6

I'm not sure what's the cause of the problem, but it happened 3 times in two weeks under similar conditions.

I checked with the laptop's support desk and they made me run several tests to see if my machine was overheating but there aren't any signs of that. So, here is the problem:

I sometimes run some heavy CPU-bound programs in Python and, when I'm not using any multiprocessing stuff, I usually set the affinity to one of my 4 "cores" (Core i5 - 2 cores and 4 threads by SMT) and use the priority "High".

The first 2 times when it happened, the computer was running the heavy task for more than 24 hours when it unexpectedly turned off. I was browsing or doing other stuff and everything was gone. The machine was somewhat hotter than usual, but not really hot.

When I turned it on again, It behaves like nothing happened... Just "Windows is starting" or something like that. Not a single message about the fail!

The third time it happened, the process was set to "Real Time" after it was running during 5 minutes on Normal priority. The computer turned off about 5 seconds after finished running the task (that was a quick one).

It wasn't even hot!!

When I turned it on again and tried to reproduce the error, I couldn't. So maybe the laptop should stay on for at least one day or two before showing this problem...

Now some funny things:

  • It always happen when I'm also using it for other tasks
  • Programs like Blender rendering 3D images during 2 days in a row using the 4 threads doesn't crash it. Also tried with HandBrake converting videos on High Priority and using all threads.

Some thoughts:

  • I live in a very hot place (30°C to 40°C now that it's summer), but it also happened with the air conditioning system working.
  • When running on single thread, the Intel Turbo Boost System raises the active core frequency from 2.67GHz to 2.93GHz

So, what should I do? There are other tests I could run to see if there's a problem with the CPU? Should I discard overheating even if I don't feel the laptop getting much hotter?

Hennes
  • 64,768
  • 7
  • 111
  • 168
JBernardo
  • 184
  • 1
  • 8
  • what happens if you don't set the affinity? – soandos Jan 25 '12 at 05:36
  • @soandos I would say nothing, but I did some tests now and it showed the same problem with blender at "High Priority" (before I only tested Normal Priority) – JBernardo Jan 25 '12 at 05:38
  • I am talking about the afinity, not the priority – soandos Jan 25 '12 at 05:39
  • @soandos whoa, I made a couple more tests with HWMonitor and: Afinity makes both cores hot (the one use is 10°C hotter than the other). But doesn't crash – JBernardo Jan 25 '12 at 05:42
  • @soandos Now with high priority, the cores get really hot: 95 to 100°C ! – JBernardo Jan 25 '12 at 05:43
  • With Blender it crashed when both cores reached about 105°C (or 221°F) when I set High Priority. With Normal Priority then stay about 80 to 85°C and sometimes 90° – JBernardo Jan 25 '12 at 05:46
  • turbo-boost is effectivly an overclocking of a single core when using just the one core ? The purpose of it being to make the computer as fast as possible when doing light tasks , tasks that use only one core. Your wanting to make high use of this speed by pushing everything to the one core , and having to work it as hard as possible? . . The small differance in temps between the cores when sent to one, shows that the cooling connection is NOT poor, leaving just overall temps, getting high. If the CPu is that hot, there is also VRMs and other things. – Psycogeek Jan 25 '12 at 06:34
  • so far you have burst speeds, the burst of activity at OS boot, High GPU use , PLUS concentrated single core activity all causing a thermal or power based shut-down ? Do you have "Automatically Restart" on "System Failure" turned off in the computer properties? System properties / advanced / startup and recovery / system failure – Psycogeek Jan 25 '12 at 07:17
  • @Psycogeek I'm not sure but I think Automatic Restart is disabled. Can't check now. – JBernardo Jan 25 '12 at 07:45
  • @Psycogeek Another thing you just made me remember is that things like that didn't happen before I started using a full HD monitor plugged to the notebook. My [Core i5 480M](http://ark.intel.com/products/52952) has an integrated GPU. Do you think the GPU is heating as well since the problems always happen when I'm also using the laptop to other tasks? – JBernardo Jan 25 '12 at 07:48
  • you can test for that, I assume, Mostly CPU based (like prime on small fft) vrses Mostly GPU based (like Furmark) hard testing. Noteing that those tests are harder than any normal use. . Even way Too-Far "overclocked" systems that are unstable, will still function when all the work going on is bottlenecked by other needs, disk needs, cpu needs or gpu needs. Those same systems only show instability and cooling issues when "Burning" one item at a time, the hard testing isnt like normal use, but it can locate weak links. – Psycogeek Jan 25 '12 at 08:00
  • . . .And memtest for mostly ram based testing and ram overheat – Psycogeek Jan 25 '12 at 08:08

2 Answers2

1

There's a problem with your hardware. If the machine is within its warranty period, take it back for servicing. If it's not, try blowing out dust & etc.

1

Your problems are most probably caused by a dried-out thermal paste on your CPU and/or GPU. This will certainly happen on any laptop after a couple years, especially if running heavy tasks for extended periods of time, like in your case.

I have experienced exactly the same problem once every couple years on my Sony Vaio. Each time, it was solved with couple drops of thermal paste and a can of compressed air.

Buy new thermal paste, a can of compressed air for blowing the dust off all components (especially ventilation ports) and a bottle of petrol for cleaning the CPU/GPU from dried out remains of old thermal paste. You can google for details and there's a big chance you can find a complete disassembly video for your laptop model on YouTube. It shouldn't cost more than 20$, but you will end up with a laptop which runs much cooler, quieter, faster and without crashes.

pure.by
  • 831
  • 7
  • 6