9

I am thinking that my nVidia graphics card might be starting to fail, as the machine randomly locks while playing video games.

Is there a (nVidia-specific) diagnostic program that can test the card for errors and heat issues?

Obviously, a program that just monitors heat wont do as the machine completely locks up.

Ƭᴇcʜιᴇ007
  • 111,883
  • 19
  • 201
  • 268
Keltari
  • 71,875
  • 26
  • 179
  • 229
  • 2
    I see this question pop up from necro every so often (like now). So I might as well mention that the card did die shortly after the original post. Still, would like to find diagnostic tool... – Keltari Jun 18 '13 at 16:10
  • 3
    I'm not clear why this question was "_off-topic_"; in my experience, this problem is almost _common_ and _erratic_. One of our teams spent 3 person-months trying to address this - They still have pallet inconsistencies and artificial limitations. On my Linux box (and Windows) the NVidia board freezes, goes black (or just dies?) . Linux is working writing logs or some thing. If this was a question about HardDrives; you would recommend SMART tools. – will Feb 28 '17 at 11:25
  • Try the procedure outlined on this page: **[Graphics Troubleshooting Procedure](https://help.ubuntu.com/community/GraphicsTroubleshootingProcedure)**. I only did it just now, but hoping it will assist. You are correct; there's a long standing bug against the drivers that is NOT closed -- But the best I've seen is suggestions for someone to try a (unstable) latest build and _suck-it-and-see_. Might be the rules about questions need to be labeled "Off-topic". – will Feb 28 '17 at 11:30
  • 1
    @will - I agree it's ridiculous to close this question. Especially since the answer (as of 2022) appears to be the "NVIDIA Validation Suite" (NVVS). – Nemo Apr 14 '22 at 14:31

1 Answers1

1

Try and isolate that is directly the card that's the problem and not other factors on your machine. This may be tricky by try using that card on another known good machine to see if you experience the same thing.

If you do, then that should be enough to convince you it's a problem with the card.

At the same time you could test another graphics card that should be working well on your machine, and if you have no issues with the alternate card then you should have confidence in the rest of your system configuration.

As for software, try a variety of graphics benchmarking software to put the card through some of its paces, along with something like RealTemp or GPUTemp as an indicator of how warm its getting under load.

Sara Gamage
  • 188
  • 5
  • 2
    I think that's why he's looking for a diagnostic tool, so that he can determine whether or not it's the graphics card. :-S – Mark Good Nov 20 '12 at 00:32
  • Using another card or using the card somewhere else is a surefire way to determine if it's the card. – David Jun 18 '13 at 15:59
  • Not everyone has a compatible machine laying around. I have a tower with an issue (why I'm here) and it;s the only tower in the building, everything else is laptops. – Emma Talbert Feb 10 '20 at 00:22