Lockups on AMD Ryzen 3 1200

Hi Everyone I have this pretty complicated issue that I have been dealing with for years and I don't really know how to fix it. This is a problem I only have under linux so it's not an issue with my build. I built a computer with an ASUSTek PRIME A320M-K motherboard following this guide : https://linux-hardware.org/?probe=4ccbcf1ee5 and a Ryzen 3 cpu. Almost immediately I noticed under linux I was having issues. Several times a day the entire computer would become unresponsive. I thought at first it was a hardware thing but then I started testing under windows and never had an issue with freezing or the system locking up.

Someone also seemed to have a similar issue here: [SOLVED] Machine locks up after 16-48 hours. / Newbie Corner / Arch Linux Forums

I have read so many pages and articles: Disable cstates C6, make modifcations to your bios, update your firmware, change out your kernel.... I am trying to figure out how to organize all this and do some logical troubleshooting of this issue. I tried disabling the C6 ( I think I did it right ) but that didn't seem to help. Does anyone know how to troubleshoot or fix this kind of issue?

Doing a bit more research I decided to try and update the bios firmware first. I have the 4011 4/19/2018 version. So not the 2017 version that some people were having issues with because I think I did try to update it in the past to fix this but I figured it was worth a go. I will see if this helps

I had this once with a computer. Lockup after some time running with linux but no problems with Windows.
It turned out to be a faulty RAM-stick.
You could try to do a memorytest ?

I ran a memtest and it came back as normal. Also the firmware upgrade did not fix it because after I made the firmware change it locked up like an hour later.

I am submitting a service request to AMD because of this article I have been reading : https://www.phoronix.com/scan.php?page=news_item&px=Ryzen-Segv-Response
This could just be a hardware bug. I will report back if it is because I don't want other people to have to rip their hair out over this.

Good idea,

In the meantime I'm trying to make a list for you to check. Just in case the service request comes up empty.

CPU and GPU temperatures OK ?

You probably have checked this already but just to be sure:
if you say 'unresponsive', do you mean that:

  1. mouse and keyboard not work anymore,
    or
  2. that the screenoutput is frozen
    or
  3. that literally every program is suddenly halted ?

For troubleshooting it is vitally important to make sure that it is beyond doubt only 1 or 2 or 3 .
If the problem is point 2 (because, for instance, your NVidia card or driver is acting up) , it will ofcourse also look like point 3 and point 1, which can be pretty confusing if you don't keep this in mind.

To check if it is point 1: configure your clock to show also 'seconds',
if the seconds keep ticking away after the freeze, you are dealing with point 1:

  1. Your keyboard and mouse went to sleep, disable autosuspend on USB

in every other case it is point 2 or 3

To check if it is point 2, see if you can remotely login with another computer (likely with SSH or whatever you use) and try to run some commandline stuff, like dmesg
or journalctl -b |tail -n40 .
If you can remote login after the freeze. You are dealing with point 2.

  1. Unload the NVidia proprietary driver and revert to Nouveau
    if that works, keep it or find a different version NVidia driver.
  2. If that doesn't help, try a completely different videocard instead.

If you can't remote login after the freeze, or if your remote session is cut off:
You are dealing with point 3.

reboot and check the logfiles:
journalctl -b -1 (=complete log since the previous boot)
And see if there are some conspicuous entries from just before the reboot.

At least you have something to do now :wink:

3 Likes

I will look into this but I did get a message back from AMD with and RMA for the chip so hopefully that will work out.

1 Like