What's the best way to recover when Ubuntu Mate freezes?

The individual values are documented in /etc/sysctld/10-maginc-sysrq.conf and 176 is the sum of:

128 - allow reboot / poweroff
 32 - enable remount read-only
 16 - enable sync command
------
176

That setting allows using the REISUB sequence (see Wikipedia for more info on what REISUB does).

As for the Ctrl+Alt+Backspace combo: I have no idea why it’s not enabled by default. I don’t think it has any security implications. I always switch it on, just in case. If never needed, all the better.

1 Like

Some 11e laptops are using Intel Bay Trail processors. You might be affected by point #3 on this thread:

2 Likes

@ouroumov Thanks for the link, this does seem to be the issue for me. My 11e has an Intel Celeron N2940, and I have experienced random freezes twice now (once at Fedora 25, and today at Ubuntu Mate 16.04).

This is a horrible bug IMO, it’s like a Linux BSOD. Also, people always say Linux is a good choice for lower end hardware—which is what this bug affects. I have lost work over it before. Fortunately it seems to happen fairly rarely on my 11e, but I sure hope it’s fixed soon. I don’t really want to disable CPU power saving because I run my 11e on battery often.

But since this does happen, is it possible to recover using Ctrl + Alt + Backspace or the magic SysRq key combo? The only way I know how to recover now is holding down the power button. When Ubuntu freezes, nothing works, not the mouse, nor Alt+F2. I can’t switch to another tty either. I hadn’t enabled Ctrl+Alt+Backspace, but I have now. I hope it doesn’t happen again, but if it does I’ll try that first.

Thanks for explaining, makes sense now. :^) So REISUB is enabled when I tried to use it, either I don’t know the SysRq key or the system wasn’t responding. I enabled Ctrl+Alt+Backspace after having a freeze on my 11e just today. :^(

The issue seems to be Bug 109051 - intel_idle.max_cstate=c1, which I found listed as item number 3 in the Ubuntu Mate known bugs page. My 11e has an Intel Celeron N2940, one of the processors affected by the bug. Not a nice bug.

Just throwing this out there for comment:

sudo service lightdm restart

Couldn’t this also be used from console?

1 Like

Yes, if one has access to console. Since we are now using systemd (although service + the old System V init scripts seem to still be there), it’s:

sudo systemctl restart lightdm.service

3 Likes

Oh, so it’s a bit different now? Should I only use the new systemctl command or will the old one work still? I was reading a guide on recovering from a freeze that listed the old one, but it is several years out of date. I have noticed that often deprecated commands are kept as aliases for the new ones, though.

Unfortunately, I just experienced two freezes in a row. :^( I was forced to use hard restart again because none of the other options worked. First I tried Ctrl+Alt+Backspace, but it didn’t work. I couldn’t switch to another tty, either. Then I tried to do the SysRq key but nothing happened—this could be because I’m still not entirely sure where the SysRq key is on my keyboard. Some web hits said to hold down Alt+Prtsc, another said press Alt+Fn+S, release S, then go through the reisub sequence. Neither seemed to do anything.

For those of you who have used the SysRq key in the past, what exactly is supposed to happen? Shouldn’t the X session terminate after sending the signal to kill remaining processes? I was just pressing these keys without anything happening.

Second, have those of you who are affected by Bug #3 on the 16.04 known bugs thread succeeded in recovering by restarting X or the SysRq key combo? Every time it has been a COMPLETE freeze, and no keypresses have worked. Is hard reboot really the only option with this bug?

From my experience (had the bug on a machine with Intel J1900), yes. It’s a complete kernel crash. I suspect Magic SysRq needs basic functionalities to keep working in order to work.
You can try Magic SysRq while your computer is operating normally to know what key combination is used on your hardware.

Apparently a patch is underway (see comment 724 from assignee), let’s hope it gets here soon.

Just experienced another crash, but I think I’m figuring this out.

I was installing a program from CD in virtualbox when the system locked up.

Used REISUB to shutdown and was expecting to just be able to restart. When I restarted I was getting page after page of ATA bus error.

Booted to a live USB and went online. Most people who had this problem solved it by switching or unplugging and replugging their SATA cables, so I shutdown and did this.

When I rebooted the error message said I need to do a manual fsck. So I did this.

So this is my question: What are the implications of an ATA bus error? Is this just a loose connection? Is my motherboard going? Are there any test programs I can run to detect and/or address this?

Thanks,

Jim

Hi Jim,

is it related to “TLP”? (see answers 3 & 4):

https://bugs.launchpad.net/elementaryos/+bug/1576634

I have a “desktop” and haven’t installed “TLP”, so I don’t think that’s it. , unless TLP is installed by default on 16.04.

I think that I may have a faulty SATA cable connected to my DVD drive. When you first begin having screen freezes/lockups, you don’t often associate your actions at the exact moment with the lockup - such as using the DVD drive.

When I was doing my recent research on ATA bus errors, several people mentioned tracking the problem down to a faulty SATA cable or a loose cable connection. So it got me to thinking - Yeah, I was using the DVD at the time of the last crash. So now I’m trying to remember if I was using the DVD at the time of the other crashes. I think there’s a good chance that maybe I was. It’s impossible to be 100% sure at this point.

I have a spare SATA cable. I’m going to switch that out on my DVD and go from there.

Jim

1 Like

FYI: It is installed by default on 16.04 (by ubuntu-mate-desktop metapackage).

Even if the package description says “save battery power on laptops”, TLP manages many other things (CPU frequency scaling related things, for example) so it’s useful on a desktop as well. See the TLP website for more information.

systemctl list-unit-files|grep tlp

3 Likes

Hi Jim,

if you can, take out the DVD drive (I assume it is a tower PC) and check as to whether or not the jumper is set to “Cable Select” instead of Master or Slave?. :smiley:

The normal setting would be Master for a single drive or Slave if you have it sharing the same cable as your HDD!. :smiley: (Wouldn’t normally be the case with a SATA cable though!). :smiley:

I’m coming up with a different scenario so let me run this by you:

My harddrive is partitioned like this:
sda1 - swap
sda2 - / (ubuntu-mate 16.04 with kernel 4.4)
sda3 - home
sda4 - extended partition
sda5 - backup partition for home partition
sda6 - ubuntu-mate 16.10
sda7 - not used at present

I use mate 16.04 as my main system because my primary interest is stability. I was having a problem with virtualbox crashing so I installed it in mate 16.10 on sda6. Everything worked fine. No problems. I thought I had all the conflicts worked out and it’s a lot of trouble to logout and login to a different partition, so I installed virtualbox in my 16.04 partition on sda2. I was in the middle of installing Chessmaster8000 when all the above problems began. (I had previously installed it in the 16.10 partition with no problem.)

I’m thinking the problem may be some conflict with the 4.4 kernel. Even though fsck says that my file system is clean, when I boot up with a live USB and run fsck it cannot locate any partition other than sda2 - it says that maybe the others don’t exist. Obviously something got scrambled in the crash.

I going to do a reinstall, but before I do, if there are any log files that may prove helpful, let me know and I’ll send them to you.

Since Im going to do a reinstall, and my primary interest is stability, and I know there have been some stability issues with 16.04 and 16.04.1, do you think these issues have been addressed in 16.04.2? I would seem that upgrading to the new kernel might address at least some of the issues. Or should I go back to 14.04 until these issues have been resolved?

Thanks, Jim

HI Jim,

it may well be a kernel problem but I cannot really say, it is possible that upgrading the kernel may solve the problems for you, I am using UM 16.4.2 and don’t have any hardware problems on two different PC’s, one is a midi tower (HP Compaq Presario/1GB Nvidia GPU) and the other is a mini Packard Bell notebook with Intel GPU!. :smiley:

I know. That really wasn’t a fair question.

I was reading Jay’s comments Screen flickers twice every startup

I know that 16.04 was a huge upgrade and Canonical made a lot of changes from 14.04. While I know these things are eventually going to get “ironed out”, in the meantime, I would really prefer a seat on the sidelines.

A tip for next time: you can list the partition tables of the drive with

sudo fdisk -l /dev/sda

to see if all partitions are recognized. It also shows important information like Disklabel type.

It would not be a bad idea to view SMART information of the drive and run the tests. A drive that’s about to fail can cause all kinds of “interesting” problems. As for the logs: saving journalctl -b output (logs from last boot) to a text file on a usb stick could be a good idea.

2 Likes

Better to fix the freeze. If you had windows10 on your machine and replaced it for linux then switch on legacy support in the BIOS. Worked for me on several laptops

Thanks for the reply, I fixed the freezes months ago though. :slight_smile: I replaced Win7 on this machine. It was actually the nasty Bug #109051, which causes total freezes on computers using low end Bay Trail CPUs. I edited my grub config file as described here. One of the reasons I love the UM community is that the solution to this issue was written down on a common problems and workarounds page!

No more freezes, the only downside is that the computer uses somewhat more power since I disabled CPU power saving modes.