3

I have a new system on my laptop (Ubuntu Gnome 16.04) and first week or so it was great, but then it started to (seemingly randomly) freeze every now and then (about once a day, sometimes more; it's a complete system freeze, where the only option so far was to force shutdown). I was looking at some system logs (kern.log, syslog and dmesg) and found this in /var/log/syslog at the timestamp of the most recent freeze: screenshot of the syslog file

The complete syslog is here, the weird line is at 14582.

Does anybody have any hints on what could be the culprit of the freezes? Should I look at other logs? I tried to search for some advice on how to debug system crashes, but the information I found was rather scarce and not very helpful. For instance the Ubuntu wiki guide tells me to replicate the issue on CLI, but I don't know how since I don't know what is crashing. I hoped to find something in the logs but the ones recommended for checking show nothing of interest to my non-expert eye.

I want to try the SysRq method described in the Ubuntu wiki, however it rather conflicts with the Wikipedia article on this topic. This is why I hesitated to use it so far. If anybody has any advice on this, it would be also greatly apreciated.

Here is my system info:

$ uname -a
Linux ultrabook 4.13.0-31-generic #34~16.04.1-Ubuntu SMP Fri Jan 19 17:11:01 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Update

Today it freezed again, todays log with just errors (outputed by grep -i Error* /var/log/syslog), as suggested by Elder Geek can be found here. The timestamp of the crash is before 11:30:24, which is the time of new booting.

Thanks for any help.

jena
  • 426

1 Answers1

2

It might be hard to confirm from your past case, but I believe this was the reason, and others will be able to upvote if this worked for them.

I found out there are a certain number of NVME SSD disks that don't play well with linux because of power related matters.

After many months of troubleshooting, I found the reason Ubuntu was suddenly freezing for no reason was because of that.

The solution was to add the following kernel parameter:

nvme_core.default_ps_max_latency_us=0

To do so, modify /etc/default/grub, and add the above to your GRUB_CMDLINE_LINUX_DEFAULT= string, e.g. mine looked like that:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nvme_core.default_ps_max_latency_us=0"

Then save, and run update-grub.

And reboot the system.

I have had a 100% success rate since, e.g. 0 crashes, not even an odd crash once in a while.

@Jena, if you still have your original NVME SSD disk that was causing the issue, it would be nice to have a full circle confirmation.

References:

Official kernel bug report: https://bugzilla.kernel.org/show_bug.cgi?id=195039

Wadih M.
  • 392