1

I just upgraded my OS from Ubuntu 16.04LTS to Ubuntu 18.04LTS and then to Ubuntu 20.04 LTS as I am trying to use the GPU to run neural networks. I upgraded the OS in order to be able to install the latest nvidia drivers. I have an Nvidia Geforce GTX 1650 GPU card. In Ubuntu 18 I installed the nvidia drivers 430 and when the OS upgrade was done to Ubuntu 20.04 the nvidia drivers were automatically updated to the latest version 525 and these are the ones recommended in the official nvidia drivers website for my card, link:https://www.nvidia.com/download/driverResults.aspx/199656/en-us/

When there are processes that use a lot of RAM like playing videos, loading a lot of data from firefox or when trying to run the neural networks my computer starts slowing down, the mouse pointer starts to get choppy, the GPU temperature goes to 95°C and the gpu-util goes to 100%(running the command nvidia-smi) nvidia-smi command output just before freezing and then all the system goes into a deep freeze state, the mouse and keyboard stop responding and the audio enters into a loop state. There is no way to take it out from freeze state but to hard-reset pressing the power button.

I see there are many similar questions related to this problem in this version of Ubuntu: How to find out why Ubuntu 20.04 freezes? Ubuntu 20.04 LTS freezes randomly - Suspecting Nvidia Ubuntu 20.04 random freezes Ubuntu 20.04 random freeze ups Complete freezing - Ubuntu 20.04, probable problem with AMD driver

In most of the questions the problem was related to the BIOS version but I saw in some of the posts that the swap memory had values of 2GB or 4GB and when checking mine it is 976Mb... I have no idea if my problem is related to the swap memory?? My knowledge of Ubuntu and drivers is pretty limited. If anyone can help I would be really grateful this is getting super frustrating and long.

Here some useful info

free -h

total        used        free      shared  buff/cache   available

Mem: 15Gi 2,8Gi 10Gi 36Mi 2,2Gi 12Gi

Swap: 976Mi 0B 976Mi

sysctl vm.swappiness result:

vm.swappiness = 60

sudo lshw -C memory result:

PCI (sysfs)  
  *-firmware                
       description: BIOS
       vendor: American Megatrends Inc.
       physical id: 1
       version: E16S3IMS.108
       date: 11/18/2019
       size: 64KiB
       capacity: 16MiB
       capabilities: pci upgrade shadowing cdboot bootselect edd int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer acpi usb biosbootspecification uefi
  *-memory
       description: System Memory
       physical id: 3b
       slot: System board or motherboard
       size: 16GiB
     *-bank:0
          description: SODIMM DDR4 Synchronous 2667 MHz (0,4 ns)
          product: M471A2K43CB1-CTD
          vendor: Samsung
          physical id: 0
          serial: 36BD8D3D
          slot: ChannelA-DIMM0
          size: 16GiB
          width: 64 bits
          clock: 2667MHz (0.4ns)
     *-bank:1
          description: [empty]
          physical id: 1
          slot: ChannelB-DIMM0
  *-cache:0
       description: L1 cache
       physical id: 45
       slot: L1 Cache
       size: 384KiB
       capacity: 384KiB
       capabilities: synchronous internal write-back unified
       configuration: level=1
  *-cache:1
       description: L2 cache
       physical id: 46
       slot: L2 Cache
       size: 1536KiB
       capacity: 1536KiB
       capabilities: synchronous internal write-back unified
       configuration: level=2
  *-cache:2
       description: L3 cache
       physical id: 47
       slot: L3 Cache
       size: 12MiB
       capacity: 12MiB
       capabilities: synchronous internal write-back unified
       configuration: level=3
  *-memory UNCLAIMED
       description: RAM memory
       product: Intel Corporation
       vendor: Intel Corporation
       physical id: 14.2
       bus info: pci@0000:00:14.2
       version: 00
       width: 64 bits
       clock: 33MHz (30.3ns)
       capabilities: pm cap_list
       configuration: latency=0
       resources: memory:d5418000-d5419fff memory:d541d000-d541dfff

htop output just before freezing and after increasing the swap memory: htop

1 Answers1

1

According to the output of free -h, your swap file is already full (0 free). Linux saw your free swap space and put some cached/shared memory in it and called it a day, but didn't realize that would lead to your system's death when the 2.8GB of physical memory filled up.

Quick free fix: increase your swap size. On my system I have 8GB RAM so I use 24GB of swap, but anything above 25GB should be good for you (the general rule of thumb is that swap should be 2x the physical ram size, but you can put more if you run into problems in the future).

Way more expensive but results in better experience: When linux starts using your swap file, things typically start getting slow. I mean, the slow kind of slow. Cursor lag, waiting on browser tabs for more than 15 seconds, it's a nightmare unless if you're keeping your swap on some kind of multi-thousand dollar NVMe drive. So I recommend just downloading buying more RAM to get the best performance. You have a free RAM slot according to the output of sudo lshw -C memory, so that's not going to be a problem.