1

Moments after copying a large file on my primary SSD, the IO PSI value goes up to more than 90% and the system becomes very unresponsive. This takes around a minute to go back to normal.

I also noticed freezing problems when doing things like docker pull, but was able to reproduce what I believe to be the same problem, using rsync now.

The problem does not seem to happen when the test file is located in the second SSD.

Reproducible example:

Inspired by this post I am logging load and IO PSI to a CSV file every second.

In the first terminal session I run

rm monitor.csv; t=0; while true; do t=$(($t+1)); echo $s,$(cat /proc/loadavg | cut -d ' ' -f1),$(grep some  /proc/pressure/cpu  | cut -d' ' -f2 | cut -d= -f2),$(grep some /proc/pressure/io | cut -d' ' -f2 | cut -d= -f2),$(grep full /proc/pressure/io | cut -d' ' -f2 | cut -d= -f2) >> monitor.csv; sleep 1; done

I copy a 3,3 GB testfile on the primary SSD:

rm test.zip ; echo "start rsync" >> monitor.csv ; rsync original.zip test.zip ; echo "end rsync" >> monitor.csv

Now the monitor.csv looks like this:

1,0.64,0.18,0.00,0.00
2,0.67,0.14,0.00,0.00
3,0.67,0.14,0.00,0.00
4,0.67,0.12,0.00,0.00
start rsync
5,0.67,0.12,0.00,0.00
6,0.67,0.09,0.00,0.00
7,0.70,0.09,0.00,0.00
8,0.70,0.08,0.00,0.00
end rsync
9,0.70,0.08,0.00,0.00
10,0.70,0.42,6.15,5.79
11,0.70,0.42,6.15,5.79
12,0.80,0.35,16.99,15.97
13,0.80,0.35,16.99,15.97
14,0.80,0.28,30.94,29.38
15,0.80,0.28,30.94,29.38
16,1.14,0.23,43.09,41.27
17,1.14,0.23,51.59,49.73
18,1.14,0.19,51.59,49.73
19,1.14,0.19,51.59,49.73
20,1.14,0.33,58.00,55.40
21,2.57,0.27,62.89,60.22
22,2.57,0.27,62.89,60.22
23,2.57,0.22,67.80,65.25
24,2.57,0.22,67.80,65.25
25,2.57,0.18,72.37,69.73
26,3.57,0.18,72.37,69.73
27,3.57,0.15,75.38,72.68
28,3.57,0.15,75.38,72.68
29,3.57,0.12,79.48,76.90
30,3.57,0.12,79.48,76.90
31,4.56,0.10,83.01,80.36
32,4.56,0.10,83.01,80.36
33,4.56,0.08,84.46,81.38
34,4.56,0.08,84.46,81.38
35,4.56,0.06,86.37,83.12
36,5.24,0.06,86.37,83.12
37,5.24,0.05,86.66,83.10
38,5.24,0.05,86.66,83.10
39,5.24,0.04,87.09,82.90
40,5.24,0.04,87.09,82.90
41,5.94,0.03,84.71,80.02
42,5.94,0.03,84.71,80.02
43,5.94,0.02,84.77,79.65
44,5.94,0.02,84.77,79.65
45,5.94,0.02,86.26,80.08
46,6.91,0.02,86.26,80.08
47,6.91,0.01,86.93,80.42
48,6.91,0.01,86.93,80.42
49,6.91,0.01,86.04,79.80
50,6.91,0.01,86.04,79.80
51,7.95,0.01,85.67,79.29
52,7.95,0.01,85.67,79.29
53,7.95,0.01,86.45,79.78
54,7.95,0.01,86.45,79.78
55,7.95,0.00,86.73,79.82
56,8.12,0.00,86.73,79.82
57,8.12,0.18,80.08,73.69
58,8.12,0.18,80.08,73.69
59,8.12,0.33,65.57,60.34
60,7.55,0.33,65.57,60.34
61,7.55,0.27,54.05,49.77
62,7.55,0.27,54.05,49.77
63,7.55,0.22,44.26,40.76
64,7.55,0.18,36.24,33.37
65,7.02,0.18,36.24,33.37
66,7.02,0.18,36.24,33.37
67,7.02,0.14,29.67,27.32
68,7.02,0.12,24.30,22.37
69,7.02,0.12,24.30,22.37
70,6.46,0.10,19.89,18.32
71,6.46,0.10,19.89,18.32
72,6.46,0.08,16.29,15.00
73,6.46,0.08,16.29,15.00
74,6.46,0.06,13.34,12.28
75,5.94,0.06,13.34,12.28
76,5.94,0.05,10.92,10.06
77,5.94,0.05,10.92,10.06
78,5.94,0.04,8.94,8.23
79,5.94,0.04,8.94,8.23
80,5.55,0.03,7.32,6.74
81,5.55,0.03,7.32,6.74
82,5.55,0.02,5.99,5.52
83,5.55,0.02,5.99,5.52
84,5.55,0.02,4.91,4.52
85,5.10,0.02,4.91,4.52
86,5.10,0.01,4.02,3.70
87,5.10,0.01,4.02,3.70
88,5.10,0.01,3.29,3.03
89,5.10,0.01,3.29,3.03

Chart:

Chart

  • copying the file takes only ~ 5 seconds
  • After that the PSI goes very high, and normalizes after a minute
  • The system GUI becomes very unresponsive during that time

Also a warning is blinking in atop after the copy process:

Atop red NVM/crypt

The same operation with the file located on the second SSD takes a bit longer, but the system does not freeze and the IO PSI value does not spike so badly:

1,2.42,0.00,0.01,0.01
2,2.42,0.00,0.01,0.01
3,2.42,0.00,0.01,0.00
4,2.23,0.00,0.01,0.00
start rsync
5,2.23,0.00,0.00,0.00
6,2.23,0.00,0.00,0.00
7,2.23,0.00,1.27,1.09
8,2.23,0.00,1.27,1.09
9,2.13,0.00,2.85,2.70
10,2.13,0.00,2.85,2.70
11,2.13,0.18,3.42,3.30
12,2.13,0.18,3.42,3.30
13,2.13,0.14,4.25,4.15
14,2.04,0.14,4.25,4.15
15,2.04,0.12,5.11,5.03
16,2.04,0.12,5.11,5.03
17,2.04,3.72,9.44,7.02
end rsync
18,2.04,3.72,9.44,7.02
19,1.88,3.04,8.63,6.65
20,1.88,3.04,8.63,6.65
21,1.88,2.49,7.07,5.44
22,1.88,2.49,7.07,5.44
23,1.73,2.04,5.79,4.46
24,1.73,2.04,5.79,4.46
25,1.73,1.67,4.74,3.65
26,1.73,1.67,4.74,3.65
27,1.73,1.36,3.88,2.99
28,1.59,1.36,3.88,2.99
29,1.59,1.12,3.17,2.44
30,1.59,1.12,3.17,2.44

System Details

  • Lenovo E14

  • Ubuntu 22.04

  • Gnome & Wayland

  • CPU AMD Ryzen 7 4700U with Radeon Graphics

  • 2 NVME SSDs built in, both with encrypted file system

    • primary SSD: (mvme0)
      • Model Number: KBG40ZNT512G TOSHIBA MEMORY
      • Firmware Version: 0109AELA (seems to be the latest available)
    • secondary SSD: (nvme1)
      • Model Number: CT500P3SSD8
  • Schedulers are the same on both devices:

$ cat /sys/block/nvme0n1/queue/scheduler 
[none] mq-deadline 
$ cat /sys/block/nvme1n1/queue/scheduler 
[none] mq-deadline

Theories / Questions:

  • is the primary SSD just broken?
  • does it have to do with a limited amount of PCI-E channels in the system?

Tried

  • I tried Kernel 6.8.0-40-generic (standard on my system), as well as 6.5.0-45 - but behave the same in this regards.

EDIT1 (as per @ubfan1's comment)

  • Trim seems to run more on the second device (without the problem). On the primary device it does not run so often - is this good or bad?
$ journalctl |grep "fstrim\["|grep nvme0
Jul 01 08:32:31 example fstrim[27758]: /boot/efi: 504.9 MiB (529436672 bytes) trimmed on /dev/nvme0n1p1
Jul 01 08:32:31 example fstrim[27758]: /boot: 1.2 GiB (1269686272 bytes) trimmed on /dev/nvme0n1p2
Jul 08 09:53:27 example fstrim[50828]: /boot/efi: 504.9 MiB (529436672 bytes) trimmed on /dev/nvme0n1p1
Jul 08 09:53:27 example fstrim[50828]: /boot: 1.2 GiB (1269567488 bytes) trimmed on /dev/nvme0n1p2
Jul 15 09:43:17 example fstrim[46889]: /boot/efi: 504.9 MiB (529436672 bytes) trimmed on /dev/nvme0n1p1
Jul 15 09:43:17 example fstrim[46889]: /boot: 1.2 GiB (1269567488 bytes) trimmed on /dev/nvme0n1p2
Jul 22 12:33:55 example fstrim[14093]: /boot/efi: 504.9 MiB (529436672 bytes) trimmed on /dev/nvme0n1p1
Jul 22 12:33:55 example fstrim[14093]: /boot: 1.2 GiB (1267404800 bytes) trimmed on /dev/nvme0n1p2
Jul 29 10:15:54 example fstrim[63916]: /boot/efi: 504.9 MiB (529436672 bytes) trimmed on /dev/nvme0n1p1
Jul 29 10:15:54 example fstrim[63916]: /boot: 1.2 GiB (1267404800 bytes) trimmed on /dev/nvme0n1p2
Aug 05 09:37:39 example fstrim[46612]: /boot/efi: 504.9 MiB (529436672 bytes) trimmed on /dev/nvme0n1p1
Aug 05 09:37:39 example fstrim[46612]: /boot: 1.2 GiB (1269469184 bytes) trimmed on /dev/nvme0n1p2
  • Both device are built-in, not connected via USB

  • Smartctl

smartctl 7.2 2020-12-30 r5155 [x86_64-linux-6.8.0-40-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION === Model Number: KBG40ZNT512G TOSHIBA MEMORY Serial Number: 30TPxxxxxxx Firmware Version: 0109AELA PCI Vendor/Subsystem ID: 0x1e0f IEEE OUI Identifier: 0x8ce38e Total NVM Capacity: 512.110.190.592 [512 GB] Unallocated NVM Capacity: 0 Controller ID: 0 NVMe Version: 1.3 Number of Namespaces: 1 Namespace 1 Size/Capacity: 512.110.190.592 [512 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 8ce38e 0400911b0e Local Time is: Tue Aug 20 08:58:37 2024 CEST Firmware Updates (0x14): 2 Slots, no Reset required Optional Admin Commands (0x001f): Security Format Frmw_DL NS_Mngmt Self_Test Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp Log Page Attributes (0x0e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 512 Pages Warning Comp. Temp. Threshold: 82 Celsius Critical Comp. Temp. Threshold: 86 Celsius

Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 3.50W - - 0 0 0 0 1 1 1 + 2.60W - - 1 1 1 1 1 1 2 + 1.80W - - 2 2 2 2 1 1 3 - 0.0500W - - 4 4 4 4 800 1200 4 - 0.0050W - - 4 4 4 4 3000 32000

Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 3 1 - 4096 0 1

=== START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02) Critical Warning: 0x00 Temperature: 29 Celsius Available Spare: 100% Available Spare Threshold: 10% Percentage Used: 12% Data Units Read: 41.196.433 [21,0 TB] Data Units Written: 54.357.811 [27,8 TB] Host Read Commands: 792.033.014 Host Write Commands: 1.093.950.301 Controller Busy Time: 10.506 Power Cycles: 3.308 Power On Hours: 6.391 Unsafe Shutdowns: 4 Media and Data Integrity Errors: 0 Error Information Log Entries: 44 Warning Comp. Temperature Time: 0 Critical Comp. Temperature Time: 0 Temperature Sensor 1: 29 Celsius

Error Information (NVMe Log 0x01, 16 of 256 entries) No Errors Logged

Alex
  • 1,062

2 Answers2

1

What it looks like is that the OS is telling rsync that the copy is complete, but in reality it doesn't seem to have happened yet. Therefore, spikes in utilization increment afterwards when the actual write to disk is happening - the same disk as your OS, no less.

This sort of looks like write-back mode where the write is confirmed as complete once the intention to write is entered into cache. Your SSDs don't have DRAM cache, so this likely means you're caching the ~3.3GB file to RAM, first.

How much RAM does your system have? If you have 4GB RAM, Ubuntu is using ~1.3GB and the file is ~3.3GB. That's more than 4GB RAM. Perhaps there's some conflict between paging (write cache) to the same disk your writing to from RAM?

If that is not the case, and you have more than enough RAM, perhaps consider using the faster disk for your OS and disk image storage/usage? The Crucial is faster (M.2 2280) than the Toshiba (M.2 2242).

Crucial: https://www.harddrivebenchmark.net/hdd.php?hdd=Crucial%20P3%20500GB Toshiba: https://www.harddrivebenchmark.net/hdd.php?hdd=KBG40ZNS512G%20NVMe%20TOSHIBA%20512GB

Checking your specs, I don't see any issues; both NVMe slots appear to be PCI 3.0 4x (likely the same limited PCI channels, though, yes). I can only imagine that the problem is either a feature on the SSD (perhaps there is a built-in encryption offload on the Crucial) or simply the IO capabilities of the Toshiba itself. Otherwise, I'd expect the IO over the PCI bus to be scheduled equitably.

As for PSI, you might want to consider narrowing down the parameter that it's referencing. The chart is color coded which, in your example, means different points of contention, but I don't see a legend to map out which color is which pressure point. However, based on your script and the CSV output, the last two outputs are highest, and those are both IO-based. So, the issue appears to be IO related. In the ATOP output, the encryption process seems to be highest, which is why I pointed to some sort of SSD feature set helping with IO.

I could only suggest attempting the same operations w/o disk encryption to see if that relaxes the IO penalty/overhead...but that's your OS disk, so I can't recommend you leave it in that state.

I'm kind of wondering if the problem would follow the disk or the slot if you swapped the disks, but I would imagine that you can't do that since they are likely 80mm and 42mm, respectively.

1

I think the issue was, that trim never ran, after setting it up (https://askubuntu.com/a/122007/39966) and running

$ sudo fstrim --all --verbose
/boot/efi: 504,9 MiB (529432576 bytes) trimmed on /dev/nvme1n1p1
/boot: 1,2 GiB (1265078272 bytes) trimmed on /dev/nvme1n1p2
/: 89,6 GiB (96226893824 bytes) trimmed on /dev/mapper/vgubuntu-root

which took around 25 minutes to complete, I get this performance during and after the rsync - IO PSI stays much lower and system stays responsive.

1,1.43,0.27,0.00,0.00
2,1.43,0.27,0.00,0.00
3,1.43,0.22,0.00,0.00
4,1.39,0.22,0.00,0.00
5,1.39,0.18,0.00,0.00
6,1.39,0.18,0.00,0.00
7,1.39,0.14,0.00,0.00
start rsync
8,1.39,0.14,0.00,0.00
9,1.44,0.12,5.61,5.61
10,1.44,0.12,5.61,5.61
11,1.44,0.10,7.85,7.67
12,1.44,0.10,7.85,7.67
13,1.44,0.08,7.70,7.55
14,1.33,0.08,7.70,7.55
15,1.33,0.06,7.75,7.63
16,1.33,0.06,7.75,7.63
17,1.33,0.05,7.62,7.52
18,1.62,0.05,7.62,7.52
19,1.62,0.40,11.49,11.23
20,1.62,0.40,11.49,11.23
21,1.62,0.33,10.86,10.64
22,1.62,0.33,10.86,10.64
23,2.29,0.45,9.79,9.62
24,2.29,0.45,9.79,9.62
25,2.29,0.91,19.25,17.84
26,2.29,0.91,19.25,17.84
27,2.29,0.93,26.81,24.39
28,2.99,0.93,26.81,24.39
29,2.99,0.76,39.71,37.54
30,2.99,0.76,39.71,37.54
31,2.99,0.98,45.74,42.15
end rsync
32,2.99,0.98,45.74,42.15
33,2.75,0.80,39.44,36.33
34,2.75,0.80,39.44,29.75
35,2.75,0.66,32.30,29.75
36,2.75,0.66,32.30,24.36
37,2.75,0.54,26.45,24.36
38,2.61,0.54,21.65,19.94
39,2.61,0.62,21.65,19.94
40,2.61,2.14,17.73,16.33
41,2.61,2.14,17.73,16.33
42,2.61,2.47,14.52,13.37
43,2.48,2.47,14.52,13.37
44,2.48,2.21,11.89,10.95
45,2.48,2.21,11.89,10.95
46,2.48,1.99,9.73,8.96
47,2.48,1.99,9.73,8.96
48,2.28,1.81,7.97,7.34
49,2.28,1.81,7.97,7.34
50,2.28,1.66,7.25,6.55
51,2.28,1.66,7.25,6.55
52,2.28,1.36,12.82,12.07
53,2.34,1.36,12.82,12.07
54,2.34,1.11,23.54,22.20
Alex
  • 1,062