I'm running Ubuntu 22.04.2 LTS on a supermicro system with the following configuration:
- Intel Xeon Gold 6248R
- 256GB RAM
- Samsung PM1735 NVMe SSD (6,4TB) // (currentfirmware: EPK98B5Q)
- XFS filesystem
TRIM causes a consistent freeze of the whole machine every time it is executed by the fstrim.timer. Also the large trimmed amounts are concerning each time. Monitoring the fstrim output via fstrim -v is not possible, since I don't want to cause another freeze by running it manually. Here's the journal output:
Apr 20 00:48:35 hostname fstrim[1361579]: /mnt: 3.8 TiB (4144756957184 bytes) trimmed on /dev/nvme0n1
Running hdparm -I /dev/sdx doesn't seem to have a useful output:
root@hostname ~ # hdparm -I /dev/nvme0n1
/dev/nvme0n1:
I've read different guides on how to check, if fstrim is supported by the NVME. I'd like to simply disable it, but I'm not sure of the consequences. Should it ever be disabled?
EDIT: After a lot of testing and monitoring, we have come to the conclusion that the TRIM operation is not the main cause of this problem, but instead the effect of it. The system is running LXD virtualization with an I/O usage of constantly 85% - 99%, writing only 70mb/s and reading 5-10mb/s. When TRIM starts, the system crashes as a result of the NVMe being stressed out.
Using iotop and atop [-Ddp] showed that glusterfs is causing these high I/O numbers. After searching for hours, I couldn't find anything near close to this issue. Every other thread on the internet kept discussing the networking aspect.
We compared the mentioned LXD system with a host that uses OpenVZ containering running the same configuration. No I/O issues whatsoever.
Any ideas on how to tweak the gluster performance?