Summary of Issue
After enabling the LUKS / dm-crypt full-disk-encryption option available through the Ubuntu installer, disk I/O performance is absolutely abysmal. Writing to the disk stalls / freezes the system. Data read from the disk appears to be corrupted.
If I don't use LUKS / dm-crypt then I don't have any problems at all. Everything is perfectly stable and performant. I understand that encryption has a performance hit. I expect lower performance, not minutes-long system freezes and data corruption.
I've never had so many issues with a completely clean Ubuntu installation before. I'm fine with being wrong about something. I just want my stuff to work!
Both systems listed below are affected by this issue. All experimentation happened on the Ryzen system. The i5 has been doing basically nothing for 2 years so I just never noticed the problem until now.
System #1 (running mostly idle for about 2 years)
- Ubuntu Server 20.04.3 LTS
- Intel i5-3570
- 8GB RAM, non-ECC
- Kingston 120GB A400 SATA SSD
- No errors reported by Memtest86+ or Prime95
System #2 (new system, where problem was first discovered)
- Ubuntu Server 22.04 LTS
- AMD Ryzen 5 5600
- 32GB RAM, ECC
- Kingston 120GB A400 SATA SSD
- No errors reported by Memtest86+ or Prime95
Steps to Reproduce
- Install Ubuntu 20.04 LTS or 22.04 LTS
- During installation, when setting up the disk, choose the following options
- Use an entire disk
- Set up this disk as an LVM group
- Encrypt the LVM group with LUKS
- Expand the partition containing
/so it takes up all available free space in the LVM group - After installation and boot, use SSH / samba / USB / whatever to transfer a large file to the OS disk
Expectation
- Write big files (greater than ~6GB) to the disk without the system freezing
- Read files back from the disk and have them not be corrupted
Reality
All of these issues were found with the Ryzen system. I tested heavy I/O load on the i5 system once and was able to reproduce the issue. I'm not brave enough to push it further, lest I corrupt the OS disk and have to rebuild it.
- Writing large files freezes the system to the point where only console echo works. commands don't run. even
lswon't return anything. SSH transfers stall, time out, and fail. - iotop says at least one kcryptd worker thread hits 99% IO load and then hangs there for several minutes (feels like 2-3 minutes)
- Large files read back from the disk appear to be corrupted. I moved a VM image over and it wasn't able to run for more than a few seconds without crashing out due to internal file system damage. After a few reboots
aptstarted complaining about broken packages. The network connection stopped coming up. Eventually the system threw a kernel panic and I gave up. - Oddly enough,
rebootdoesn't actually reboot. The system will hang with a black screen after shutting down the OS. Lights and fans stay on. The chassis reset button doesn't work in this state. I have to pull the power cable out of the wall to get things going again.
Please note that none of these issues occur when the OS is installed without LUKS / dm-crypt underneath. This includes the odd issue with the hung reboots.
Also note that I tried running Windows 10 + BitLocker on the Ryzen system and it had zero issues.
Additional Info
I did all of this on the new Ryzen system with Ubuntu 22.04 LTS.
- I tried setting
cryptsetup --allow-discards --perf-no_read_workqueue --perf-no_write_workqueue --persistent refreshthinking that this was caused by some weirdness with the cheap SSD. It helped write performance, but reads still appear corrupted. - I tried a full clean reinstall without any extra applications. Just the base OS and iotop. No updates. The problem persists.
- I swapped the Kingston SSD for a known-good 7200RPM spinning hard disk. SATA 2.5" 320GB non-SMR. Full clean reinstall. The problem persists.
- I swapped the Kingston SSD for an known-good NVMe drive, Samsung SSD 970 EVO Plus. Full clean reinstall. The problem persists.
- I replaced the SATA cables, even though everything works fine without encryption. The problem persists.
- All drives involved in this mess have passed badblocks and SMART tests.
At this point I'm seriously considering moving back to Windows 10 + BitLocker because I don't know what else to do.
Links
- How do I troubleshoot a disk IO performance issue possibly related to dm-crypt/LUKS? is where I got the
cryptsetupadvice shown above. - https://ubuntuforums.org/showthread.php?t=2474486 - my massive wall-of-text post on Ubuntu Forums. Please note that I make many references to VirtualBox in that thread, but VirtualBox is not the cause, as evidence by the fact that the problem still persists on a clean installation of Ubuntu 22.04 without VirtualBox installed. I just happened to be working with VirtualBox when the problem appeared.