Gluster i/o wait system freeze

Question

I'm running Ubuntu 22.04.2 LTS on a supermicro system with the following configuration:

Intel Xeon Gold 6248R
256GB RAM
Samsung PM1735 NVMe SSD (6,4TB) // (currentfirmware: EPK98B5Q)
XFS filesystem

TRIM causes a consistent freeze of the whole machine every time it is executed by the fstrim.timer. Also the large trimmed amounts are concerning each time. Monitoring the fstrim output via fstrim -v is not possible, since I don't want to cause another freeze by running it manually. Here's the journal output:

Apr 20 00:48:35 hostname fstrim[1361579]: /mnt: 3.8 TiB (4144756957184 bytes) trimmed on /dev/nvme0n1

Running hdparm -I /dev/sdx doesn't seem to have a useful output:

root@hostname ~ # hdparm -I /dev/nvme0n1

/dev/nvme0n1:

I've read different guides on how to check, if fstrim is supported by the NVME. I'd like to simply disable it, but I'm not sure of the consequences. Should it ever be disabled?

EDIT: After a lot of testing and monitoring, we have come to the conclusion that the TRIM operation is not the main cause of this problem, but instead the effect of it. The system is running LXD virtualization with an I/O usage of constantly 85% - 99%, writing only 70mb/s and reading 5-10mb/s. When TRIM starts, the system crashes as a result of the NVMe being stressed out.

Using iotop and atop [-Ddp] showed that glusterfs is causing these high I/O numbers. After searching for hours, I couldn't find anything near close to this issue. Every other thread on the internet kept discussing the networking aspect.

We compared the mentioned LXD system with a host that uses OpenVZ containering running the same configuration. No I/O issues whatsoever.

Any ideas on how to tweak the gluster performance?

Does selecting a minimum size to trim with the -m speed things up (ignores small spaces)? If you have a large initial part of the filesystem that remains unchanged, maybe use -l to skip that. Consider reducing the frequency of trim (change the cron?). I wouldn't turn it off completely. — ubfan1, Apr 22 '24 at 15:53
Hi, thank you for sharing some ideas. We've already tried reducing the frequency, which only resulted in more frequently freezes. Our theory is that the longer we wait with trimming the blocks, the more traffic has been going on, hence more blocks need to be trimmed. Its like a never ending story. I will try to play around with the parameters and give a feedback. — eftotheoh, Apr 23 '24 at 11:54
Look at the journalctl -l output (grep for fstrim). Weekly trim on my 1TB SSD takes just over a second cpu (60GB+20GB trimmed). Do you have disk activity indications (light?...) because cpu should not be a big factor. I have seen trimmed bytes reported that seem way to big, like in the petabytes, so 4.3TB isn't in that category. Check for any firmware updates on your motherboard and SSD. — ubfan1, Apr 23 '24 at 16:00

Gluster i/o wait system freeze

0 Answers0