2

I'm facing a 100% usage issue on the root partition (/dev/sda2) on my Ubuntu server, despite no obvious large files, and it's driving me insane:

$ df -h /
Filesystem      Size  Used Avail Use% Mounted on  
/dev/sda2       996G  976G     0 100% /

While my du shows only 521G is used

$ du -xh --max-depth=1 / | sort -hr | head -n 5

521G / 466G /home 25G /var 12G /usr 11G /root

I have 455G used by something, but I can't put my finger on it.

This issue prevented my users from accessing their sessions, so I cleaned up some users in /home, but the issue will creep back up.

What I already did :

Docker images and volumes are healthy:

$ docker system df
TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          12        5         15.84GB   8.503GB (53%)
Containers      6         0         297.5kB   297.5kB (100%)
Local Volumes   4         3         72.06MB   25.08kB (0%)
Build Cache     0         0         0B        0B

Snap packages: removed old disabled versions with sudo snap remove --purge, got me 3G of headroom (yeah!).

Journal logs: only ~352MB used.

/var/lib/snapd is around 4.7G.

lsof | grep '(deleted)' shows only PulseAudio memfd buffers (not disk files)

$ lsof | grep '(deleted)'
lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing
      Output information may be incomplete.
lsof: WARNING: can't stat() nsfs file system /var/snap/lxd/common/ns/shmounts
      Output information may be incomplete.
lsof: WARNING: can't stat() nsfs file system /var/snap/lxd/common/ns/mntns
      Output information may be incomplete.
pulseaudi 210380                    administrateur    6u      REG                0,1 67108864    5195534 /memfd:pulseaudio (deleted)
pulseaudi 210380 210419 null-sink   administrateur    6u      REG                0,1 67108864    5195534 /memfd:pulseaudio (deleted)
pulseaudi 210380 210422 snapd-gli   administrateur    6u      REG                0,1 67108864    5195534 /memfd:pulseaudio (deleted)

I have a large mount under /mnt/NASLABO2, but this is a remote NAS mount and should not affecting local disk ?

//192.168.26.102/Backup_Linux  118T   73T   46T  62% /mnt/NASLABO2

I moved a lot of my data from my main partition to the higher up partition, but i dont want to touch the /home of my main group of users yet.

dev/sda3                       31T  5,4T   25T  19% /TDL

I even had some downtime because of FSCK, whithout finding anything.

Something interesting that an user told me, the hidden data appeared years ago after an electrical failure and a hard reboot.

Since this is a critical server where downtime must be minimized, I’m looking for a more permanent solution. Any idea of what could cause this issue, and how i could fix it ?

Cheers !

Artur Meinild
  • 31,385
Adrien T
  • 21
  • 1

0 Answers0