Sunday, December 2, 2007

df and du issues

df and du issues:

I received a call from one of my users today, and he mentioned that the /var file system utilization reported by df did not match the output from du. I logged into the box to see what was going on, and ran the df and du commands to see how much space was being used:

$ df -h /var

Filesystem size used avail capacity Mounted on

/dev/md/dsk/d3 3.9G 2.0G 1.8G 53% /var

$ cd /var && du -sk .
302898

One I saw this information, I realized that a file had most likely been unlinked from the file system, but was still open by one or more processes. To see which process was responsible for this annoyance, I used the lsof “+L1″ option to list open files with a link count of zero:

$ lsof +L1

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME

evhandsd 1424 root 3w VREG 85,3 897032 0 7404 /var (/dev/md/dsk/d3)

syslogd 1818 root 14w VREG 85,3 1884238513 0 6803 /var (/dev/md/dsk/d3)


[ … ]

Errrr — based on this information, it looks like syslogd has a 1.8GB logfile open with a link count of zero (I wish I had process accounting running to see which process unlinked this file out from under syslogd). To fix this issue and synchronize the df and du output, I restarted syslogd:

$ /etc/init.d/syslogd stop && /etc/init.d/syslogd start

Which allowed the file to go away and the df and du output to match:

$ df -h /var

Filesystem size used avail capacity Mounted on

/dev/md/dsk/d3 3.9G 302M 3.6G 8% /var

$ du -sk /var
300668 /var

This little exercise reminded me how awesome lsof is.


No comments: