Tuesday, September 18, 2012

Logrotate fills up root partition

Production machines were dropping dead. A few minutes (or sometimes seconds) before death , I received alerts the root partition was full. Then poof, access to the machine is gone forever.

Logrotate ended up being the culprit. Even though our big log producer (Varnish) stored its access logs on a separate, large partition apart from root, logrotate still uses /tmp to compress logs. Rotation would attempt to rotate a log several times larger than its root partition and then croak.

The solution for us was to double the amount of logs stored but cut the time in half. Rather than try and speed up the amount of time logrotate runs, we just removed the varnish logrotate configuration file from /etc/logrotate.d and ran it in a cron every 30 minutes:

*/30 * * * * /usr/sbin/logrotate -f $LOGROTATECONF > /dev/null

This forces a rotation every 30 minutes using our custom varnish logrotate conf. Within the configuration file we set:

rotate 48

To effectively keep 24 hours of logs available for debugging.



No comments:

Post a Comment