Re: Clean up cache directory from Henrik Nordstrom on 1999-10-16 (squid-users)

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Sat, 16 Oct 1999 15:52:41 +0200

Robert.Rose@centrelink.gov.au wrote:

> Squid 2.2stable4, Solaris 2.6, on Sun Enterprise 250, 1x400MHz CPU, 1Gb
> RAM, 4x9Gb disks striped with Volume Manager for the cache directory.
>
> cache_dir 30000 64 256

64 isn't quite enought. There will be an inbalance in the first 8 top
level directories. See earlier postings on how to calculate L1.

> Squid was restarted a few times in it's first few months of operation,
> each time the cache files started back at number 00000000.

Yes, each time Squid is restarted it begins looking for free file
numbers at 0.

> ready for an increase in the cache.log as above. To my surprise, Squid
> started back at top level directory "00", but didn't start back at
> 00000000, it just kept incrementing to 00400000 and continued on from
> there.

It will go up to something like 481D88 (2 * estimated number of
objects), then restart at 0.

> When we originally compiled Squid, we didn't change from truncate to
> unlink, we'd carefully planned for inode usage based on 256 files in the
> second level directories. Unfortunately since the old files are truncated,
> not unlinked and Squid isn't reusing them, we've got a lot of zero length
> files scattered through our cache directory tree.

What you need to plan for is twice the number of estimated number of
objects. This is irregardless of the number of L1 or L2 directories. The
number of L1 and L2 directories needs to be selected suitable for
storing this number of files.

> How can we go about cleaning them up? I'm not keen to do "find /cachedir
> -size 0c -exec rm {}\;",

Recompile Squid to use unlink, not truncate.

> and I'm not keen on stop/starting squid on a regular basis to go back to
> 00000000 and have it produce Size mismatch entries as above.

No, you should not restart Squid only for the reason of having it to
restart it's file numbering.

A probable reason why you get size mismatch is a race where Squid reuses
a file number while it is being truncated. If squid is compiled to use
truncate then you will see size mismatches where the on-disk file is
smaller than Squid expects, if you are using unlink then you will get
open failures (the file isn't there at all). Both are mostly harmless
errors and is processed as a cache miss once detected.

> I realise a recompile to use unlink instead of truncate
> may be the best option, but are there any other options?

a) Regulary run a find as mentioned above.
b) Have the correct number of L1 directories (at least 72 in your case
with a huge single cache directory), and make the filesystem with inodes
to house 2 * cache_dir_size / store_avg_object_size files.

> What happens when the cache file number eventually gets to FFFFFFFF?

It won't. It will only go up to to 2*cache_dir_size /
store_avg_object_size and then restart at 0.

--
Henrik Nordstrom
Squid hacker

Received on Sat Oct 16 1999 - 08:10:16 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:48:55 MST