Re: Feature Request for 3 release from Henrik Nordstrom on 2003-06-25 (squid-dev)

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Thu, 26 Jun 2003 00:41:48 +0200

On Thursday 26 June 2003 00.20, Joel Wiramu Pauling wrote:

> I have one major complaint about squid. I manage an intranet and I
> need to be able to quickly and succinctly see whats in the cache
> and, sometimes pull objects from the cache (you would be amazed at
> home many people will download important docs, and have computers
> die...), as well as monitoring unwanted sites etc.

Unfortunately Squid itself does not have any means of such view of the
cache. Squid only keeps an abstract MD5 hash of the URL internally to
conserve memory usage, the rest is kept on disk only.

MD5 hashes have the very nice property that they are fixed in size.
Always exact 16 bytes regardless of how long the URL is.

The lengths of URLs varies from 9 characers to 4 KB in size with a
average somewhere around 40 characters. If Squid were to keep track
of the full URLs the already large memory usage would almost double,
effectively shrinking the possible cache size by 50%.

> I have found a great little utility.. but's it's slow.. dosen't
> work with a running cache, and it's output is hard to control. It's
> called purge, (you may or may not be aware of it.)
> The authors site is
> here:http://www.cache.dfn.de/DFN-Cache/Development/purge.html (GNU
> I believe)

It works with a running cache, and is designed to be used with the
cache running for removing specified content from the cache.

Due to the design of the cache an implementation in Squid will be as
slow, and will not only run slow but also greatly slow down Squid in
the process.

> This seems like basic functionality to me, and a lot of people
> agree, that this would be a huge benefit to all squid users out
> there.

To this I agree, but there is technical reasons making it not that
suitable to do within Squid.

What is a viable approach is to add a second database for this purpose
in parallell to Squid, keeping track of the URLs in the cache. This
database can be built automatically by tracking the store.log log
file and feeding the data into a database of choice. For tracking the
store.log file the per File::Tail module is very suitable, but some
database design is probably needed to get a database which can be
searched in interesting ways.

Regards
Henrik
Received on Wed Jun 25 2003 - 16:42:27 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:20:09 MST