RE: [SQU] Writing a script to view and download the contents ofproxy cache?

From: Robert Collins <robert.collins@dont-contact.us>
Date: Mon, 19 Feb 2001 11:22:02 +1100

This is the wrong way to go about this. To find out whats in the cache
dirs you need to query squid itself. Via the cache_object:// protocol
you can find out the list of the entire cache. It is machine generated
and should be easily parsable. In fact I believe that recent versions of
squid allow access to this via SNMP as well.

Once you've parsed the entire cache object list, you can of course query
your local copy. That list includes swap file details IIRC.

Finally, if this is something you are going to be working a lot on, you
could consider enhancing the cache_object protocol for the object list
to allow queries.

Rob

> -----Original Message-----
> From: Colin Campbell [mailto:sgcccdc@citec.qld.gov.au]
> Sent: Monday, 19 February 2001 10:46 AM
> To: Cameron Just
> Cc: squid-users@ircache.net
> Subject: Re: [SQU] Writing a script to view and download the contents
> ofproxy cache?
>
>
> Hi,
>
> On Fri, 16 Feb 2001, Cameron Just wrote:
>
> > Hi,
> >
> > I was wondering if it is possible to view and download the
> contents of
> > a squid server via a web browser? I have checked the FAQs and could
> > find any reference to it.
> >
> > I have since decided to write a script in php which will analyse the
> > contents of theproxy cache and make it available via a webbrowser.
> >
> > I am using squid-2.3.STABLE1-5
> > I first look at the contents of store.log and retrieve
> what I believe
> > to be the contents of the cache at the moment. I then have a linkto
> > another page which will dump the contents of the relavent
> file in the
> > cache to the browser. The script will tell the browser what sort of
> > file it is by sending the appropriate headers.
> >
> > Here is a line from store.log
> > 982373047.567 SWAPOUT 0000001A 200 982327734 956284084
> -1 image/gif 1020
> > 2/10202 GET http://ads.msn.com/ads/HOTBOS/00338AH0014_LG.gif
> >
> > I have two problems at present.
>
> > 1. Once I have the name of the file in the cache ie
> 0000001A How do I
> > know where it is in the cache directory structure stored on
> the disk?
> > The directory structure is quite deep and I can't quite see
> any strict
> > pattern for locating the file
>
> The name of the file actually includes the path to get there.
> That file
> "0000001A" is actually "00/00/1A". Now, which cache is it in?
> Dunno cos
> that file exists in every cache_dir.
>
> >
> > 2. Even downloading what I believe to be a GIF from the cache
> > directory and renaming it to a .gif file. I cannot view it
> in anyimage
> > viewer. Why? Is the file in some weird format?
>
> Squid puts a header on all files in the cache. You need to remove it
> before you get the "original" contents.
>
> >
> > So my main questions are
> >
> > Am I using the correct logfile?
>
> Since you don't know which cache is in use at the time, I'd say not.
>
> > How canI find out the exact location and filetype of a file
> stored in the cache?
>
> You cannot, easily. Since squid is designed to find out
> quickly if a URL
> is cached it uses two hashes. The first hashes the URL. This
> becomes the
> key to a (hash) lookup in swap.state. The data returned is
> the filename
> containing that URL.
>
> The only surefire way to look at the cache is to scan it, reading the
> headers in each file to determine what you are looking at.
>
> Colin
>
> --
> To unsubscribe, see http://www.squid-cache.org/mailing-lists.html
>
>

--
To unsubscribe, see http://www.squid-cache.org/mailing-lists.html
Received on Sun Feb 18 2001 - 17:30:29 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:58:03 MST