Re: [squid-users] An Old Question: Cache Query/Extraction from Genaro Flores on 2009-09-17 (squid-users)

From: Genaro Flores <genaro.flores_at_gmail.com>
Date: Thu, 17 Sep 2009 17:34:29 +0100

> One way to do it reasonably efficiently is to track store.log, where you
> have [...]

Many thanks for the idea--store.log itself is pretty much all I need since
mostly entries from a short interval before present concern me. Didn't know
it held that much information. RTFM, they say :-)

> There may be some small drift between this shadow database and the
> actual content if you miss some log entries, but it's self-healing over
> time as the cache content gets replaced.

For my purpose, even that wouldn't matter. A few lost entries in tens of
thousands is negligible for the use case.

--On Tuesday, September 15, 2009 20:44 +0200 Henrik Nordstrom
<henrik_at_henriknordstrom.net> wrote:

> tis 2009-09-15 klockan 18:03 +0100 skrev Genaro Flores:
>
>> I guessed so but I was thinking a specialized tool could do the indexing
>> for whoever wants/needs it. Maybe I'll try making a couple short scripts
>> for that purpose and for searching the index and retrieving the targets.
>> I was wishing somebody had done something similar before :-D
>
> Quite likely some have done such tools, but I am not aware of any such
> tool published on the Internet..
>
> One way to do it reasonably efficiently is to track store.log, where you
> have
> - Squid object id
> - URL
> - Mime type
> - time
> - HTTP status
> - last-modified
> - content-length
> - object size
> - expires
>
> and some other small details.
>
> just feed this into an database keyed by Squid object id, and indexed on
> relevant pieces of the rest..
>
> There may be some small drift between this shadow database and the
> actual content if you miss some log entries, but it's self-healing over
> time as the cache content gets replaced.
>
> Regards
> Henrik
>
Received on Thu Sep 17 2009 - 16:36:57 MDT

This archive was generated by hypermail 2.2.0 : Thu Sep 17 2009 - 12:00:03 MDT