Re: [squid-users] Squid 3 - Page Content

From: Martin Ritchie <martin.ritchie@dont-contact.us>
Date: Wed, 22 Oct 2003 17:08:24 +0100

Robert Collins wrote:
> On Tue, 2003-10-21 at 21:31, Martin Ritchie wrote:
>
>>Sorry if this is a total newbee question but I'm wanting to store the
>>actual page content in a database is there anyone out there that has
>>done anything like this? Do you have any pointers of where I should start.
>
>
> Well, there are a few approaches. The simplest would be to tail
> store.log, and copy out the objects as they are completed. You can use
> ufsdump in the squid3 sources (cd src && make ufsdump) as a sample
> application for examining a single cached object. Only a little work
> would be needed to list all the metadata, and the byte offset that
> actual data starts - from there you can insert that into your database.
> (Be sure to take a local copy (not hardlink) first, so as to minimise
> the occurences of the object being recycled before you get to it. You
> can't do that with COSS though. A second approach would be a hacked
> squid with a an external call out of some sort - perhaps iCap , although
> the iCap patches are still only for 2.5.

my cvs head ufsdump doesn't want to compile. I'm getting a number of
mulitple definition errors based on a number of comm_select methods. I'm
still new to C++ so please go easy on me. I'm not sure that even getting
this working will solve our problem as only cached pages will be in the
cache.

If I'm wanting to go for the second approach of 'patching' squid with an
external call where would I start. Is 2.5 and icap the best approach or
should I be looking to v3?

I guess the html is sent to the client as it arrives but is it ever
available fully in memory? and is it possible to add db processing when
the content has been fully retrieved.

tia

-- 
Martin Ritchie
the Kelvin Institute
50, George Street
+44 (0) 141 548 5719
Received on Wed Oct 22 2003 - 10:08:28 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:20:35 MST