Re: store entry question

From: Matt Tuzzolo <mtuzzolo@dont-contact.us>
Date: Wed, 16 Apr 2003 11:05:07 -0400 (EDT)

Henrik,
        Ok. First of all, I'm not sure I actually have to reset the
entry. Let me explain..
        I'm trying to search the page for keywords and then once the
page is done being read, if we've read less than X bytes, compare the
number of keywords to a keyword threshold (1). So if the page contains
1 or more keywords, then squid appends a redirect. The threshold for
pages with a bytesize greater than X bytes is higher and the comparison is
done with each call to httpReadReply.
        The idea behind this is for the filtering software to be able to
identify galleries (little text, lots of pictures), which may only be 2k
of html which contains 1 keyword.
        I don't want to always check the first X bytes of a page because
the page might be 50k in total, and contain one keyword coincidentally in
the first X bytes resulting in the page being blocked.
        So what I'm trying to do is wait for squid to finish reading the
page from the webserver. At this point, we can append a redirect if
neccesary and skip appending buf. This only works for persistent
connections since with nonpconns we don't know that there isn't any more
to read from the server until we've already delivered the content to the
client. I'm not sure how to handle nonpconns without modifying the
content-length (which seems to cause all sorts of problems).

-Matt

---------------------------
Matt Tuzzolo
Merrimack Education Center
978.262.4000
---------------------------

On 16 Apr 2003, Henrik Nordstrom wrote:

> Normally you don't.. instead you create another StoreEntry with the same
> key and fill it with your data.. The new object will automatically
> invalidate the "older" object as soon as you assign the public key to
> your new object.
>
> The exception to this is when Squid is fetching data from the network.
> Then a StoreEntry may be reset if it is decided before any data have
> been sent to the client that the data retrieved so far is no good and
> should be immediately thrown away to allow for another attempt to
> retrieve the object. This is about the only case where it is safe to
> reset an existing StoreEntry.
>
>
> What is is actually you are trying to do here?
>
> Regards
> Henrik
>
>
> ons 2003-04-16 klockan 15.39 skrev Matt Tuzzolo:
> > Hey,
> > I'm trying to reset a store entry and then fill it with my own
> > data. My code basically looks like this:
> >
> > <------(entry contains "</html>")
> > storeMemObjectDump(entry->mem_obj);
> > storeEntryReset(entry);
> > storeAppend(entry, "blah", 4);
> > storeMemObjectDump(entry->mem_obj);
> > <------(entry now contains "</html>blah" should be "blah")
> >
> > But storeMemObjectDump prints the same data each time and the client
> > still recieves the pre-existing content. Is storeEntryReset
> > not actually freeing the data_hdr? I can only find one other call to
> > storeEntryReset so I don't have much to go on.
> >
> > I was previously just appending my data after the read, but this only
> > works for dynamically generated sites (where the CL is unknown), so that's
> > no good (although some browsers will actually just ignore the
> > content-length and read from the socket indefinitely). I need to
> > somehow get rid of what's already in this entry and recreate it with my
> > own content so this is compatible with browsers that pay attention to the
> > content-length.
> >
> > Any suggestions?
> >
> > -Matt
> >
> > ---------------------------
> > Matt Tuzzolo
> > Merrimack Education Center
> > 978.262.4000
> > ---------------------------
> --
> Henrik Nordstrom <hno@squid-cache.org>
> MARA Systems AB, Sweden
>
Received on Wed Apr 16 2003 - 09:04:57 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:19:41 MST