Re: Questions/Remarks about rbcollins_filters branch

From: Robert Collins <robert.collins@dont-contact.us>
Date: Sat, 27 Jan 2001 01:36:48 +1100

If this answer doesn't make much sense.. it's because I'm too asleep :-]

Some notes before I answer: the rbcollins filters is looking at generic data modifying/inspecting filters, to allow dynamic code
insertion (most of squid is already following the same calling pattern, but the function calls are placed by hand). Secondly the
inital application I am aiming for is content modification to tie in with one of Joe coopers projects. I also expect to rationalise
the te code significantly by using filters in it. Thirdly, filters are a generic tool for squid, data modifying filters are the
first application for the tool.

For content changing hooks, some sort of header modifying filter may also be required - for example to change content-type headers
on the fly... This is part of the thoughts I mention below...

----- Original Message -----
From: "Moez Mahfoudh" <moez.mahfoudh@imag.fr>
To: "Squid Dev" <squid-dev@squid-cache.org>
Sent: Saturday, January 27, 2001 1:21 AM
Subject: Questions/Remarks about rbcollins_filters branch

> <This message is targetted primarly to Robert and to people who know
> about the filters he's implementing>
> Hi all,
>
> I am interested in the filters implementation (rb_collins_filters
> branch) and after a few hacks around this new piece of code, I have
> some (may be stupid) questions to make things clearer in my mind:
> * What is the exact flow of data (from sockets to store). Are the
> filter functions (for example dochunk) called once or many times during
> an object reading. I mean what is the exact scenario ? :
> - read a chunk -> pass it to filter 1 -> pass the result to filters 2
> -> ...... -> pass the result to the last filter -> append to store
> object -> goto read chunk.

this one. See below.

> or
> - read all the chunks -> pass all the chunks through filters -> put
> in the store

not this one. This requires potentially unlimited buffer space in memory & time. Clients would time out if we buffer 20 Mb files
before sending any data.

socket->store path:

read a chunk -> if it's the first chunk, build a list of filters (this could be done before the first chunk, an should be :-].)
call filter1 ->call filter 2 ->call filter n -> call terminating filter -> append to store.

the store then calls the client_side call backs for any listening clients.

filters can (and should) remove themselves from the list if they are no longer needed. For example the header processing filter
removes itself after the headers are handled, and thus is not called any longer.

>
> * In a filter function, what is the meaning of returning TE_CHUNK_A/B ?

Thats a bit of black magic whose answer is only known to Patrick McManus. It's actually part of the transfer encoding code - which
is why it's in the TE filter (TE==transfer_encoding). It should get optimised out when the TE loop becomes a series of filters. If I
understand Patrick's code correctly what it is used for is indicating the terminating 0\r\n on chunked encoding.

> Now, here is my (unique) suggestion (for the moment):
> * Sometimes, it is better to have a filter triggered by some condition
> (depending on headers value, for example, a filter to transform txt
> files to html would be triggered only of content type is "text/plain").
> So it'll be interesting to have this possibilty in the API.

It's in the works. I think that filters should be able to be applied to the four hook points (data from client->squid,
squid->origin, origin->squid,squid->client) in a arbitrary fashion via the config file and ACL's. However that's a long term goal
(feel free to base a branch on rbcollins.filters and work on that side of it :-]). In fact the TE & range request filters in
client_side are added based on whether the request was a range request, and if TE is needed.... So for coder determined criteria
that possibility is already there.. just the acl framework needs looking into. I have some ideas on that that I'll try to put
together clearly and mail out in the next few days.

>
> Thank you... It is a good work Rob...

thanks..

Rob
Received on Fri Jan 26 2001 - 07:36:30 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:13:25 MST