Re: Squid3 with multiple ICAP Services

From: Richard Bishop <squid-dev@dont-contact.us>
Date: Tue, 25 Sep 2007 13:08:06 -0400

Alex,

Firstly, please disregard some of what I said in my previous email. Having had another good look over the code I've identified that I did make some mistakes in my interpretations. The statement posted on the 3.x developer guide - "The code is often difficult to follow because there are no explicit state variables for the active requests. Instead, thread execution progresses as a sequence of 'callback functions' which get executed when I/O is ready to occur, or some other event has happened" is definitely very true!

Between sending email my and receiving your I have been having a play around with this idea. I've gotten to the point where I have multiple ICAP services being processed for REQMOD methods, though nothing for RESPMOD as yet. As I see it, REQMOD is somewhat simpler than RESPMOD since you're never going to be dealing with such massive amounts of data. A request is generally only going to be a few KBs (a few MB in the case of an HTTP POST), whereas RESPMOD could be anything (together with your points made below about pipelining).

I've modified the ICAPConfig class such that the ICAPAccessCheckCallbackWrapper function makes a list of all applicable classes (using the existing criteria that a class should have at least one service that matches on both method and vector point to justify check its ACLs). This list is then passed down to another function that decides which services are applicable in that class (again, based on the existing up/down mandatory/optional criteria).

Once we have a list of services to apply these are applied to the request one at a time through the existing callouts, with the end result being processed as normal.

I have to admit that it is all a little hacky at the moment, and certainly doesn't make any of the sweeping changes that you've proposed. It does, however, work better than the existing multiple-service code (i.e. there being none). If anybody would like me to send them the patch please let me know.

Regards your other points:

> 1. The ICAP configuration classes (e.g., ICAPClass and ICAPAccessCheck)
> and code would need to support selection of multiple services. It is not
> yet clear whether such selection should be static (once per HTTP
> transaction) or dynamic (select next ICAP service after the previous
> service has finished). Perhaps the choice should even be configurable.

I think you make a very good point regards 'configurable' here. I would suggest that there should be an option to bypass particular services based on the results of earlier services - i.e. values of ICAP headers returned in the response. In the case of multiple chained virus scanners, this would mean that the presence of an X-Virus-Found after the first transaction (indicating the first scanner found a virus and rewrote the body), would then skip the other services since we then know this rewritten body to be clean.

> Besides handling bypass and aborts, the iteration code should correctly
> pass adapted HTTP bodies from one service to another via BodyPipe. It is
> probably important to support ICAP service "pipeline" where the next
> service can start working on the HTTP message still being adapted by the
> previous service (where possible). This implies that the code will have
> to deal with multiple ICAP services working on the same HTTP transaction
> at the same time.

Yes, agreed. It is certainly pointless (and slow) to buffer data that could be piped straight into the next service and onto the client. I don't think my C++ skills are up to tackling this one!

 

Richard
Received on Tue Sep 25 2007 - 13:08:39 MDT

This archive was generated by hypermail pre-2.1.9 : Mon Oct 01 2007 - 12:00:05 MDT