Re: Storing of information

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 12 Feb 2012 19:18:12 +1300

On 12/02/2012 5:05 p.m., Pieter De Wit wrote:
> <snip>
>> * the parsing bottleneck gets crunched several times: on first
>> arrival, in the ICAP server, and on return to Squid,
>> * the ICAP server bypass optimization can't be used since quote needs
>> to measure every byte,
>> * tunneled data does not get sent to ICAP services,
>>
>> Not exactly perfect service, but it offers the most complete quota
>> control without adding complexity to Squid.
>>
>> eCAP might be a slightly better. It sill runs inside Squid and has
>> some processing overhead, but should reduce the parse problems and
>> network delays involved with ICAP.
>>
>>>
>>> Points to reading URL's are more than welcome, also, so is examples
>>> of libicapapi :)
>>
>> Hopefully someone else knows some then, because I dont :(
>>
>> Amos
> Hi Amos,
>
> You said that you proposed some work a while ago, would you mind
> sharing that? I gave the network thing some thoughts and I can see how
> the delay would hurt squid. I kept on comparing it to milters, but
> these don't mind a few ms delay, email is a lot less interactive.
>
> The thought process I am going with is something along the lines of a
> process that is "spoken" to, like ecap perhaps, via pipes or a lib or
> some such. This process will be notified based on the following:
>
> (* - Request, **-Reply)
>
> * I would like to go to protocol://site
> ** Is there quota left to allow this, if the user has 0 quota left,
> block the request, no use
> * The server said the object is X bytes long, can I continue to
> download it
> ** Yes, there is quota. The problem comes in if the server didn't give
> a length, if that is the case, perhaps only allow 1024 bytes until his
> quota runs out. There is also the problem if the server said the
> object is bigger than it really is...
> * Can I sent the following 1024 bytes
> ** Yes, there is quota.
>
> At any given step, if the quota runs out, the connect is aborted. This
> will involve some tie in with the FD struct that you guys have
> already. I do recall myself and Alex having a chat about this. I
> referred to it as "hooks" into the FD struct. I *think* the talk about
> "hooks" in the FD struct was aborted because it didn't add enough
> value at the time, or real life caught up to me or or or :)

The download even if known-length can be aborted at any time, also the
backend system may change the quota at any time as well.
So IMO the best idea is to collpase the requests all down to a request
asking for N bytes and passing along any parameters which the quota
backend needs.

The basic idea was started here:
   http://bugs.squid-cache.org/show_bug.cgi?id=1849

Looking back at the discussion thread it was started by you in Feb 2009
the model description is here
http://marc.info/?l=squid-dev&m=123570800116923&w=1. Although it seems I
sent you something in private before that with more details. Sorry that
mail is gone now.

The Measurement Factory have since created the client_delay_pool part of
it but without any helper hooks. So the current is only /sec capping.
Adding a helper API hook that sets the client DB quota field values and
updates it when exhausted

That is fully controllable already with per-request limitations and speeds.

The big cases that are left is fixed-size quotas that run down. No need
for lookups with details from particular headers or such at this point.

>
> Based on this, I would like to re-float the idea of "hooks" in the FD
> struct. From the top of my head, one would have modules that expose
> certain function/procedures:

The FD struct is on the hitlist for erasure or at least removing
anything that is not particularly directly related to the FD value. The
Comm layer has been restructured in Squid-3.2+ into a set of dynamically
created listener Jobs (TcpAcceptor) which spawn traffic handler
AsyncCalls based on the http(s)_port settings. The hooks would be best
being added into the call sequence and run out of those traffic handler
functions. Incidentally that would be...

>
> OnClientConnect (source_ip,source_port,target_ip,target_port);

This would be httpAccept(), httpsAccept() in client_side.cc where the
client DB entry is created/updated. It would need the config settings to
handle being limited to the TCP level details available here, with no
request details.
  The main idea behind using a helper, was that we can completely avoid
the work of figuring out generically useful config directives. Just pass
the TCP details to the helper and let the admin decide which are used
and how.

> OnClientRequest (URL);
> OnClientRequestContent (content,size,offset);

The code structure allows for a hook after the request headers are fully
received and parse completed in the doCallouts(). The earlier processing
is locked inside some annoying loops. I'm hoping to kill those, but that
will take a while. For now we are stuck with doCallouts() being the
start- and end-all of request processing.

> OnClientResponse (URL,size);
> OnClientResponseContent (content,size,offset);

Squid offers http.cc processRequest() for hooks after the response
headers have been parsed.

> OnClientDisconnect (<not sure>);
>
> I will outright say, I have no clue how modules work (thinking about
> apache etc) and these are shamelessly based on my Delphi XP with Objects.

The hardest part is making the hook on quota runout work cleanly. Have a
good look at client_db.cc for how the "quota" stuff in there works already.

>
> Cheers,
>
> Pieter
>
> P.S. Might be worth starting a new thread perhaps ?

Same topic though. Rename?

Amos
Received on Sun Feb 12 2012 - 06:18:19 MST

This archive was generated by hypermail 2.2.0 : Sat Feb 18 2012 - 12:00:06 MST