Re: Storing of information

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sat, 11 Feb 2012 18:57:21 +1300

On 11/02/2012 5:50 p.m., Pieter De Wit wrote:
> On 11/02/2012 15:43, Amos Jeffries wrote:
>> On 11/02/2012 8:34 a.m., Pieter De Wit wrote:
>>> Hi Guys,
>>>
>>> So I saw on the mailing list the question about quotas came up
>>> again. I thought I would give it a shot (planning etc) and I was
>>> wondering, does "Squid" have a way to store data ?
>>>
>>> In Quota you will have Bandwidth (bytes, not per second) that has to
>>> be checked/adjusted and updated to disk. Instead of writing my own
>>> routines to store this in my "own" format, it would be nice to have
>>> something that grows with Squid.
>>
>> Discard the idea of disk formats for quota control. You are dealing
>> with individual packet read/writes at this level. Everything of
>> importance needs to be in RAM. Squid has a client_db memory cache
>> which stores statistics and details about each unique client. Several
>> transaction state controllers (ConnStateData, TunnelStateData,
>> *ServerData) already interact with that for per-client bandwidth
>> reporting.
>>
>> At the more abstracted level, semi-accurate quotas would need the
>> client_db to be backed up on disk or somewhere periodically. (oh yay,
>> yet another event to cause "squid keeps pausing" bugs).
>>
>> Or alternatively, the design I worked out a years or so ago uses a
>> helper process to query some database or system managing client
>> quotas. There is a lot of interest in RADIUS and ActiveDirectory
>> backends controlling this type of thing, and a helper interface is
>> much more flexible than pre-determined config formats. You can query
>> this helper on each new client to receive a quota value which gets
>> used up then re-checked. We still get some slowdown from the lookups,
>> but it is up to the admin to configure how much quota bytes the
>> helper requests each cycle and thus how much overhead they add.
>>
>> Amos
>>
> Hi Amos,
>
> Thanks - I second the idea of a helper process, I like the "milter"
> type idea where this can run on another box, away from Squid. I have
> been messing around with ICAP to do this, what are your thoughts on
> that ? IMHO, the time part of quotas should be handled by ACL's, it is
> impossible for Squid to know how long someone has been on the net.
>

I like the idea of pushing it off into ICAP. That does being up the
problem that things like auth loops, errors and related self-DoS events
are omitted from the quota counts. Also tunnels are adapted only for the
HTTP headers portion, once they get to the blind-tunnel parts its direct
byte shuffling between two TCP sockets.

> Back to the "sync" of client_db. I agree that the working set of the
> DB should be in memory, but would a threaded approach slow squid down
> "that much" ? I am thinking along the lines of a pthread that just
> sits there and lazy writes dirty objects to disk. A quick lock of the
> global mutex, copy the object, unlock mutex, write object, rinse
> repeat. There will also have to be a shutdown hook...the helper
> process is looking pretty good round about now :)

In squid it would have to be a AsyncJob to start with, since the memory
spaces are still too much twisted together to add threading cleanly.
When that is working, splitting process or thread away may be an option
for improving over the Job.

>
> For some reason, version control systems and I have never been
> friends. I can't get a copy of the source code using BZR (using the
> instructions on the Squid wiki), so I will build against the latest
> .tar.gz that I can find. I also suspect this will be new files and
> very little modifying of current files.

That is a problem, can you push the errors it gives you back here (new
thread), lifeless might have a clue if the rest of us can't help.

Amos
Received on Sat Feb 11 2012 - 05:57:31 MST

This archive was generated by hypermail 2.2.0 : Sat Feb 11 2012 - 12:00:05 MST