Re: Storing of information from Alex Rousskov on 2012-02-17 (squid-dev)

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Fri, 17 Feb 2012 15:50:58 -0700

On 02/17/2012 12:38 PM, Pieter De Wit wrote:

>> ICAP is not ideal for implementing control points other than at the
>> beginning of messages. It would be awkward and performance-expensive to
>> use ICAP for quota control if you want to change things in the middle of
>> a transaction. ICAP also lacks proactive notifications from the ICAP
>> server to Squid, which would be nice for certain quota operations.
>>
>> Compared to ICAP, eCAP has a much lower performance overhead, but has a
>> similar beginning-of-message design limitation. If you want to do this
>> using loadable modules, we could discuss extending eCAP, but I am not
>> sure it is the best approach.

>> I recommend considering reshaping client_db code to work with shared
>> memory so that it can work correctly in SMP mode. I do not think that
>> would be very difficult to do and it would be immediately useful.
>>
>> Once that is done, you can add a control message queue and use external
>> processes to manage the quota database as needed. This design would
>> minimize performance impact while allowing for any external quote
>> management programs to co-exist with Squid.

> Yip, seems to be the general feeling out there. With this redesign,
> would iCAP and eCAP not move into the same "hooks" ? They might not be
> external programs, but the connections to the servers can be made at
> that time ?

I do not see how ICAP or eCAP can benefit from this because they already
have their own communication protocols and APIs that cannot be changed
to use shared memory and IPC queues. It is remotely possible that some
other adaptation scheme will grow out of this work, of course, but I
think it is too premature to discuss that right now.

> Alex, would it be ok if I pop a few emails over to you to assist with
> the SMP stuff, if needed ?

It is best to keep the discussion on squid-dev so that everybody can
participate and there is an archive for others to come back to. It may
also provide you with better answers when I am wrong or not available.
And folks can always ignore your emails if they are not interested in
the subject.

The firs steps should be relatively straightforward:

0. Create a test procedure to detect regression bugs. Collect results
using unmodified code. You will come back to this at the end of each
step below (at least). In my experience, quota-related code is
impossible to get right without a lot of testing. What may seem like a
reasonable algorithm often results in over- or under-allocation of
resources due to various combinations of concurrent transactions
competing for the same bandwidth.

1. Change client_db storage format to something you can store in a
memory segment (or a few segments). You can still use hashes and lists,
but you have to avoid raw C++ pointers to elements.

2. Add atomic locking to the resulting structure, preferably on a member
basis (i.e., not a single giant lock). Existing StoreMap and IPC Queues
are good examples of how this can be done.

3. Switch from per-process db to a shared db (if shared memory
primitives are supported and there are multiple workers).

HTH,

Alex.
Received on Fri Feb 17 2012 - 22:52:26 MST

This archive was generated by hypermail 2.2.0 : Sat Feb 18 2012 - 12:00:07 MST