Re: external acl cache level

From: Gonzalo Arana <gonzalo.arana@dont-contact.us>
Date: Mon, 22 May 2006 23:17:18 -0300

On 5/22/06, Henrik Nordstrom <henrik@henriknordstrom.net> wrote:
> mån 2006-05-22 klockan 11:46 -0300 skrev Gonzalo Arana:
>
> > Reordering and combining perhaps? Allowing combining would raise the
> > number of cached entries invalidations from N-2 to (2**N)-2 (I am not
> > counting current reply as an invalidation).
>
> The big problem is the lookup, which we want to keep quick..
>
> Invalidation is not strictly needed, depending on the lookup order. As
> long as the lookup gives less detail higher priority there is no
> conflict (only unneeded entries in the extacl cache).

Ah, I see now. As long as the helper & squid follow "lower level
number, higher priority" policy, there is no need for cache
invalidation.
Just for the record: if a helper replies with level=N means that
there is no reponse cached for any level for this key (otherwise,
cached response would have been used and the question to the helper
would have not been asked.).

> To be able to make sane lookup structures it is very beneficial if the
> data can be structured in a path like structure. This worked out quite
> okay except that there is acl types where the acl arguments (the data in
> the acl statement) is more important than some request details
> (external_acl_type format tags)...

I may be wrong, but reordering is needed in those cases, which is why
I proposed 'combining' key components: letting the helper specify
which request-tokens may be used for caching this response.

> > The format tag should expand to some string we are sure is not present
> > in any other tags, which is something difficult to assure since we
> > have %{Header:member} tag. Adding 'level' support for external acl
> > cache implies the request/reply pair need some higher level structure
> > (say XML, or HTTP-alike), unless I am missing something.
>
> I am not sure I see the problem you refer to. Can you eloberate a bit on
> what kind of problem you see?

Sure! Here is an example:
external_acl_type useless %{Host} %| /some/helper some-argument
acl yet_another_useless external useless %{Cookie} %| %{MYADDR}

We could just demand that /some/helper should be aware of request
levels (this is something you pointed out below). Sooner or later
this will lead to confusions.

Options:
1) To expand '%|' to some string that we know it won't be present in
any other tags. I fear no matter which string we choose for '%|'
expansion; that string could be present in (for instance) Cookie
request header.

2) As you proposed:
> Another approach would be to mark the arguments per their key detail level.
Unless I misunderstand this, you are proposing that each request could
look something like this (I know that there are cleaner ways to do it,
this is just an example):
1=localhost 1=blah1 2=user_xxx 3=1.1.1.1
where each integer represent the key level.
With this approach, key-component level is assigned by squid
configuration, and is not per-request (which perhaps is what is
wanted).

3) We could let external helper to decide key-component level by using
something like XMLRPC or we could come up with our own protocol based
in, say, HTTP.

This encoding/protocol/structure (whatever this should be called)
should add support for something like HTTP's Vary: in the response,
the helper should indicate which components of the request were taken
into consideration for building the reply.

Of course, this 'custom-made-protocol' aproach is needed only if
key-component level is to be assigned per-request, which is what I've
been chasing so far.

> Draft patch attached. This patch adds %DATA expanding into the acl
> arguments (and %ACL expanding into the acl name for completeness).

Let me see If I follow correctly: with %DATA you can switch the order
of the arguments to external_acl, right? So you can make acl
arguments have higher priorities than external_acl formats.

> Problem: %DATA have a slight problem with whitespace characters if the
> helper is to handle arguments with whitespace AND multiple arguments in
> the same acl type.. as currently written they both looks the same in the
> %DATA expansion.. (a space character, appropriately encoded per the
> helper protocol).

we seem to fall into "some higher level structure is needed" again.
Mainly because the external helper is needed to tell squid which
arguments have been used ("combining" approach).

> Which reminds me.. external acl helper protocol should be switched by
> detault to the 3.0 format for 2.6. The "shell escaped" format used in
> 2.5 was a bit of mistake.. (looks pleasing to humans, but is quite ugly
> to computers)

I guess url-escaped arguments are much easier to decode than 'shell
escaped' ones.

> The "level" adds structure to the requests by allowing it to be
> structured in a path like manner when needed by introducing the level
> separators in the request format.
>
> %DST %| %PATH
>
> Problems:
>
> The helper is assumed to know the key levels defined in
> external_acl_type. These are not reflected in the request data. Not sure
> this actually is a problem, but results may be odd if the admin
> configures his external_acl_type differently than expected by the
> helper..

I vote -1 for this, basically it is a headache-maker. Unless we let
each 'token' of the line sent to external helper to be a 'level'.
This would lead to potentially more hash_lookup calls (which should be
fast anyway).

> With the lack of %DATA above this approach fails if the data from the
> acl is more important than some request details.

Reordering is needed in these cases. "Combining" provides "reordering".

> Another approach would be to mark the arguments per their key detail
> level. With this approach %DATA is not needed as the request parameters
> do not need to be sorted on their detail level and could even be
> extended into alternate priorities. However it shares the first problem
> above (if it is a problem..).

I don't understand why this shares the first problem (about external
helper being aware of the key levels).

> The key detail level markup provides the most flexible solution ofthe .
> But may be too complex for the admin.. but I suppose nothing stops using
> a combination to provide the best of both as the first is a subset of
> the detail level markup, with the level increasing per level marker..

key detail level markup could be good; but the result of any format we
choose for indicating key level could be the result of some of the
tags (%{Cookie}, %{EXT_USER}).

Personally, I am against using complex structures like XML/HTTP-alike
for the 'level' idea, mainly because it make helpers more complex and
more CPU/RAM intensive, and the hole idea is to keep them small &
simple. If providing 'cache level' along with 'reordering' makes
things too complex, sysadmin can always fallback to just use more than
one external acl (each external acl for each possible 'level').

I am aware that if we implement key-token-combining, we might need to
perform 2**N hash_lookup calls (where N is the number of tokens
present in the external acl key). If the 'combining' idea wins, this
O(2**N) issue must be tackled (using another data structure seems the
most promising alternative).

Regards,

-- 
Gonzalo A. Arana
Received on Mon May 22 2006 - 20:17:20 MDT

This archive was generated by hypermail pre-2.1.9 : Thu Jun 01 2006 - 12:00:04 MDT