Re: [RFC] %Sf to log request state flags

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 16 Jun 2011 09:29:24 -0600

On 06/15/2011 05:36 PM, Amos Jeffries wrote:
> On Wed, 15 Jun 2011 14:31:35 -0600, Alex Rousskov wrote:
>> On 06/12/2011 01:28 AM, Amos Jeffries wrote:
>>> On 21/05/11 12:28, Amos Jeffries wrote:
>>>> On 21/05/11 03:58, Alex Rousskov wrote:
>>>>> On 05/20/2011 08:56 AM, Amos Jeffries wrote:
>>>>>> I'm now looking at adding %Sf code to log the handling flags Squid
>>>>>> works
>>>>>> with for modes etc.
>>>>>>
>>>>>> Anyone have ideas on exact what to put in the log?
>>>>>> coded characters? (one, two, a whole word?)
>>>>>
>>>>> Can you give a few examples of what you call the handling or state
>>>>> flags?
>>>>
>>>> src/HttpRequestFlags.h
>>>>
>>>> The "mode" and type ones (accel, tproxy/spoof, intercepted,
>>>> transparent,
>>>> sslbump, internal, ims, range) and the major behaviour indicators
>>>> (redirected, cacheable, nocache, ignore_cc, auth, loopdetect,
>>>> chunked_reply, stale, adapted).
>>>>
>>>> Thats probably no all we will want to log, but the ones that stand out
>>>> right now as needing a mention.
>>>>
>>>> Amos
>>>
>>> I've broken that into three groups. One about the traffic "mode" type,
>>> one about what we do during HTTP handling, and one about what happens
>>> durign teh optionl adaptations.
>>>
>>> Going through the list of things done and flags floating around. Also
>>> assuming one character per flag I get this:
>>>
>>>
>>> BNF form:
>>>
>>> squid-flags ::= port-mode '-' http-alt [ '-' adapts ]
>>>
>>> port-mode ::= 'P' [ 'a' | 'f' | 'I' | 'i' | 's' | 't' | 'z' ]
>>>
>>> http-alt ::= 'H' [ 'l' | 'm' | 'o' | 'r' | 'u' | 'z' ]+
>>>
>>> adapts ::= 'A' [ 'E' | 'e' | 'i' | 's' | 'w' | 'x' | 'z' ]+
>>>
>>>
>>>
>>> port-mode tags meaning:
>
> Flags which outline the transit type (aka port "mode"). Derived from the
> http_port/https_port configuration directives. May contain deduced flags
> which were omitted from the config file definition.
>
>>>
>>> a - accel / reverse
>>> f - default / forward
>>> I - internal request
>>> i - interception (NAT)
>>> s - spoofing (TPROXY)
>>> t - transparent (HTTP definition)
>>> z - encrypted traffic
>>
>> It does not matter for now, but I think these should be documented using
>> specific squid.conf examples. Otherwise, it is not 100% clear (to me
>> anyway) that they are mutually exclusive and what some of them really
>> mean.
>>
>> I would also allow multiple flags for port-mode because a single
>> http_port may have multiple attributes/modes that apply to the same
>> transaction.
>
> Sorry missed the "+" on that BNF. No these are not mutually exclusive
> for the purposes of %Sf. The ones which are are documented on the
> http_port directive where the relevance is greater.
>
>>
>>
>>> http-alt tags meaning:
>>>
>
> Flags which outline the manipulations and operations undertaken in the
> request-line and MiME header sections.
> Also request/reply alterations undertaken in according with RFC 2616 due
> to details in those sections. (for now only [de-]chunking)
>
>>> l - loop detected
>>> m - HTTP mangling adaptations (non-violation changes)
>>> o - ignore client Cache-Control
>>> r - HTTP Redirect by Squid (not by origins)
>>> u - HTTP Upgrade (1.0->1.1 required changes, etc)
>>> v - HTTP protocol violation
>>> z - HTTP transfer encoding mux (chunking etc)
>>
>> All of the above seem optional. Let's make this group of state flags
>> optional, just like adaptation flags below. I would also generalize the
>> format so that more groups can be added in the future:
>>
>> squid-flags ::= flags-group [ '-' flags-group ]*
>> flags-group ::= port-flags | http-flags | adapt-flags | other-flags
>> ...
>
> Okay. Seems reasonable.
>
>>
>>
>>> adapts tags meaning:
>
> Adaptation manipulations which involve the request or reply body.

Squid does not always know whether the body was adapted by eCAP or ICAP.
And in many cases, header adaptations are as important as body
adaptations. I would not exclude header adaptations and define the
adapt-flags group as

  Request and response adaptation flags (ICAP, eCAP, URL rewriting,
  ESI, etc.) and man-in-the-middle message decryption flags (SslBump).

I would probably exclude SslBump from this group because it "adapts" on
a very different level, but that is your call.

>>> E - ESI constructed reply
>>> e - eCAP adapted
>>> i - ICAP adapted
>>> s - SSL bump
>>> w - URL re-write
>>> x - cross-protocol gateway (FTP->HTTP, Gopher->HTTP, etc)
>>> z - encrypted / decrypted (ie gzip)
>>
>> For eCAP and ICAP, should we distinguish REQMOD from RESPMOD?
>
> Maybe. If so, we make ESI "s". *CAP case sensitive REQMOD (e|i), RESPMOD
> (E|I). Dropping ssl-bump into "z".
>
>>
>>
>> I do not want to fight over specific flag placements, but please try to
>> better define what each group means (or what it is for). A more precise
>> and detailed definition will help those who want to add new flags or new
>> groups.
>
>
>
> The definition here was for things which involve manipulation of the
> body. ssl-bump log entry for the CONNECT fits in here as "z" (decrypted
> the body) or something similar. The decrypted sub-requests get their own
> log entries and ssl-bump there fits in the http_port flag group as "z"
> (encrypted transit).

I do not understand why you want to separate the body from the headers
as far as flags are concerned. I would focus on modules or algorithms
that touched the message first. Which part of the message was touched
may be important as well, but seems like a secondary concern and we
cannot always tell.

>> Finally, we may want to reserve "+" or a similar symbol to be used for
>> multi-letter flags when we run out of single symbols or decide we want
>> to log more than just flags.
>
> Reserving all non-alphanumeric for now would be good.

Thank you,

Alex.
Received on Thu Jun 16 2011 - 15:30:25 MDT

This archive was generated by hypermail 2.2.0 : Fri Jun 17 2011 - 12:00:04 MDT