Re: [PATCH] Enhanced access logging, added ICAP logging, chunked requests, bug #2495 fix.

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 30 Jun 2009 11:09:11 +1200

On Sun, 28 Jun 2009 01:46:12 -0600, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On 06/28/2009 12:45 AM, Amos Jeffries wrote:
>
>> I note that there are a lot of different fixes going on here. Is it easy
>> to pull out these ones for separate merges?
>>
>> - Bug 2495
>> - chunked requests
>> - DNS timers and logging
>> - ICAP logging changes
>
> Chunked requests are relatively easy to pull.
>
> Bug 2495 can probably be pulled out, but DNS logging stuff is
> essentially a part of it because the new DnsLookupDetail class caries
> the wait time info which gets updated in the right places, all for
> logging. The two were developed together, at the same time. It would
> take a relatively long time to fully separate the two but it can be done.
>
> Most logging is too tightly integrated to be separated (and it probably
> makes no sense to do that anyway because it should be reviewed and used
> together).
>
> Please note that if this patch is approved, separately developed
> features will be merged (by bzr) separately. My understanding is that
> bzr will replay the original commit sequence so we will have separate
> commit messages for separate (but dependent) features.
>
> With the bzr magic caveat in mind, do you want me to spend time on
> separating chunked requests hack and possibly bug 2495 fix?

It would probably be worth it for the chunking break-out. Though the rest
sounds less simple. Lets see how big the 'other' patch is when chunking is
not part of it.

FYI I'm asking for two reasons:
 1) easier audit. I got through ~10% of the lines before falling asleep the
other night.

 2) My experience with the merge features is that bzr does its 'replay'
only internally for consustency.

In the public changesets we keep for patches the merge shows up as one
commit with the entire branch log as the changeset comment. see
http://www.squid-cache.org/Versions/v3/3.0/changesets/b8726.patch for an
example of what happened when I tried it for maintenance updates to
3.0.STABLE3.
 These days I only use it to sync devel branch to trunk as a "Merged from
trunk" change.
 For such big and important stuff as these breaking it out allows us to
keep at least a minor separating in the public changesets. That said, the
'breaking out' can be done using the -r features to cherry-pick for both
the merge and the audit diff rather than creating whole new branches.

>
> > I'm taking a closer look at this one in light of the logging
alterations
>> that I have coming up.
>>
>>
>> Your comments suggest ICAP/eCAP are going to be done outside access.log
>> so the bits for them do not belong in AccessLogEntry. I know ALE is a
>> mix of uses already, but no need to add even more garbage there before
>> we fix that bug. Is it easy to make IcapLogEntry a main object in its
>> own files, possibly inheriting from AccessLogEntry if you need the ALE
>> fields in the icap.log? If not we shall have to do it after the fact in
>> the logging updates, but I would rather have it clean beforehand.
>
> There are two groups of logging features, some go into access.log and
> some go into icap.log. Moreover, icap.log can log almost everything
> access.log can (but in a different scope).
>
> I doubt I can pull out IcapLogEntry and make it an ALE kid without
> redesigning old ALE-dependent code that assumes everything is in ALE. I
> think somebody already looked at that and the conclusion was that either
> we have to redesign the whole logging mess or keep adding to it. I
> wanted to save resources for more important stuff so I said "add to the
> mess". This is also related to the namespace question below.
>
> Please note that ALE is currently organized as a collection of
> scope-specific subclasses. The \todo I added was specifically to
> emphasize that I know that this is the wrong style, but I am not going
> to change it for now. Note the plural in "Inner class declarations" :-)

Okay. Understood. I'll add this as yet another task on the redesign then.

>
>> When splitting an existing code (Hs in this case) for inbound and
>> outbound. Please update them to use both directional < and > format
>> codes.
>
> OK.
>
>> The non-directional then needs to default to the old behavior
>> with a configu parse WARNING: about the change.
>>
>> We appear to be fallign into a predictable pattern for LFT. May as well
>> keep that up...
>
> Are you sure about the warnings? Every squid.conf that used a custom (or
> copied default) format will generate these warnings. I think this may
> annoy too many folks. I am not against adding support for both "<*" and
> "*" versions if you think I must, but I wonder if warnings should wait
> until most folks switch or start using to the documented/recommended
> fields?

No, we have other stuff such as the reply/request header split to be
identical situation. When the split happens the old format is deprecated
immediately. We need to start warning as early as possible so that when it
does get remove people have less excuses for not being ready.
The deprecation gets notified in four places: config parsing, release notes
for logformat, the roadmap deprecations list, and my next release
announcement where it takes effect.

It's only a level-1 warning that the format is now deprecated and will at
some undefined future time star to be fatal. Admin are used to this as long
as the message is polite and clear on its message, like: "%h is deprecated
and will be removed without further notice in a future release. Please use
%>h in your " << cfg_file << " file instead."
(assuming its the %>h that provides the old behavior).

>
>> These don't fit however. What is the difference now?
>>
>> + LFT_SENT_HTTP_CODE,
>> + LFT_RECEIVED_HTTP_CODE,
>
> One is sent by Squid to the client. The other one is received by Squid
> from the server. Or are you asking about some other difference?

That was the Q & A, I was looking for. I was not sure from the naming if it
was ICAP related. Seems not.

Naming these clearly seems a bother. I'd go for LFT_HTTP_*_CODE in line
with the other naming patterns then. Pending a more client-side/server-side
descriptive word than SENT/RECEIVED.

>
>>
>> These are ICAP headers right? but not with names 'Last-Matched-Icap:'
>> etc.? So can you make the ICAP ones all LFT_ICAP_*
>> - LFT_LAST_MATCHED_ICAP_HEADER,
>> - LFT_LAST_MATCHED_ICAP_HEADER_ELEM,
>> - LFT_LAST_MATCHED_ICAP_ALL_HEADERS,
>> + LFT_ICAP_LAST_MATCHED_HEADER,
>> + LFT_ICAP_LAST_MATCHED_HEADER_ELEM,
>> + LFT_ICAP_LAST_MATCHED_ALL_HEADERS,
>
> There are ICAP headers logged to access.log. I think this is why they do
> not start with LFT_ICAP_. I am happy to change the enums if you do not
> think it would be better the other way around.
>
> I am not sure what you mean by "with names 'Last-Matched-Icap:" but
> these will log actual ICAP header fields. There could be many header
> fields with the same name across multiple ICAP transactions (e.g., in a
> chain) so we use the "last matched" trick.

Okay, my explain; Now that we have multiple protocols being logged with
these LFT_* options. What I'm seeing almost there and working toward
making formal. Is that protocol-specific stuff gets called
LFT_<PROTOCOL>_* (already LFT_HTTP_*, LFT_HTCP_*, LFT_ICAP_* ).

So and _HTTP_ header called "Blah-ICAP:" if needs special logging tag gets
enum as LFT_HTTP_BLAH_ICAP_HEADER. TO differ it from an ICAP header "Blah:"
 which gets enum LFT_ICAP_BLAH_HEADER.

Clear as Mud?

>
>> Is the tag form: %adaptation_ and %icap_* set in stone? There was a very
>> nice 'namespacing' proposition a while back which never seems to appear.
>> Using the ':' separator syntax like so --> %icap::<field>, or
>> %adapt::icap::<field>
>> The HTTP tags are all reserved char-codes. So if we are adding a new tag
>> format on top it might as well be a nice extensible one.
>
> AFAIK, the person who was going to implement
> http://wiki.squid-cache.org/Features/AclNamespaces
> is no longer working on Squid. I still stand behind that proposal in
> principle, but will not have the time to implement it myself in the
> foreseeable future :-(.
>
> Also, do we support line \
> continuation in squid.conf? We are approaching the limit of \
> what can be managed on one line and namespaces will only make \
> things worse

I believe we do. There are comments to the effect in the config parser.
Will have to check if it works though.

Amos
Received on Mon Jun 29 2009 - 23:09:19 MDT

This archive was generated by hypermail 2.2.0 : Tue Jun 30 2009 - 12:00:06 MDT