Re: [PATCH] log format tokens upgrade

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 17 Nov 2011 00:47:15 +1300

On 8/11/2011 9:37 a.m., Alex Rousskov wrote:
> On 11/05/2011 05:13 AM, Amos Jeffries wrote:
>> This updates the format parser and storage objects in the Format::
>> namespace and separates some into separate files for SourceLayout
>> project dependency reductions.
>>
>> Functionally it adds a registration API so components can register
>> themselves an array of tokens in a namespace. Registering the arbitrary
>> namespace "example" with some tokens ("a","b") will cause the parser to
>> accept those tokens in a logging format like so: "%example::a %example::b".
>>
>> Due to time limitations resolving the library-library symbol
>> perculiarities in GCC I have had to cut short of moving the "adapt::"
>> and "icap::" tokens into their libraries. But have gone so far as to
>> make them use the registration process in this patch ready for that step.
>>
>> Future work:
>> - convert the error pages to use format for the page body macros
>> - convert the %ssl_* tokens in src/ssl/* to use format and the
>> namespace "ssl::"
>> - convert external_acl_type to use formats for its helper input string.
>>
>> Amos
>> + /// array of tokens inside this namespace
>> + TokenTableEntry const *tokenSet;
> AFAICT, 3rd-party code modifies this set. Should we prohibit copying and
> assigning of TokenNamespace objects then?
>

It may be modified by the component which registered it. There are only
two limits: that the pointer is kept valid until shutdown, and that the
array ends with the pair {NULL,LFT_NONE}

>
>> - {"<st", LFT_ICAP_BYTES_READ},
>> + {">st", LFT_ICAP_BYTES_SENT},
>> + {"<st", LFT_ICAP_BYTES_READ},
>> {"<bs", LFT_ICAP_BODY_BYTES_READ},
>>
>> {">h", LFT_ICAP_REQ_HEADER},
>> {"<h", LFT_ICAP_REP_HEADER},
>>
>> {"tr", LFT_ICAP_TR_RESPONSE_TIME},
>> - {"tio", LFT_ICAP_IO_TIME},
>> + {"tio", LFT_ICAP_IO_TIME},
>> {"to", LFT_ICAP_OUTCOME},
>> {"Hs", LFT_ICAP_STATUS_CODE},
>>
>> - {NULL, LFT_NONE} /* this must be last */
>> + {NULL, LFT_NONE} /* this must be last */
> Whitespace non-changes?

Yes. Polish within scope as the array is altered.

>> === modified file 'src/main.cc'
>> --- src/main.cc 2011-11-03 10:02:02 +0000
>> +++ src/main.cc 2011-11-05 09:41:52 +0000
>> @@ -63,6 +63,7 @@
>> #include "event.h"
>> #include "EventLoop.h"
>> #include "ExternalACL.h"
>> +#include "format/Token.h"
>> #include "fs/Module.h"
>> #include "PeerSelectState.h"
>> #include "Store.h"
>> @@ -1105,6 +1106,8 @@
>> // TODO: pconn is a good candidate for new-style registration
>> // PconnModule::GetInstance()->registerWithCacheManager();
>> // moved to PconnModule::PconnModule()
>> +
>> + Format::Token::Init();
>> }
>>
>> if (IamPrimaryProcess()) {
>
> Consider removing this dependency or even Format::Token::Init() as such
> by using RunnersRegistry API. I understand you may not have time for
> that, but perhaps you can add a corresponding TODO to discourage more
> code being added to Format::Token::Init()?

I gave it a try but run up against time issues. This is for now matching
the ACL and Auth registrations methodology.

I would really like the other libraries to register and manage their own
token arrays. This is pretty much a requirement before the ssl::
namespace can be merged. So this is definitely going to get another look
long term.

>
>> === renamed file 'src/format/Tokens.h' => 'src/format/Token.h'
>> --- src/format/Tokens.h 2011-10-13 17:05:25 +0000
>> +++ src/format/Token.h 2011-11-05 08:38:12 +0000
>> @@ -1,5 +1,7 @@
>> -#ifndef _SQUID_FMT_TOKENS_H
>> -#define _SQUID_FMT_TOKENS_H
>> +#ifndef _SQUID_FORMAT_TOKEN_H
>> +#define _SQUID_FORMAT_TOKEN_H
>> +
>> +#include "format/TokenTableEntry.h"
> Can you avoid this dependency by pre-declaring class TokenTableEntry? It
> looks like we are only using a pointer to TokenTableEntry in
> src/format/Token.h, no?

Token.h uses the ByteCode and Quoting which it pulls in. Switched the
dependency to ByteCode.h.

The remainder of the notes all done. New patch coming after the test run.

>
>
> The above are all minor polishing touches. I did not see any bugs, and I
> cannot say I understand this code well enough to find them.

Before the patch we have one global array of tokens to scan sequentially
when parsing formats. After the change we have a tree (linked-list of
arrays) with branches selected by prefix (aka "namespace"). Reducing the
total scan time for namespace prefixed tokens, at some minimal expense
to the older generic tokens.

Amos
Received on Wed Nov 16 2011 - 11:47:24 MST

This archive was generated by hypermail 2.2.0 : Thu Nov 17 2011 - 12:00:04 MST