Re: [PATCH] HttpRequest::helperNotes to NotePairs

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 19 Mar 2013 14:31:14 -0600

On 03/19/2013 01:36 PM, Tsantilas Christos wrote:
> On 03/16/2013 01:52 AM, Alex Rousskov wrote:
>> On 03/15/2013 05:18 PM, Amos Jeffries wrote:
>>> On 16/03/2013 7:11 a.m., Tsantilas Christos wrote:
>>>> This patch converts the HttpRequest::helperNotes to a NotePairs object
>>>> This patch is similar with the one posted under the "Helper and Notes"
>>>> thread and just add a new method:
>>>> NotePairs::hasPair to check if a key/value pair is stored
>>>>
>>>>
>>>> We need to discuss, how to handle multiple notes with the same key.
>>>> Assume that you have the following answer from a helper:
>>>> OK user="J, Smith" group="group1,A" group=group2
>>>>
>>>> Currently (with or without this patch), formating code "%note{group}"
>>>> will print:
>>>> "group1,A,group2"
>>>>
>>>> The formating code "%note" will print:
>>>> "user: J, Smith\r\ngroup: group1,A\r\ngroup: group2"
>>>> I must note here that the two "group" kv-pair keys will be printed as
>>>> separate entries.
>>>>
>>>> How should we handle this case? Is it OK as is?
>>>>
>>>> We need to decide if the ',' will be used as separator or as valid
>>>> values character.

> My question has to do with the way we are interpreting the helpers response.
> The current code allow helper answers in the form:
> OK user="\"J, Smith\",chtsanti" user="Alex"
>
> This one will store the following notes:
> user: J, Smith
> user: chtsanti
> user: Alex
>
> Is it OK?

No, there are only two user values in the above response IMO:

   value #1: "J, Smith",chtsanti
   value #2: Alex

The quoted comma should not become a value delimiter -- it is a part of
the first value.

> An alternate method can be to support the following.
> helper response:
> OK user="J, Smith",chtsanti user=Alex
> be stored as the following notes:
> user: J, Smith
> user: chtsanti
> user: Alex

Yes, the above looks correct to me.

>>> IMO when we are printing the syntax key '=' value, it makes a lot of
>>> sense to use the HTTP ABNF format which uses both encoded token or
>>> 'bare' quoted-string ("...") for any value with reserved characters such
>>> as '',' or ':' in our notes.
>>
>> And I think we already have code somewhere that smartly adds quotes to
>> values that should be quoted while not quoting simple tokens (a common
>> case).
>
> I did not found....

ConfigParser::QuoteString().

>>> The simple way to do that is to just
>>> quoted-string all values on %note and print as duplicate keys in both
>>> formats.
>>
>> It is better not to over-quote for performance and aesthetic reasons, I
>> think. It also helps to match what folks write in squid.conf with what
>> Squid shows in logs. If a token does not contain special characters,
>> let's not quote it. A comma can be declared a special character, of course.
>
> The %note currenlty is a little strange... We are urlencoding the result
> before print it so any '"' or other special character will be printed
> urlencoded.
> This is the policy we are using for printing headers. I think is not so
> bad, but maybe the result looks a little strange . The "%{user}note"
> will print:
> %22J,%20Smith%22,chtsanti,Alex

We need to print values as close to their preferred input form as
possible. Contexts that do not URL-decode notes, should not URL-encode
them. Contexts that do URL-decode values, should URL-encode them (unless
support for URL decoding is deprecated). Otherwise, folks get really
confused when trying to troubleshoot issues. I have already seen that
confusion happen in cases where admins were upgrading their helper scripts.

I hope URL decoding of notes is limited to [old] helper responses. Can
we deprecate that practice? If yes, we should not URL-encode notes when
reporting, sending, or logging them (by default).

>>> Please don't implement hasPair using hasByNameListMember(). The whole
>>> set of HttpHeader 'strList' manipulator functions are rather nasty in
>>> the way they build the headers into strings before re-parsing the
>>> values. They also take a side-trip through the HTTP reserved headers
>>> matching which is completely unnecessary for Notes.
>>> The simple loops you have for find() should work a lot faster despite
>>> the string comparision on each key.

>> If the existing API does what we need, should not we use it unless its
>> speed makes it really unacceptable? IMO, this does not sound like such a
>> critical code path that we should not use an existing API.

> If we agree that the current helper response interpreting is OK then the
> hasByNameListMember is helpful to be used in hasPair.
> The hasPair currently used in ICAP/eCAP interface to avoid store
> duplicate values for a key.
> It can be used in an acl note too...

Does hasByNameListMember() handle quoted commas correctly? If yes, I
think we should keep using it. If not, I think we should fix it (and
keep using its API).

$0.02,

Alex.
Received on Tue Mar 19 2013 - 20:31:16 MDT

This archive was generated by hypermail 2.2.0 : Wed Mar 20 2013 - 12:00:09 MDT