Re: [RFC] Annotation quoting

From: Tsantilas Christos <chtsanti_at_users.sourceforge.net>
Date: Tue, 18 Jun 2013 13:35:56 +0300

On 06/18/2013 12:50 PM, Amos Jeffries wrote:
> On 18/06/2013 8:21 p.m., Tsantilas Christos wrote:
>> Some customers complaining that their annotations (set by url_rewriter)
>> look wrong when logged to access.log. Here are a few examples of logged
>> annotations:
>> %22-%22
>> %22Default_Google%22
>> %22pg13,k12%22
>
> Where are those %22 characters coming from?

Please see my comment bellow ("Currently the logging code when needs...")

> They are not part of HTTP/ICAP header syntax, and quoting shoudl be
> stripped by the helper on receipt. So this is something special being
> added somewhere along the line.

Yep. The quoting just added to log the anotation. But currently looks
that it is done with a bad way..

>
>> Currently the logging code when needs to log an annotation:
>> 1) first check for each note value if quotation needed (it uses the
>> ConfigParser::QuoteString, so it adds quotes if any non alphanumeric
>> char exist in string)
>> 2) then apply logformat quoting . This is means by default just do a
>> url encoding before print.
>>
>>
>> If we just remove the default logging quote, the values will be printed
>> inside quotes (""), but if the user wants to change quoting style, will
>> have the same problems.
>> If we remove quotes from values we may have problems, because
>> annotations may include comas (',') or spaces which will confuse logging.
>>
>>
>> Looks that we have the following choices to fix this:
>>
>> 1. Do nothing. Claim that we do not support values with commas even
>> though they can be passed to Squid (and Squid will correctly interpret
>> them) by helpers and HTTP/ICAP/eCAP agents. This is pretty bad because
>> customers want to use such annotations (and comma is a natural delimiter
>> in many cases).
>
> Test the logformat quoting syntax is working and inform the customers
> how to use it to customize quoting.
>
> %'{X-Name:;}note %"{X-Name:;}note %[{X-Name:;}note %#{X-Name:;}note
>
> Which of the four above tokens displays the log field the way they want?

Assume we have thw following anotations:
  X-Name: test
  X-Name: a,sec-test

The %'{X-Name}note will print:
  test, "a,sec-test"

(Note the space after first comma)

The %"{X-Name}note will print
  test, \"a,sec-test\"

The %[{X-Name}note will print:
 test, "a,sec-test"

The %#{X-Name}note will print
test,%20%22a,sec-test%22

>
>
>> 2. Encode individual annotation values separately while encoding
>> in-value commas if needed (e.g., when using URL-encoding). This solves
>> the problem but adds overhead. It also makes logging somewhat
>> inconsistent: HTTP values are encoded as a one string, not individually,
>> and in-value HTTP commas are not encoded (but possibly should?).
>>
>> 3. Make the value delimiter configurable. The admin may be able to use a
>> delimiter string that will not clash with characters used inside
>> annotations. This is not ideal because the admin has to know in advance
>> what annotations are possible (to avoid clashes). This option is
>> probably easier to implement than option #2.
>>
>>
>> Personally I prefer 3. We can give the delimiter as argument in %note
>> formating code. eg:
>> %{X-Name:;}note
>
> Delimiter is not up for questino is it? the quoting syntax is the
> problem, and the logformat definition contains flexible quoting codecs.

The delimiter can be used for ecamples in cases there is coma in values.
For example
  X-Name: test
  X-Name: a,sec-test

We may want to use %{X-Name:;}note to print:
   test;a,sec-test

>
> Amos
>
Received on Tue Jun 18 2013 - 10:36:12 MDT

This archive was generated by hypermail 2.2.0 : Tue Jun 18 2013 - 12:00:08 MDT