Re: [RFC] Annotation quoting

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 18 Jun 2013 21:50:26 +1200

On 18/06/2013 8:21 p.m., Tsantilas Christos wrote:
> Some customers complaining that their annotations (set by url_rewriter)
> look wrong when logged to access.log. Here are a few examples of logged
> annotations:
> %22-%22
> %22Default_Google%22
> %22pg13,k12%22

Where are those %22 characters coming from?
  They are not part of HTTP/ICAP header syntax, and quoting shoudl be
stripped by the helper on receipt. So this is something special being
added somewhere along the line.

> Currently the logging code when needs to log an annotation:
> 1) first check for each note value if quotation needed (it uses the
> ConfigParser::QuoteString, so it adds quotes if any non alphanumeric
> char exist in string)
> 2) then apply logformat quoting . This is means by default just do a
> url encoding before print.
>
>
> If we just remove the default logging quote, the values will be printed
> inside quotes (""), but if the user wants to change quoting style, will
> have the same problems.
> If we remove quotes from values we may have problems, because
> annotations may include comas (',') or spaces which will confuse logging.
>
>
> Looks that we have the following choices to fix this:
>
> 1. Do nothing. Claim that we do not support values with commas even
> though they can be passed to Squid (and Squid will correctly interpret
> them) by helpers and HTTP/ICAP/eCAP agents. This is pretty bad because
> customers want to use such annotations (and comma is a natural delimiter
> in many cases).

Test the logformat quoting syntax is working and inform the customers
how to use it to customize quoting.

     %'{X-Name:;}note %"{X-Name:;}note %[{X-Name:;}note %#{X-Name:;}note

Which of the four above tokens displays the log field the way they want?

> 2. Encode individual annotation values separately while encoding
> in-value commas if needed (e.g., when using URL-encoding). This solves
> the problem but adds overhead. It also makes logging somewhat
> inconsistent: HTTP values are encoded as a one string, not individually,
> and in-value HTTP commas are not encoded (but possibly should?).
>
> 3. Make the value delimiter configurable. The admin may be able to use a
> delimiter string that will not clash with characters used inside
> annotations. This is not ideal because the admin has to know in advance
> what annotations are possible (to avoid clashes). This option is
> probably easier to implement than option #2.
>
>
> Personally I prefer 3. We can give the delimiter as argument in %note
> formating code. eg:
> %{X-Name:;}note

Delimiter is not up for questino is it? the quoting syntax is the
problem, and the logformat definition contains flexible quoting codecs.

Amos
Received on Tue Jun 18 2013 - 09:50:42 MDT

This archive was generated by hypermail 2.2.0 : Tue Jun 18 2013 - 12:00:08 MDT