Re: parsing quoted-string HTTP header fields

From: Tsantilas Christos <chtsanti_at_users.sourceforge.net>
Date: Thu, 26 May 2011 10:02:32 +0300

On 05/26/2011 07:42 AM, Amos Jeffries wrote:
> On Wed, 25 May 2011 20:07:13 +0300, Tsantilas Christos wrote:
>> Here are two patches the first implements the (1) and the second the (3).
>
> Attach?
oops!
Now attsached.

>
>> The second patch does not look slower, but maybe we do not want to be
>> strict for many reasons. One reason is that this function used to
>> "Configurable SSL error details messages" patch, to parse the "descr:"
>> and "detail:" fields :-)
>> The other reason is that it is usual to find web clients or servers
>> which uses a simple "\n" instead of the "\r\n"...
>>
>
> I'm thinking a copy process may be slower than a basic scan, due to new
> memory allocations etc.
> Not necessarily true, waiting to see those patches. :)
>
> Amos
>
>>
>> On 05/25/2011 02:51 PM, Amos Jeffries wrote:
>>> On 25/05/11 22:44, Tsantilas Christos wrote:
>>>> The quoted string fields parsing done in squid3 using the
>>>> httpHeaderParseQuotedString function (HttpHeaderTools.cc file)
>>>>
>>>> I found that we do not support multiline quoted string fields, but
>>>> according the rfc2616 multiline quoted string fields should supported:
>>>>
>>>> quoted-string = ( <"> *(qdtext | quoted-pair ) <"> )
>>>> qdtext = <any TEXT except <">>
>>>> TEXT = <any OCTET except CTLs,
>>>> but including LWS>
>>>> LWS = [CRLF] 1*( SP | HT
>>>>
>>>>
>>>>
>>>> I am planning to fix the httpHeaderParseQuotedString function, using
>>>> one
>>>> of the following rules:
>>>>
>>>> 1) Just ignore any "\r" or "\n" character. This is the fastest and
>>>> simpler approach
>>>> 3) Require "\r\n " or "\r\n\t" as line separator and just remove the
>>>> "\r\n"
>>>> 3) Require "\r\n " or "\r\n\t" as line separator and replace it with a
>>>> space
>>>> 4) Also support "\n\t" or "\n\n\t" as line separators.
>>>
>>> (4) case #2 "\n\n\t" is also non-compliant.
>>>
>>>>
>>>> Any comments or suggestions?
>>>
>>> RFC states "A recipient MAY replace any linear white space with a single
>>> SP".
>>>
>>> So (1) and the second (3) {replace with 0x20} are the valid options.
>>>
>>> It may be worth having two quoted-string parsers. One for each option.
>>> The preference being (1) for its higher performance.
>>>
>>> Amos
>
>

Received on Thu May 26 2011 - 07:02:51 MDT

This archive was generated by hypermail 2.2.0 : Thu May 26 2011 - 12:00:06 MDT