Re: [PATCH v2] new request-line parser and unit tests

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 01 Sep 2010 17:33:33 -0600

On 09/01/2010 04:28 AM, Amos Jeffries wrote:
> Amos Jeffries wrote:
>> Split into two patches.
>>
>> The parser_unittests.patch adds tests for all the request line
>> permutations of content and garbage I could think of over the last few
>> days with their RFC 1945 and RFC 2616 expected outcomes. On the cases
>> which were not mentioned, ie binary crap, I've erred to leniency
>> accepting as 'unknown' content in the field with relevant cases
>> assumed to be HTTP/0.9.
>>
>>
>> The request-line parser existing in squid failed on several types of
>> test, mostly the ones centered around missing or invalid URIs.
>>
>> So, parser_changes.patch contains an upgrade/fix to
>> HttpParserParseReqLine() handling HTTP request-lines as specific by
>> section 5.1 of the above RFCs. Specifically to handle a sequence of
>> unknown bytes up to a terminating LF (\n) octet. Nothing outside that
>> is relevant to this parser.
>>
>> Changes:
>> * The semantics as previously documented are taken on. No changes
>> there, but documentation clarified a bit. Some things previously not
>> erroring are now doing so. External code impact is in the nature of
>> reduced special cases to be handled. Specifically raw-CR weirdness in
>> the request line fields. This occuring in URL was a vulnerability at
>> least once.
>>
>> * Prior updates to HttpParser object for other parse stages opens the
>> possibility of this parse action returning HTTP status code directly.
>> Additions are done to make use of this (with the existing status codes
>> only).
>>
>> * Input permutations where the unit-tests showed the old parser was
>> violating its own documentation have been fixed to produce expected
>> outputs.
>>
>> * Old parser operated three distinct potentially long parse loops.
>> Added several local variables to remember various octets seen while
>> searching for the terminal LF. This removed the need for two of the
>> parse re-scans (length of method, length of URI).
>>
>>
>>
>> Future work TODO but outside the scope of this update and not going to
>> be added:
>>
>> * making this part of HttpParser
>> * making HttpParser a trickle-scanner to speed up large request line
>> parsing.
>> * adding 6xx codes for internal use producing better errors on several
>> interesting error cases identified by this parser.
>> * using HttpParser fields output from this to seed the following
>> parser stages and prevent re-scanning this line to find bits.
>> Particularly the HttpRequest sanity checks.
>>
>>
>
> Updated version meeting Alex comments.
>
> made into a member of HttpParser without gotos.
>
> relaxed_header_parser will enable it to also skip prefix whitespace
> (space character only) and multiple-\r sequences at the end of line.
>
> --enable-http-violations is still required before it will accept
> non-HTTP version types 'downgraded' to HTTP/0.9

Thank you for addressing my comments and improving the code.

FWIW, we have tested your patch with Co-Advisor, and it does not break
any non-caching cases.

Alex.
Received on Wed Sep 01 2010 - 23:33:45 MDT

This archive was generated by hypermail 2.2.0 : Thu Sep 02 2010 - 12:00:03 MDT