[PATCH] new request-line parser and unit tests

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 01 Sep 2010 03:20:19 +1200

Split into two patches.

The parser_unittests.patch adds tests for all the request line
permutations of content and garbage I could think of over the last few
days with their RFC 1945 and RFC 2616 expected outcomes. On the cases
which were not mentioned, ie binary crap, I've erred to leniency
accepting as 'unknown' content in the field with relevant cases assumed
to be HTTP/0.9.

The request-line parser existing in squid failed on several types of
test, mostly the ones centered around missing or invalid URIs.

So, parser_changes.patch contains an upgrade/fix to
HttpParserParseReqLine() handling HTTP request-lines as specific by
section 5.1 of the above RFCs. Specifically to handle a sequence of
unknown bytes up to a terminating LF (\n) octet. Nothing outside that is
relevant to this parser.

Changes:
  * The semantics as previously documented are taken on. No changes
there, but documentation clarified a bit. Some things previously not
erroring are now doing so. External code impact is in the nature of
reduced special cases to be handled. Specifically raw-CR weirdness in
the request line fields. This occuring in URL was a vulnerability at
least once.

  * Prior updates to HttpParser object for other parse stages opens the
possibility of this parse action returning HTTP status code directly.
Additions are done to make use of this (with the existing status codes
only).

  * Input permutations where the unit-tests showed the old parser was
violating its own documentation have been fixed to produce expected outputs.

  * Old parser operated three distinct potentially long parse loops.
Added several local variables to remember various octets seen while
searching for the terminal LF. This removed the need for two of the
parse re-scans (length of method, length of URI).

Future work TODO but outside the scope of this update and not going to
be added:

  * making this part of HttpParser
  * making HttpParser a trickle-scanner to speed up large request line
parsing.
  * adding 6xx codes for internal use producing better errors on several
interesting error cases identified by this parser.
  * using HttpParser fields output from this to seed the following
parser stages and prevent re-scanning this line to find bits.
Particularly the HttpRequest sanity checks.

Amos

Received on Tue Aug 31 2010 - 15:20:29 MDT

This archive was generated by hypermail 2.2.0 : Tue Aug 31 2010 - 12:00:05 MDT