Re: [PATCH] HTTP Parser upgrade

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Wed, 01 Jan 2014 14:08:24 -0700

On 01/01/2014 05:15 AM, Amos Jeffries wrote:
> This patch renames the HttpParser class as Http1Parser (to differentiate
> from the future planned Http2Parser) and moves it into the Http namespace.
...
> Also, the HttpRequestMethod class is moved into the Http namespace and
> library to reduce dependencies on the parser class outside the library.

> Overall this does not seem to affect performance much.

Any performance testing performed on this patch to support the above
estimation?

> Http::Http1ParserPointer

The proposed naming approach results in awkward names with too many
"HTTP"s in them. I suggest a different arrangement:

  * namespace Http // stuff common to HTTP/1 and HTTP/2 handling
  * namespace Http::One // stuff specific to HTTP/1 handling
  * namespace Http::Two // stuff specific to future HTTP/2 handling
  * No Http prefixes for names inside the above three namespaces.

This alternative approach would result in

   Http::One::Parser
   Http::Two::ParserPointer

instead of

   Http::Http1Parser
   Http::Http2ParserPointer

Same length, but better looking names without repetition of information.
Moreover, after adding these global namespace aliases (to http/forward.h):

   namespace Http1 = Http::One;
   namespace Http2 = Http::Two;

we can use

   Http1::Parser
   Http2::ParserPointer

which are even shorter that the proposed names.

Note that declaring Http1 and Http2 namespaces outside of the Http
namespace (rather than namespace aliases) is probably a bad idea because

* it will force us to use Http:: prefixes inside Http1 and Http2
namespaces, which is kind of wrong, OR

* force us to add blank "use Http" statements, which opens another can
of warms, especially if used in header files.

The final sketch, putting all the naming-related pieces together:

> namespace Http {
> namespace One {
> class Parser ...
> };
> namespace Two {
> class Parser ...
> };
> };
> namespace Http1 = Http::One;
> namespace Http2 = Http::Two;

You can even place all HTTP/N-specific protocol code into src/http/N/
directories, but that may be an overkill today.

> === modified file 'src/client_side.cc'

> +prepareAcceleratedURL(ConnStateData * conn, ClientHttpRequest *http, const Http::Http1ParserPointer &hp)
...
> +prepareTransparentURL(ConnStateData * conn, ClientHttpRequest *http, const Http::Http1ParserPointer &hp)

If possible, pass references to the parser instead of pointers. All
other factors being equal, references are cheaper and safer than
pointers, so code that does not store an object (or pass an object for
storage by others) should accept a reference, especially when the object
is guaranteed to exist in the callers.

> + const char *url = hp->requestUri().c_str();
> if (*url != '/') {

Declare the "url" pointer itself as constant, to expose "url is no
longer the same as requestUri()" problems discussed further below.

> + const char *url = hp->requestUri().c_str();
> if (*url != '/') {

Please add an "XXX: performance regression" comment for the above if you
do not want to convert the rest of url-using code to use the new SBuf API.

> - int url_sz = strlen(url) + 32 + Config.appendDomainLen +
> + int url_sz = hp->requestUri().length() + 32 + Config.appendDomainLen +
> ...
> - int url_sz = strlen(url) + 32 + Config.appendDomainLen;
> + int url_sz = hp->requestUri().length() + 32 + Config.appendDomainLen;

AFAICT, there are cases where url is no longer the same as requestUri()
by the time you get to this code in prepareAcceleratedURL().

Please make url_sz constant if possible since you are modifying its
declaration. Same for url_sz in prepareTransparentURL().

> - if (*url != '/')
> + if (hp->requestUri()[0] != '/')
> return; /* already in good shape */

The old code "worked" for empty URLs (""). The new does not. I do not
know whether empty URLs are possible at this point. If they are not
possible, perhaps add a comment or even a Must()?

> + bool parsedOk = hp->parseRequest();

Please declare as constant if possible.

> === renamed file 'src/HttpParser.h' => 'src/http/Http1Parser.h'

> + /// the request-line URI if this is a request, or an empty string.
> + SBuf requestUri() const {return uri_;}

Consider returning "const SBuf &" to avoid refcounting overheads. Why is
this method in the general message parser? Should not it (and the
corresponding data member) be placed in HTTP *request* parser?

> + /// the HTTP method if this is a request method
> + const HttpRequestMethodPointer & method() const {return method_;}

Similar concerns here. This should be in a request parser IMO.

Also, s/a request method/a request message/ if method() stays in the
general message parser.

Finally, either this method should return a method object (rather than a
pointer to it) OR we should get rid of METHOD_NONE. Having two ways to
signal a missing method is too much IMO. The former is probably easier
and requires fewer code changes.

HTH,

Alex.
Received on Wed Jan 01 2014 - 21:08:27 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 02 2014 - 12:00:10 MST