[PREVIEW] Dechunk incoming requests as needed and pipeline them to the server side.

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Sun, 05 Sep 2010 02:53:58 -0600

Hello,

     This is the second revision of the patch. The server-side now sends
chunked requests if the original request was chunk. There is one
relatively big XXX that I could not resolve so far. See notes below or
search for WE_KNOW_HOW_TO_SEND_ERRORS in the patch. Fixes welcome!

Amos, your polishing comments for the first revision has not been
addressed yet, but I am not ignoring them.

Thank you,

Alex.

---------- draft commit message ----------

Dechunk incoming requests as needed and pipeline them to the server side.

The server side always chunks the request if the original request was
chunked. No next hop version checks are performed.

* Client-side:

Removed clientIsRequestBodyValid() as unused. It was called with a
content-length>0 precondition that made the function always return true.

Removed old dechunking hack that was trying to buffering the entire
request body, pretending that we are still reading the headers. Adjusted
related code. More work may be needed to identify client-side code that
assumes the request size is always known.

Removed ConnStateData::bodySizeLeft() because we do not always know how
much body is left to read -- chunked requests do not have known sizes
until we read the last-chunk. Moreover, it was possibly used wrong
because sometimes we want to know whether we want to comm_read more body
bytes and sometimes we want to know whether we want to "produce" more
body bytes (i.e., copy already read bytes into the BodyPipe buffer,
which can get full).

Added ConnStateData::mayNeedToReadMoreBody() to replace
conn->bodySizeLeft() with something more usable and precise.

XXX: If there is a chunks parsing error, the new code just resets the
connection. I tried to find a way to send an error page to the client,
but failed to do so. It is easy to do when the headers and the body
prefix is parsed, but if the error is send later, the server-side may
start sending us its response, and the two responses would clash,
causing assertions. I do not know how to fully avoid that. Search for
WE_KNOW_HOW_TO_SEND_ERRORS.

Tried to break deep recursion/iteration around clientParseRequest. When
chunked parser fails during the message prefix parsing, the rest of the
code may decide that the connection is no longer used (and that there is
no pending transaction, even though the currentobject member is not
NULL!) and start parsing the second request. If that second parse fails
(for example), there will be two concurrent errors to be sent to the
client and the client-side code cannot handle that. However, due to the
XXX above, we never send an error when chunking parser fails, making
most of the related code polishing useless, at least for now.

Removed my wrong XXX related to closing after initiateClose.

Removed my(?) XXX related to endless chunked requests. There is nothing
special about them, I guess, as a non-chunked request can be virtually
endless as well if it has a huge Content-Length value.

Use commIsHalfClosed() instead of fd_table[fd].flags.socket_eof for
consistency with other client-side code and to improve readability. I
think these should return the same value in our context but I am not sure.

Correctly handle identity encoding. TODO: HTTPbis dropped it. Should we?

* Server-side:

Separated "received whole request body" state from "sent whole request
body". When we chunk requests, we need to add last-chunk. Thus, we may
receive (and written) the whole raw body but still need to write
last-chunk. This is not trivial because we should not write last-chunk
if the body producer aborted. XXX: check all pipe->exhausted() callers
to make sure all code has been adjusted.

Added getMoreRequestBody() virtual method that Server uses to get
encoded body bytes from its kids. FTP does not encode and uses default
implementation.

Fixed HTTP/FTP doneSendingRequestBody() to call its parent. I am not
sure it helps with correctly processing transactions, but the parent
method was designed to be called, and calling it make the transaction
state more clear.

Moved "broken POSTS" handling code into its own method and polished it
(HttpStateData::finishingBrokenPost). We now skip the "broken POSTS" fix
if the request is chunked.

Resolved old XXX: HttpStateData::handleRequestBodyProducerAborted() was
indeed doing nothing useful despite all the pretense. Now it aborts the
transaction.

fwd->handleUnregisteredServerEnd() was called without calling
fwd->unregister() first. It led to assertions because FwdState was being
aborted (on fd closure) without any error being set. Now fixed because
fwd->unregister() removes FwdState's close handler and FwdState
destructor does not require an error to be set.

Received on Sun Sep 05 2010 - 08:54:14 MDT

This archive was generated by hypermail 2.2.0 : Sun Sep 05 2010 - 12:00:04 MDT