Re: [PATCH] HTTP/2.0 initial support

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sat, 16 Nov 2013 05:21:54 +1300

New patch attached for review.

On 24/08/2013 3:08 a.m., Alex Rousskov wrote:
> On 08/22/2013 11:57 PM, Amos Jeffries wrote:
>> On 23/08/2013 5:11 a.m., Alex Rousskov wrote:
>>> Perhaps more importantly, the way you parse PRI requests may violate
>>> draft-ietf-httpbis-http2-04 that seems to require that those requests
>>> start with a specific 24-octet magic, not with a general HTTP1 header
>>> followed by the 6-octet magic. Why not parse HTTP2 Connection Header
>>> correctly, without dealing with the apparently inappropriate HTTP1
>>> request parser at all??
>
>
>> The magic may still occur at any time thanks to Upgrade:, so the HTTP/1
>> parser avoidance can only be added after the
>> ConnStateData/ClientSocketContext/ClientHttpRequest interactions have
>> all been re-worked for layer separation and multiplexed framing HTTP/2
>> support added. I am not keep to do that all as one huge step.
>
> Perhaps you misunderstood my suggestion because it does not lead to any
> huge changes. In fact, it would probably simplify your patch. I suggest
> that the existing request parser is adjusted to work like this:
>
> if (the buffered request content is too short is the magic prefix)
> keep accumulating as if we did not receive the end of headers
> else if (the buffered request content starts with the magic)
> create the corresponding header structures (PRI v2.0)
> else
> old code
>
> The above is done in the parser so no caller will have to be changed
> unless it has to be changed to handle HTTP2 PRI correctly anyway. And
> all changes in the callers will deal with regular header structures. No
> magic, no offset adjustments, no custom HTTP2 parsing code.
>

Maybe we are miss-aligning on our definition of "parser". I mean
HttpParser the class.

The first problem here is that the HttpParser class is currently not
responsible for interpreting the first line. Only for splitting it up
into SP-delimited pieces and validating that the bytes forming each are
valid for the particular positions.

The second problem is that the rest of the magic after the first "line"
is not valid Mime syntax. If it is treated/parsed as headers by
client-side code we will end up dumping out "SM: " or ": SM" on the
server-side which is intentionally invalid HTTP/2 magic to detect that
kind of brokenness.

It is the parseHttpRequest() function which does the parsing
interpretation. And that is the piece of code I am changing with this patch.

So as far as I can see I *am* changing Squid completely as you are
proposing to be done.

NOTE: The shortest possible HTTP/1.1 request is *shorter* by 8 bytes
than the HTTP/2.0 magic prefix. And the shortest possible request-line
(HTTP/0.9) is shorter by 17 bytes.

# >
# > if (the buffered request content is too short is the magic prefix)
# > keep accumulating as if we did not receive the end of headers

 The loop calling HttpParser handles both HTTP/1 requests which can be
many bytes shorter than the HTTP/2 magic, and the first line of that
magic prefix. It keeps accumulating bytes until a first-line syntax has
been received.
  - there are no changes to this existing code required to handle the
HTTP/2 magic.

 The parseHttpRequest() function takes over from there, and checks for
the magic PRI prefix. If there are not enough bytes for the rest of the
prefix is also keeps accumulating, OR handles the HTTP/1 request as before.
  - this is where that isHttp20MagicPri if-condition is being added, and
what it is doing.

# > else if (the buffered request content starts with the magic)
# > create the corresponding header structures (PRI v2.0)

- creating the corresponding header structures is what "old code" does.

- this is special-case redundant, since the HttpRequest structure is
generated from the HttpParser class structure and the parse done already
in order to find the first-line syntax which are smaller than magic
sequence length.

# > else
# > old code

 The parseHttpRequest() after the isHttp20MagicPri if-condition code
does all this. Including doing the structure creation correctly for the
case where the magic PRI was found.

 - so instead I drop the "else" and do the old code whenever the magic
is valid OR PRI and HTTP/2.0 are not present.

>
>>> Why prohibit PRI HTTP/1.1 requests? That method was not reserved and,
>>> hence, ought to be a valid extension method for HTTP1.
>>
>> If PRI is being used as a method in HTTP/1 that is a problem for the
>> HTTP/2 magic and we need to start rejecting it
>
> I do not see why handling HTTP/1.x PRI as before is a problem. What
> exactly will it suddenly break?

What breaks is HTTP/2 failure detection. That method was specifically
chosen from unused names such that implementations treating unknown
HTTP/1.x request-line as HTTP 0.9 format and requests will "upgrade" it
according to HTTP/1 rules.

The result when they do that is the line "PRI * HTTP/1.1" ... this is
sign of broken middleware implementation *not* a sign of PRI being used
legitimately as method by HTTP/1.1 software.

Mark and the others went out of their way to identify tokens for the
request which were not used in existing traffic. So if anyone is using
the PRI request in HTTP/1 for any legitimate reason we need to be made
aware of that and bring that to the HTTPbis WG attention as fast as
possible so the magic can be changed. Allowing such requests to go
silently through Squid will only help to hide any such uses further.

>
>> because technically the
>> "PRI" is now a HTTP/1 extension method for signalling that HTTP/2 is
>> already started on this connection (SETTINGS frame will be in its
>> payload in binary form).
>
> That does not compute for me. Perhaps I do not understand how HTTP2 is
> supposed to work, but my current understanding is that no HTTP1 request
> can affect HTTP2 operation, including HTTP1 requests with method PRI.
>
> HTTP2 requires HTTP/2.0 text for the magic to match so it will never
> match any HTTP1 request, including HTTP1 PRI request. Thus, I do not see
> why we need to prohibit HTTP1 PRI or even treat it specially.
>
> If HTTP2 specs assign some special meaning to HTTP1 PRI, please point me
> to the right documentation.
>

The HTTP/2 specification does not cover handling of the PRI magic by
HTTP/1 software. That is expected to be handled according to the RFC
2616 or at least HTTPbis updated specifications.

* In HTTP/2.0 it is magic octets to be checked and discarded.

* In HTTP/1.1 it is a message to be processed and handled.
 - in 1.1-only forward/reverse proxy it should be rejected with HTTP/1.1
error.

 - interception proxy can either reject (we do right now), or pass-thru
and preserve end-to-end HTTP/2.0 (what this patch is trying to implement).

>
>>> 2) At memcpy() time, tunnelState->client.buf may already contain data
>>> from the server. Note the tunnelState->copy() call above.
>
> That was wrong, sorry. What I should have said is that at memcpy() time,
> we may be comm_writing server bytes to the client already. The following
> call in tunnelStartShoveling() starts writing server data to the client:
>
>> tunnelState->copy(tunnelState->server.len, tunnelState->server,
tunnelState->client, TunnelStateData::WriteClientDone);
>

It does not matter that we are *writing* to the client.conn->fd. The
shovelling memcpy() operation being done on the client is a *read*
buffer operation.

The shoveling is:
  * if there are server bytes already known (server.buf) then
shovel/write to client.fd
  * else read into server.buf from server.fd

* if there are client bytes already known (in ConnStateData) then memcpy
to client.buf and shovel/write from client.buf to server.fd
  * else, read into client.buf from cleint.fd

These two if-conditions can be done in either order as they are
operating on opposite directions of the traffic flow. At no point does
one of them alter state used by the other if-condition.

The write handlers alter the buffer which was written *from*, but only
shifting it ready for the next write. This is the reason we need to
memcpy into client.buf before writing; if not the write handler halts as
it identifies we just wrote N>0 out of 0 bytes.

>
>>> I am concerned what would happen when both the
>>> server has data to sent to the client and the client has data to send to
>>> the server. I am not 100% sure tunnel.cc code can handle that.
> That concern still exist. This is just a concern. I just do not know
> whether the current code can handle what you are asking it to do. If you
> studied it and think it is going to work, that is fine with me.

I have double checked this. see above. I hope that calms your concern.
I will continue to experiment with shuffling the data into client.buf in
the request setup function in future, but this adds a time delay across
potentially multiple TCP connection attempts to the concerns. I would
rather do that as a separate experiment outside the scope of this patch.

>
>>>> + } else if (*method_p == Http::METHOD_PRI || *http_ver ==
>>>> Http::ProtocolVersion(2,0)) {
>>>> + // other uses of PRI method or HTTP/2.0 version moniker are
>>>> not supported
>>>> + hp->request_parse_status = Http::scMethodNotAllowed;
>>>> + return parseHttpRequestAbort(csd, "error:method-not-allowed");
>>>> + }
>>> The error message seems wrong in case of a PRI method -- we do support
>>> it.
>>
>> We support it but _do not allow it_ on this port type.
>
> Please fix or remove the comment then. It says "not supported".
>

Fixed.

>
>>>
>>>> + // Let tunneling code be fully responsible for CONNECT and PRI
>>>> requests
>>>> + if (http->request->method == Http::METHOD_CONNECT ||
>>>> http->request->method == Http::METHOD_PRI) {
>>> and
>>>
>>>> - if (request->method == Http::METHOD_CONNECT && !redirect.status) {
>>>> + if (!redirect.status && (request->method == Http::METHOD_CONNECT
>>>> || request->method == Http::METHOD_PRI)) {
>>> Since HTTP1 PRI requests are valid but should not be tunneled, the new
>>> logic here should be based on version being 2.0 rather than the method
>>> being PRI, I guess.
>>>
>>> More changes may be needed to get this right, depending on how we parse
>>> and validate HTTP2 requests.
>>
>> If we make this check by version it will start matching GET/PUT/POST
>> requests as soon as we start actually parsing the HTTP/2 messages.
>
> Yes, more changes will be needed as soon as we allow more HTTP2 message
> methods. Still, HTTP1 PRI should not cause tunnels.
>

Yes. It does not cause tunnels. It causes that HTTP/1
"error:method-not-allowed" response indicating that Squid does not allow
the method to be used in HTTP/1. Squid only allows it to start HTTP/2.

>>>> /// Called when we are done writing CONNECT request to a peer.
>>>> static void
>>>> tunnelConnectReqWriteDone(const Comm::ConnectionPointer &conn, char
>>>> *buf, size_t size, comm_err_t flag, int xerrno
>>>> {
>>>> TunnelStateData *tunnelState = (TunnelStateData *)data;
>>>> debugs(26, 3, conn << ", flag=" << flag);
>>>> - assert(tunnelState->waitingForConnectRequest());
>>> Why was this assertion removed while the description of the function
>>> remained unchanged? They go together.
>>
>> I have not yet fully sorted out how to handle failover between peers if
>> the first one does not support HTTP/2 but the PRI is still permitted to
>> go there. For now the recourse is to always deliver the peer response to
>> the client. Just like CONNECT used to be handled before you added the
>> whole waiting mechanism. So the waiting flag will be false in this pathway.
>>
>> Re-adding those assertions and managing server HTTP/1 rejections/replies
>> is a TODO.
>
> I am having trouble correlating my question with your answer. According
> to the above quoted code, tunnelConnectReqWriteDone() is called when we
> are done writing CONNECT request to a peer. Is that still true? If yes,
> the assertion seems appropriate. If it is no longer true, let's fix the
> function description accordingly and may be adjust the assertion as well.

No it is called whenever we are done writing "The" request to the peer.
The request may be CONNECT or PRI now.

I've altered the comment: s/CONNECT/CONNECT or PRI/

>
> As for failover, I am not sure we need it: If Squid is misconfigured to
> send HTTP2 requests where they should not go, do we really want to try
> to recover from that configuration failure automatically?
>

I am expecting that a lot of old configurations will be used as-is
without admin manually adding ACL to forbid PRI method to their peers.
As such Squid will at least initially face a lot of situations where the
peers do not support HTTP/2 and reject the PRI request. We need to
handle those cleanly and run through the set of other destinations in
case we can find a route that will work.

I think the same thing should also be happening for CONNECT, but that is
outside scope here.

>
>>>> // we have to eat the connect response from the peer (so
>>>> that the client
>>>> // does not see it) and only then start shoveling data to
>>>> the client
>>> ...
>>>> -
>>>> - assert(tunnelState->waitingForConnectExchange());
>>> Why was this assertion removed? It seems to go together with the comment
>>> above it (which was not adjusted).
>>
>> see above about the other one.
>
> Ditto. Perhaps you can describe a specific case where these seemingly
> valid assertions fail after your patch? IIRC, those two assertions deal
> with sending a CONNECT request to a peer when the client did not send a
> CONNECT request. That situation needs special care because Squid becomes
> responsible for (a) sending the CONNECT and (b) hiding CONNECT response
> from the client. Does HTTP2 need different handling in that situation?
>

Yes. When the PRI request is sent through Squid.

I have re-added these asserts with the extra condition method==PRI. That
should cover the two cases going through and still catch problems on any
other state leading there.

>
>>>> + // we support HTTP/2.0 magic PRI requests
>>>> + if (r->method == Http::METHOD_PRI)
>>>> + return (r->urlpath == "*");
>>>> +
>>> This will block HTTP1 PRI requests, which we should continue to support
>>> and will allow HTTP2 non-PRI requests which we do not want to support
>>> for now.
>>
>> Intentionally so on both points.
>> a) see above about the HTTP/1 requests.
>
> Yes, we disagree on whether HTTP1 PRI should be allowed.
>
>
>> b) when we get on to parsing HTTP/2 requests we will face HttpRequest
>> generated with version 2.0 and other non-PRI methods where the URL needs
>> parsing.
>
> "When we get on to parsing HTTP/2 requests", this code should be
> adjusted further, of course. However, it seems wrong to allow something
> we do not support yet. Should not this be the place to reject HTTP2
> requests that we do not [yet] support?
>

Remember your requested parser sequence above?

# > else if (the buffered request content starts with the magic)
# > create the corresponding header structures (PRI v2.0)

This urlCheckRequest() function is being called by the code processing
those generated header structures. if it does not return true for this
PRI with *-URL case Squid will reject all HTTP/2 traffic.

On the other side, parsing of the HTTP/2 frames is not being done yet.
So there is no parsed HTTP/2 messages ever reaching this code. When it
does this code will need to be adjusted to fit. For now it only needs to
handle the one PRI case.

>>> * If you have not checked that the new code works with intercepted SSL
>>> traffic forwarded to an origin server when ssl_bump none matches, please
>>> do. That condition exercises the tunnel path you are changing using a
>>> fake CONNECT request.
>>
>> I have not and cannot test the SSL cases. Instead I have simulated
>> various data arrival permutations on a CONNECT.
>> If the bumping code breaks with any of those flow types something is
>> very broken in the bumping, but needs a separate fix.
>
> Please do not commit your changes until the above common case is tested.
> If you cannot test it, we will once you post the final patch.

Okay. I am re-testing after all these tweaks and should have a new patch
for you guys to test the SSL stuff with soonish.

>>> * If you have not checked that the new code works with CONNECT traffic
>>> forwarded to an SSL peer, please do. That condition exercises the tunnel
>>> path you are changing when the CONNECT response from the SSL peer is
>>> forwarded to the client. Please test with a simple CONNECT request and a
>>> CONNECT request that has some data attached after it (bug 3371).
>>
>> I have tested CONNECT relayed to direct and peer with early pushed data
>> from client and from server. These appear to be working fine.
>> Given those results, I have to assume that a "normal" HTTPS inside
>> CONNECT will retain its existing behaviour.
>>
>> I assume you are referring to the ssl_bump none case above because the
>> ConnStateData has a buffer filled with SSL handshake data?
>
> No. The key in this use case is that the client is sending CONNECT and
> the peer is an SSL peer, which results in peer CONNECT response to Squid
> being blindly forwarded to the client, without parsing. This is not
> related to SslBump.

Ah. That was covered in my tests. It is how the PRI request is supposed
to be handled by this patch design. The CONNECT case is not changed by
this patch in any way.

  * the connect exchange code managing that case is only enabled for
CONNECT
requests, (PRI requests never set the flag to trigger it), and
  * the methods doing the handling are untouched by this patch, and
  * the startShovelling code still operates on the server response as
before.

Amos

Received on Fri Nov 15 2013 - 16:22:02 MST

This archive was generated by hypermail 2.2.0 : Tue Nov 19 2013 - 12:00:10 MST