Re: [PATCH] HTTP/2.0 initial support

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 23 Aug 2013 17:57:21 +1200

On 23/08/2013 5:11 a.m., Alex Rousskov wrote:
> On 08/02/2013 04:30 AM, Amos Jeffries wrote:
>> This patch lays the groundwork for HTTP/2.0 support in Squid:
>>
>> * registers PRI method for detection of the HTTP/2.0 "magic" connection
>> header when expecting HTTP/1.1 traffic
>>
>> * extend the request line HTTP-version's accepted by the parser to
>> include "HTTP/2.0".
>>
>> * reject 2.0 traffic received by reverse- or forward-proxy ports and log
>> "error:http-2.0-not-supported" in access.log.
>>
>> * reject "HTTP/2.0" version label or "PRI" method in any use other than
>> a complete HTTP/2.0 magic connection header.
>>
>> * change delivered error page to the "Method Not Allowed" response,
>> indicating the PRI magic "method" is rejected by this HTTP/1.1 proxy.
>>
>> * intercepted HTTP/2.0 traffic is relayed to ORIGINAL_DST or a
>> cache_peer transparently.
>>
>> Note that HTTP/2.0 traffic relaying can be prevented by defining an ACL
>> for method PRI and limited by http_access controls in the same way as
>> method CONNECT. cache_peer_access selection can also make use of the PRI
>> method in ACLs to control tunneling to peers.
>>
>> Also, fix bug 3371 (CONNECT payload data sent in the first packet being
>> dropped by Squid) as a necessary step towards relaying the HTTP/2.0
>> frames which are expected to be pushed in by the client immediately
>> after the HTTP/2.0 magic header.
>>
>> Also, add URL parsing and generation support for "*" URL in
>> request-lines of PRI, OPTIONS and TRACE methods. This is only updating
>> the URL internal code for OPTIONS and TRACE, the remainder of support
>> necessary to cope with that URL special case according to HTTP/1.1
>> requirements is still missing and outside the scope of this patch.
>
>> + /* deny "PRI * HTTP/2.0" via non-intercepted ports */
>> + bool isHttp20Magic = (*method_p == Http::METHOD_PRI &&
>> + *http_ver == Http::ProtocolVersion(2,0) &&
>> + memcmp(&hp->buf[hp->hdr_end+1], "SM\r\n\r\n", 6) == 0
>> +
> I very much doubt peeking 6 bytes beyond hp->buf[hp->hdr_end+1] is safe
> without checking the buffer boundary. It is most likely conceptually
> wrong even when that does not lead to looking beyond the allocated buffer.

Okay. Yes it also does not account for the case when the magic is split
over multiple read().
Fixed both now.

> Perhaps more importantly, the way you parse PRI requests may violate
> draft-ietf-httpbis-http2-04 that seems to require that those requests
> start with a specific 24-octet magic, not with a general HTTP1 header
> followed by the 6-octet magic. Why not parse HTTP2 Connection Header
> correctly, without dealing with the apparently inappropriate HTTP1
> request parser at all??

It is only magic when interpreted as HTTP/2. When interpreted as
HTTP/1.1 it is intentionally a valid request-line.
The point of this is that Squid is still HTTP/1.1.

The magic may still occur at any time thanks to Upgrade:, so the HTTP/1
parser avoidance can only be added after the
ConnStateData/ClientSocketContext/ClientHttpRequest interactions have
all been re-worked for layer separation and multiplexed framing HTTP/2
support added. I am not keep to do that all as one huge step.

If I move the magic detection out into the pre-parser initial loop as
the code is today we end up writing an entire parallel code pathway
handling linking of that ConnStateData loop to tunnel.cc (without an
HttpRequest or ClientHttpRequest mind). This is problematic in several
ways - it is entirely temporary and needs tearing down when the HTTP/2
frame parser is put in place, requires extensive redesign of tunell.cc,
that tunnel.cc is itself already a parallel mechanism we should be
merging back into the main line. If we are goig to go down that road it
is FAR and away simpler to use the Frame-based architectur I passed by
you earlier and you gave several strong arguments against doing that.

This is the next best approach for incremental changes;
- it allows us in the next step to switch parsers without affecting the
post-parse processing, and
- it provides a faster roadmap to working HTTP/2<->HTTP/1 gateway which
appears to be the most important form of proxy in the early stages of
HTTP/2 rollout, and
- it allows detection of PRI anywhere in the traffic stream - including
bumped, and
- it allows us to use the ACL control mechanisms on these setup magic
requests in all the same ways CONNECT can be, *even if the PRI was
intercepted*.

>
>> + memmove(const_cast<char*>(&hp->buf[hp->hdr_end+1]), &hp->buf[hp->hdr_end + 6 +1], hp->bufsiz - hp->hdr_end - 6);
>> + hp->bufsiz -= 6;
> Similar "access outside of allocated buffer" concerns here, but now we
> are not only reading but writing. And (hp->bufsiz - hp->hdr_end - 6) may
> be negative.

These can only be hit if the outside-buffer matches the SM\r\n\r\n
pattern. Which is rather unlikely. Fixed anyway with the solution to the
earlier one.

>
>> + memcmp(&hp->buf[hp->hdr_end+1], "SM\r\n\r\n", 6) == 0);
>> + memmove(const_cast<char*>(&hp->buf[hp->hdr_end+1]), &hp->buf[hp->hdr_end + 6 +1], hp->bufsiz - hp->hdr_end
>> + hp->bufsiz -= 6;
> Please name the magic integer 6 since it is used more than once. It may
> be a good idea to name the magic string as well, but that is optional if
> it is used only once.

Done and fixed these overruns, I think.

>
>> + memmove(const_cast<char*>(&hp->buf[hp->hdr_end+1]), &hp->buf[hp->hdr_end + 6 +1], hp->bufsiz - hp->hdr_end
>> + hp->bufsiz -= 6;
>> +
>> + // we only support tunneling intercepted HTTP/2.0 traffic for now
>> + if (csd->port && !csd->port->flags.isIntercepted()) {
>> + hp->request_parse_status = Http::scMethodNotAllowed;
>> + return parseHttpRequestAbort(csd, "error:http-2.0-not-supported");
>> + }
> It is probably better to bail _before_ we manipulate the unsupported
> message with memmove().

Okay Done.

>
>> + /* deny "PRI * HTTP/2.0" via non-intercepted ports */
>> + bool isHttp20Magic = (*method_p == Http::METHOD_PRI &&
>> + *http_ver == Http::ProtocolVersion(2,0) &&
>> + memcmp(&hp->buf[hp->hdr_end+1], "SM\r\n\r\n", 6) == 0);
>> + if (isHttp20Magic) {
>> + } else if (*method_p == Http::METHOD_PRI || *http_ver == Http::ProtocolVersion(2,0)) {
>> + }
>> +
> Why prohibit PRI HTTP/1.1 requests? That method was not reserved and,
> hence, ought to be a valid extension method for HTTP1.

If PRI is being used as a method in HTTP/1 that is a problem for the
HTTP/2 magic and we need to start rejecting it because technically the
"PRI" is now a HTTP/1 extension method for signalling that HTTP/2 is
already started on this connection (SETTINGS frame will be in its
payload in binary form).

>
>> + } else if (*method_p == Http::METHOD_PRI || *http_ver == Http::ProtocolVersion(2,0)) {
>> + // other uses of PRI method or HTTP/2.0 version moniker are not supported
>> + hp->request_parse_status = Http::scMethodNotAllowed;
>> + return parseHttpRequestAbort(csd, "error:method-not-allowed");
>> + }
> The error message seems wrong in case of a PRI method -- we do support it.

We support it but _do not allow it_ on this port type.

>
>> - prepareTransparentURL(csd, http, url, req_hdr);
>> + if (isHttp20Magic)
>> + http->uri = xstrdup(url);
>> + else
>> + prepareTransparentURL(csd, http, url, req_hdr);
> Can this logic go inside prepareTransparentURL()? There you can check
> for 2.0 version of the protocol (we reject all non-isHttp20Magic HTTP2
> requests).

I will give it a try, but it involves several layers of pointer
indirection to get it out of http->request version.

>
>> + // Let tunneling code be fully responsible for CONNECT and PRI requests
>> + if (http->request->method == Http::METHOD_CONNECT || http->request->method == Http::METHOD_PRI) {
> and
>
>> - if (request->method == Http::METHOD_CONNECT && !redirect.status) {
>> + if (!redirect.status && (request->method == Http::METHOD_CONNECT || request->method == Http::METHOD_PRI)) {
> Since HTTP1 PRI requests are valid but should not be tunneled, the new
> logic here should be based on version being 2.0 rather than the method
> being PRI, I guess.
>
> More changes may be needed to get this right, depending on how we parse
> and validate HTTP2 requests.

If we make this check by version it will start matching GET/PUT/POST
requests as soon as we start actually parsing the HTTP/2 messages.

For now the assumption is that talks underway about CONNECT becoming
invalid for HTTP/2 and PRI is not on the cards for adding, will come out
with useful results. If not this check will get a whole lot more
complicated.

>
>> +
>> + // consume header early so that tunnel gets just the body
>> + connNoteUseOfBuffer(conn, http->req_sz);
>> + notedUseOfBuffer = true;
>> }
> Can we do the above for all non-FTP requests and remove
> notedUseOfBuffer? It looks like all non-FTP paths that reach this point
> call connNoteUseOfBuffer() now.

That woudl be nice. But out of scoep for this patch.
I will leave that for you as a followup patch though since you
understand all the other code to check better than I.

>
>> + // We just need to ensure the bytes from ConnStateData are in client.buf already
>> + memcpy(tunnelState->client.buf, tunnelState->http->getConn()->in.buf, tunnelState->http->getConn()->in.notYetUsed);
> Two problems here:
>
> 1) This is very unsafe. in.notYetUsed can be almost as big as the
> admin-configurable client_request_buffer_max_size, while
> tunnelState->client.buf is just SQUID_TCP_SO_RCVBUF. While nothing bad
> should happen by default because tunnelState->http->getConn()->in.buf
> starts small, a "proper" sequence of requests before CONNECT can grow it.
>
> To properly fix bug 3371, you will either need to enlarge (if needed)
> tunnelState->client.buf dynamically before copying (easy) or drain
> tunnelState->http->getConn()->in.buf slowly (lower RAM usage).

Good point. Enlarging would be the best way forward with minimal changes
AFAICS.

> 2) At memcpy() time, tunnelState->client.buf may already contain data
> from the server. Note the tunnelState->copy() call above.

client.buf never contains data from the server. It only ever contains
the results of a read() from the client FD.
server.buf is where that server early-response data placed.

> You probably need to move the new memcpy-related code before the
> tunnelState->server.len check, but I am not sure that would be
> correct/sufficient. I am concerned what would happen when both the
> server has data to sent to the client and the client has data to send to
> the server. I am not 100% sure tunnel.cc code can handle that.

You created the "shovel" logics here! I tested this quite a lot and the
existing code order is the only one I found which does *not* trigger
assertions about buffer being empty (note that those are still in
place). If you can find another way to do this please point me at it.
squidclient.cc in this patch contains a simulation of the data pattern
expected - use "squidclient -V 2.0 -p 3129 *" to send a 2.0 request to
port 3129 with 'intercept' option.

If the client pushed data in early it will be in ConnStateData::in.buf
and needs moving to client.buf for tunnel readClient(...) to manage
delivery directly to the server FD when kicking off the client->server
directional specific shovelling.

If the server pushed data early it will be in server.buf (I think, or
maybe the temporary response buffer you added) and the normal tunnel
readServer(...) will manage delivery directly to the client FD when
kicking off the server->client directional specific shovelling.

At no point to the opposing data streams get shuffled between tunnel
buffers. The only problem is how and when to shuffle the data into those
tunnel buffer to begin the shovelling without colliding with data
already in the buffer.
I placed the client's data copy in startShovelling itself to reduce
duplicating that copy in the two code paths leading there (peer or
DIRECT connection setup).

AFAIK, In the ideal situation we would have tunnelStateData pulling data
out of ConnStateData via BodyPipe and not doing this buffering at all.
Next best would be to shuffle the bytes into client.buf in tunnelStart()
but I ran into assert(!client.len) issues doing that.

> Also, please store tunnelState->http->getConn()->in in a local reference
> variable to avoid repeating that expression many times in
> tunnelStartShoveling().

Okay. Done that much.

>
>> /// Called when we are done writing CONNECT request to a peer.
>> static void
>> tunnelConnectReqWriteDone(const Comm::ConnectionPointer &conn, char *buf, size_t size, comm_err_t flag, int xerrno
>> {
>> TunnelStateData *tunnelState = (TunnelStateData *)data;
>> debugs(26, 3, conn << ", flag=" << flag);
>> - assert(tunnelState->waitingForConnectRequest());
> Why was this assertion removed while the description of the function
> remained unchanged? They go together.

I have not yet fully sorted out how to handle failover between peers if
the first one does not support HTTP/2 but the PRI is still permitted to
go there. For now the recourse is to always deliver the peer response to
the client. Just like CONNECT used to be handled before you added the
whole waiting mechanism. So the waiting flag will be false in this pathway.

Re-adding those assertions and managing server HTTP/1 rejections/replies
is a TODO.

>
>> + // send full HTTP/2.0 magic connection header to server
>> + mb.Printf("PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n");
> Please use mb.append, not Printf(). If my parsing suggestion above is
> implemented, this will be a named constant.

Done.

>
>> + debugs(11, 2, "Tunnel server REQUEST: " << tunnelState->server.conn << ":\n----------\n" <<
>> + StringArea(mb.content(), mb.contentSize()) << "\n----------");
> Please capitalize server as "Server", for consistency with other similar
> messages:
>
>> ./http.cc: debugs(11, 2, "HTTP Server REPLY:\n---------\n" << readBuf->
>> ./http.cc: debugs(11, 2, "HTTP Server REQUEST:\n---------\n" << mb.buf << "
>> ./client_side.cc: debugs(11, 2, "HTTP Client REPLY:\n---------\n" << mb->bu
>> ./client_side.cc: debugs(11, 2, "HTTP Client REQUEST:\n---------\n" << (hp-
> Needless to say, somebody should add a function to format all these, but
> that is outside your patch scope.

Done.

>
>> // we have to eat the connect response from the peer (so that the client
>> // does not see it) and only then start shoveling data to the client
> ...
>> -
>> - assert(tunnelState->waitingForConnectExchange());
> Why was this assertion removed? It seems to go together with the comment
> above it (which was not adjusted).

see above about the other one.

>
>> + if (request->urlpath == "*")
>> + return (request->canonical = xstrdup(request->urlpath.termedBuf()));
>> + // else fall through to default
> Please set urlbuf and break instead of duplicating this pretty important
> canonical=xstrdup() code and returning immediately.

I sought to remove one data copy. Instead of copying into urlbuf, then
copying out of it again by the xstrdup().
Done by making this case identical to the other function' matching case.

>
>> + memcpy(buf, "*", 2);
> Please use "*\0" above to avoid doubts that the termination character is
> always there and copying it is intentional.

Done.

>
>> + case Http::METHOD_OPTIONS:
>> + case Http::METHOD_TRACE:
>> + case Http::METHOD_PRI:
>> + if (request->urlpath == "*") {
>> + memcpy(buf, "*", 2);
>> + return buf;
>> + }
> Here again, I would recommend breaking instead of returning so that the
> post-formatting code can do its job (even though it currently does
> nothing for * URLs).

Done.

>
>> + // we support HTTP/2.0 magic PRI requests
>> + if (r->method == Http::METHOD_PRI)
>> + return (r->urlpath == "*");
>> +
> This will block HTTP1 PRI requests, which we should continue to support
> and will allow HTTP2 non-PRI requests which we do not want to support
> for now.

Intentionally so on both points.
a) see above about the HTTP/1 requests.
b) when we get on to parsing HTTP/2 requests we will face HttpRequest
generated with version 2.0 and other non-PRI methods where the URL needs
parsing.

> * If you have not checked that the new code works with intercepted SSL
> traffic forwarded to an origin server when ssl_bump none matches, please
> do. That condition exercises the tunnel path you are changing using a
> fake CONNECT request.

I have not and cannot test the SSL cases. Instead I have simulated
various data arrival permutations on a CONNECT.
If the bumping code breaks with any of those flow types something is
very broken in the bumping, but needs a separate fix.

> * If you have not checked that the new code works with CONNECT traffic
> forwarded to an SSL peer, please do. That condition exercises the tunnel
> path you are changing when the CONNECT response from the SSL peer is
> forwarded to the client. Please test with a simple CONNECT request and a
> CONNECT request that has some data attached after it (bug 3371).

I have tested CONNECT relayed to direct and peer with early pushed data
from client and from server. These appear to be working fine.
Given those results, I have to assume that a "normal" HTTPS inside
CONNECT will retain its existing behaviour.

I assume you are referring to the ssl_bump none case above because the
ConnStateData has a buffer filled with SSL handshake data? AIUI that
data is treated by your existing waitingForConnect logics. The test
disabling setting the waiting flag is intentionally wrapped in a test
for PRI method to ensure those waiting logics are unchanged but not
enabled for PRI.
  This update should at worst do nothing, and at best fixes those edge
cases where client SNI or similar are pushed into ConnStateData early
and hit bug 3371.

> * Will everything work correctly if an HTTP2 PRI request is sent over a
> bumped connection?

I believe so. That and having them pipelined as Upgrade: without a
CONNECT wrapper are two reasons why I placed the PRI detection after
parsing as a HTTP/1 request.

>
>> + debugs(26, 9, "\n----------\n" << StringArea(from.buf, from.len) << "\n----------");
>> + debugs(26, 9, "Tunnel PUSH Payload: \n" << StringArea(tunnelState->http->getConn()->in.buf, tunnelState->http->getConn()->in.notYetUsed) << "\n----------");
> Please remove these (and any other payload dumping debugs if I missed
> them). Today, ALL,9 logs are essential for triage in many cases and they
> are huge enough without payload dumps. If we start dumping payload like
> that, they will become unusable in many cases.
>
> We may have agreed that level-9 debugging is for data dumping. However,
> the current code uses that level for other things _and_ any current
> dumping is limited to rare "interesting" or "important" cases. This
> keeps ALL,9 usable. The above dumps violate that principle.
>
> Also, the correct way to debug payload is the Raw() API from Debug.h,
> not StringArea.

I will drop the main copy loop one, but for a while the PUSH payload
will be useful for the initial bytes debugging of the cases you queried
testing of above.

Converted to Raw(). Thanks for the reminder.

>
>> if (cbdataReferenceValid(tunnelState)) {
>> +
>> if (!tunnelState->server.len)
> Please remove non-changes.

Oops. Done.

Amos
Received on Fri Aug 23 2013 - 05:57:31 MDT

This archive was generated by hypermail 2.2.0 : Fri Aug 23 2013 - 12:01:12 MDT