Re: WebSockets negotiation over HTTP

From: Henrik Nordstrom <henrik_at_henriknordstrom.net>
Date: Thu, 30 Jul 2009 11:44:09 +0200

tor 2009-07-30 klockan 08:21 +0000 skrev Ian Hickson:

> > 1) CONNECT and HTTP Upgrade are optional, and independent. One may be used or
> > the other. They may be both tried in any order.
>
> That doesn't sound specific enough to get interoperable behaviour. It
> seems like we'd want to define exactly what the sequence of events should
> be (as the draft does now, for instance), rather than leaving it open.

The proposal is that you have the following:

1. A definition of the independent WebSockets protocol, being a
bidirectional octet stream protocol per your specification.

2. Profiles defining how to establish a WebSockets transport in HTTP
infrastructure if needed, covering both "normal" and proxied setups.

For '1' we don't care what you do, just as we don't care about any other
non-HTTP protocol (SMTP, IRC, SNMP, SMB, whatever).

But for 2 you need to use HTTP, which essentially boils down to defining
that one may switch a webserver to use WebSockets by using the HTTP
Upgrade mechanism as defined by HTTP.

> If you want to first try port 81, then try port 815, then try another
> port, then we are talking about at least _three_ connection attempts, each
> of which could have multisecond latency. That simply isn't workable.

Well, you are basically trying to do firewall avoidance. That in itself
isn't workable and will require a bit of work to be successful.

If you were not trying to do firewall avoidance then none of this
"trying different ports" thing would be needed and the server would
specify THE method to use from start.

> > 3) The specific order of bytes is not mentioned _anywhere_ in the new text.
>
> That seems like a problem, not a benefit.

It's a benefit.

HTTP is not an exact octet sequence protocol, but still have very well
defined message syntax and boundaries.

What happens after the HTTP upgrade have completed (101 HTTP response
seen) is up to you, but before that the HTTP Upgrade sequence is HTTP if
you at all is to use an HTTP Upgrade sequence for switching to
WebSockets (another port is most likely better).

The HTTP Upgrade method is in HTTP terms just a request "please can we
switch to protocol X instead?", and this question and it's answer is
defined by HTTP terms. What happens after that is however of no concern
to HTTP and is specified by whatever protocol the connection got
"upgraded" to.

> > 4) The order of headers _received_ is not mentioned past the 101 / 4xx /5xx
> > line. HTTP varies order in-transit.
>
> I'd feel much more confident with more than one line's worth of handshake.

The 101 is more than one line. A minimal 101 response for switching to
WebSockets is:

HTTP/1.1 101 Switchng Protocols
Upgrade: WebSockets/1.0
[empty line]

But as already said there MAY be additional headers, and the
Reason-Phrase MAY be different, intended for humans and SHOULD NOT be
interpreted by machines.

For example the following response is equal to the above:

HTTP/1.1 101 Protocoles de commutation
Server: WonderfulMagicServer/4.2 (WonderOS, MagicLogic/4.3)
SomeHeaderDefinedByWonderfulMagic: SomeValue
Authenticate-Info: xxxx
Upgrade: WebSockets/1.0
[empty line]

Additionally this is just the negotiation for switching protocol. What
follows after this is mostly intended to be self-descriptive and not
relying much on the properties of the HTTP Upgrade handshake. But the
server MAY require HTTP authentication to accept the Upgrade.

> > 5) Specific mention is made to ignore non-understood headers added randomly by
> > intermediaries.
>
> So long as that happens after the handshake, that's ok, but we can't allow
> that inside the handshake, it would allow for smuggling data through,

If having this view then you CAN NOT use HTTP for the Upgrade handshake,
and MUST use another port and other protocol signatures.

> effectively faking the handshake with unexpecting servers. (I also don't
> really understand the point. If there are intermediaries adding data, then
> frankly we probably _do_ want the connection to fail.)

Faking the handshake to an unsuspecting server is equally hard even if
you use a exact octet sequence.

A HTTP server not supporting WebSockets can not respond with the 101
mentioned above unless it supports WebSockets or the attacker has full
control over the server down to exact octet sequence of the response. If
the server does not accept the upgrade you may see a 200, 3xx, 4xx or
5xx response, but not 101. Some of those response codes means that the
client is expected to take certain actions like trying again with proper
authentication (i.e. 401), but most are just different variants of "not
supported" in the context of WebSockets. See for example the
Authenticate-Info header mentioned above for an example.

> The handshake absolutely must be the very first byte, otherwise you can
> just trick the remote end into sending back the appropriate bytes for the
> handshake half-way through what it thinks is an unrelated part of the
> connection, depending on what the remote end's protocol is.

HTTP does not accept this. The format of an HTTP response and requests
is well defined, and do not skip over random "unknown" data hoping to
make sense of what follows..

HTTP does define a header format, and expects "unknown" but otherwise
well-formed headers to be ignored, allowing for the protocol to be
extended when needed.

> We don't want WebSocket relayed by MITM proxies. That's not compatible
> with HTTP as far as I can tell.

The HTTP Upgrade mechanism is compatible with MITM proxies, but they
need to agree on supporting it (HTTP Upgrade, not WebSockets in
specific).

> (I really don't understand what is "flaky" about a fixed set of bytes.)

It's about as flaky as placing requirements exactly where the stamps
(including the postal service stamps on those) should be on a letter
posted via the postal service and refusing to accept letters delivered
by the postal service where the stamps are not exactly aligned with this
or of have another value than specified in your specification of "this
is the kind of letter I accept to be delivered to me by the postal
service" but accepted as perfectly valid by the postal service.

> I don't understand why HTTP would be involved here at all. The only reason
> we have any kind of HTTP-like handshake is to allow for port sharing, not
> to allow any kind of relaying. I wouldn't expect or desire HTTP proxies to
> know anything about WebSocket. If any proxies are going to be used, they
> should be protocol-agnostic proxies like SOCKS.

In which case you SHOULD NOT be using an HTTP handshake at all.

It may well be specified along the same lines as HTTP if you like, but
the protocol signature SHOULD NOT be HTTP, and intentions SHOULD NOT be
"to make it easy to share port".

Regards
Henrik
Received on Thu Jul 30 2009 - 09:44:28 MDT

This archive was generated by hypermail 2.2.0 : Fri Jul 31 2009 - 12:00:07 MDT