Re: WebSockets negotiation over HTTP

From: Ian Hickson <ian_at_hixie.ch>
Date: Fri, 4 Sep 2009 01:44:48 +0000 (UTC)

On Fri, 14 Aug 2009, Amos Jeffries wrote:
> > >
> > > Being sensitive to whether the server replies "101 Blah" versus "101
> > > blah" absolutely cripples WebSockets. We want to help you fix this
> > > problem.
> >
> > I still do not understand why anything gets crippled. Maybe you could
> > show an example of how you expect this problem to occur?
>
> Your protocol definition used byte-level. At the byte-level 'b' (0x97
> IIRC) does not equal 'B' (0x65 IIRC). Thus the response is a different
> byte pattern and a failed WebSocket connection.

Sure, but why does this cripple Web Sockets? A compliant Web Socket server
won't send back a lower-case "b" here.

(This has nothing to do with bytes vs text, by the way; we can just as
easily have case-insensitivity with a byte-based definition. Indeed, a
later part of the handshake does exactly that.)

> One very real example of this would be the web server or an fully
> WebSocket capable intermediary sending back bytes
>
> Your spec section 3.1 sub 12 says:
> "
> 12. Read the first 85 bytes from the server. If the connection
> closes before 85 bytes are received, or if the first 85 bytes
> aren't exactly equal to the following bytes, then fail the Web
> "
> [ note the words _exactly equal_ ]
> "
> Socket connection and abort these steps.
>
> 48 54 54 50 2f 31 2e 31 20 31 30 31 20 57 65 62
> 20 53 6f 63 6b 65 74 20 50 72 6f 74 6f 63 6f 6c
> 20 48 61 6e 64 73 68 61 6b 65 0d 0a 55 70 67 72
> 61 64 65 3a 20 57 65 62 53 6f 63 6b 65 74 0d 0a
> 43 6f 6e 6e 65 63 74 69 6f 6e 3a 20 55 70 67 72
> 61 64 65 0d 0a
> "
>
> example #1 suppose there was an intermediary translating websockets-over-http
> to websockets-port-81 which used HTTP to format said headers of confirmation.

Here is the problem I have with this: Why would we suppose the existence
of such an intermediary? Why would anyone ever implementa
WebSocket-specific proxy?

> In all other ways it is fully WebSockets compliant. But sends byte 18 as
> 73 (s) instead of 53 (S). Boom! The entire application is not WebSockets
> compliant and will fail every single transaction that goes through it.

Nobody would ever ship such a proxy, since the bug would be immediately
detected with the most rudimentary of testing.

This seems like a benefit to me, not a problem.

> example #2 is where the traffic is processed by an HTTP-only
> intermediary which sees the 'Upgrade:' header and flags the connection
> for transparent pass-thru (This by the way is the desirable method of
> making Squid support WebSockets).
>
> Being a good HTTP relay it accepts these bytes:
> HTTP/1.1 101 Web Socket Protocol Handshake
> Upgrade: WebSocket
> Connection: Upgrade
>
> It violates HTTP by omitting the Via and other headers your spec omits to
> handle. And passes these on:
> HTTP/1.1 101 Web Socket Protocol Handshake
> Connection: Upgrade
> Upgrade: WebSocket
>
> then moves to tunnel mode for you.

Why would it violate HTTP in all the ways you mention? If it goes to such
extreme lengths to have transparent pass-through to the point of violating
HTTP, why would it then go out of its way to reorder header lines?

From my perspective, such a proxy would raise all kinds of alarm bells to
me, and I would be _glad_ that the connection failed. If it didn't, I
wouldn't be sure we could trust the rest of the data.

> > > > > This is because if it doesn't, how can it say 'I won't upgrade'.
> > > > It can just not upgrade. Returning anything but the correct handshake
> > > > will be treated as a failed connection by the WebSocket client.
> > > Correct.
> >
> > So why would it need to say "I won't upgrade"?
>
> To inform MITM that the upgrade is not going to happen and the links they have
> open maybe used for other HTTP things without wasting network resources
> tearing them down and rebuilding.

The MITM isn't the WebSocket client. In this situation, it's a
(non-compliant, since it forwarded by-hop headers) HTTP proxy. What's
more, in this scenario the server isn't a WebSocket server, either, it's
an HTTP server. So what Web Socket says is irrelevant.

> > > > On Thu, 30 Jul 2009, Robert Collins wrote:
> > > > > > Suppose we had no handshake at all, and that there was no data
> > > > > > framing, so that as soon as we connected to a port, we could send
> > > > > > arbitrary data down.
> > > > > >
> > > > > > A Web page, say evil.example.net, could open a Web Socket connection
> > > > > > to http://www.corp.example.com/, send it a GET request for
> > > > > > /secret-plans, and then forward the contents of the file to a remote
> > > > > > host. If they could then trick someone on example.com's intranet to
> > > > > > look at this file, and assuming www.corp.example.com did nothing
> > > > > > more than rely on connectivitity for authentication (pretty common
> > > > > > in small intranets), then evil.example.net could steal the company's
> > > > > > secret plans.
> > > > > Can't the web page just send an ajax request to corp.example.com
> > > > > anyway?
> > > > It can't read the response, no.
> > > Can you explain this please?
> > > AFAIK, Sending a request then closing the connection immediately is the
> > > only way said response might be unreadable. I don't understand what you
> > > mean.
> >
> > A script on a Web page can cause a GET request to be sent to an arbitrary
> > URL, but it has no way to obtain the contents of the result of that request
> > unless that server opts in (using CORS) to allowing the script to see the
> > contents of the response.
>
> So in other words:
>
> the remote web server being queried has to accept the TCP connection,
> perform the correct HTTP-level handshakes, then correct CORS handshakes in
> order for you to use those links?
>
> as compared to:
>
> the remote web server being queried has to accept the TCP connection, perform
> the correct HTTP-level handshakes, then correct WebSockets handshakes in order
> for you to use those links?
>
> (note the single word difference).

Yes, CORS and the Web Socket handshake are more or less equivalent. (There
are just some minor differences because they are different protocols with
different needs.)

> > WebSockets, even when initiating the connection by talking to an HTTP
> > server and then switching to WebSockets, isn't layered on HTTP. It's
> > just doing the bare minimum required to allow the HTTP server to
> > understand the request and get out of the way. Once the handshake is
> > complete, there is no HTTP anywhere on the stack at all.
>
> Its not doing the bare minimum.
>
> The bare minimum would be to accept the valid HTTP transforms which the
> Internet _will_ perform on the handshake. Discard those useless transforms and
> validate the handshake status line.

The "bare minimum" is the least amount of processing possible. What you
describe is _more_ processing, not less. Therefore it's not the minimum.

> > Could you show me a proof of concept of this? I've not been able to
> > get any sort of two-way connection to a remote host when there's a
> > MITM proxy in the middle of the connection, regardless of what packets
> > I send. Any help you could provide here would be much appreciated.
>
> It's not the sending that matters so much. It's the receiving.
>
> Write your code as I spec'd out for you earlier.
> * Send what you want.
> * Receive whatever HTTP arrives and discard/skip/ignore all the useless
> headers that come back before the first pair of 'CRLFCRLF' (two sequential
> CRLF).
> * The bytes that follow after the 101 reply (for Upgrade:) or 200 (for
> CONNECT) are your two-way TCP link.

This is nowhere near secure enough. For example, it would allow scripts to
talk to HTTP servers unhindered.

On Thu, Jul 30 2009, Henrik Nordstrom wrote:
> tor 2009-07-30 klockan 08:21 +0000 skrev Ian Hickson:
> > >
> > > 1) CONNECT and HTTP Upgrade are optional, and independent. One may
> > > be used or the other. They may be both tried in any order.
> >
> > That doesn't sound specific enough to get interoperable behaviour. It
> > seems like we'd want to define exactly what the sequence of events
> > should be (as the draft does now, for instance), rather than leaving
> > it open.
>
> The proposal is that you have the following:
>
> 1. A definition of the independent WebSockets protocol, being a
> bidirectional octet stream protocol per your specification.
>
> 2. Profiles defining how to establish a WebSockets transport in HTTP
> infrastructure if needed, covering both "normal" and proxied setups.
>
> For '1' we don't care what you do, just as we don't care about any other
> non-HTTP protocol (SMTP, IRC, SNMP, SMB, whatever).
>
> But for 2 you need to use HTTP, which essentially boils down to defining
> that one may switch a webserver to use WebSockets by using the HTTP
> Upgrade mechanism as defined by HTTP.

I understand that you disagree with my interpretation, but my
interpretation is that this is exactly what the spec does already.

> > > 5) Specific mention is made to ignore non-understood headers added
> > > randomly by intermediaries.
> >
> > So long as that happens after the handshake, that's ok, but we can't
> > allow that inside the handshake, it would allow for smuggling data
> > through,
>
> If having this view then you CAN NOT use HTTP for the Upgrade handshake,
> and MUST use another port and other protocol signatures.

IANA expert review has informed me that I must use ports 80 and 443, so
there I don't have a choice here.

> It may well be specified along the same lines as HTTP if you like, but
> the protocol signature SHOULD NOT be HTTP, and intentions SHOULD NOT be
> "to make it easy to share port".

Those are my intentions. I am sorry you disagree with them, but I don't
really see what I can do about it.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Fri Sep 04 2009 - 01:41:37 MDT

This archive was generated by hypermail 2.2.0 : Fri Sep 04 2009 - 12:00:04 MDT