Re: WebSockets negotiation over HTTP

From: Ian Hickson <ian_at_hixie.ch>
Date: Wed, 21 Oct 2009 23:52:37 +0000 (UTC)

On Sat, 17 Oct 2009, Mark Nottingham wrote:
> On 17/10/2009, at 9:09 AM, Ian Hickson wrote:
> > On Wed, 14 Oct 2009, Mark Nottingham wrote:
> > >
> > > Section 5.2 does constrain the bytes the server accepts from the
> > > client, thereby conflicting with HTTP, but only in some small
> > > details. In particular, it makes HTTP header field-names
> > > case-sensitive, and requires certain arrangements of whitespace in
> > > them.
> > >
> > > Ian, if you can address these small things in section 5.2 it would
> > > help.
> >
> > If a WebSocket client is connecting to a WebSocket server, then this
> > isn't HTTP, it's just the WebSocket protocol. So whether the fields
> > are parsed like HTTP is presumably not a problem.
> >
> > If an HTTP client is connecting to a WebSocket server, then the
> > server's response is going to be garbage (from the HTTP client's
> > perspective) anyway, much like if an HTTP client were to connect to an
> > SMTP server. So how the server parses the fields doesn't really
> > matter.
> >
> > If a WebSocket client is connecting to a WebSocket server, then the
> > requirements in this section don't apply to the server.
> >
> > If an HTTP client is connecting to an HTTP server, then the whole spec
> > doesn't apply.
> >
> > Which case is the one you are concerned about? Are my conclusions
> > above incorrect?
>
> Until the upgrade is complete, you're speaking HTTP and working with
> HTTP implementations.

How so? A WebSocket client is always talking Web Socket, even if it might
also sound like HTTP.

> Have you verified that implementations (e.g., Apache module API) will
> give you byte-level access to what's on the wire in the request, and
> byte-level control over what goes out in the response?

On the server side, you don't need wire-level control over what's coming
in, only over what's going out.

There's already a WebSocket module for Apache, by the way:

   http://code.google.com/p/pywebsocket/

> Overall, I guess I'm just not seeing how running WebSockets on port 80
> (i.e., co-existant with a HTTP server) is ever a good idea.

I wouldn't recommend co-existing with a port 80 HTTP server. The
co-existing support is really for port 443.

> Since a sizeable portion of the Internet is accessed through proxies
> (e.g., hotels, universities, corporations, mobile phones, and some
> ISPs), and none of the deployed infrastructure will support WebSockets,
> deploying in this fashion alone won't be workable on the open Internet;
> people using this technique will have to also deploy a fallback server
> on a different port. So, why bother, and why force people to write the
> code for fallback? What value is there in doing it this way?

By and large, you can connect over port 443 without the proxy getting in
the way. That's the model that I would expect most Web Socket deployments
to use.

> Despite all of this, you say:
>
> > The simplest method is to use port 80 to get a direct connection to a
> > Web Socket server. Port 80 traffic, however, will often be
> > intercepted by HTTP proxies, which can lead to the connection failing
> > to be established.
>
> which I think is misleading; this is far from the simplest way to use
> WebSockets, from a deployment perspective.

True. I've tried to reword this to avoid this possible ambiguity.

> > > The other aspect here is that you're really not using Upgrade in an
> > > appropriate fashion; as mentioned before, its intended use is to
> > > upgrade *this* TCP connection, not redirect to another one.
> >
> > There's only one TCP connection established. As far as I can tell,
> > WebSocket never does a redirect of any kind.
>
> -48 5.1 says:
>
> > Send the string "WebSocket-Location" followed by a U+003A COLON (:)
> > and a U+0020 SPACE, followed by the URL of the Web Socket script,
> > followed by a CRLF pair (0x0D 0x0A).
> >
> > For instance:
> >
> > WebSocket-Location: ws://example.com/demo
> >
> > NOTE: Do not include the port if it is the default port for Web
> > Socket protocol connections of the type in question (80 for
> > unencrypted connections and 443 for encrypted connections).
>
> This looks an awful lot like a redirect.

There's no redirection involved here. It's just confirming the opened URL,
as part of the handshake. The TCP connection is not closed (unless the
handshake fails, and then it's not reopened).

> I see now that you have the client-side fail a connection where the URL
> doesn't match, but that's really not obvious in 5.1. Please put some
> context in there and reinforce that the URL has to be the URL of the
> current script, not just any script.

Ok, I've added a note at the end of that section explaining that the user
agent will fail the connection if the strings don't match what the UA
sent. Please let me know if you'd like anything else clarified; I don't
really know exactly what should be made clearer.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wed Oct 21 2009 - 23:39:44 MDT

This archive was generated by hypermail 2.2.0 : Thu Oct 22 2009 - 12:00:05 MDT