Re: Fundamental architecture of a proxy implementing keep-alive.

From: Dancer <dancer@dont-contact.us>
Date: Thu, 26 Aug 1999 10:54:19 +1000

"James A. Donald" wrote:
>
> "James A. Donald" wrote:
> > > Junkbuster relies on the connection NOT being kept alive in order to
> tell
> > > where one request ends and another begins, and fails to translate and
> act
> > > upon
> > > - Keep-Alive
> > > - Connection
> > > - Proxy-Authenticate
> > > - Proxy-Authorization
> > > - TE
> > > - Trailers
> > > - Transfer-Encoding
> > > - Upgrade
> > >
> > > Instead it mindlessly passes them along, producing all sorts of
> > > nasty failures.
> > >
> > > The first problem with keep-alive is that I do not know how to
> > > tell where a request or response ends, so that do not know when I
> > > can throw away state for the previous request and prepare to
> > > translate new state.
>
> dancer:
> > The answer is Content-length. Unless you are doing chunked encoding.
> > So if you drop transfer-encoding support and anything else that
> > relies on chunked encoding then your job gets a whole lot simpler.
> > If the response has a Content-length header, then your object is
> > exactly that long, or ends when the connection does, whichever
> > happens first. If you don't have a content-length, then the object
> > ends when the connection does, and that's that.
>
> What about objects that do not have bodies, for example most gets, and
> the response to a HEAD request?

Objects without bodies don't have bodies to worry about :)
For requests, the presence or absence of a content-length header will
tell you if you should expect a body. You cannot send one without, IIRC.
As for a HEAD, you know you're doing a HEAD..but there _might_ be a body
if the object is generated by a CGI that doesn't check the method.

>
> Will the following algorithm work?
>
> If there is no Content-Length or Transfer-Encoding, assume
> there is no body, and the message ends with the first blank
> line.
>
> If there is a Content-Length, assume the Content-Length bytes
> following the first blank line are the body.
>
> If there is a Transfer-Encoding, and no Content-Length,
> suppress the Connection: Keep-Alive header and ensure the
> presence of a "Connection: Close header". Assume that the end
> of the message will be indicated by a termination of the
> connection, and just pass everything as is until the
> connection terminates.
>
> Pass through the TE header
>
> Suppress the Ugrade header.

This presupposes that you have multiple outstanding requests on a server
connection, doesn't it? (see below)

> Will a client issue to a proxy two requests for two different
> different servers on the same connection? If it does then the proxy
> would have to do some nasty serialization code

I don't see anything to say that it shouldn't or can't. I don't think
many implementations _do_ (I have code that does this, for example), but
I think it must be allowed for. If you are supporting chunked encoding,
then I think it likely. You have two options (referring also to the
sequence above, for the latter one), as I see it:

1) Keep a connection id for each connection, and a request id for each
request, so that you can preserve the order of responses. You might want
to do this anyway just for accounting.

2) Don't accept a second request from the client-side connection until
you've dealt with the previous one. That's got to be the simplest durn
method of dealing with them. It's not unreasonable to assume that the
client may open multiple parallel persistant connections on a
time/workload basis. I use a similar strategy in opening connections to
my own server-thing.

Not-Really-3-because-I-only-mentioned-2) If you're using chunked
encoding, then I believe you are not required to return responses in the
same order that they were requested. Interleave chunks of any objects
you have available in any order as-and-when bits of them become
available. By signalling support for chunked encoding, it's the client's
job to defragment the responses and sort them into coherent order.

> If this never happens, then the following algorithm should work.
>
> When the client initiates a new connection with the proxy,
> create a thread to handle client to server data. This thead
> reads the first header of the request, and then opens a
> connection with the server, creating a second thread to handle
> the server to client data.
>
> If it is talking to a server, the thread handling client
> requests strips the domain name out of the URLs, and
> translates Proxy-Connection headers to Connection headers, and
> suppresses Upgrade headers. If it is talking to a chained
> proxy, it passes stuff as it is given, unless it needs to
> replace a "Keep-Alive" with a "Close" because it cannot figure
> the length.
>
> Similarly the server to client thread translates Connection to
> Proxy-Connection headers, and "Keep-Alive" to Close as needed.
>
> If the proxy-server connection closes expectedly (after having
> provided a response to each request) the server thread
> terminates.
>
> If the proxy-server connection closes unexpectedly (the server
> has nto provided a response to each request, the proxy-client
> connection is terminated, and both threads terminated.

I think you can (and perhaps should) return a 502 here.

> If the client thread closes expectedly or unexpectedly, both
> threads terminate and proxy-server connection is closed.
>
> If the client thread receives a new request when the server
> connection has already closed, and the server thread has
> terminated, it reopens the server thread and the server
> connection.

Don't forget sensible timeouts for idle connections.

I mislike me the use of threads. Having written a couple small http
proxies recently, I find that state-machines seem to work better than
threaded models. Of course, that might simply be that the state design
means that you absolutely cannot go off half-cocked, and without a good
plan.

D
Received on Wed Aug 25 1999 - 19:46:09 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:48:06 MST