Re: Range processing

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Fri, 5 Apr 2002 17:47:24 +0200

Adrian Chadd wrote:

> If you need the stream to be sequentially read, and the headers
> happen to start at offset 0 (well, the status first, _THEN_ the
> headers) ...

Yes. As I said I got a bit confused as I misread former/latter in your
sentence, thinking that you somehow saw a connection between sequential
reading and header parsing..

> > Then the question arises on how do you signal the returned ranges to the
> > caller? As a linear stream with the ranges concatenated, or structured
> > somehow?
>
> Have an array of sub-requests of a given object.
>
> Eg, pass storeLookup the following:
>
> req protocol: HTTP
> target protocol: HTTP
> request URI: GET foo
> headers: <foo>
> chunk:
> headers: <foo> (including the actual range)
> chunk:
> headers: <foo>
>
> etc, so each requested chunk can have specific attributes.
>
> See, I see it as the client side's responsibility to format the
> data in the HTTP "format". It won't be as fast as the high-end
> commercial caches, but its a decent comprimise.

I am talking about the other direction. How the store signals back to the
caller what range is currently being read, and how to deal with ranges not
known by the store.

> > And how to deal with partial objects where only part of the requested
> > range or ranges can be satisfied?
>
> How is it normally handled?

Not, I guess.

In HTTP basically see only two options here

a) The simple path: If what you have cannot fully satisfy the request, then
ingore what you already have and forward the full request. Then if the new
reply is compatible with what you have (same strong ETag) then optionally
attempt to merge the two to form a larger subset of the object.

b) The advanced path: First try to fetch the missing pieces. If the new reply
is compatible with what you have then make a combined reply to the client,
else discard both and forward the full request.

There is a couple of other theoretical approaches in HTTP, but I think only
these two are practically manageable.

a) is obviously a lot simpler to deal with

b) is more conservative in bandwidth usage, but there is two problems here.
Unless one wants to return the ranges in a different order than requested
then the "new" fetch needs to be both throttled to the client speed without
actually talking to the client, and buffered so the data can be sent to the
client when we get there in the range processing.

I don't think all of this kind of logics belongs at the store layer at all.
It belongs at a HTTP proxy layer acting as a interchange between the client
side, store and upstream protocols. In Squid this layer is not existing as a
isolated layer but distributed over client_side.c and the protocols.
Received on Fri Apr 05 2002 - 08:47:29 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:14:58 MST