Re: client-store interface from Robert Collins on 2003-10-30 (squid-dev)

From: Robert Collins <robertc@dont-contact.us>
Date: Fri, 31 Oct 2003 08:58:21 +1100

On Thu, 2003-10-30 at 12:25, Henrik Nordstrom wrote:
> On Thu, 30 Oct 2003, Robert Collins wrote:
>
> > I don't agree on this - but it's largely irrelevant at this point, as
> > I've no short term plans for merging of ranges - when someone starts to
> > work on this we can finalise the design.
>
> Some reasons just scratching the surface:
>
> * read-ahead gap logics

Addressable quite straightforwardly in either approach.

> * store merge logics when there is multiple data feeds (different clients
> requesting different not yet cached ranges)

Heh, I actually consider this a reason to feed into one store object
rather than a new one.

> > > * when a store client is initiated it must indicate if a ranged/partial
> > > response is acceptable, and in such case what ranges it wants.
> >
> > This is sort of in place - the client simply asks for the ranges it
> > wants, one at a time. The range metadata is passed through to http.cc,
> > which then obtains those ranges or the full object as appropriate and as
> > per policy.
>
> Here we differ considerably.
>
> What I propose is an API where the store dictates to the client what the
> client gets. The client just gives the guidelines on what it wants at
> start.

That has serious issues for things like 1XX Continue support.

> > The data ordering is only a SHOULD for multipart requests, it's a MUST
> > for single-part requests.
>
> Right, which is a special case of the non-ranged store client logics.

I'm trying to avoid special cases: I'm attempting to do only two general
cases:
1) Satisfiable [range] requests
2) Corrupt in some fashion requests which fallback to a 'stream' mode.

> > Case 1:
> > the upstream returns a partial response with invalid Range metadata -
> > the range request (such as -50) cannot be normalised for whatever
> > reason.
>
> If the response is invalid we should either pass the response as is (if
> composed as a valid HTTP message and compatible with the request) or an
> error.

Exactly.

> If/when we start doing merging of partial responses then there will be
> situations where such invalid responses is not compatible with the
> request.

True - and thats where http + store logics will belong, to fallback to
unconditional upstream retrieval when needed.

> > Case 2:
> > the upstream returns multi-part data and we parse it. The order doesn't
> > match what we requested (which is valid).
> > Our request for specific bytes in the store may end up deadlocking:
> > - if the object isn't cached.
> > - the data we need to send isn't inside the readahead gap.
> > - the memory flush algorithm is aware of non lowest-highest retrieval
> > patterns.
> > then we will never read the bytes the client needs.
>
> this is indeed a problem, and is why I say it is the store who should tell
> the client what the client gets.
>
> However, special care needs to be considered in the cases where the
> request does not allow for a multi-part response. I don't really see any
> other option than to make sure the forwarded request does not allow for
> multi-part responses unless the original request allows for such
> responses.
>
> Doing downgrading from multi-part to single-part responses will always be
> open to corner cases of this type.

Sure. The issue I have with your design is that it makes the
server/store drive everything which is exactly the problem we had before
where the client side drove everything. Downgrading is esaier if the
client side can drive it's needs. Teaching the store about freeable
ranges in random access needs to be done - in either approach.

> > Case 3:
> > the upstream response isn't actually compatible with what the client
> > requested - and ETag violation. Again, dealing with a buggy server, we
> > should pass through the data or display an error.
>
> In this case we should pass throught the data if possible, and discard the
> response. Only in cases where the response is incompatible with the
> request due to modifications applied by Squid (i.e. Squid initiated range
> request to fill in an incomplete object etc) should an error be returned
> in this case.

Right. And where squid initiated these changes, squid should re-request
unconditionally. IOW, no headers should be returned to the client side,
until the required upstream requests have returned their headers and
been compataiblity tested against the basis for the request.

> > The notification approach (combined with a couple of query methods that
> > already exist) allows the client side to examine the memory store as
> > data arrives, rather than needing to know which data is going to arrive
> > next.
>
> Which basically is the same thing that I propose, but from a different
> angle.

Yeah. Theres other uses for the notification - to implement the arrival
of headers, and for the transactions needed for 1XX Continue support.

> > Effectively, we have this. It's a little baroque and could be improved.
>
> I need to read up a little on the Squid-3 design it seems. Been to much in
> 2.5 lately..
>
> If we really have this then this whole discussion is a bit upside down as
> the store should in such case not need to care about any of this (except
> the ability to store sparse data). Most the issues discussed here should
> such case be in the Client-side<->forward/broker<->Server-side API chain,
> not the store client interface.

Well... The broker is called 'store_client'... Its the layer for
accessing the hot store and the cold store by the client side. The
improvements that could be made are to make store writes occur through a
store client too. The store_client has very little knowledge of store
internals now, and when finished will be just a broker.

I have to get something in place for fixing bug 624, so It'll be what
I've sketched out. Shouldn't be too hard to fiddle the design to fit
whatever we agree on.

Cheers,
Rob

-- 
GPG key available at: <http://members.aardvark.net.au/lifeless/keys.txt>.

application/pgp-signature attachment: This is a digitally signed message part

Received on Thu Oct 30 2003 - 14:58:27 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:20:44 MST