Re: range request cache

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 23 Feb 2012 10:01:50 -0700

On 02/21/2012 05:09 PM, Zhu, Shan wrote:
> I have an urgent need for caching requested ranges so I want to do a quick "hack" on this topic before the new feature becomes available.
>
> What I want to achieve is for Squid to cache a range request without pre-fetching and caching the whole object, so that if the cached range is requested again it can be served from the cache.
>
> What I want to do is to change the URL with range request into a unique file name, and once the response is received from the back-end server, the response can be cached as a single object.
> The workflow should be like this,
>
> (1) Change URL with range request into something like: "[original URL]_[range start]_[range end]", internally to Squid only.
>
> (2) Squid checks cache to see if it is cached. If yes, Squid responds to the client with the cached object. If no, go to Step 3.
>
> (3) When it is a cache miss in Step 2, Squid forwards the original URL to the back-end server, with range request, as normal. (Not to pre-fetch of = the whole object file.)
>
> (4) Squid receives the response from back-end server, for the requested range only, recognizes it as the response corresponding to the changed URL = "[original URL]_[range start]_[range end]".
>
> (5) Squid caches the range according to the changed URL "[original URL]_[range start]_[range end]", like a single object file.
>
> (6) Squid responds to the client for the URL with range request, as normal.
>
> Here I may have simplified the problem and omitted the time-stamp issue, etc.
>
> Is this doable? How difficult would it be? Can I get any suggestion on how to proceed? I am starting from scratch on the source code change.

Adding support for range caching is doable, of course. It is a difficult
project though, even if you limit that support to the absolute minimum.

I am not sure changing the URL is the best or even easiest way forward.
Instead, I would try to change how cache key is computed by adding Range
information to the hashing function and then adjust the "does the cached
store entry match the request" code to account for Range request
headers. As Amos has mentioned already, looking at Vary support may be
helpful here. You can kind of treat Range responses as having an
implicit "Vary: Range" header.

I suspect the most difficult parts would be to correctly adjust code
responsible for computing expected/actual/maximum entry size to respect
Content-Range limits and adjust swap code to expect/write/read the right
number of bytes. The response size-related code is really messy. Squid
v3.2 has some improvements in that area, but we are still a long way
from a good API.

HTH,

Alex.
Received on Thu Feb 23 2012 - 17:02:04 MST

This archive was generated by hypermail 2.2.0 : Fri Feb 24 2012 - 12:00:12 MST