Re: A youtube and windows update cache thing. from Amos Jeffries on 2013-05-28 (squid-dev)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 29 May 2013 01:27:13 +1200

On 28/05/2013 9:14 p.m., Reiner Karlsberg wrote:
>
>>>
>>> Hi,
>>> just a basic question, relevant to squid2.7, at least:
>>>
>>> AFAIK, without looking into the src code, squid2.7 requests and caches
>>> the complete object, before delivering the data to fulfill the first
>>> range request.
>>> Delivery might even start as soon as the data for the actual range
>>> request is available.
>>>
>>> Further range requests will be handled from the cached "complete
>>> object".
>>>
>>> In case, this is correct, then it should also be valid using
>>> storeurl_rewrite_program in squi2.7
>>>
>>> The probability, that other parts of the cached "complete object"
>>> will be reuested, is
>>> quite high, so no
>>> no need for something new.
>>>
>>> This strategy might even be an advantage, in case, the range requests
>>> for the same object vary in size, which might be dependent upon
>>> connection speed.
>>>
>> Hey,
>> As far as I know squid serves 206 partial responses from cache only
>> if the full object exists in cache.
>
> I can not follow the flow of execution in squid2.7, but I am not
> shure, that you are correct.
> These excerpts from squi2.7 MIGHT indicate, that transmission of
> data for a range request to the client can start before the complete
> object is cached:
>
> squid.conf:
>
> # TAG: range_offset_limit (bytes)
> # Sets a upper limit on how far into the the file a Range request
> # may be to cause Squid to prefetch the whole file. If beyond this
> # limit Squid forwards the Range request as it is and the result
> # is NOT cached.
> #
> # This is to stop a far ahead range request (lets say start at 17MB)
> # from making Squid fetch the whole object up to that point
> before <=============================
> # sending anything to the client.
>
>
>
> client_side.c:
> clientCheckRangeForceMiss(StoreEntry * entry, HttpHdrRange * range)
> {
> /*
> * If the range_offset_limit is NOT in effect, there
> * is no reason to force a miss.
> */
> if (0 == httpHdrRangeOffsetLimit(range))
> return 0;
> /*
> * Here, we know it's possibly a hit. If we already have the
> * whole object cached, we won't force a miss.
> */
> if (STORE_OK == entry->store_status)
> return 0; /* we have the whole object */
> /*
> * Now we have a hit on a PENDING object. We need to see
> <===========================================
> * if the part we want is already cached. If so, we don't
> * force a miss.
> */
> assert(NULL != entry->mem_obj);
> if (httpHdrRangeFirstOffset(range) <= entry->mem_obj->inmem_hi)
> return 0;
> /*
> * Even though we have a PENDING copy of the object, we
> * don't want to wait to reach the first range offset,
> * so we force a miss for a new range request to the
> * origin.
> */
> return 1;
> }
>

This looks like a collapsed_forwarding code check to me.
IIRC 2.x will only cache complete objects. However when
collapsed-forwarding is enabled an incomplete or never-finished object
may be collapsed and used. Those type of objects then get dropped later
on eth disk swapout stages when determined incomplete so won't HIT with
206 unless some client is currently fetching.

Henrik will know with more certainty if he can be tracked down. Alex
maybe, since he is now workign on collapsed forwading, but possibly not
as well since the CF feature is being re-designed at a basic level for
the very different 3.x codepaths.

Amos
Received on Tue May 28 2013 - 13:27:18 MDT

This archive was generated by hypermail 2.2.0 : Tue May 28 2013 - 12:00:12 MDT