Re: [squid-users] cache peer: hit, miss and reject from Nikolai Gorchilov on 2013-09-05 (squid-users)

From: Nikolai Gorchilov <niki_at_x3me.net>
Date: Fri, 6 Sep 2013 05:26:09 +0530

Sorry for the late reply. Was traveling in the last two days.

On Wed, Sep 4, 2013 at 10:05 AM, Amos Jeffries <squid3_at_treenet.co.nz> wrote:
>
> On 4/09/2013 7:14 a.m., Niki Gorchilov wrote:
>>
>> 2. We know that 50% of the objects in our cache never get requested
>> second time, thus only creating load on the system to store and later
>> to evict them.
>
> How did you get to that conclusion please?

These are not the squid stats, but our own custom cache peer stats.

> What Squid version are you using at present?

3.3.8. Waiting for 3.4 :-)

>> So we prefer to be able to cache on second, third,
>> etc... request without passing the first requests via the peer at all.
>
> You understand that will possibly halve your caching efficiency right? turning the 2-request URLs into MISS+MISS+... and making only the 3-times fetched URLs worth caching...

Yeah. Did the math very carefully :) 1+, 2+, 3+, 4+ up to 10+ requests scenario.

> Caching is at its core a tradeoff between storage delays and bandwidth delays. If you explicitly weight it in the direction of bandwidth delays by not caching things on first request the benefits drop off significantly fast.

You are right in general, but not in our specific case. Our cache peer
produces about 60% cache hit. Going from 1st to 2nd request caching
will reduce the hit rate to about 58%. Still good result without much
trashing the HDDs. Or keep the trashing, but reduce the size by half
:)

>> Why? Same reasons as above.... ICP is cheap enough for statistics and
>> decision making...
>
> This is not possible with ICP as far as I know.

OK. Clear.

>> Any ideas how to resolve my issue and offload the cache peer by at
>> least 50% of the requests it servers currently?
>
> Answer: Do not use cache existence test(s) to solve access control and routing problems.

Great point :-) Thanks.

> I would use an external_acl_type helper to do the calculation about whether a request was to be cached and set a tag=value on the transaction. The tag type ACL can then test for this tag and do a "cache deny". Since you have all traffic
>
> Something like this:
>
> external_acl_type tagger ttl=0 %URL ... (helper returns "OK tag=first-seen" or just "OK").
> acl firstSeen external tagger
> acl taggedFirst tag first-seen
> http_accesss deny firstSeen !all
> cache deny !taggedFirst

Yeah. Did something like this, works like a charm.

Even tried to remove all ICP as it is used only for marking via
qos_flows parent. The helper mostly replicates the logic behind our
custom ICP listener and returning tag=parent-hit was a no brainer.
Unfortunately I have discovered that clientside_tos doesn't support
slow acls like tag. I believe this fact has to be mentioned somewhere
at http://www.squid-cache.org/Doc/config/clientside_tos/. Will stick
to ICP HIT/MISS & quos_flows for DSP marking for now, while observing
ZPH kernel patch as an alternative.

Thanks Amos. For all the efforts keeping the squid development going
and it's community alive :)

Best,
Niki
Received on Thu Sep 05 2013 - 23:56:56 MDT

This archive was generated by hypermail 2.2.0 : Fri Sep 06 2013 - 12:00:04 MDT