Re: [squid-users] Squid losing connectivity for 30 seconds from Amos Jeffries on 2011-12-05 (squid-users)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 06 Dec 2011 00:56:49 +1300

On 5/12/2011 7:14 p.m., Elie Merhej wrote:
>
>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I am currently facing a problem that I wasn't able to find a
>>>>>>>>>>> solution for in the mailing list or on the internet,
>>>>>>>>>>> My squid is dying for 30 seconds every one hour at the same
>>>>>>>>>>> exact time, squid process will still be running,
>>>>>>>>>>> I lose my wccp connectivity, the cache peers detect the
>>>>>>>>>>> squid as a dead sibling, and the squid cannot server any
>>>>>>>>>>> requests
>>>>>>>>>>> The network connectivity of the sever is not affected (a
>>>>>>>>>>> ping to the squid's ip doesn't timeout)
>>>>>>
>>>>> Hi,
>>>>>
>>>>> here is the strace result
>>>>> -----------------------------------------------------------------------------------------------------
>>>>>
>>> <snip looks perfectly normal traffic, file opening and closing data
>>> reading, DNS lookups and other network read/writes>
>>>>> read(165, "!", 256) = 1
>>> <snip bunch of other normal traffic>
>>>
>>>>> read(165, "!", 256) = 1
>>>>> ----------------------------------------------------------------------------------------------------------------------------------------------------
>>>>>
>>>>> Squid is freezing at this point
>>>
>>> The 1-byte read on FD #165 seems odd. Particularly suspicious being
>>> just before a pause and only having a constant 256 byte buffer space
>>> available. No ideas what it is yet though.
>>>
>>>
>>>
>>>> wccp2_router x.x.x.x
>>>> wccp2_forwarding_method l2
>>>> wccp2_return_method l2
>>>> wccp2_service dynamic x
>>>> wccp2_service_info x protocol=tcp flags=src_ip_hash priority=240
>>>> ports=80
>>>> wccp2_service dynamic x
>>>> wccp2_service_info x protocol=tcp flags=dst_ip_hash,ports_source
>>>> priority=240 ports=80
>>>> wccp2_assignment_method mask
>>>>
>>>>
>>>> #icp configuration
>>>> maximum_icp_query_timeout 30
>>>> cache_peer x.x.x.x sibling 3128 3130 proxy-only no-tproxy
>>>> cache_peer x.x.x.x sibling 3128 3130 proxy-only no-tproxy
>>>> cache_peer x.x.x.x sibling 3128 3130 proxy-only no-tproxy
>>>> log_icp_queries off
>>>> miss_access allow squidFarm
>>>> miss_access deny all
>>>
>>> So if I understand this right. You have a layer of proxies defined
>>> as "squidFarm" which client traffic MUST pass through *first* before
>>> they are allowed to fetch MISS requests from this proxy. Yet you
>>> are receiving WCCP traffic directly at this proxy with both NAT and
>>> TPROXY?
>>>
>>> This miss_access policy seems decidedly odd. Perhapse you can
>>> enlighten me?
>> Hi,
>>
>> Let me explain what I am trying to do,(I was hoping that this is the
>> right setup) the squids are siblings so my clients pass through one
>> squid only (this squid uses icp to check if the object is in my
>> network, if not the squid fetches the object from the internet)
>>
>> if
>> miss if miss
>> clients-------->WCCP-------->squid------------->ICP----------->Internet-------->WCCP------->squid-------->clients
>>
>>
>> I have over 400Mbps of bandwidth, but one squid (3.1) cannot
>> withstand this kind of bandwidth (number of clients), this is why I
>> have created a squidFarm
>> I have the following hardware: i7 xeon 8 cpus - 16GB Ram - 2 HDDs
>> 450GB & 600GB no RAID
>> Software: Debian OS squeeze 6.0.3 with kernel 2.6.32-5-amd64 and
>> iptables 1.4.8
>> Please note that when I only use one cache_dir (the small one
>> cache_dir aufs /cache1/squid 320000 480 256 ) I don't face this problem
>> The problem starts when the cache dir size is bigger then 320 GB
>> Please advise
>>
>> Thank you for the advise on the refresh patterns
>> Regards
>> Elie
>>
> Hi Amos,
>
> Thank you for your help, the problem was solved when I replaced the
> refresh patterns with what you recommended,
> I replaced more then 20 lines of refresh patterns with 4 lines,

Wow. Not an effect I was expecting there. But great news either way :)

>
> One more question, do you recommend a specific replacement policy, I
> have read that when the size of cache directory is large, you advise
> to leave the default replacement policy,

I don't really have a specific opinion about the available policies.
There have been quite a few comments that the heap algorithms are faster
than the classical linked-list LRU. But each is suited for different
caching needs. So whichever you think best matches what data you want to
keep in cache.

Amos
Received on Mon Dec 05 2011 - 11:57:00 MST

This archive was generated by hypermail 2.2.0 : Mon Dec 05 2011 - 12:00:03 MST