AW: [squid-users] detecting dead parent problem - understanding parent and icp from Rietzler, Markus \(RZF, SG 324 / <RIETZLER

From: Rietzler, Markus \(RZF, SG 324 / \) <markus.rietzler_at_fv.nrw.de>
Date: Wed, 3 Jul 2013 10:29:24 +0000

> -----Ursprüngliche Nachricht-----
> Von: Amos Jeffries [mailto:squid3_at_treenet.co.nz]
> Gesendet: Dienstag, 7. Mai 2013 04:56
> An: squid-users_at_squid-cache.org
> Betreff: Re: [squid-users] detecting dead parent problem
>
> On 7/05/2013 3:16 a.m., Rietzler, Markus (RZF, SG 324 /
> <RIETZLER_SOFTWARE>) wrote:
> > we have a setup with one squid (user-proxy) that connects to 4 parent
> proxies.
> >
> > cache_peer proxy-inter1 parent 8083 0 sourcehash no-query no-digest no-
> netdb-exchange connection-auth=off
> > cache_peer proxy-inter2 parent 8083 0 sourcehash no-query no-digest no-
> netdb-exchange connection-auth=off
> > cache_peer proxy-inter3 parent 8083 0 sourcehash no-query no-digest no-
> netdb-exchange connection-auth=off
> > cache_peer proxy-inter4 parent 8083 0 sourcehash no-query no-digest no-
> netdb-exchange connection-auth=off
> >
> > recently two of those 4 parents were gone. in cache log we saw messages
> like:
> >
> > 2013/05/06 16:27:33 TCP connection to proxy-inter4/8083 failed
> >
> > and then after 10s or so (which should be the dead_parent_timeout)
> >
> > 2013/05/06 16:27:34 Detected DEAD Parent: proxy-inter4
> >
> > that seems to be normal.
> >
> > BUT
> > 1) those messages reappear in cache.log again and again. normally we
> would expect them not to come at all unless the parent is detected as live
> again. many "TCP connection failed" and some times "DEAD parents"
> > 2) browsing the web was extremely SLOW
> >
> > we use squid 3.2.4 as user-proxy and the 4 parent proxies.
> >
> > configure options: '--enable-auth-basic=MSNT,SMB' '--enable-external-
> acl-helpers=ldap_group' '--enable-auth-basic' '--enable-auth-ntlm' '--
> enable-auth-negotiate=kerberos' '--enable-delay-pools' '--enable-follow-x-
> forwarded-for' '--enable-removal-policies=lru,heap' '--with-
> filedescriptors=4096' '--with-winbind' '--with-async-io' '--enable-
> storeio=ufs,aufs,diskd,rock' '--disable-ident-lookups' '--
> prefix=/www/squid' '--enable-underscores' '--with-large-files'
> 'PKG_CONFIG_PATH=/opt/gnome/lib64/pkgconfig:/opt/gnome/share/pkgconfig' --
> enable-ltdl-convenience
> >
> > top on the two living parent proxies was ok.
> >
> >
> > we also have two development systems. one running squid 2.7.3 and one
> 3.2.9. the one with 3.2.9 showed some problems. many log entries in cache
> log and SLOW browsing. on the old squid browsing was no problem at all.
> all requests were fast enough. the old squid showed no messages in cache
> log after "DEAD parent". on both development systems only few (2-3) users
> were active.
> >
> >
> > any idea were to look?
>
> Start with removing "no-query" from the cache_peer lines. The one of the
> main purposes of proxy queries is to determine UP/DEAD status. You can
> also tune the connection-fail-limit= option on cache_peer to reduce the
> number of failed requests before the peer is declared DEAD.
>
> FYI: 3.2 forwarding path algorithm has been altered a fair bit in a way
> which might account for the behaviour change. Namely DNS is only looked
> up once per path available, and re-tries are done sequentially down the
> resulting set of IPs - 3.1 and older would do DNS lookups on every
> re-try so you would easily get the 10 failed connects in a few ms while
> retrying a single request which never gets through. In 3.2 you will get
> 10 *different* requests trying the peer over a slightly longer time
> (better chance of short-outage recovery detection) and getting serviced
> by a later path (hopefully more successful, and definitely less lag on
> errors than before).

i will come back to this subject, as the problems prevent from updating our systems. as said, we have a loadbalancing/fallback setup with 4 central proxy servers and two dmz proxy servers.

you suggest removing no-query from cache-peer. we have setup our proxy servers without any icp/icmp. so removing no-query will make no difference as our parent proxies don't listen on icp queries.

one question: as said we don't use ICP in our proxies. our setup is as follow:

PC - proxy_user - proxy_inter1 -
                - proxy_inter2 - proxy_dmz1
                - proxy_inter3 - proxy_dmz2
                - proxy_inter4 -

clients connect to proxy_user. we have defined 4 parent proxies. those 4 parent proxies will also have two parent proxies setup, those two dmz-proxies will then make the real request in the internet.

none of our proxies will use icp. what will we gain with activating icp? could the missing icp be one reason for the performance problem when stopping 2 of the 4 proxy_inter? will there be a problem in detecting DEAD parents without icp?

as far as I understood: icp will give me some benefit when proxy_user don't have the requested object. in our setup it will have to query eg. proxy_inter1 (based on sourcehash). if proxy_inter1 has the object good, if not proxy_inter1 will fetch it. with icp proxy_user will first do an icp query to all of the 4 proxy_inter to see I one of them have the object. if one says "have it" then it will receive the object from eg. proxy_inter3 although sourcehash would normally fetch from proxy_inter1. is this right?

markus
Received on Wed Jul 03 2013 - 10:29:40 MDT

This archive was generated by hypermail 2.2.0 : Wed Jul 03 2013 - 12:00:12 MDT