Re: [squid-users] Squid not responding and 100% cpu

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 07 Jun 2013 19:19:53 +1200

On 6/06/2013 8:31 p.m., Nuno Fernandes wrote:
> Hello,
>
> I have squid 3.3.5 that stopped responding. I can see that it is using 100%cpu.
> Cache.log reports thousands of:
>
> 2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.997 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.998 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:08.999 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
> 2013/06/06 09:19:09.000 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 29
> 2013/06/06 09:19:09.000 kid1| TcpAcceptor.cc(197) doAccept: New connection on FD 27
>
> and then
>
> 2013/06/06 09:19:11.835 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
> 2013/06/06 09:19:12.310 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
> 2013/06/06 09:19:12.843 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
> 2013/06/06 09:19:13.254 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
> 2013/06/06 09:19:13.705 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
> 2013/06/06 09:19:14.271 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
> 2013/06/06 09:19:14.704 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
> 2013/06/06 09:19:15.032 kid1| TcpAcceptor.cc(272) acceptNext: connection on local=0.0.0.0:3126 remote=[::] FD 29 flags=41
>
> # ls -la /proc/4857/fd|wc -l
> 770
>
> Using 770 file descriptors
>
> # netstat -anp|grep 4857|grep ESTAB|wc -l # established connections
> 682
>
> The wierd thing is that if i don't activate ssl intercept i don't get this error. Another wierd thing is the established connections:
>
> tcp 0 0 10.10.10.254:36046 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:36032 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:35757 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:35972 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 121 0 10.10.10.254:3126 10.10.10.254:43033 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:3126 10.10.10.254:35965 ESTABLISHED 4857/(squid-1)
> tcp 121 0 10.10.10.254:3126 10.10.10.254:35696 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:36011 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:42963 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:36091 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 121 0 10.10.10.254:3126 10.10.10.254:35688 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:36098 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:36008 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 121 0 10.10.10.254:3126 10.10.10.254:35758 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:35944 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 121 0 10.10.10.254:3126 10.10.10.254:42939 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:35689 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:3126 10.10.10.254:35972 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:35700 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 121 0 10.10.10.254:3126 10.10.10.254:42995 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:35770 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:3126 10.10.10.254:35958 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:3126 10.10.10.254:35976 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:3126 10.10.10.254:35941 ESTABLISHED 4857/(squid-1)
> tcp 0 0 10.10.10.254:43051 10.10.10.254:3126 ESTABLISHED 4857/(squid-1)
>
>
> 10.10.10.254 is the squid box. 3126 is the ssl intercept port.
>
> # grep 3126 /etc/sysconfig/iptables
> [0:0] -A PREROUTING -i vlan10 -s 10.10.10.4 -p tcp -m tcp --dport 443 -j REDIRECT --to-port 3126
>
> Only my ip address is forwarded to 3126... Here is the sslbump part of the conf.
>
> https_port 3126 transparent ssl-bump generate-host-certificates=on dynamic_cert_mem_cache_size=4MB cert=/etc/etproxy/ssl/myCA.pem

Funny story ...

  *if* Squid were actually being "transparent proxy" here the outgoing
details on these connections would be "source 10.10.10.4:random-port,
destination some-IP:443". And your rule would loop that connection back
into Squid.

Unluckily for you "transparent" is currently an alias for "intercept"
and the Squid outgoing IP should not be 10.10.10.4. So the same
behaviour is being caused by something else more difficult to determin.

> acl sslsniff src 10.10.10.4
> acl sslbumpbypass dst "/etc/etproxy/whitelist.https"
> acl broken_sites dstdomain .twitter.com
> acl broken_sites dstdomain .facebook.com
> always_direct allow sslsniff
> ssl_bump none sslbumpbypass
> ssl_bump none broken_sites
> ssl_bump server-first all
> sslcrtd_program /usr/lib/squid/ssl_crtd -s /etc/etproxy/ssl/ssl_db -M 4MB
> sslcrtd_children 5

For starters check your configuration for the directive "via off" and
*remove* it. If it does not exist, please report that fact.

When that is done the broken requests should be rejected with a
forwarding loop error message and not DoS the machine while you are
testing for the source of the loop.

To track that down you can configure "debug_options ALL,1 11,2". Which
will dump the HTTP protocol headers and IP:port details for each HTTP
message through the proxy. Look for the ssl-bumped messages outbound
from Squid and see what IP:port details are on them and try to work back
from there how those details came to be.

Amos
Received on Fri Jun 07 2013 - 07:20:03 MDT

This archive was generated by hypermail 2.2.0 : Fri Jun 07 2013 - 12:00:06 MDT