Re: [squid-users] Facebook page very slow to respond from Amos Jeffries on 2011-10-08 (squid-users)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 09 Oct 2011 16:24:25 +1300

On 09/10/11 09:15, Wilson Hernandez wrote:
> I disabled squid and I'm doing simple FORWARDING and things work, this
> tells me that I'm having a configuration issue with squid 3.1.14.
>
> Now, I can't afford to run our network without squid since we are also
> running SquidGuard for disabling some websites to certain users.
>
> Here's part of my squid.conf:
>
> # Port Squid listens on
> http_port 172.16.0.1:3128 intercept disable-pmtu-discovery=off
>
> error_default_language es-do
>
> # Access-lists (ACLs) will permit or deny hosts to access the proxy
> acl lan-access src 172.16.0.0/16
> acl localhost src 127.0.0.1
> acl localnet src 172.16.0.0/16
> acl proxy src 172.16.0.1
> acl clientes_registrados src "/etc/msd/ipAllowed"
>
> # acl adstoblock dstdomain "/etc/squid/blockAds"
>
> acl CONNECT method CONNECT
>
<snip>
>
> http_access allow proxy
> http_access allow localhost
>
> #---- Block some sites
>
> acl blockanalysis01 dstdomain .scorecardresearch.com clkads.com
> acl blockads01 dstdomain .rad.msn.com ads1.msn.com ads2.msn.com
> ads3.msn.com ads4.msn.com
> acl blockads02 dstdomain .adserver.yahoo.com ad.yieldmanager.com
> acl blockads03 dstdomain .doubleclick.net .fastclick.net
> acl blockads04 dstdomain .ero-advertising.com .adsomega.com
> acl blockads05 dstdomain .adyieldmanager.com .yieldmanager.com
> .adyieldmanager.net .yieldmanager.net
> acl blockads06 dstdomain .e-planning.net .super-publicidad.com
> .super-publicidad.net
> acl blockads07 dstdomain .adbrite.com .contextweb.com .adbasket.net
> .clicktale.net
> acl blockads08 dstdomain .adserver.com .adv-adserver.com
> .zerobypass.info .zerobypass.com
> acl blockads09 dstdomain .ads.ak.facebook.com .pubmatic.com .baynote.net
> .publicbt.com

Optimization tip:
These ACLs are the same as far as Squid is concerned. You are also
using them the same way at the same time below. So the best thing to do
is drop those 01,02,03 numbers and have all the blocked domains in one
ACL name.

Then the below testing can be reduced to a single:
http_access deny blockads

>
> http_access deny blockanalysis01
> http_access deny blockads01
> http_access deny blockads02
> http_access deny blockads03
> http_access deny blockads04
> http_access deny blockads05
> http_access deny blockads06
> http_access deny blockads07
> http_access deny blockads08
> http_access deny blockads09
> # http_access deny adstoblock
>
> acl bank dstdomain .popularenlinea.com .bpd.com.do
> acl google dstdomain .google.com .google.com.do
> acl ourPublicServer dstdomain .figureo56.com
>
> http_access allow bank
> http_access allow google
> http_access allow ourPublicServer

Similar for these allows.

Although note that this is giving public access to google and bank
through your proxy to anyone who knows what IP your public server IP is.

Why you would need such unlimited access in a purely intercepting proxy
(no reverse-proxy) is questionable. It hints at a hidden problem
elsewhere in the setup.

<snip>
>
> acl manager proto cache_object
> # replace 10.0.0.1 with your webserver IP
> acl webserver src 172.16.0.1
>
> http_access allow manager webserver
> http_access deny manager
>

Good, you have some protection against cache manager access. But the
Safe_ports and "CONNECT !SSL_ports" security restrictions to prevent
abuse of the proxy are missing.

I'm starting to recommend they go above the manager controls in
preparation for future Squid releases. But for 3.1 it does not matter
much whether they are directly above or below, so long as they are up
here high in the config.
This is both a security and performance measure. Since the cases the
security protection rejects are processing and bandwidth intensive DoS
risks.

<snip>
>
> # Also videos are LARGE; make sure you aren't killing them as 'too big
> to save'
> # - squid defaults to 4MB, which is too small for videos and even some
> sound files
> maximum_object_size 32500 KB
>
> maximum_object_size_in_memory 15625 KB
>
> minimum_object_size 0 KB
>
> cache_mem 1024 MB
>
> access_log /var2/squid/access.log
> cache_log /var2/squid/cache.log
> cache_store_log /var2/squid/store.log
>
> cache_dir aufs /var2/squid/cache 100000 64 255
<snip>
>
> ###### Newer VideoCache ######
> # --BEGIN-- videocache config for squid
<snip comments>
> url_rewrite_program /usr/local/bin/zapchain "/usr/bin/python
> /usr/share/videocache/videocache.py" "/usr/local/bin/squidGuard -c
> /usr/local/squidGuard/squidG$
>
> url_rewrite_children 10
>
> acl videocache_allow_url url_regex -i \.youtube\.com\/get_video\?
> acl videocache_allow_url url_regex -i \.youtube\.com\/videoplayback
> \.youtube\.com\/videoplay \.youtube\.com\/get_video\?
<snip regex ACLs>
>
> acl videocache_allow_url url_regex -i video\.break\.com\/(.*)\.(flv|mp4)
> acl videocache_allow_dom dstdomain .mccont.com .metacafe.com
> .redtube.com .cdn.dailymotion.com
> url_rewrite_access allow videocache_allow_url
> url_rewrite_access allow videocache_allow_dom
> url_rewrite_access allow all

Allow, allow, or "allow all".

There is absolutely no reason to perform those complex regex ACL
testing when both success and failure lead to "allow". Drop the first
two url_rewrite_access lines.

> redirector_bypass on

Allows automatic bypass of squidguard security when the traffic load on
the whole redirector chain rises enough to get slow. I suspect that is
not what you wanted.

Also, I' not sure what zapproxy does internally but I suspect you want
to consider its child processes in the same way you would consider squid
ACLs. Such that the ones most likely to cut down load are tested first
(that would be squidguard probably, since it blocks requests rather than
just reducing WAN bandwidth like videocache).

>
> # --END-- videocache config for squid
>
> #---------------------------------
> # --- Windows Update
>
>
> acl windowsupdate dstdomain windowsupdate.microsoft.com
> acl windowsupdate dstdomain .update.microsoft.com
> acl windowsupdate dstdomain download.windowsupdate.com
> acl windowsupdate dstdomain redir.metaservices.microsoft.com
> acl windowsupdate dstdomain images.metaservices.microsoft.com
> acl windowsupdate dstdomain c.microsoft.com
> acl windowsupdate dstdomain www.download.windowsupdate.com
> acl windowsupdate dstdomain wustat.windows.com
> acl windowsupdate dstdomain crl.microsoft.com
> acl windowsupdate dstdomain sls.microsoft.com
> acl windowsupdate dstdomain productactivation.one.microsoft.com
> acl windowsupdate dstdomain ntservicepack.microsoft.com
> acl windowsupdate dstdomain .go.microsoft.com
> acl windowsupdate dstdomain
> .update.microsoft.com/windowsupdate/v7/default.aspx
> acl windowsupdate dstdomain .download.microsoft.com
> acl windowsupdate dstdomain activex.microsoft.com
> acl windowsupdate dstdomain codecs.microsoft.com
> acl windowsupdate dstdomain urs.microsoft.com
>
> acl wuCONNECT dstdomain www.update.microsoft.com
> acl wuCONNECT dstdomain sls.microsoft.com
>
> http_access allow CONNECT wuCONNECT localnet
> http_access allow windowsupdate localnet
>
> # --- Windows update ends -----------------------------
>
> # ------ Test AntiVirus Caching --------------
> acl avast_allow_url url_regex -i \.vpu
> acl avast_allow_url url_regex -i \.vpx
>
> url_rewrite_access allow avast_allow_url

You already said "url_rewrite_access allow all". The above lines do nothing.

>
> acl avast dstdomain avast.com
> http_access allow CONNECT localnet
> http_access allow avast localnet

Um, you already allowed unlimited access to google through your Squid.
Why you would want that but then restrict avast is a questionable choice.

> #---------------------------------
>
<snip irrelevant comments>
> # TAG: half_closed_clients
> # Some clients may shutdown the sending side of their TCP
> # connections, while leaving their receiving sides open. Sometimes,
> # Squid can not tell the difference between a half-closed and a
> # fully-closed TCP connection. By default, half-closed client
> # connections are kept open until a read(2) or write(2) on the
> # socket returns an error. Change this option to 'off' and Squid
> # will immediately close client connections when read(2) returns
> # "no more data to read."
> #
> #Default:
> half_closed_clients off
>
> #Default:
> store_dir_select_algorithm least-load
>
> #extension_methods SEARCH NICK
>
> range_offset_limit 0 KB
>
> quick_abort_min 0 KB
>
> #quick_abort_pct 95
>
> #negative_ttl 1 minutes
>
> connect_timeout 60 seconds
>
> dns_nameservers 172.16.0.2 172.16.0.1
>
> logfile_rotate 5
>
> offline_mode off
>
> balance_on_multiple_ip on

This erases some of the benefits from connection persistence and reuse.
It is not such a great idea with 3.1+ as it was with earlier Squid.

Although you turned of connection persistence anyway below. So this only
is noticable when it breaks websites depending on IP-base security.

>
> refresh_pattern ^ftp: 1440 20% 10080
> refresh_pattern ^gopher: 1440 0% 1440
> refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
> refresh_pattern . 0 20% 4320
>

You may as well erase all the refresh_pattern rules below. The CGI and
'.' pattern rules are the last ones Squid processes.

>
> #Suggested default:
> refresh_pattern -i \.jpg$ 0 50% 21600 reload-into-ims
> refresh_pattern -i \.gif$ 0 50% 21600 reload-into-ims
<snippety, snip, snip>
> override-expire ignore-private
> refresh_pattern -i \.swf$ 10080 90% 525600 ignore-no-cache
> override-expire ignore-private
>
>
> read_ahead_gap 32 KB
>
> visible_hostname www.optimumwireless.com
> cache_mgr optimumwireless_at_hotmail.com
>

Optimum wireless. Hmm. I'm sure I've audited this config before and
mentioned the same things...

> # TAG: store_dir_select_algorithm
> # Set this to 'round-robin' as an alternative.
> #
> #Default:
> # store_dir_select_algorithm least-load
> store_dir_select_algorithm round-robin
>

Interesting. Forcing round-robin selection between one dir. :)

>
>
> # PERSISTENT CONNECTION HANDLING
> #
>
-----------------------------------------------------------------------------

>
> #
> # Also see "pconn_timeout" in the TIMEOUTS section
>
> # TAG: client_persistent_connections
> # TAG: server_persistent_connections
> # Persistent connection support for clients and servers. By
> # default, Squid uses persistent connections (when allowed)
> # with its clients and servers. You can use these options to
> # disable persistent connections with clients and/or servers.
> #
> #Default:
> client_persistent_connections off
> server_persistent_connections off
> # TAG: persistent_connection_after_error
> # With this directive the use of persistent connections after
> # HTTP errors can be disabled. Useful if you have clients
> # who fail to handle errors on persistent connections proper.
> #
> #Default:
> persistent_connection_after_error off
>

>
> # TAG: pipeline_prefetch
> # To boost the performance of pipelined requests to closer
> # match that of a non-proxied environment Squid can try to fetch
> # up to two requests in parallel from a pipeline.
> #
> # Defaults to off for bandwidth management and access logging
> # reasons.
> #
> #Default:
> pipeline_prefetch on

pipelining ON without persistent connections OFF earlier. This could be
the whole problem all by itself.

What it happening is that Squid accepts 2 requests (pipeline on) from
the client, parses them both, services the first one from a random DNS
IP (balance_on_multiple_ip on) and *closes* the connection (persistence
off). The client is forced to repeat the TCP connection and second
request from scratch, likely pipilining another behind that.

This is doubling the load on Squid parser (which is one of the slowest
most CPU intensive processes of proxying). As well as potentially
doubling the client->squid request traffic.

I recommend you remove balance_on_multiple_ip and
server_persistent_connections from your config. That will enable server
persistence in 3.1 in accordance with its HTTP/1.1 capabilities.

Also you can try this:
   pipeline_prefetch on
   client_persistent_connections on
   pconn_timeout 30 seconds

If that client facing change causes resources problems use:
pipeline_prefetch off
client_persistent_connections off

BUT, be sure and please make a note of why before turning off
persistence. You will want to re-check that reason periodically.
Persistence enables several major performance boosters (like pipelining)
in HTTP/1.1 and the problems-vs-benefits balance changes over time
depending on external factors of client HTTP compliance and network
hardware outside of Squid.

>
> http_access allow clientes_registrados

Um, I assume the clientes_registrados are registered IPs within LAN network?

As a double-check on the http_access permissions you can use squidclient
to get a simple list of the access permissions:
squidclient mgr:config | grep http_access

and check the output rules follow your required policies.

>
> shutdown_lifetime 45 seconds
>
> http_access deny all
>
>
> Wilson Hernandez
> www.figureo56.com
> www.optimumwireless.com
>
>
> On 10/8/2011 3:34 PM, Wilson Hernandez wrote:
>> So far this is what cache.log looks like:
>>
>> 2011/10/08 15:10:14| Starting Squid Cache version 3.1.14 for
>> i686-pc-linux-gnu...
>> 2011/10/08 15:10:14| Process ID 1498
>> 2011/10/08 15:10:14| With 65536 file descriptors available
>> 2011/10/08 15:10:14| Initializing IP Cache...
>> 2011/10/08 15:10:14| DNS Socket created at [::], FD 7
>> 2011/10/08 15:10:14| DNS Socket created at 0.0.0.0, FD 8
>> 2011/10/08 15:10:14| Adding nameserver 172.16.0.2 from squid.conf
>> 2011/10/08 15:10:14| Adding nameserver 172.16.0.1 from squid.conf
>> 2011/10/08 15:10:14| helperOpenServers: Starting 10/10 'zapchain'
>> processes
>> 2011/10/08 15:10:15| Unlinkd pipe opened on FD 33
>> 2011/10/08 15:10:15| Swap maxSize 102400000 + 1048576 KB, estimated
>> 1616384 objects
>> 2011/10/08 15:10:15| Target number of buckets: 80819
>> 2011/10/08 15:10:15| Using 131072 Store buckets
>> 2011/10/08 15:10:15| Max Mem size: 1048576 KB
>> 2011/10/08 15:10:15| Max Swap size: 102400000 KB
>> 2011/10/08 15:10:16| Version 1 of swap file with LFS support detected...
>> 2011/10/08 15:10:16| Rebuilding storage in /var2/squid/cache (DIRTY)

Seems like your shutdown timeout is too small. Squid is not completig
the save of the index.
But then its only taking 16 minutes and 15 seconds to re-scan the disk
on startup before squid starts using the cache again so not a big issue
for you.

<snip>
>> 2011/10/08 15:26:31| Completed Validation Procedure
>> 2011/10/08 15:26:31| Validated 9023701 Entries
>> 2011/10/08 15:26:31| store_swap_size = 99318180
>> 2011/10/08 15:26:31| storeLateRelease: released 0 objects
>> 2011/10/08 15:26:31| IpIntercept.cc(137) NetfilterInterception: NF
>> getsockopt(SO_ORIGINAL_DST) failed on FD 12: (2) No such file or
>> directory

This is a major problem. Either your NAT system is broken or the
connection did not traverse the NAT system at all (direct use of port
3128 as forward proxy port).

When you move up to the next Squid versions with Host: security
supported this will reject every one of these connections instead of
just warning.

The current Squid 3.1 and older _assume_ that the IPs given by the OS
connection setup were real and a forward-proxy connection made. This
could be filling your logs with garbage IP addresses.

Particularly, the incoming packets can be seen by Squid as coming
from the box the NAT was performed on. I suspect this is why you are
placing "http_access allow proxy" at the top of your config. Essentially
allowing a blanket free access to anyone who can figure out how to avoid
your NAT and contact the proxy directly.
(NP: the "allow localhost" rule should also go down next to "allow
clientes_registrados" if you actually need it after fixing the NAT).

The recommended config is to have two http_port entries, one for
forward-proxy requests (3128) and another (secret random port) for NAT
intercepted connections. The NAT receiving port should be firewalled
such that nothing coming from outside the firewall software can reach it
(iptables mangle table DROP).
See the recently updated config at
http://wiki.squid-cache.org/ConfigExamples/Intercept/LinuxDnat

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.15
   Beta testers wanted for 3.2.0.12

Received on Sun Oct 09 2011 - 03:24:35 MDT

This archive was generated by hypermail 2.2.0 : Tue Oct 11 2011 - 12:00:03 MDT