Re: [squid-users] TCP_SWAPFAIL/200

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 18 Apr 2012 15:12:21 +1200

On 18.04.2012 12:46, Linda Walsh wrote:
> I recently (well a month or so ago) tried to upgrade squid after my
> old version got overwritten
> by an OS related upgrade.
>
> now I am seeing TCP_SWAPFAIL/200 messages in my log -- that doesn't
> sound good.
>
> Why would I be getting such?

http://wiki.squid-cache.org/SquidFaq/SquidLogs#Squid_result_codes

>
> It appears the local disk-store isn't growing over time -- so I'm
> assuming it it telling
> me the on-disk store isn't working right?

Yes.

>
> I used a similar config to my previous one (below), so I'm not sure
> why it would be
> croaking now... Is there something "illegal" bout my config? I also
> included my non-comment squid.conf lines following that just to be
> thorough....
> i'd really like to get squid back to being 100%
> solid-bullet-proof...which it
> isn't right now (have had truncated downloads on longer downloads)...
>
> I'm also getting occasional core dumps in the base of the cache dir,
> which is
> usually a bad sign... ;-| Haven't had a chance to try to check the
> stack trace yet,
> but was wondering if anything looked amiss with my swap setup.

Please prioritise the core dump investigation.
Please use gdb and find out what the crash is coming from. The crash
and core-dump could be what is behind those incomplete or truncated
responses.

At this point I suggest updating to 3.2.0.17. There are a bunch of
cache related fixes in that release. The new cache swap.state format
will rebuild your cache_dir meta data from scratch and discard anything
which has problems visible.

If the core dumps continue with the new release, please prioritise
those. Most of the rest of what you describe may be side effects of the
crashing.

>
>
> It's a 12core 48G machine, with a reasonably fast Raid so it should
> have plenty of horse power for 1 user...but I find it can't keep up
> with
> my browsing habits... which is insane considering it's usually used
> for 10's-100's of users w/no prob...I know I am not that fast..
>
>
> Any pointing out of "gotcha's" would be appreciated!...
>
>
>
> squid -v
> Squid Cache: Version 3.2.0.16
> configure options: 'CFLAGS=-g -m64 -O2 -march=native -pipe
> -D_REENTRENT
> 'CCFLAGS=-g -m64 -O2 -march=native -pipe -D_REENTRENT 'LDFLAGS= -s'
> '--prefix=/usr' '--bindir=/usr/sbin' '--datadir=/usr/share/squid'
> '--libexecdir=/usr/sbin' '--libdir=/usr/lib64'
> '--localstatedir=/var/cache/squid' '--sharedstatedir=/var/lib/squid'
> '--sysconfdir=/etc/squid' '--docdir=/usr/share/packages/doc/squid'
> '--with-aufs-threads=24' '--with-logdir=/var/log/squid'
> '--with-mandir=/usr/share/man'
> '--with-piddir=/var/run/squid/squid.pid'
> '--with-default-user=squid' '--with-gnu-ld' '--with-included-ltdl'
> '--with-pic' '--with-large '--with-ltdl-lib=/usr/lib64'
> '--enable-build-info' '--enable-cachemgr-hostname' '--enable-disk-io'
> '--disable-ecap' '--disable-icap-client' '--enable-kill-parent-hack'
> '--enable-linux-netfilter' '--enable-ltld-install'
> '--enable-referer-log'
> '--enable-removal-policies' '--enable-stacktraces' '--enable-storeio'
> '--enable-useragent-log' '--enable-zph-qos'
> '--enable-x-accelerator-vary'
> '--disable-xmalloc-statistics' '--disable-auto-locale'
> '--disable-htcp'
> '--disable-ident-lookups' '--disable-ipv6' '--disable-snmp'
> '--disable-translation' '--without-netfilter-conntrack'
> 'EXT_LIBECAP_CFLAGS=-lecap' 'EXT_LIBECAP_LIBS=/usr/lib/libecap.so.2'
>
> + a bunch of compiler optimization switches: (that I also mostly
> used,
> though gcc is a newer version and a few options might be different)
>
> -fpie -fmessage-length=0 -funwind-tables -fasynchronous-unwind-tables
> -fbranch-target-load-optimize -fira-loop-pressure -fgcse -fgcse-las
> -fgcse-lm -fgcse-sm -floop-interchange -floop-strip-mine -floop-block
> -flto -fpredictive-commoning -frename-registers -ftree-loop-linear
> -ftracer -ftree-loop-distribution -ftree-loop-im -ftree-loop-ivcanon
> -fivopts -ftree-vectorize -funswitch-loops
> -fvariable-expansion-in-unroller -freorder-blocks-and-partition
> -fweb'
>
>
>
> Non-comment squid conf lines :
>
> acl sc_subnet src 192.168.3.0/24
> acl localnet src 192.168.3.0/24 # RFC1918 possible internal network
> acl localnet src fc00::/7 # RFC 4193 local private network
> range
> acl localnet src fe80::/10 # RFC 4291 link-local (directly
> plugged) machines
> acl SSL_ports port 443
> acl Safe_ports port 80 # http
> acl Safe_ports port 81 # http
> acl Safe_ports port 82 # http
> acl Safe_ports port 21 # ftp
> acl Safe_ports port 443 # https
> acl Safe_ports port 70 # gopher
> acl Safe_ports port 210 # wais
> acl Safe_ports port 1024-65535 # unregistered ports
> acl Safe_ports port 280 # http-mgmt
> acl Safe_ports port 488 # gss-http
> acl Safe_ports port 488 # gss-http
> acl Safe_ports port 591 # filemaker
> acl Safe_ports port 777 # multiling http
> acl Allowed_Connect port 1024-65535 #allowed non-SSL Connects to
> non-reserved ports
> acl CONNECT method CONNECT
> http_access allow manager localhost
> http_access allow manager sc_subnet
> http_access deny manager
> http_access deny !Safe_ports
> http_access allow CONNECT Safe_Ports

NOTE: Dangerous. Safe_Ports includes port 1024-65535 and other ports
unsafe to permit CONNECT to. This could trivially be used as a
multi-stage spam proxy or worse.
   ie a trivial DoS of "CONNECT localhost:8080 HTTP/1.1\n\n" results in
CONNECT loop until your machines port are all used up.

> http_access deny CONNECT !SSL_ports
> http_access allow localnet
> http_access allow localhost
> http_access deny all
> http_port 192.168.3.1:8080
> hierarchy_stoplist cgi-bin ?

You can drop hierarchy_stoplist from your config for simplicity.

> cache_mem 8 GB
> memory_replacement_policy heap GDSF
> cache_replacement_policy heap LFUDA
> cache_dir aufs /var/cache/squid 65535 64 64

You have multiple workers configured. AUFS does not support SMP at this
time. That could be the problem you have with SWAPFAIL, as the workers
collide altering the cache contents.

To use this cache either wrap it in "if ${process_number} = N" tests
for the workers you want to do caching. Or add ${process_number} to the
path for each worker to get its own unique directory area.

eg:
  cache_dir aufs /var/cache/squid_${process_number} 65535 64 64

or
if ${process_number} = 1
  cache_dir aufs /var/cache/squid 65535 64 64
endif

> maximum_object_size 1 GB
> cache_swap_low 93
> cache_store_log /var/log/squid/store.log
> pid_filename /var/run/squid/squid.pid
> strip_query_terms off
> buffered_logs on
> cache_log daemon:/var/log/squid/cache.log
> coredump_dir /var/cache/squid
> url_rewrite_host_header off
> url_rewrite_access deny all
> url_rewrite_bypass on

You do not have any re-writer or redirector configured. These
url_rewrite_* can all go.

> refresh_pattern -i (/cgi-bin/|\?) 0 0% 0

This above pattern ...

> refresh_pattern -i \.(ico|gif|jpg|png) 0 20% 4320
> ignore-no-cache ignore-private override-expire
> refresh_pattern -i ^http: 0 20% 4320 ignore-no-cache
> ignore-private

"private" means the contents MUST NOT be served to multiple clients.
Since you say this is a personal proxy just for you, thats okay but be
carefulif you ever open it for use by other people. Things like your
personal details embeded in same pages are cached by this.

"no-cache" *actually* just means check for updates before using the
cached version. This is usually not as useful as many tutorials make it
out to be.

> refresh_pattern ^ftp: 1440 20% 10080
> refresh_pattern ^gopher: 1440 0% 1440

  ... is meant to be here (second to last).

> refresh_pattern . 0 20% 4320
> read_ahead_gap 256 MB

Uhm... 256 MB buffering per request.... sure you want to do that?

> negative_ttl 3 seconds
> range_offset_limit 16 MB
> store_objects_per_bucket 16
> request_header_max_size 384 KB
> via off
> vary_ignore_expire on
> request_header_access From deny all
> request_header_access Referer deny all
> request_header_access Server deny all
> request_header_access User-Agent deny all
> request_header_access WWW-Authenticate deny all
> request_header_access Link deny all
> reply_header_access From deny all
> reply_header_access Referer deny all
> reply_header_access Server deny all
> reply_header_access User-Agent deny all
> reply_header_access WWW-Authenticate deny all
> reply_header_access Link deny all
> request_timeout 4 minutes
> half_closed_clients on
> shutdown_lifetime 8 seconds
> visible_hostname web-proxy
> hostname_aliases ishtar ishtar.sc.tlinx.org web-proxy
> ns1.sc.tlinx.org
> umask 027
> dns_defnames on
> memory_pools_limit 4096 GB
> forwarded_for delete
> pipeline_prefetch on
> high_response_time_warning 7000
> high_page_fault_warning 1024
> high_memory_warning 24 GB
> workers 8

Please run "squid -k parse" and fix the messages about obsolete or
changed config options.

Amos
Received on Wed Apr 18 2012 - 03:12:27 MDT

This archive was generated by hypermail 2.2.0 : Fri Apr 20 2012 - 12:00:03 MDT