[squid-users] Squid on DualxQuad Core 8GB Rams - Optimization - Performance - Large Scale - IP Spoofing

From: Haytham KHOUJA \(devnull\) <devnull@dont-contact.us>
Date: Sun, 14 Oct 2007 02:07:16 +0300

Hello,
The purpose of this thread is to join forces to have the best Squid
configuration for generic affordable Intel machines available by major
vendors (Dell/HP...) specifically for ISPs and corporations that want a
basic setup but with optimal response and throughput and maximizing
bandwidth savings.
I work for an important ISP and I currently replaced 2 NetApp NetCache
with 3 Dell 2950 hooked up on a Foundry Switch for Load Balancing.
I used tproxy to enable IP Spoofing to IP spoofing the outgoing address
with some configurations on the Cisco core router, I had to compile
iptables and tproxy on a Debian kernel source (2.6.18)

I've read almost every single thread on Optimizing Squid and Linux and
want to share my setup with you.
I do have some questions, clarifications and bugs but overall the
performance is pretty impressive. (Yes, much better than the NetApps)

What i want to do is since i have 8 GB of RAMs, i want to store more hot
objects in the RAMs to maximize Memory hit ratio, but with my setup,
Squid doesn't
go above 2GB~3GB of usage. (Remember, that there are no other heavy
processes on the machine).

If i knew beforehand that Squid doesn't make use of SMP, i wouldn't have
bought Dual Quad Core and would have invested in Intel CPUs with 8mb of
Cache, but what's done is done :)

Before i had Squid go down because of File Delimiters and maximum open
files and ip_conntrac fill up, i fixed both with some iptables and
sysctl configuration.
Now i'm hitting a "Oct 14 01:17:06 proxy4 squid[8883]: assertion failed:
diskd/store_io_diskd.c:384: "!diskdstate->flags.close_request" Error, so
Squid kills and restarts (which flushes the Memory cache).

I'm looking forward for some contributions, idea sharing, knowledge
correcting to make this setup a standard setup for large scale, well
optimized and high performant Squid for future tweakings. I hope this
configuration would be then uploaded to the Squid wiki.

Here's my setup:
Dell 2950
Dual Quad Core 2.4Ghz / 8 GB Rams / 4x 136 GB 15000 RPM drives

I have 3 cache_dir on separate drives and I formated the 3 disks with
ReiserFS:
    /dev/sdb1 /CACHE1 reiserfs notail,noatime 0 0
    /dev/sdc1 /CACHE2 reiserfs notail,noatime 0 0
    /dev/sdd1 /CACHE3 reiserfs notail,noatime 0 0

I run Debian GNU/Linux Etch and compiled Squid with the following:
Squid Cache: Version 2.6.STABLE16
configure options: '--bindir=/usr/bin' '--sbindir=/usr/sbin/'
'--sysconfdir=/etc' '--enable-icmp' '--enable-snmp' '--enable-async-io'
'--enable-linux-netfilter' '--enable-linux-tproxy' '--with-dl'
'--with-large-files' '--enable-large-cache-files' '--with-maxfd=1000000'
'--enable-storeio=diskd,ufs' '--with-aio' '--enable-epoll'
'--disable-ident-lookups' '--enable-removal-policies=heap'
'CFLAGS=-DNUMTHREADS=120'

As you can see i have the following modules enabled: linux-tproxy,
diskd, epoll, and removal policies.
/dev/epoll improves network I/O performance, Diskd separates disk I/O to
separate processes (which reduces process locking from Squid to write on
disks), and read benchmarks for memory and disk removal policies.

My /etc/squid.conf is composed of the following:

http_port 80 transparent tproxy
tcp_outgoing_address IP of the Machine
:: Those are for IP Spooding and Transparency

via off
forwarded_for off
:: Those are for total transparency, remote hosts will never guess that
the request came from a proxy

cache_mem 600 MB
:: A bit confused about this, When i go higher than 2GB, Squid kills
with a "out of memory" error. I have 8GB and want to maximize the use of it.

cache_effective_user nobody
cache_effective_group nogroup
:: Security and bla bla

cache_replacement_policy heap LFUDA
memory_replacement_policy heap GDSF
:: Very objective, you can google about them

cache_dir diskd /CACHE1 61440 16 256 Q1=144 Q2=128
cache_dir diskd /CACHE2 61440 16 256 Q1=144 Q2=128
cache_dir diskd /CACHE3 61440 16 256 Q1=144 Q2=128
:: DISKD configuration, i'm only using 60GB of each disk

cache_access_log /var/log/squid/access.log
cache_log /var/log/squid/cache.log
cache_store_log none
:: No need to log cache_store, so minimizing the Disk I/O

fqdncache_size 51200
ipcache_size 51200
:: Caching IPs/Domain Name and whatnot

pipeline_prefetch on
:: Performance enhancement

shutdown_lifetime 1 second
:: Tired to wait whenever i restart my Squids (Only on testing)

read_ahead_gap 60 KB
maximum_object_size 2 GB
minimum_object_size 0 KB
maximum_object_size_in_memory 128 KB
cache_swap_high 80%
cache_swap_low 70%
half_closed_clients off
memory_pools on
positive_dns_ttl 24 hours
negative_dns_ttl 30 seconds
request_timeout 60 seconds
connect_timeout 30 seconds
pconn_timeout 30 seconds
ie_refresh on
dns_nameservers DNS1 DNS2
emulate_httpd_log off
log_ip_on_direct on
debug_options ALL, 9
pid_filename /var/run/squid.pid

My IPtables/sysctl and startup file:
#!/bin/sh
iptables -t tproxy -A PREROUTING -i eth0 -p tcp -m tcp --dport 80 -j
TPROXY --on-port 80
:: I run Squids on port 80 so that i can forward all incoming requests
on port 80 to the Squids on the Cisco router level

echo 1 > /proc/sys/net/ipv4/ip_forward
echo 1 > /proc/sys/net/ipv4/ip_nonlocal_bind
echo 0 > /proc/sys/net/ipv4/conf/all/rp_filter
echo 1024 65535 > /proc/sys/net/ipv4/ip_local_port_range
echo 102400 > /proc/sys/net/ipv4/tcp_max_syn_backlog
echo 1000000 > /proc/sys/net/ipv4/ip_conntrack_max
echo 1000000 > /proc/sys/fs/file-max
echo 60 > /proc/sys/kernel/msgmni
echo 32768 > /proc/sys/kernel/msgmax
echo 65536 > /proc/sys/kernel/msgmnb
:: Maximizing Kernel configuration

ulimit -HSn 1000000
/etc/init.d/squid stop
/etc/init.d/squid start
:: Re-enforcing ulimit parameters for the Squid process.

Thank you
Received on Sat Oct 13 2007 - 17:07:44 MDT

This archive was generated by hypermail pre-2.1.9 : Thu Nov 01 2007 - 13:00:01 MDT