Re: [squid-users] Trying to improve the Byte Hit Ratio, any tips ?

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 06 Jan 2009 17:49:10 +1300

Vianney Lejeune wrote:
> Hello,
>
> I'm trying to improve the Byte Hit Ratio of SquidCache on my
> network. There is 220 computers in the LAN, using internet on a general
> usage basis. The maximum bandwidth is 4Mbps in/out, the total amount of
> data is estimated to be 30 to 60 Gbytes daily.
>
>
> This is the report from cachemgr:
> =>
> Average HTTP requests per minute since start: 1023.9
> Average ICP messages per minute since start: 0.0
> Select loop called: 1208577 times, 5.619 ms avg
> Cache information for squid:
> Request Hit Ratios: 5min: 37.9%, 60min: 41.1%
> Byte Hit Ratios: 5min: 13.2%, 60min: 13.8% (It's quite low, these
> values are usual)
> Request Memory Hit Ratios: 5min: 2.0%, 60min: 2.6% (I rebooted
> the server 3 hours ago, this can explain these low values)
> Request Disk Hit Ratios: 5min: 41.3%, 60min: 36.3%
> Storage Swap size: 27654312 KB
> Storage Mem size: 190364 KB
> Mean Object Size: 29.65 KB
> Requests given to unlinkd: 33035
> Median Service Times (seconds) 5 min 60 min:
> HTTP Requests (All): 0.23230 0.46965
> Cache Misses: 0.35832 0.72387
> Cache Hits: 0.19742 0.35832
> Near Hits: 0.20843 0.55240
> Not-Modified Replies: 0.03829 0.05331
> DNS Lookups: 0.00094 0.00779
> ICP Queries: 0.00000 0.00000
> <=
>
> This is my squid.conf file:
> =>
>
> http_port 3128 transparent
> hierarchy_stoplist cgi-bin ?

> acl QUERY urlpath_regex cgi-bin \?
> cache deny QUERY

Without cache peers you can drop the above QEURY acl.
That will raise both hit ratios on semi-dynamic objects.
BUT, see addition to refresh_pattern below...

> acl apache rep_header Server ^Apache
> broken_vary_encoding allow apache
> maximum_object_size 128 MB

Re: the above maximum. There may be huge objects going through that can
be cached.

> cache_mem 250 MB
> maximum_object_size_in_memory 50 KB

memory, memory, memory. The more you can throw at the problem the more
objects can be kept and served while hot. Squid with 64-bit can easily
handle many GBs of memory cache. (at cost of slow shutdown when it saves
the hottest to disk for the next round.)

> cache_replacement_policy heap LFUDA

Been a while since I looked at these, to maximize bytes you want the
policy that looks at object size as well as 'coldness'. To remove the
smaller cool objects before the larger equally cool ones.

> cache_dir ufs /data/spool/squid 30000 16 256

Your cache dir is only 30GB. Thats one days traffic or less by your
above statements. For good hit ratios you may need at least 7 days,
preferrably as close to 30 as possible.

Depending on your OS, AUFS(Linux) or diskd(*BSD) may prove much faster
access than UFS.

> access_log none
> cache_log none

The above is generating log file named "none". Would be more useful to
set debug_options ALL,0. If you really don't want to know about the
critical problems that do happen then set filename to /dev/null as well.

> cache_store_log none
> log_ip_on_direct off
> hosts_file /etc/hosts
> refresh_pattern ^ftp: 1440 20% 10080
> refresh_pattern ^gopher: 1440 0% 1440

without QUERY acl above, you wil need this right here in the pattern order:
  refresh_pattern -i (/cgi-bin/|\?) 0 0% 0

> refresh_pattern . 0 20% 4320
> quick_abort_min 0 KB
> quick_abort_max 0 KB
> range_offset_limit 0 KB

Be careful, but you may want to play at setting these to continue
downloads. (quick_abort -1 KB)
That will cause all partial and restarted downloads to become HIT later.
At risk of some wastage.

> half_closed_clients off
> shutdown_lifetime 0 seconds
> acl all src 0.0.0.0/0.0.0.0
> acl manager proto cache_object
> acl localhost src 127.0.0.1/255.255.255.255
> acl to_localhost dst 127.0.0.0/8
> acl SSL_ports port 443 # https
> acl SSL_ports port 563 # snews
> acl SSL_ports port 873 # rsync
> acl Safe_ports port 80 # http
> acl Safe_ports port 21 # ftp
> acl Safe_ports port 443 # https
> acl Safe_ports port 70 # gopher
> acl Safe_ports port 210 # wais
> acl Safe_ports port 1025-65535 # unregistered ports
> acl Safe_ports port 280 # http-mgmt
> acl Safe_ports port 488 # gss-http
> acl Safe_ports port 591 # filemaker
> acl Safe_ports port 777 # multiling http
> acl Safe_ports port 631 # cups
> acl Safe_ports port 873 # rsync
> acl Safe_ports port 901 # SWAT
> acl purge method PURGE
> acl CONNECT method CONNECT
> acl ReseauLocal src 10.0.0.0/16
> http_access allow manager localhost
> http_access deny manager
> http_access allow purge localhost
> http_access deny purge
> http_access allow localhost
> http_access allow ReseauLocal
> http_access deny all
> http_reply_access allow all
> icp_access deny all
> cache_effective_group proxy
> httpd_suppress_version_string on
> via off
> forwarded_for off
> log_icp_queries off
> client_db off
> coredump_dir /var/spool/squid
> pipeline_prefetch off
> <=
>
> Do you see something that need to be improved ? Did I miss something?

Theres a lot of tweaks with refresh_pattern that can be done to warp
things into cache longer than they are supposed to be stored. I won't
advocate any though.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE5 or 3.0.STABLE11
   Current Beta Squid 3.1.0.3
Received on Tue Jan 06 2009 - 04:50:27 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 08 2009 - 12:00:02 MST