Re: [squid-users] How can I cache most content

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Mon, 15 Feb 2010 09:23:47 +1300

Landy Landy wrote:
> Hello.
>
> I was looking at a post (how to force windos update to cache all update) from a week ago that was trying to cache all windowsupdates. I was looking into utilizing thundercache that does exactly that, I'm using videocache and can't get both (thundercache and videocache) to work together. After reading the post I decided to use squid to cache windows updates but, don't know if I'm doing it correctly since I haven't copy the refresh_patterns from the post. I actually followed the http://wiki.squid-cache.org/SquidFaq/WindowsUpdate wiki. I would also like to cache antivirus' updates and the content from the most visited sites: hi5, facebook, etc.... I would like to cache everything if possible.
>
> Looking at my access.log file I notice that content I thought it was supposed to be cached is not getting cached for example:
>
> 1266160859.409 146 172.16.100.61 TCP_REFRESH_UNMODIFIED/304 382 GET http://col.stb.s-msn.com/i/98/996247A8EF5F5991FCD8AACF6528F.jpg - DIRECT/65.54.81.185 image/jpeg
> 1266160859.421 148 172.16.100.61 TCP_REFRESH_UNMODIFIED/304 382 GET http://col.stb.s-msn.com/i/FA/E1B3F9B5667878F033D4C68A911AFD.jpg - DIRECT/65.54.81.209 image/jpeg

Cached content being updated...

> 1266160859.660 168 172.16.100.61 TCP_MISS/200 2273 GET http://a.rad.msn.com/ADSAdClient31.dll? - DIRECT/65.55.197.125 text/html
> 1266160859.865 100 172.16.100.61 TCP_MISS/200 419 GET http://b.scorecardresearch.com/r? - DIRECT/204.2.241.162 image/gif

Adverts. explicitly non-cacheable.
As advised by others, you really want to block this outright, or educate
your users on use of Ad-blockers to maximize bandwidth.

> 1266160861.376 123 172.16.100.61 TCP_REFRESH_UNMODIFIED/304 306 GET http://col.stc.s-msn.com/br/gbl/css/6/decoration/pipe.gif - DIRECT/4.23.59.126 -

Cached content being updated...

> 1266160861.387 289 172.16.100.45 TCP_MISS/200 415 GET http://0.channel53.facebook.com/p - DIRECT/69.63.178.123 text/plain

Facebook. Private update channel for a users page display.

> 1266160862.600 0 172.16.100.45 TCP_MISS/000 0 GET http://0.channel53.facebook.com/p - DIRECT/0.channel53.facebook.com -
> 1266160865.819 0 172.16.100.18 TCP_MISS/000 0 GET http://sn120w.snt120.mail.live.com/mail/SafeRedirect.aspx? - DIRECT/sn120w.snt120.mail.live.com -

Several transfer errors.

> 1266160872.391 146 172.16.254.1 TCP_MISS/200 319 GET http://www.kottke.org/frontpage/updates/index.php? - DIRECT/67.18.227.74 text/html

The only page in the list which appears to be cacheable.
First visit maybe?

> 1266160872.647 136 172.16.100.61 TCP_REFRESH_UNMODIFIED/304 264 GET http://col.stb.s-msn.com/i/50/832D93022C9184EBE368DD81A3874.jpg - DIRECT/65.54.81.209 image/jpeg

Cached content being updated...

> 1266160873.653 346 172.16.100.16 TCP_MISS/200 1119 GET http://www.facebook.com/ajax/presence/reconnect.php? - DIRECT/69.63.189.11 application/x-javascript

Facebook. 'nuff said.

> 1266160874.001 199591 172.16.100.99 TCP_MISS/200 4303656 GET http://streamer.soundclick.com/jarry_lo/14/06/freemp3/mamajuana+ajudemedeus.mp3 - DIRECT/8.14.112.23 audio/x-mpegurl

Streamed mp3, VERY likely never to have been visited before...

> 1266160876.884 270 172.16.100.99 TCP_MISS/200 651 GET http://w88.go.com/b/ss/wdgespcom,wdgespge/1/H.17/s73739908562219? - DIRECT/66.235.138.18 image/gif

Explicitly non-cacheable private page. Created several hours in the
future!! (Even to me sitting here in timezone +1300).

> 1266160877.214 198 172.16.100.110 TCP_MISS/200 5516 GET http://www.myhotcomments.com/graphics/53933 - DIRECT/75.126.132.34 text/html

ERROR: "The resource doesn't send Vary consistently."

>
> Here's my squid.conf file. Please correct things that might not be correct or optimized to cache the most content as possible.
>
> # Port Squid listens on
> http_port 172.16.0.1:3128 transparent

I seriously advise doing "transparent" on a different port.
Allow direct external connections to a port flagged for "transparent"
interception operations is asking for trouble these days.

>
> # Access-lists (ACLs) will permit or deny hosts to access the proxy
> acl lan-access src 172.16.0.0/16
> acl localhost src 127.0.0.1
> acl localnet src 172.16.0.0/16
>
> acl CONNECT method CONNECT
>
> http_access allow localhost
> http_access allow lan-access

Hmm. With "lan-access" machines having complete uncontrolled access to
the Internet its no wonder your attempts at using http_access below this
line are not working....

Also "lan-access" and "localnet" can be reduced to only one ACL. Pick oe
name and replace the other.

<snip...>
> acl windowsupdate dstdomain .go.microsoft.com
> acl windowsupdate dstdomain .update.microsoft.com/windowsupdate/v7/default.aspx
> acl windowsupdate dstdomain .download.microsoft.com
> acl windowsupdate dstdomain activex.microsoft.com
> acl windowsupdate dstdomain codecs.microsoft.com
> acl windowsupdate dstdomain urs.microsoft.com
>
> #acl CONNECT method CONNECT
> acl wuCONNECT dstdomain www.update.microsoft.com
> acl wuCONNECT dstdomain sls.microsoft.com
>
> http_access allow CONNECT wuCONNECT localnet
> http_access allow windowsupdate localnet
> # --- Windows update ends -----------------------------
>
> store_avg_object_size 48 KB

To maximize caching, DONT set limits on what can be cached....

> half_closed_clients off
>
> store_dir_select_algorithm round-robin
> quick_abort_min -1
> negative_ttl 1 minutes
> connect_timeout 90 seconds
> dns_nameservers 196.3.81.5 200.88.127.22 196.3.81.132
> logfile_rotate 5
> offline_mode off
>
> #balance_on_multiple_ip on
>
> refresh_pattern ^ftp: 1440 20% 10080
> refresh_pattern ^gopher: 1440 0% 1440
> refresh_pattern -i (cgi-bin|\?) 0 0% 0

Sorry, we have a better version of that now:
   refresh_pattern -i (/cgi-bin/|\?) 0 0% 0

> refresh_pattern . 0 20% 4320
>
> read_ahead_gap 32 KB
>
> visible_hostname Optimum
> cache_mgr sdfs_at_hotmail.com
>
>
> client_persistent_connections off
> server_persistent_connections off
> persistent_connection_after_error off

The above will be sucking a fair bit of speed out of your connection.
TCP handshakes on every request...

> detect_broken_pconn off
> memory_pools off
> #memory_pools_limit 64 MB
> refresh_all_ims on
> reload_into_ims on
> retry_on_error on
> coredump_dir none

> pipeline_prefetch on

With bandwidth limitations this will be sucking a fair bit of useless
crap in.

>
>
> Sorry for the long post but, I'm in desperate need of saving bandwidth since the most I can get in my part of the world is only 5MB and have to handle over 100 users with this connection.
>
> Thanks in advanced for your help and guidance.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE8 or 3.0.STABLE24
   Current Beta Squid 3.1.0.16
Received on Sun Feb 14 2010 - 20:23:56 MST

This archive was generated by hypermail 2.2.0 : Mon Feb 15 2010 - 12:00:08 MST