Re: [squid-users] Re: Cache Windows Updates ONLY

From: Nick Hill <nick_at_nickhill.co.uk>
Date: Sat, 12 Apr 2014 20:08:20 +0100

I have been ironing out issues with my windows updates set-up for
Squid. I have been through my squid.conf file to de-cruft it.

The following squid.conf should be self-documenting. I have found this
works well in a multi-computer environment where you can expect a lot
of Windows machines to perform updates. A computer shop is a good
example. Of course, you will want to configure a DHCP server with a
wpad.dat address so that your client machines will auto-configure to
use your proxy.

The principle difference between this and other configurations is that
it will cache windows updates even where a query string operates on a
cab, exe, or other non-dynamic response. I find the query string does
not change the file contents. (I know - it is possible that it
could...)

The other feature is that Microsoft conveniently include SHA1 hashes
in URLs for static content files. Often, these static content files
will be found at differing locations, and will often be called with
query strings! Web cache hell! This configuration represents the data
internally to squid based purely on the SHA1 hash where available. If
two content items really have a SHA1 match, then you can guarantee
they are identical. Any successive file accesses from any of the
windows update domains which match the general SHA1 pattern used in
windows updates will generate a cache HIT, even where the URL is quite
different, and irrespective of any cache-bashing query string.

I will monitor the configurations over the next week. Empirically, so
far, it all works!
If anyone can see howlers, let me know. Thanks!

#squid.conf file for Squid Cache: Version 3.4.4
#compiled on Ubuntu with configure options: '--enable-async-io=8'
'--enable-storeio=ufs,aufs,diskd' '--enable-removal-policies=lru,heap'
#'--enable-delay-pools' '--enable-underscores' '--enable-icap-client'
'--enable-follow-x-forwarded-for' '--with-logdir=/var/log/squid3'
#'--with-pidfile=/var/run/squid3.pid' '--with-filedescriptors=65536'
'--with-large-files' '--with-default-user=proxy'
#'--enable-linux-netfilter' '--enable-storeid-rewrite-helpers=file'

#Recommendations: in full production, you may want to set debug
options from 2 to 1 or 0.
#You may also want to comment out strip_query_terms off for user privacy

#Explicitly define logs for my compiled version
cache_store_log /var/log/squid3/store.log
access_log /var/log/squid3/access.log
cache_log /var/log/squid3/cache.log

#Lets have a fair bit of debugging info
debug_options ALL,2
#Include query strings in logs
strip_query_terms off

acl all src all
acl windowsupdate dstdomain .windowsupdate.microsoft.com
acl windowsupdate dstdomain .c.microsoft.com
acl windowsupdate dstdomain .ws.microsoft.com
acl windowsupdate dstdomain .update.microsoft.com
acl windowsupdate dstdomain images.metaservices.microsoft.com
acl windowsupdate dstdomain .download.windowsupdate.com
acl windowsupdate dstdomain wustat.windows.com
acl windowsupdate dstdomain swcdn.apple.com
acl windowsupdate dstdomain data-cdn.mbupdates.com
acl QUERY urlpath_regex cgi-bin \?

#I'm behind a NAT firewall, so I don't need to restrict access
http_access allow all

#Uncomment these if you have web apps on the local server which auth
through local ip
#acl to_localhost dst 127.0.0.0/8 0.0.0.0/32
#http_access deny to_localhost

visible_hostname myclient.hostname.com
http_port 3128

#Always optimise bandwidth over hits
cache_replacement_policy heap LFUDA
#200Mb max object if not windowsupdate
maximum_object_size 200000 KB
#Set these according to your file system
cache_dir ufs /home/smb/squid/squid 70000 16 256
coredump_dir /home/smb/squid/squid

refresh_pattern -i
microsoft.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd)
43200 80% 43200 override-lastmod override-expire ignore-reload
ignore-must-revalidate ignore-private
refresh_pattern -i
windowsupdate.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd)
43200 80% 43200 override-lastmod override-expire ignore-reload
ignore-must-revalidate ignore-private
refresh_pattern -i
windows.com/.*\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd)
43200 80% 43200 override-lastmod override-expire ignore-reload
ignore-must-revalidate ignore-private
#Default refresh patterns last if no others match
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern . 0 20% 4320

#Directive sets I have been experimenting with
#override-lastmod override-expire ignore-reload ignore-must-revalidate
ignore-private
#reload-into-ims

#Windows updates use a lot of range requests. The only way to deal with this
#in Squid is to fetch the whole file as soon as requested
range_offset_limit -1 windowsupdate
quick_abort_min -1 KB windowsupdate

#Windows update files are HUGE! I have set this to 6Gb.
#A recent (as of Apr 2014) windows 8 update file is 4Gb
maximum_object_size 6000000 KB windowsupdate

#My internet connection is not just used for Squid. I want to leave
#responsive bandwidth for other services. This limits D/L speed
delay_pools 1
delay_class 1 1
delay_access 1 allow all
delay_parameters 1 1200000/1200000

#We use the store_id helper to convert windows update file hashes to bare URLs.
#This way, any fetch for a given hash embedded in the URL will deliver
the same data
#You must make your own /etc/squid3/storeid_rewrite instructiosn at end.
#change the helper program location from
/usr/local/squid/libexec/storeid_file_rewrite to wherever yours is
#It is written in PERL, so on most Linux systems, put it somewhere
convenient, chmod 755 filename
store_id_program /usr/local/squid/libexec/storeid_file_rewrite
/etc/squid3/storeid_rewrite
store_id_children 10 startup=5 idle=3 concurrency=0
store_id_access allow windowsupdate
store_id_access deny all

#We want to cache windowsupdate URLs which include queries
#but only those queries which act on an installable file.
#we don't want to cache queries on asp files as this is a genuine server
#side query as opposed to a cache breaker
acl wupdatecachablequery urlpath_regex
(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|appxbundle|esd)\?

#Deny caching for URLs matching query but not windowsupdate
cache deny QUERY !windowsupdate
#Deny caching for URLs matching query and windowsupdate but not cachable updates
cache deny QUERY windowsupdate !wupdatecachablequery

#Given windows update is un-cooperative towards third party
#methods to reduce network bandwidth, it is safe to presume
#cache-specific headers or dates significantly differing from
#system date will be unhelpful
reply_header_access Date deny windowsupdate
reply_header_access Age deny windowsupdate

#Put the two following lines in /etc/squid3/storeid_rewrite ommitting
the starting hash
#^http:\/\/.+?\.ws\.microsoft\.com\/.+?_([0-9a-z]{40})\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd)
   http://wupdate.squid.local/$1
#^http:\/\/.+?\.windowsupdate\.com\/.+?_([0-9a-z]{40})\.(cab|exe|ms[i|u|f]|asf|wm[v|a]|dat|zip|psf|appx|esd)
   http://wupdate.squid.local/$1
Received on Sat Apr 12 2014 - 19:08:31 MDT

This archive was generated by hypermail 2.2.0 : Sun Apr 13 2014 - 12:00:05 MDT