Re: [squid-users] Re: YouTube and other streaming media (caching)

From: Adrian Chadd <adrian_at_creative.net.au>
Date: Wed, 28 May 2008 09:31:37 +0800

This stuff requires Squid-2.7. Henrik will roll Squid-2.7.STABLE2 soon, so
wait until thats done.

Adrian

On Tue, May 27, 2008, samk_at_twinix.com wrote:
> See Thread at: http://www.techienuggets.com/Detail?tx=32811 Posted on behalf of a User
>
> We have about 600 users behind a squid 2.6stable20 proxy, and youtube represents a big chunk of our bandwidth. Will this method work in 2.6 or is 2.7 needed? We tried to used 3.0 for a while, but suffered a proxy auth bug, and when that was fixed, it was unstable, so I went back to 2.6
>
> Thanks.
>
> In Response To:
>
> On Thu, Apr 17, 2008 at 08:11:51AM +0800, Adrian Chadd wrote:
> > The problem with caching Youtube (and other CDN content) is that
> > the same content is found at lots of different URLs/hosts. This
> > unfortunately means you'll end up caching multiple copies of the
> > same content and (almost!) never see hits.
> >
> > Squid-2.7 -should- be quite stable. I'd suggest just running it from
> > source. Hopefully Henrik will find some spare time to roll 2.6.STABLE19
> > and 2.7.STABLE1 soon so 2.7 will appear in distributions.
>
> Thanks Adrian. FYI I got this to work with 2.7 (latest) based off the
> instructions you provided earlier. Here is my final config and the
> perl script used to generate the storage URL:
>
> http_port 3128
> append_domain .esri.com
> acl apache rep_header Server ^Apache
> broken_vary_encoding allow apache
> maximum_object_size 4194240 KB
> maximum_object_size_in_memory 1024 KB
> access_log /usr/local/squid/var/logs/access.log squid
>
> # Some refresh patterns including YouTube -- although YouTube probably needs to
> # be adjusted.
> refresh_pattern ^ftp: 1440 20% 10080
> refresh_pattern ^gopher: 1440 0% 1440
> refresh_pattern -i \.flv$ 10080 90% 999999 ignore-no-cache override-expire ignore-private
> refresh_pattern ^http://sjl-v[0-9]+\.sjl\.youtube\.com 10080 90% 999999 ignore-no-cache override-expire ignore-private
> refresh_pattern get_video\?video_id 10080 90% 999999 ignore-no-cache override-expire ignore-private
> refresh_pattern youtube\.com/get_video\? 10080 90% 999999 ignore-no-cache override-expire ignore-private
> refresh_pattern . 0 20% 4320
>
> acl all src 0.0.0.0/0.0.0.0
> acl esri src 10.0.0.0/255.0.0.0
> acl manager proto cache_object
> acl localhost src 127.0.0.1/255.255.255.255
> acl to_localhost dst 127.0.0.0/8
> acl SSL_ports port 443
> acl Safe_ports port 80 # http
> acl Safe_ports port 21 # ftp
> acl Safe_ports port 443 # https
> acl Safe_ports port 70 # gopher
> acl Safe_ports port 210 # wais
> acl Safe_ports port 1025-65535 # unregistered ports
> acl Safe_ports port 280 # http-mgmt
> acl Safe_ports port 488 # gss-http
> acl Safe_ports port 591 # filemaker
> acl Safe_ports port 777 # multiling http
> acl CONNECT method CONNECT
> # Some Youtube ACL's
> acl youtube dstdomain .youtube.com .googlevideo.com .video.google.com .video.google.com.au
> acl youtubeip dst 74.125.15.0/24
> acl youtubeip dst 64.15.0.0/16
> cache allow youtube
> cache allow youtubeip
> cache allow esri
>
> # These are from http://wiki.squid-cache.org/Features/StoreUrlRewrite
> acl store_rewrite_list dstdomain mt.google.com mt0.google.com mt1.google.com mt2.google.com
> acl store_rewrite_list dstdomain mt3.google.com
> acl store_rewrite_list dstdomain kh.google.com kh0.google.com kh1.google.com kh2.google.com
> acl store_rewrite_list dstdomain kh3.google.com
> acl store_rewrite_list dstdomain kh.google.com.au kh0.google.com.au kh1.google.com.au
> acl store_rewrite_list dstdomain kh2.google.com.au kh3.google.com.au
>
> # This needs to be narrowed down quite a bit!
> acl store_rewrite_list dstdomain .youtube.com
>
> storeurl_access allow store_rewrite_list
> storeurl_access deny all
>
> storeurl_rewrite_program /usr/local/bin/store_url_rewrite
>
> http_access allow manager localhost
> http_access deny manager
> http_access deny !Safe_ports
> http_access deny CONNECT !SSL_ports
> http_access allow localhost
> http_access allow esri
> http_access deny all
> http_reply_access allow all
> icp_access allow all
> coredump_dir /usr/local/squid/var/cache
>
> # YouTube options.
> quick_abort_min -1 KB
>
> # This will block other streaming media. Maybe we don't want this, but using
> # it for now.
> hierarchy_stoplist cgi-bin ?
> acl QUERY urlpath_regex cgi-bin \?
> cache deny QUERY
>
> And here is the store_url_rewrite script. I added some logging:
>
> #!/usr/bin/perl
>
> use IO::File;
> use IO::Socket::INET;
> use IO::Pipe;
>
> $| = 1;
>
> $fh = new IO::File("/tmp/debug.log", "a");
>
> $fh->print("Hello!\n");
> $fh->flush();
>
> while (<>) {
> chomp;
> #print LOG "Orig URL: " . $_ . "\n";
> $fh->print("Orig URL: " . $_ . "\n");
> if (m/kh(.*?)\.google\.com(.*?)\/(.*?) /) {
> print "http://keyhole-srv.google.com" . $2 . ".SQUIDINTERNAL/" . $3 . "\n";
> # print STDERR "KEYHOLE\n";
> } elsif (m/mt(.*?)\.google\.com(.*?)\/(.*?) /) {
> print "http://map-srv.google.com" . $2 . ".SQUIDINTERNAL/" . $3 . "\n";
> # print STDERR "MAPSRV\n";
> } elsif (m/^http:\/\/([A-Za-z]*?)-(.*?)\.(.*)\.youtube\.com\/get_video\?video_id=([^&]+).* /) {
> print "http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=" . $4 . "\n";
> $fh->print("http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=" . $4 . "\n");
> $fh->flush();
> } elsif (m/^http:\/\/([A-Za-z]*?)-(.*?)\.(.*)\.youtube\.com\/get_video\?video_id=(.*) /) {
> # http://lax-v290.lax.youtube.com/get_video?video_id=jqx1ZmzX0k0
> print "http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=" . $4 . "\n";
> } else {
> print $_ . "\n";
> }
> }
>
> Could likely remove the last elsif block at this point as it's catching
> on the previous one now. But this is working great! Probably some
> tuning yet to be done. Maybe someone could update the wiki with the
> new regexp syntax.
>
> Ray

-- 
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
- $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
Received on Wed May 28 2008 - 01:30:16 MDT

This archive was generated by hypermail 2.2.0 : Tue Aug 05 2008 - 01:05:14 MDT