Re: [squid-users] Caching youtube videos problem/ always getting TCP_MISS

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Tue, 23 Nov 2010 01:14:45 +0000

On Mon, 22 Nov 2010 17:39:12 +0530, Saurabh Agarwal
<Saurabh.Agarwal_at_citrix.com> wrote:
> Hi All/Amos
>
> I am using squid 2.7 Stable7 and trying to cache this youtube video
> http://www.youtube.com/watch?v=7M-jsjLB20Y but I am always getting a tcp
> miss. I have done the required configuration as mentioned on
> http://wiki.squid-cache.org/ConfigExamples/DynamicContent/YouTube. After
a
> few redirects this http get request response returns the content type of
> video/x-flv.
>
>
http://v24.lscache6.c.youtube.com/videoplayback?ip=202.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor&fexp=901306%2C900025&algorithm=throttle-factor&itag=34&ipbits=8&burst=40&sver=3&expire=1289592000&key=yt1&signature=1E5E015856CF11DE13A253255DFA638D9084981C.D49489F758A488EF2DF2200E8DD8EFADE4F4ADF7&factor=1.25&id=eccfa3b232c1db46&
>
> From above url I can successfully extract id=eccfa3b232c1db46 field from
> this url using storeurl rewriter feature as confirmed by cache.log
below.

No. The log below shows this transformation happening on the URL, but that
the rewriter output is not correct for the channel ID portion.

> Please search for
>
http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=eccfa3b232c1db46
> below. When I turn on debug logs using debug level ALL,9 in squid I see
> following in cache.log file. The messages like "Rewrote to" prints the
same
> input url in the cache.logs. I think this url should be
>
http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=eccfa3b232c1db46
>
>
> ---------output from cache.log using debug level
> ALL,9-------------------------------
> helperHandleRead: '0
>
http://v24.lscache6.c.youtube.com/videoplayback?ip=202.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor&fexp=901306%2C900025&algorithm=throttle-factor&itag=34&ipbits=8&burst=40&sver=3&expire=1289592000&key=yt1&signature=1E5E015856CF11DE13A253255DFA638D9084981C.D49489F758A488EF2DF2200E8DD8EFADE4F4ADF7&factor=1.25&id=eccfa3b232c1db46&
> 10.102.79.81/client - GET - myip=10.102.79.88 myport=3128
>
http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=eccfa3b232c1db46
>
> '
> 2010/11/12 12:36:40| helperHandleRead: end of reply found: 0
>
http://v24.lscache6.c.youtube.com/videoplayback?ip=202.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor&fexp=901306%2C900025&algorithm=throttle-factor&itag=34&ipbits=8&burst=40&sver=3&expire=1289592000&key=yt1&signature=1E5E015856CF11DE13A253255DFA638D9084981C.D49489F758A488EF2DF2200E8DD8EFADE4F4ADF7&factor=1.25&id=eccfa3b232c1db46&
> 10.102.79.81/client - GET - myip=10.102.79.88 myport=3128
>
http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=eccfa3b232c1db46
>

This whole string appears to be what is coming back.
Correct output is just "0
http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=eccfa3b232c1db46"

Note the "0 " concurrency channel ID is missing from in front of the
altered URL, but the whole received line is being sent as a prefix
including the newline terminator.

<snip>

> 2010/11/12 12:36:40| Rewrote to
>
http://v24.lscache6.c.youtube.com/videoplayback?ip=202.0.0.0&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Calgorithm%2Cburst%2Cfactor&fexp=901306%2C900025&algorithm=throttle-factor&itag=34&ipbits=8&burst=40&sver=3&expire=1289592000&key=yt1&signature=1E5E015856CF11DE13A253255DFA638D9084981C.D49489F758A488EF2DF2200E8DD8EFADE4F4ADF7&factor=1.25&id=eccfa3b232c1db46&
> ....

Once Squid identifies the channel ID and discards the garbage trailing the
line this "rewrote to" is what Squid uses as response to the channel 0
request.
Note the absence of SQUIDINTERNAL in the domain name and the abundance of
extra parameter strings.

...

> 2010/11/12 12:36:40| cbdataUnlock: 0xafd4d8
> 2010/11/12 12:36:40| cbdataUnlock: Freeing 0xafd4d8
> 2010/11/12 12:36:40| helperHandleRead: end of reply found:
>
http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=eccfa3b232c1db46
>
>
> 2010/11/12 12:36:40| helperHandleRead: unexpected reply on channel -1
from
> store_rewriter #1
>
'http://video-srv.youtube.com.SQUIDINTERNAL/get_video?video_id=eccfa3b232c1db46'

This looks like the actual response is coming back on a second line
without any channel ID. This missing channel ID is the only reason your
client requests are not getting sent the wrong reply objects from cache.

What should be happening is that the rewriter pulls off the channel ID and
requested URL from the request. Then sends back the channel ID and altered
URL *only*.

Amos
Received on Tue Nov 23 2010 - 01:14:49 MST

This archive was generated by hypermail 2.2.0 : Tue Nov 23 2010 - 12:00:02 MST