Re: [squid-users] Youtube Issue!

From: Ghassan Gharabli <sounarose_at_googlemail.com>
Date: Sun, 27 Nov 2011 15:58:17 +0200

Hello again,

I have tested this video myself and "&range=*" is coming along with
some videos without skipping anything ..

Now everything is okay but some videos are being cached 2 times with
the same Content-Length!

Please see this one:

1322399742.127 66732 192.168.10.14 TCP_HIT/200 69489664 GET
http://o-o.preferred.orange-par1.v14.lscache1.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=903311&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=81E9381A2DF2C1F61388DB08F270607E4CF8F67E.233A3E093009D8EE0123DFC0C3CAE35FB97D7348&source=youtube&expire=1322424000&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1RNUl9FSkNOMV9MR1ZBOl9kd3dzRzJKZlhJ&id=db5b3a6267109fd6
- NONE/- video/x-flv
1322399847.393 79657 192.168.10.14 TCP_HIT/200 69489664 GET
http://o-o.preferred.orange-par1.v14.lscache1.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=903311&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=81E9381A2DF2C1F61388DB08F270607E4CF8F67E.233A3E093009D8EE0123DFC0C3CAE35FB97D7348&source=youtube&expire=1322424000&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1RNUl9FSkNOMV9MR1ZBOl9kd3dzRzJKZlhJ&id=db5b3a6267109fd6&range=13-2375679
- NONE/- video/x-flv

Content is moving from (id=db5b3a6267109fd6) to
id=db5b3a6267109fd6&range=13-2375679 and that is reallys trange
howcome it is being skipped if no human was skipping it manually !
..AS far as i know when someone skips and forward the timestamp
manually then you will find "&begin=*" or "start=*" and It is already
ignored so I am not facing erros on playing those videos on YT because
"&range=*" are no longer caching .

This was denied
refresh_pattern
(get_video|videoplayback|videodownload|\.flv).*range\=[0-9\-]* 0 0% 0

I also have ignored it with storeurl_rewrite helper.

Amos, If you said that &range happens when you skip playing .. then
howcome it is happening like that ?

Ghassan

On 11/27/11, Ghassan Gharabli <sounarose_at_googlemail.com> wrote:
> BTW, That what was happeing to me while testing YT & Ofcourse you cant
> even think of caching videos after being skipped by the client .
>
> Concerning the FLV Object , yes I have noticed from before that when
> you upload a youtube Video then they split the whole video into frames
> which seems to send different objects with the same Video ID ..
> ofcourse this one should be ignored by Squid .
>
> 302 Redirection was only found in "240p" FLV by default and for sure I
> have applied the code just not to hit LOOP .
>
> ACCESS.LOG
> -------------------
> 1322360339.081 88 192.168.10.14 TCP_HIT/200 86436 GET
> http://o-o.preferred.orange-par1.v3.lscache3.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=712F1A94A31D43D03E1DB0F67FF9B7F1A9EDA4EC.029774C29E789ACC1D557E1172163D90F6610205&source=youtube&expire=1322384400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NTUl9FSkNOMV9LTVZFOkpsV3BkS1RxZXNF&id=283246f338ece5ad
> - NONE/- video/x-flv
> 1322360339.242 445 192.168.10.14 TCP_MISS/204 229 GET
> http://clients1.google.com/generate_204 - DIRECT/209.85.148.138
> text/html
> 1322360339.549 453 192.168.10.14 TCP_MISS/204 422 GET
> http://s.youtube.com/stream_204?event=streamingerror&erc=1&retry=1&ec=100&fexp=912600,907605,915002&plid=AASyrgMkZZEo1OUT&v=KDJG8zjs5a0&el=detailpage&rt=0.749&fmt=34&shost=o-o.preferred.orange-par1.v3.lscache3.c.youtube.com&scoville=1&fv=WIN%2011,0,1,152
> - DIRECT/74.125.39.100 text/html
> 1322360339.619 434 192.168.10.14 TCP_MISS/204 422 GET
> http://s.youtube.com/stream_204?fv=WIN%2011,0,1,152&event=streamingerror&el=detailpage&erc=2&rt=0.873&fexp=912600,907605,915002&fmt=34&v=KDJG8zjs5a0&shost=tc.v3.cache3.c.youtube.com&plid=AASyrgMkZZEo1OUT&scoville=1&ec=100
> - DIRECT/74.125.39.101 text/html
> 1322360340.112 10781 192.168.10.14 TCP_MISS/204 230 GET
> http://o-o.preferred.orange-par1.v3.lscache3.c.youtube.com/generate_204?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=712F1A94A31D43D03E1DB0F67FF9B7F1A9EDA4EC.029774C29E789ACC1D557E1172163D90F6610205&source=youtube&expire=1322384400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NTUl9FSkNOMV9LTVZFOkpsV3BkS1RxZXNF&id=283246f338ece5ad
> - DIRECT/64.15.118.50 text/html
> 1322360341.351 10833 192.168.10.14 TCP_MISS/204 422 GET
> http://s.youtube.com/stream_204?rt=0.460&fmt=34&el=detailpage&shost=o-o.preferred.orange-par1.v3.lscache3.c.youtube.com&scoville=1&ec=100&event=streamingerror&retry=1&erc=1&fv=WIN%2011,0,1,152&plid=AASyrgKgSyateKe8&fexp=912600,907605,915002&v=KDJG8zjs5a0
> - DIRECT/74.125.39.102 text/html
> 1322360341.818 2729 192.168.10.14 TCP_HIT/200 2376087 GET
> http://tc.v3.cache3.c.youtube.com/videoplayback?fexp=907605%2C912600%2C915002&key=yt1&ipbits=8&burst=40&sver=3&algorithm=throttle-factor&signature=712F1A94A31D43D03E1DB0F67FF9B7F1A9EDA4EC.029774C29E789ACC1D557E1172163D90F6610205&id=283246f338ece5ad&factor=1.25&expire=1322384400&itag=34&source=youtube&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&ip=84.0.0.0&cp=U0hRR1NTUl9FSkNOMV9LTVZFOkpsV3BkS1RxZXNF&playretry=1
> - NONE/- video/x-flv
>
>
> AS you can see , It is moving one time but causing error at FLV Player .
>
>
> I need someone to test this URL
> http://www.youtube.com/watch?v=KDJG8zjs5a0
>
> If someone is interested :
>
>
> #your perl location in here, mine is #!/bin/perl
> $|=1;
> while (<>) {
> @X = split;
> $x = $X[0];
> $_ = $X[1];
> # youtube 1024p HD itag=37, 720p HD itag=22
> } if
> (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?\&(itag=37|itag=22).*?\&(id=[a-zA-Z0-9]*)/)
> {
> print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $2 . "&" .
> $3 . "\n";
> # youtube 360p itag=34 ,480p itag=35 and others
> } elsif
> (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*?(itag=[0-9]*).*?(id=[a-zA-Z0-9]*)/)
> {
> print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $3 . "\n";
>
> } elsif
> (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*?(id=[a-zA-Z0-9]*).*?(itag=[0-9]*)/)
> {
> print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $2 . "\n";
> } else {
> print $x . $_ . "\n";
> }
> }
>
> I didnt add "\&" because sometimes "ITAG" comes like
> "videoplayback?itag=34" same thing for "ID"
>
>
> Now Im only getting errors on those videos with 302 Redirection and
> Loop patch was applied successfully before compiling Squid and
> access.log shows that it is normally moving to the location of the
> video url but the 2 URLs are being cached since we are caching
> "/videoplayback\?" and both are producing FLV Videos.
>
> When somebody skip the portion of the video to a timestap which hasnt
> been downloaded yet then YT adds to its URL something like
> &begin=[0-9]. I have denied caching those URLs because it will make
> your cache directory bigger & more bigger by a short time.
>
>
> Ghassan
>
>
>
> On Sun, Nov 27, 2011 at 4:02 AM, Amos Jeffries <squid3_at_treenet.co.nz>
> wrote:
>> On 27/11/2011 5:32 a.m., Ghassan Gharabli wrote:
>>>
>>> Hello Amos,
>>>
>>>
>>> Finally, I have almost captured the most YouTube Videos except
>>> something I want to get some asistance from you .
>>>
>>>
>>> As I have tested before and tried so many times .. Chudy's script is
>>> outdated.
>>>
>>> After testinig and logging Youtube Videos . I finally have found
>>> something not being fully cached . If you still remember I have said
>>> before with my old messages that ID isnt being captured in all places
>>> but its okay I have done this . I will post my details after I
>>> completelly finish them.
>>>
>>> Could you please explain to me whats happening here?
>>>
>>> If&range=13-2375679 was found in a URL then Squid doesnt understand
>>> how to cache the full video .. as it only cache the first 13 seconds I
>>> guess! and then it stops . If I try to download this finished cached
>>> movie then you notice its size about 2.2 MB . You try to remove it
>>> from cache then Squid cant even find it as it claims not cached but
>>> shows TCP_HIT in access.log . STRANGE!
>>
>> (NP: by remove you mean PURGE request? HUT just means cached data was
>> found
>> to service the request, which is right since purging the data involves
>> locating it (HITing) before erasing the cached entry. Followup requests
>> after the purge should not be HIT.).
>>
>> I took a look at these"range" replies being generated by YT a while back.
>>
>> What I found was that a request for video URL would send back a FLV
>> object
>> with bytes eg "[SWF...]ABCDEFGH". All fine and good this is the cacheable
>> video.
>>
>> If the user skips around in the video the player generates a range=
>> request
>> stating what timestamp or bytes they want to strat at. Its not clear
>> which
>> due to the reply which comes back having a *different* byte sequence than
>> the video at the same URL.  For example, on the "[SWF...]ABCDEFGH" video
>> it
>> would produce:   "[SWF...]EFGH" or something similar.
>>
>> Under the HTTP rules the range object to be combined must be a snippet
>> portion of the base object (range 4-999, should have been just "DEFGH").
>> By
>> adding the SWF headers on each reply YT are making them unique and
>> different
>> objects. Combining them in the middle (ie by a caching app) will cause
>> errors in the binary object and crash the Flash player or cause it to
>> display an error message instead of the video
>>
>> This range request only seems to happen if the user skips into a portion
>> of
>> video the player has not yet downloaded. So sending them the whole video,
>> which is what we try to do with Squid, will cause a display lag for the
>> user
>> but not cause problems in their player.
>>
>>
>>>
>>> Now look into this URL:
>>> -------------------------------
>>>
>>>
>>> "http://o-o.preferred.orange-par1.v4.lscache7.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=8223490C23E48CB708E04666E4
>>>
>>> A550422757CEC6.9D8D78E66DD14FEFC4B5F960F493ED4CDFD7C51C&source=youtube&expire=13
>>>
>>> 22348400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NPVl9FSkNOMV9LSVpFOkpsV3BkS1B1ZXN
>>> F&id=e120643085f56831&range=13-2375679"
>>>
>>> HTTP/1.0 200 OK
>>> Last-Modified: Fri, 27 Nov 2009 12:44:54 GMT
>>> Content-Type: video/x-flv
>>> Date: Sat, 26 Nov 2011 16:06:29 GMT
>>> Expires: Sat, 26 Nov 2011 16:06:29 GMT
>>> Cache-Control: private, max-age=24511
>>> Accept-Ranges: bytes
>>> Content-Length: 2375667
>>> X-Content-Type-Options: nosniff
>>> Server: gvs 1.0
>>> X-Cache: MISS from Peer6
>>> X-Cache-Lookup: MISS from Peer6:3128
>>> Connection: close
>>>
>>> Whats the job of "Accept_ranges: bytes" here?
>>
>> Accept-* means the software producing that reply or request supports a
>> certain HTTP feature. In this case it is Squid and maybe the server as
>> well
>> supporting HTTP range requests. Not related to YT particulary.
>>
>>>
>>> And the very confusion again you can see another similar URL with the
>>> same "/videoplayback?.*(id)" and here comes the ID inthe end of this
>>> URL then moves temporary just . I must mention that this URL sends the
>>> FLV url as Squid already read it in access.log and then it dds
>>> &ir=1&playretry=1 or pr=1&playretry which means Squid would be
>>> confused to cache it 2 times (FLV).
>>>
>>> EXAMPLE:
>>> ---------------
>>>
>>>
>>> "http://o-o.preferred.orange-par1.v3.lscache3.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=908525%2C910207%2C916201&algorithm=throttle
>>>
>>> -factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=0489805DCC95F6EADBA9D43C3F
>>>
>>> D8C107FC768662.73AA6897FE78CF78BE7819E089F1A4FC47534C7D&source=youtube&expire=13
>>>
>>> 22344800&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NPUl9FSkNOMV9LSVZJOmdmQWdwWC01dlp
>>> n&id=283246f338ece5ad"
>>>
>>> HTTP/1.0 302 Moved Temporarily
>>> Last-Modified: Wed, 02 May 2007 10:26:10 GMT
>>> Date: Sat, 26 Nov 2011 15:50:47 GMT
>>> Expires: Sat, 26 Nov 2011 15:50:47 GMT
>>> Cache-Control: private, max-age=900
>>> Location:
>>> http://r9.orange-par2.c.youtube.com/videoplayback?sparams=id%2Cexpire%
>>>
>>> 2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=908525%2C91
>>>
>>> 0207%2C916201&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&sign
>>>
>>> ature=0489805DCC95F6EADBA9D43C3FD8C107FC768662.73AA6897FE78CF78BE7819E089F1A4FC4
>>>
>>> 7534C7D&source=youtube&expire=1322344800&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1N
>>> PUl9FSkNOMV9LSVZJOmdmQWdwWC01dlpn&id=283246f338ece5ad&ir=1
>>> X-Content-Type-Options: nosniff
>>> Content-Type: text/html
>>> Server: gvs 1.0
>>> Age: 2068
>>> Content-Length: 0
>>> X-Cache: HIT from Peer6
>>> X-Cache-Lookup: HIT from Peer6:3128
>>> Connection: close
>>
>> This is the 302 redirect Adrian and Chudy were discussing at the end of
>> the
>> wiki page. If you cache it with storeurl_access reductions it will loop
>> infinitely back at itself.
>>
>> Amos
>>
>>
>
Received on Sun Nov 27 2011 - 13:58:25 MST

This archive was generated by hypermail 2.2.0 : Sun Nov 27 2011 - 12:00:02 MST