Re: [squid-users] Youtube Issue!

From: Benjamin <benjo11111_at_gmail.com>
Date: Sun, 27 Nov 2011 19:32:27 +0530

On 11/27/2011 07:28 PM, Ghassan Gharabli wrote:
> Hello again,
>
> I have tested this video myself and "&range=*" is coming along with
> some videos without skipping anything ..
>
> Now everything is okay but some videos are being cached 2 times with
> the same Content-Length!
>
> Please see this one:
>
> 1322399742.127 66732 192.168.10.14 TCP_HIT/200 69489664 GET
> http://o-o.preferred.orange-par1.v14.lscache1.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=903311&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=81E9381A2DF2C1F61388DB08F270607E4CF8F67E.233A3E093009D8EE0123DFC0C3CAE35FB97D7348&source=youtube&expire=1322424000&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1RNUl9FSkNOMV9MR1ZBOl9kd3dzRzJKZlhJ&id=db5b3a6267109fd6
> - NONE/- video/x-flv
> 1322399847.393 79657 192.168.10.14 TCP_HIT/200 69489664 GET
> http://o-o.preferred.orange-par1.v14.lscache1.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=903311&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=81E9381A2DF2C1F61388DB08F270607E4CF8F67E.233A3E093009D8EE0123DFC0C3CAE35FB97D7348&source=youtube&expire=1322424000&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1RNUl9FSkNOMV9MR1ZBOl9kd3dzRzJKZlhJ&id=db5b3a6267109fd6&range=13-2375679
> - NONE/- video/x-flv
>
> Content is moving from (id=db5b3a6267109fd6) to
> id=db5b3a6267109fd6&range=13-2375679 and that is reallys trange
> howcome it is being skipped if no human was skipping it manually !
> ..AS far as i know when someone skips and forward the timestamp
> manually then you will find "&begin=*" or "start=*" and It is already
> ignored so I am not facing erros on playing those videos on YT because
> "&range=*" are no longer caching .
>
> This was denied
> refresh_pattern
> (get_video|videoplayback|videodownload|\.flv).*range\=[0-9\-]* 0 0% 0
>
> I also have ignored it with storeurl_rewrite helper.
>
> Amos, If you said that&range happens when you skip playing .. then
> howcome it is happening like that ?
>
>
> Ghassan
>
>
>
> On 11/27/11, Ghassan Gharabli<sounarose_at_googlemail.com> wrote:
>> BTW, That what was happeing to me while testing YT& Ofcourse you cant
>> even think of caching videos after being skipped by the client .
>>
>> Concerning the FLV Object , yes I have noticed from before that when
>> you upload a youtube Video then they split the whole video into frames
>> which seems to send different objects with the same Video ID ..
>> ofcourse this one should be ignored by Squid .
>>
>> 302 Redirection was only found in "240p" FLV by default and for sure I
>> have applied the code just not to hit LOOP .
>>
>> ACCESS.LOG
>> -------------------
>> 1322360339.081 88 192.168.10.14 TCP_HIT/200 86436 GET
>> http://o-o.preferred.orange-par1.v3.lscache3.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=712F1A94A31D43D03E1DB0F67FF9B7F1A9EDA4EC.029774C29E789ACC1D557E1172163D90F6610205&source=youtube&expire=1322384400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NTUl9FSkNOMV9LTVZFOkpsV3BkS1RxZXNF&id=283246f338ece5ad
>> - NONE/- video/x-flv
>> 1322360339.242 445 192.168.10.14 TCP_MISS/204 229 GET
>> http://clients1.google.com/generate_204 - DIRECT/209.85.148.138
>> text/html
>> 1322360339.549 453 192.168.10.14 TCP_MISS/204 422 GET
>> http://s.youtube.com/stream_204?event=streamingerror&erc=1&retry=1&ec=100&fexp=912600,907605,915002&plid=AASyrgMkZZEo1OUT&v=KDJG8zjs5a0&el=detailpage&rt=0.749&fmt=34&shost=o-o.preferred.orange-par1.v3.lscache3.c.youtube.com&scoville=1&fv=WIN%2011,0,1,152
>> - DIRECT/74.125.39.100 text/html
>> 1322360339.619 434 192.168.10.14 TCP_MISS/204 422 GET
>> http://s.youtube.com/stream_204?fv=WIN%2011,0,1,152&event=streamingerror&el=detailpage&erc=2&rt=0.873&fexp=912600,907605,915002&fmt=34&v=KDJG8zjs5a0&shost=tc.v3.cache3.c.youtube.com&plid=AASyrgMkZZEo1OUT&scoville=1&ec=100
>> - DIRECT/74.125.39.101 text/html
>> 1322360340.112 10781 192.168.10.14 TCP_MISS/204 230 GET
>> http://o-o.preferred.orange-par1.v3.lscache3.c.youtube.com/generate_204?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=712F1A94A31D43D03E1DB0F67FF9B7F1A9EDA4EC.029774C29E789ACC1D557E1172163D90F6610205&source=youtube&expire=1322384400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NTUl9FSkNOMV9LTVZFOkpsV3BkS1RxZXNF&id=283246f338ece5ad
>> - DIRECT/64.15.118.50 text/html
>> 1322360341.351 10833 192.168.10.14 TCP_MISS/204 422 GET
>> http://s.youtube.com/stream_204?rt=0.460&fmt=34&el=detailpage&shost=o-o.preferred.orange-par1.v3.lscache3.c.youtube.com&scoville=1&ec=100&event=streamingerror&retry=1&erc=1&fv=WIN%2011,0,1,152&plid=AASyrgKgSyateKe8&fexp=912600,907605,915002&v=KDJG8zjs5a0
>> - DIRECT/74.125.39.102 text/html
>> 1322360341.818 2729 192.168.10.14 TCP_HIT/200 2376087 GET
>> http://tc.v3.cache3.c.youtube.com/videoplayback?fexp=907605%2C912600%2C915002&key=yt1&ipbits=8&burst=40&sver=3&algorithm=throttle-factor&signature=712F1A94A31D43D03E1DB0F67FF9B7F1A9EDA4EC.029774C29E789ACC1D557E1172163D90F6610205&id=283246f338ece5ad&factor=1.25&expire=1322384400&itag=34&source=youtube&sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&ip=84.0.0.0&cp=U0hRR1NTUl9FSkNOMV9LTVZFOkpsV3BkS1RxZXNF&playretry=1
>> - NONE/- video/x-flv
>>
>>
>> AS you can see , It is moving one time but causing error at FLV Player .
>>
>>
>> I need someone to test this URL
>> http://www.youtube.com/watch?v=KDJG8zjs5a0
>>
>> If someone is interested :
>>
>>
>> #your perl location in here, mine is #!/bin/perl
>> $|=1;
>> while (<>) {
>> @X = split;
>> $x = $X[0];
>> $_ = $X[1];
>> # youtube 1024p HD itag=37, 720p HD itag=22
>> } if
>> (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com).*?\&(itag=37|itag=22).*?\&(id=[a-zA-Z0-9]*)/)
>> {
>> print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $2 . "&" .
>> $3 . "\n";
>> # youtube 360p itag=34 ,480p itag=35 and others
>> } elsif
>> (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*?(itag=[0-9]*).*?(id=[a-zA-Z0-9]*)/)
>> {
>> print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $3 . "\n";
>>
>> } elsif
>> (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*?(id=[a-zA-Z0-9]*).*?(itag=[0-9]*)/)
>> {
>> print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $2 . "\n";
>> } else {
>> print $x . $_ . "\n";
>> }
>> }
>>
>> I didnt add "\&" because sometimes "ITAG" comes like
>> "videoplayback?itag=34" same thing for "ID"
>>
>>
>> Now Im only getting errors on those videos with 302 Redirection and
>> Loop patch was applied successfully before compiling Squid and
>> access.log shows that it is normally moving to the location of the
>> video url but the 2 URLs are being cached since we are caching
>> "/videoplayback\?" and both are producing FLV Videos.
>>
>> When somebody skip the portion of the video to a timestap which hasnt
>> been downloaded yet then YT adds to its URL something like
>> &begin=[0-9]. I have denied caching those URLs because it will make
>> your cache directory bigger& more bigger by a short time.
>>
>>
>> Ghassan
>>
>>
>>
>> On Sun, Nov 27, 2011 at 4:02 AM, Amos Jeffries<squid3_at_treenet.co.nz>
>> wrote:
>>> On 27/11/2011 5:32 a.m., Ghassan Gharabli wrote:
>>>> Hello Amos,
>>>>
>>>>
>>>> Finally, I have almost captured the most YouTube Videos except
>>>> something I want to get some asistance from you .
>>>>
>>>>
>>>> As I have tested before and tried so many times .. Chudy's script is
>>>> outdated.
>>>>
>>>> After testinig and logging Youtube Videos . I finally have found
>>>> something not being fully cached . If you still remember I have said
>>>> before with my old messages that ID isnt being captured in all places
>>>> but its okay I have done this . I will post my details after I
>>>> completelly finish them.
>>>>
>>>> Could you please explain to me whats happening here?
>>>>
>>>> If&range=13-2375679 was found in a URL then Squid doesnt understand
>>>> how to cache the full video .. as it only cache the first 13 seconds I
>>>> guess! and then it stops . If I try to download this finished cached
>>>> movie then you notice its size about 2.2 MB . You try to remove it
>>>> from cache then Squid cant even find it as it claims not cached but
>>>> shows TCP_HIT in access.log . STRANGE!
>>> (NP: by remove you mean PURGE request? HUT just means cached data was
>>> found
>>> to service the request, which is right since purging the data involves
>>> locating it (HITing) before erasing the cached entry. Followup requests
>>> after the purge should not be HIT.).
>>>
>>> I took a look at these"range" replies being generated by YT a while back.
>>>
>>> What I found was that a request for video URL would send back a FLV
>>> object
>>> with bytes eg "[SWF...]ABCDEFGH". All fine and good this is the cacheable
>>> video.
>>>
>>> If the user skips around in the video the player generates a range=
>>> request
>>> stating what timestamp or bytes they want to strat at. Its not clear
>>> which
>>> due to the reply which comes back having a *different* byte sequence than
>>> the video at the same URL. For example, on the "[SWF...]ABCDEFGH" video
>>> it
>>> would produce: "[SWF...]EFGH" or something similar.
>>>
>>> Under the HTTP rules the range object to be combined must be a snippet
>>> portion of the base object (range 4-999, should have been just "DEFGH").
>>> By
>>> adding the SWF headers on each reply YT are making them unique and
>>> different
>>> objects. Combining them in the middle (ie by a caching app) will cause
>>> errors in the binary object and crash the Flash player or cause it to
>>> display an error message instead of the video
>>>
>>> This range request only seems to happen if the user skips into a portion
>>> of
>>> video the player has not yet downloaded. So sending them the whole video,
>>> which is what we try to do with Squid, will cause a display lag for the
>>> user
>>> but not cause problems in their player.
>>>
>>>
>>>> Now look into this URL:
>>>> -------------------------------
>>>>
>>>>
>>>> "http://o-o.preferred.orange-par1.v4.lscache7.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=907605%2C912600%2C915002&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=8223490C23E48CB708E04666E4
>>>>
>>>> A550422757CEC6.9D8D78E66DD14FEFC4B5F960F493ED4CDFD7C51C&source=youtube&expire=13
>>>>
>>>> 22348400&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NPVl9FSkNOMV9LSVpFOkpsV3BkS1B1ZXN
>>>> F&id=e120643085f56831&range=13-2375679"
>>>>
>>>> HTTP/1.0 200 OK
>>>> Last-Modified: Fri, 27 Nov 2009 12:44:54 GMT
>>>> Content-Type: video/x-flv
>>>> Date: Sat, 26 Nov 2011 16:06:29 GMT
>>>> Expires: Sat, 26 Nov 2011 16:06:29 GMT
>>>> Cache-Control: private, max-age=24511
>>>> Accept-Ranges: bytes
>>>> Content-Length: 2375667
>>>> X-Content-Type-Options: nosniff
>>>> Server: gvs 1.0
>>>> X-Cache: MISS from Peer6
>>>> X-Cache-Lookup: MISS from Peer6:3128
>>>> Connection: close
>>>>
>>>> Whats the job of "Accept_ranges: bytes" here?
>>> Accept-* means the software producing that reply or request supports a
>>> certain HTTP feature. In this case it is Squid and maybe the server as
>>> well
>>> supporting HTTP range requests. Not related to YT particulary.
>>>
>>>> And the very confusion again you can see another similar URL with the
>>>> same "/videoplayback?.*(id)" and here comes the ID inthe end of this
>>>> URL then moves temporary just . I must mention that this URL sends the
>>>> FLV url as Squid already read it in access.log and then it dds
>>>> &ir=1&playretry=1 or pr=1&playretry which means Squid would be
>>>> confused to cache it 2 times (FLV).
>>>>
>>>> EXAMPLE:
>>>> ---------------
>>>>
>>>>
>>>> "http://o-o.preferred.orange-par1.v3.lscache3.c.youtube.com/videoplayback?sparams=id%2Cexpire%2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=908525%2C910207%2C916201&algorithm=throttle
>>>>
>>>> -factor&itag=34&ip=84.0.0.0&burst=40&sver=3&signature=0489805DCC95F6EADBA9D43C3F
>>>>
>>>> D8C107FC768662.73AA6897FE78CF78BE7819E089F1A4FC47534C7D&source=youtube&expire=13
>>>>
>>>> 22344800&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1NPUl9FSkNOMV9LSVZJOmdmQWdwWC01dlp
>>>> n&id=283246f338ece5ad"
>>>>
>>>> HTTP/1.0 302 Moved Temporarily
>>>> Last-Modified: Wed, 02 May 2007 10:26:10 GMT
>>>> Date: Sat, 26 Nov 2011 15:50:47 GMT
>>>> Expires: Sat, 26 Nov 2011 15:50:47 GMT
>>>> Cache-Control: private, max-age=900
>>>> Location:
>>>> http://r9.orange-par2.c.youtube.com/videoplayback?sparams=id%2Cexpire%
>>>>
>>>> 2Cip%2Cipbits%2Citag%2Csource%2Calgorithm%2Cburst%2Cfactor%2Ccp&fexp=908525%2C91
>>>>
>>>> 0207%2C916201&algorithm=throttle-factor&itag=34&ip=84.0.0.0&burst=40&sver=3&sign
>>>>
>>>> ature=0489805DCC95F6EADBA9D43C3FD8C107FC768662.73AA6897FE78CF78BE7819E089F1A4FC4
>>>>
>>>> 7534C7D&source=youtube&expire=1322344800&key=yt1&ipbits=8&factor=1.25&cp=U0hRR1N
>>>> PUl9FSkNOMV9LSVZJOmdmQWdwWC01dlpn&id=283246f338ece5ad&ir=1
>>>> X-Content-Type-Options: nosniff
>>>> Content-Type: text/html
>>>> Server: gvs 1.0
>>>> Age: 2068
>>>> Content-Length: 0
>>>> X-Cache: HIT from Peer6
>>>> X-Cache-Lookup: HIT from Peer6:3128
>>>> Connection: close
>>> This is the 302 redirect Adrian and Chudy were discussing at the end of
>>> the
>>> wiki page. If you cache it with storeurl_access reductions it will loop
>>> infinitely back at itself.
>>>
>>> Amos
>>>
>>>
Hi,

Can we use bash shell script to write squid redirector for you tube
video caching?Actually i m much good with bash shell script.

Regards,
Benjamin
Received on Sun Nov 27 2011 - 14:01:37 MST

This archive was generated by hypermail 2.2.0 : Sun Nov 27 2011 - 12:00:02 MST