Re: [squid-users] anyone knows some info about youtube "range" parameter?

From: Eliezer Croitoru <eliezer_at_ngtech.co.il>
Date: Thu, 26 Apr 2012 04:41:05 +0300

On 25/04/2012 20:39, Ghassan Gharabli wrote:
> Hello,
>
> As i remember I already discussed this subject before mentioning that
> Youtube several months ago added a new variable/URI "RANGE". I tried
> to deny all URLs that comes with "RANGE" to avoid presenting the error
> at Youtube Player butb tried to investigate more and came with a
> solution like that :
>
nice solution!
i have used quiet long time the squid on any version with nginx as
cache_peer so i dont deny anything\any url.
the main thing for me is that the used will not feel in anyway that
there is a cache mechanism!!!
i havnt used store_url_rewrite for a long time because nginx has some
really nice features that helps cache a lot of sites with dynamic content.
i added to nginx the "range" argument pretty easily but, the main
problem is that the "204" (end of data http GET content) from a reason
wont be sent to nginx while playing the file and the youtube player will
get to a point it will not "preload" the next "range" until it will get
to the end of the "range".
so in this case i think about two options to solve it:
1. make nginx do a http 1.0 request that solves the problem of getting
stuck in the end of each chunk(1.7 mb).
2. rewrite any youtube video url and strip the "range" paramater.
3. use store_url_rewrite that will store id itag and range.

problems:
1. dont know how can it done on nginx.
2. really the best solution and will avoid any complication of the
"range" thing but will cause a lot of users that will jump into some
part in the video to return them to the first second of the video.
this is one hell of a major problem for users!!
with the "begin" range that youtube was using you could just pass the
request to youtube without any caching involved.
so another solution is to cache the whole video for requests with range
of "13-XX" which means that it will start to download the whole video
for a "start of the video request" and all other chunks will be passed
without any cache.
the problem is that i didnt got the time yet to check what is the result
of rewriting the basic "range" that starts with 13 and then to see how
youtube player will accept the response of the "full" file without range.
3. this is one of the best choices but didnt have the to try it yet.

> --------------------------------------------------------------------------------
>
> # youtube 360p itag=34 ,480p itag=35 [ITAG/ID/RANGE]
> if (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*(itag=[0-9]*).*(id=[a-zA-Z0-9]*).*(range\=[0-9\-]*)/)
> {
> print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $3 . "&" .
> $4 . "\n";
>
> # youtube 360p itag=34 ,480p itag=35 [ID/ITAG/RANGE]
> } elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*(id=[a-zA-Z0-9]*).*(itag=[0-9]*).*(range\=[0-9\-]*)/)
> {
> print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $2 . "&" .
> $4 . "\n";
>
> # youtube 360p itag=34 ,480p itag=35 [RANGE/ITAG/ID]
> } elsif (m/^http:\/\/([0-9.]{4}|.*\.youtube\.com|.*\.googlevideo\.com|.*\.video\.google\.com)\/.*(range\=[0-9\-]*).*(itag=[0-9]*).*(id=[a-zA-Z0-9]*)/)
> {
> print $x . "http://video-srv.youtube.com.SQUIDINTERNAL/" . $4 . "&" .
> $2 . "\n";
> --------------------------------------------------------------------------------------
>
> I already discovered that by rewriting them and save them as
> videplayback?id=0000000&range=00-000000 would solve the problem but
> the thing is the cache folder would be increased faster because we are
> not only saving one file as we are saving multiple files for one ID!.
>
> AS for me , it saves alot of bandwidth but bigger cache . If you check
> and analyze it more then you will notice same ID or same videop while
> watching the link changes for example :
>
> It starts [ITAG/ID/RANGE] then changes to [ID/ITAG/RANGE] and finally
> to [RANGE/ITAG/ID] so with my script you can capture the whole
> places!.
this has pretty simple solution and it's to use one store_url syntax and
order.
i dont know why you didnt used the itag at all on your store_url_rewriting.
you should store "itag" "id" and "range" in specific order to maintain
"consistent" cache syntax.

about the range themselves i did found that it has specific advancement
order which makes it pretty easy to cache and the parts that are not in
the regular playback 1.7 mb chunk order is not that big because most of
the users are using the basic playback without any "jumping" in the middle.
if you have lost some(double your users that jumps randomly in a video)
disk space for bandwidth saving in most of the times it worth it.

about the id itag and range order in the url, because i was using nginx
i didnt have this problem at all.
ngnix uses the info from the url args simple and smoothly.
generally the store_url_rewrite has much more potential to be cache
effective then nginx proxy_store as the ngnix proxy_store is a permanent
store mechanism without any time limit calculation.
as for now nginx has the option to integrate with perl that can be used
for many things such as requests manipulation.

another option i was thinking is the icap to rewrite the url or do some
other stuff.

but as for until now nginx was fine i was working on it.

Regards,
Eliezer

>
>
> Ghassan
>
> On 4/25/12, Eliezer Croitoru<eliezer_at_ngtech.co.il> wrote:
<snip>

-- 
Eliezer Croitoru
https://www1.ngtech.co.il
IT consulting for Nonprofit organizations
eliezer <at> ngtech.co.il
Received on Thu Apr 26 2012 - 01:41:17 MDT

This archive was generated by hypermail 2.2.0 : Thu Apr 26 2012 - 12:00:04 MDT