Re: [squid-users] anyone knows some info about youtube "range" parameter? from Eliezer Croitoru on 2012-04-26 (squid-users)

From: Eliezer Croitoru <eliezer_at_ngtech.co.il>
Date: Fri, 27 Apr 2012 07:43:59 +0300

On 25/04/2012 20:48, Hasanen AL-Bana wrote:
> wouldn't be better if we save the video chunks ? youtube is streaming
> files with 1.7MB flv chunks, youtube flash player knows how to merge
> them and play them....so the range start and end will alaways be the
> same for the same video as long as user doesn't fast forward it or do
> something nasty...even in that case , squid will just cache that
> chunk...that is possible by rewriting the STORE_URL and including the
> range start& end
>
> On Wed, Apr 25, 2012 at 8:39 PM, Ghassan Gharabli
> <sounarose_at_googlemail.com> wrote:
<SNIP>

i have written a small ruby store_url_rewrite that works with range
argument in the url.
(on the bottom of this mail)

it's written in ruby and i took some of andre work at
http://youtube-cache.googlecode.com

it's not such a fancy script and ment only for this specific youtube
problem.

i know that youtube didnt changed the this range behavior for the whole
globe cause as for now i'm working from a remote location that still has
no "range" at all in the url.
so in the same country you can get two different url patterns.

this script is not cpu friendly (uses more the same amount of regex
lookups always) but it's not what will bring your server down!!!

this is only a prototype and if anyone wants to add some more domains
and patterns i will be more then glad to make this script better then
it's now.

this is one hell of a regex nasty script and i could have used the uri
and cgi libs in order to make the script more user friendly but i choose
to just build the script skeleton and move on from there using the basic
method and classes of ruby.

the idea of this script is to extract each of the arguments such as id
itag and ragne one by one and to not use one regex to extract them all
because there are couple of url structures being used by youtube.

if someone can help me to reorganize this script to allow it to be more
flexible for other sites with numbered cases per
site\domain\url_structure i will be happy to get any help i can.

planned for now to be added into this scripts are:
source forge catch all download mirrors into one object
imdb HQ (480P and up) videos
vimeo videos

if more then just one man will want:
bliptv
some of facebook videos
some other images storage sites.

if you want me to add anything to my "try to cache" list i will be help
to hear from you on my e-mail.

Regards,
Eliezer

##code start##
#!/usr/bin/ruby
require "syslog"

class SquidRequest
attr_accessor :url, :user
attr_reader :client_ip, :method

         def method=(s)
                 @method = s.downcase
         end

         def client_ip=(s)
                 @client_ip = s.split('/').first
         end
end

def read_requests
         # URL <SP> client_ip "/" fqdn <SP> user <SP> method [<SP>
kvpairs]<NL>
         STDIN.each_line do |ln|
                 r = SquidRequest.new
                 r.url, r.client_ip, r.user, r.method, *dummy =
ln.rstrip.split(' ')
                 (STDOUT << "#{yield r}\n").flush
         end
end

def log(msg)
Syslog.log(Syslog::LOG_ERR, "%s", msg)
end

def main
Syslog.open('nginx.rb', Syslog::LOG_PID)
log("Started")

read_requests do |r|
idrx = /.*(id\=)([A-Za-z0-9]*).*/
itagrx = /.*(itag\=)([0-9]*).*/
rangerx = /.*(range\=)([0-9\-]*).*/

newurl = "http://video-srv.youtube.com.SQUIDINTERNAL/id_" +
r.url.match(idrx)[2] + "_itag_" + r.url.match(itagrx)[2] + "_range_" +
r.url.match(rangerx)[2]

log("YouTube Video [#{newurl}].")

newurl
end
end

main
##code end#

-- 
Eliezer Croitoru
https://www1.ngtech.co.il
IT consulting for Nonprofit organizations
eliezer <at> ngtech.co.il

Received on Fri Apr 27 2012 - 04:44:10 MDT

This archive was generated by hypermail 2.2.0 : Fri Apr 27 2012 - 12:00:03 MDT