Re: [squid-users] url_rewrite_program and Squid 2.6STABLE

From: Travis Derouin <travis@dont-contact.us>
Date: Tue, 16 Jan 2007 22:06:32 -0500

Hi,

Thanks for the info.

This is what our rewrite.pl script looks like:

$|=1;
while (<>) {
@X = split;
$url = $X[0];
my @servers = qw(10.234.169.206 10.234.169.205 10.234.169.196 );
    if ($url !~ /^http:\/\/www\.wikihow\.com/) {
        $_ = $url;
        s/^http:\/\/(.*?)\/(.*)/http:\/\/www.wikihow.com\/\2/;
        print "301:$_\n";
    } else {
        my $server = $servers[int(rand($#servers+1))];
        s/^http:\/\/www.wikihow.com/http:\/\/$server\2/;
        print "$_";
    }
}

Where the 10.xxx servers are our back-end apache servers (we're using
Squid for load balancing and caching). It basically just checks that
all requests are for pages on the www.wikihow.com domain, if not, it
301 redirects them to the same requested page on www.wikihow.com. We
do this because we used to host wikihow on the wiki.ehow.com
subdomain, and have since moved it over and it's important we 301
redirect old URLs to their new www.wikihow.com domain for SEO
purposes.

Is a url_rewrite program still needed to do this? If so, how can I
make it concurrent? If not, how can I configure squid to issue the 301
redirects for pages requested from the other domains?

It seems specifying a deny_info URL will send browsers a 302 URL, it's
essential we send them a 301 redirect, in addition it's essential that
requests for www1.wikihow.com/page2 get 301 redirected to their
counterpart www.wikihow.com/page2.

I'm not sure why this version of Squid is running out of rewrite
children, the only differences between this installation and the other
one is that we are using epoll and it's on a 64 bit processor. I'm not
sure if this affects anything. How much memory usage do the helper
instances take up?

Thanks!
Travis

On 1/16/07, Henrik Nordstrom <henrik@henriknordstrom.net> wrote:
> tis 2007-01-16 klockan 20:20 -0500 skrev Travis Derouin:
> > Hi,
> >
> > I have a few issues that I thought someone might have some advice on.
> >
> > I read in the documentation that url_rewrite_program was no longer
> > required as it was possible to force a domain name, ie. if Squid
> > received a request for www1.wikihow.com, you could issue a 301
> > redirect to www.wikihow.com.
>
> Yes. This is done by denying the request and using deny_info to redirect
> the browser to the correct URL.
>
> > http_port 80 defaultsite=www.wikihow.com vhost
> >
> > Is that correct, or should it be something else?
>
> The above says you are running an accelerator with domain based virtual
> host support, and old HTTP/1.0 clients not sending Host headers will get
> be processed as requests for the www.wikihow.com domain.
>
>
> > FATAL: Too many queued url_rewriter requests (72 on 12)
> >
> > This didn't seem to be a problem on our other server which was running
> > 2.6.STABLE3-20060825, which we just moved off of recently. The new
> > server is currently only getting half of the total traffic the old
> > server was receiving.
>
> Odd.. Squid-2.6 is the same as 2.5 there..
>
> > 2 questions: Could this be a problem particular to 2.6STABLE6? Are
> > there any serious drawbacks to having a lot of url_rewrite_children ?
>
> Main drawback is memory usage by the helper instances.
>
> What is the URL-rewriter used for? And what does it do? IF it's just
> local processing of the URL with no external lookups then a single
> helper instance is sufficient, but requires the helper to be modified to
> use the "concurrent" protocol.
>
> > Ideally I'd like a configuration that doesn't rely on
> > url_rewrite_program, I haven't figured out the configuration for that
> > yet though.
>
> If you described for what purpose you use the url rewriter helper then
> we maybe could help you with that.
>
> Regards
> Henrik
>
>
>
Received on Tue Jan 16 2007 - 20:06:41 MST

This archive was generated by hypermail pre-2.1.9 : Thu Feb 01 2007 - 12:00:01 MST