Re: [RFC] One helper to rewrite them all

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 12 Sep 2012 11:05:31 +1200

On 12.09.2012 09:52, Alex Rousskov wrote:
> On 09/11/2012 01:08 AM, Eliezer Croitoru wrote:
>> On 09/11/2012 01:52 AM, Amos Jeffries wrote:
>
>>> url-rewrite, store-url,
>>> and location-rewrite (also not ported) all have the same API
>
>> This is the first time I have heard about location-rewrite and it's
>> a
>> nice and interesting interface.
>> it can be helpful for cdn stuff if i got it right.
>> But this is not for now.
>
>
> Hm... I wonder if we are making a design mistake here by following
> Squid2 steps: one helper to rewrite request URL, one helper to
> rewrite
> store URL, then one helper to rewrite some special HTTP header, etc.
> Would it be better to extend (in a backward compatible way) the URL
> rewriter interface so that ONE helper can do all rewriting that is
> needed (today and tomorrow)?
>
> [ Well, you do need two helpers, one for requests and one for
> responses,
> but you get the idea. ]
>
> For example, a helper that knows about the enhancement may get a
> single
> request from Squid and return something like this:
>
> _response_lines: 2
> store_uri:http://foo/....
> request_uri:302:http://bar/....
>
> and Squid would know to parse the two additional response lines and
> act
> accordingly.

The two-URL response can be done with the squid-3.3 helper key-value
format I'm working on. One key for each URL. (Alex: I have one more test
to run through it when I can find the time then a final stage audit).

OR, by requiring rewriters to implement concurrency, one channel for
each URL with a type marker.

However, we want also to back-port this feature into 3.2 (3.1?) and
maintain upgrade compatibility with 2.7 implementation, at least to
start with.

It is also questionable whether the internal-rewriter feature when it
gets prioritized again will replace store-URL helper completely. The
use-cases here are very simple, "strip away some variant URL portion
with a regex mapping to a static URL", which is almost the perfect case
for internal re-write.

>
> AFAICT, the primary advantage of a single URL rewriter is
> performance.

Yes. However, using a helper at all is a performance loss. Only covered
by the HIT benefits of object de-dup.

> The primary disadvantage is some loss of flexibility: the script has
> to
> be invoked once, without adaptation or other actions possible between
> the redirector and store URL rewriter invocations, for example.

Not a big loss on flexibility since URL-rewrite is after of the
adaptations already. But with cache ACLs between re-write and store-URL
we could be wasting a bunch of bandwidth on the helper I/O for
non-cacheable things.

This also adds complexity from storeurl_access and urlrewrite_access
collision merging.

>
> I am an ICAP/eCAP guy so I do not have much experience with URL
> helpers,
> but I thought I would bring this up before it is too late, in case
> others find it useful. The decision to go down the single-helper path
> should be made now, before we add another URL rewriter.

IMO the backward compatibility and easy upgrade from 2.7 overrides
doing this now. It is possible to do a migration to this design later
easily enough not to worry.

Amos
Received on Tue Sep 11 2012 - 23:05:35 MDT

This archive was generated by hypermail 2.2.0 : Wed Sep 12 2012 - 12:00:05 MDT