Re: same URL, different encoding

From: Duane Wessels <wessels@dont-contact.us>
Date: Wed, 14 Jul 1999 08:50:43 -0600

On 14 Jul 1999, Miquel van Smoorenburg wrote:

> If I have the following URL:
>
> http://www.cistron.nl/~miquels/
>
> I can encode it as:
>
> http://www.cistron.nl/~miquels/
> http://www.cistron.nl/%7emiquels/
> http://www.cistron.nl/%7Emiquels/

I believe the '~' is illegal.
>
> Should squid realize that these are the same URLs and cache them only
> once, or should squid treat them as 3 different URLs ? I think the
> latter is what it does now.

Squid does not rewrite URLs and does not try to figure out if two URLs
are equivalent. We take a conservative approach, so as to minimize the
chance of guessing wrong.

Its a tradeoff. By trying to be "smart" we may gain some cache hits, but
we also might lose some correctness.

I could instead argue that your origin server should send HTTP redirects
for the equivalent forms. So '~' and '%7e' get redirected to a URL with '%7E'.
Then squid only caches the response under a single URL, and you can figure out
(from the referer field) where the bad forms are being used.

Duane W.
Received on Wed Jul 14 1999 - 08:46:19 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:47:23 MST