Re: Handling aliased download sites

From: Bert Driehuis <bert_driehuis@dont-contact.us>
Date: Mon, 10 Apr 2000 09:37:56 +0200 (CEST)

On Mon, 10 Apr 2000, Dancer wrote:

> However, let's download...oh, I don't know...random.exe from a server.
> http://www.somewhere.com/random.exe
>
> We store that object.
>
> Someone else comes along and gets
> http://msdownload.com/4598AB024985700FF/Considerable_unpredictable_guff/RANDOM.EXE

Yeah, that's why I'd propose having the administrator specify a list of
sites known to have identical content, and changing the URL prior to
taking the MD5 to a "common" one without touching the actual download or
storage process.

It could look like this in squid.conf:

multisite 1 ^http://ms....\.www\.connxion\.com/
multisite 1 ^http://msdownload\.somesite\.be/
multisite 1 ^http://msdownload\.somesite\.ch/

where the entire matched portion of the URL would be replaced by a string
of the form "\001multisite1\001" for MD5 generation purpose (the
control/A's taking care of putting illegal characters in the URL to avoid
rewriting the URL to a possibly valid URL).

I completely agree that Squid should not secondguess the web designer, and
that the burden of verifying that sites have identical content should be
on the cache admin.

Cheers,

                                -- Bert

Bert Driehuis, MIS -- bert_driehuis@nl.compuware.com -- +31-20-3116119
Every nonzero finite dimensional inner product space has an
orthonormal basis. It makes sense, when you don't think about it.
Received on Mon Apr 10 2000 - 01:38:09 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:22 MST