Re: Save 15% on your bandwidth...

From: David J N Begley <david@dont-contact.us>
Date: Tue, 17 Sep 1996 12:57:42 +1000 (EST)

On Sun, 15 Sep 1996 13:42:03 +0100, Gunnar Ingvi Thorisson wrote:

>> I recomend having a list (lists) that the proxy can get automaticaly
>> that would give the information of who is mirroring what.
> I don't think that is possible, how can squid figure out which site is
> closer (maybe by domain extensions)? We are talking about too
> many mirrors / sites / files that should follow each rule, setting up
> tables with rules of where it should get each file from each location
> in the world is too much work.

Agreed. Besides, then it's just trying to generate too much
administrative overhead for something that, let's face it, is best left to
the computers since they're designed for all this tiresome repetitive
stuff anyway! ;-)

> We are dealing with this problem when the same files are being
> retrieved from many sites (mostly big files). I think the best
> solution for this now is running checksum (maybe MD5) on big ftp
> files (not the path) name / date / time / size / (extension (.exe /
> .zip)) and if that matches to the file that was retrieved from another
> server before and is still in the cache it returns that file.

I like the sound of this idea - 'cept that if the remote servers do not
already provide an MD5 digest in the return header of each object, say in
response to a "HEAD" request, then the proxy will have to download the
entire file anyway just to generate the digest for comparison. :-(

> This should be an option in squid.conf also running a checksum on
> one/some/all of the possible checks (that is date / time / file name /
> extension etc..).

Since transient external attributes, like "date/time modified" can be
changed just by moving a file around, this becomes less like a candidate
for determining any correlation between two files. Whilst "file name"
falls in the same category, it's generally (crosses fingers..) less likely
to change for two instances of essentially the same file.

Further, when considering "file name", I wouldn't want to consider a
separate "extension", since that's a DOS-ism for which there is no exact
equivalent across all platforms (a "dot-something" don't count); a neat-o
RE pattern matcher would be much nicer. ;-)

Cheers..

dave
Received on Mon Sep 16 1996 - 19:57:49 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:33:01 MST