Re: Integrating FTP-Mirror

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Tue, 10 Nov 1998 22:28:07 +0100

Alex Rousskov wrote:

> True. ICP will work for up to 3-5 mirrors though. Mirrors are
> ideal application for Cache Digests since mirrored content is
> static and does not require frequent updates. Building and
> distributing a mirror digest is even simpler than building an
> index and listening/replying to ICP queries, I think.

Not always true. To distribute a digest you have to build and update a
complete index. To answer ICP queries it is sufficient if you have some
means of looking it up relatively quickly, like checking if the file is
there or if we have a up to date mirror of the site. FTP mirrors is
usually composed of several different parts, some which does the
mirroring, and other which is client access methods (FTP, HTTP, NFS
whatever). There is rarely a complete and up to date index of a large
mirror server.

My idea here was that
If you have a large mirror server based on standard mirroring tools
building a filesystem based mirror, then adding two additional small
components to this is a very easy way to make it accesible from Squid.
* A HTTP server that accepts "proxy-style" requests, and fetches the
objects from local filesystem. Any server which can map ftp://* to
/mirror/* is sufficient for this taks (even the old CERN/W3C HTTP server
is capable of this).
* A ICP server that listens for queries and answers wether the file
exists in the mirror filesystem or not. This is only a minimal variation
of the icpserver.pl script delivered with Squid.

It is also not very likely that any cache will benefit from peering with
more than at most 2 such servers for a given FTP site.

Building a cache digest of such a server requires both more work to code
it up, and more CPU time and I/O to rebuild the digest.

Another major drawback of cache digests applied here is that Squid does
not yet handle false hits. This requires the HTTP access point to the
FTP mirror to be a full proxy server or things may go really bad
("randomly" selected URLs sent to the FTP mirror, which errors). Using
ICP effectively eleminates false hits in this case (especially unrelated
false hits) and thus no proxy server is needed on the mirror.

/Henrik
Received on Tue Nov 10 1998 - 14:41:15 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:42:59 MST