Re: Cache Digests Diffs

From: Alex Rousskov <rousskov@dont-contact.us>
Date: Thu, 15 Jul 1999 18:08:27 -0600 (MDT)

On Fri, 16 Jul 1999, Kevin Littlejohn wrote:

> Slightly different approach: Have you looked at http://rsync.samba.org/ ?
> This is a package that generates a diff without a second copy - it uses
> rolling checksums and other niftyness.

As far as I can tell, you have misunderstood the rsync algorithm. The
algorithm does require two copies, of course. It does not require the
_transfer_ of the second copy over a [slow] link to generate a diff.
In our case, both copies are local (reside in the same proxy process).

The fundamental difference with rsync environment is that we have a
single source of modification (the digest generating proxy). Rsync has
to deal with modifications on both ends.

> It
> would seem that if diffs can be generated without holding an entire second
> copy of the cache-digest in memory, it might be a big win...

I do not consider an additional 1-2MB of RAM a big deal for this
particular purpose. The are alternative techniques that avoid keeping
two digests in memory (Pei Cao's Summary Cache is one of them). However,
they have similar, if not larger, memory overhead.

Also, the diff format we have proposed does not require holding the
entire diff file in memory on the receiving end to "patch" the stale
digest. This may be an nice feature because (as opposed to generating a
diff) a receiving proxy may have many digests that it has to update
simultaneously.

Thanks for attacking the problem!

Alex.
Received on Tue Jul 29 2003 - 13:15:59 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:16 MST