Re: memory-mapped files in Squid

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Fri, 29 Jan 1999 00:49:54 +0100

Andres Kroonmaa wrote:

> Yes, of course, mentioned that. But this would be too simplistic.
> It is hard to believe that fifo as Replacement Policy could be very
> effective in both reducing network traffic and disk utilisation.

Not if you take into account the past evolution of disk technology. The
size/cost ratio is increasing by a far higher rate than the speed/cost
ratio and the networkspeed/cost ratio highest of them all. If this
persists then we may soon see a situation where disk speed utilization
is of far greater significance than size utilization in order to keep up
with the network speed.

You could say that the current approach in Squid is at one far end of
the scale (or rather some of the earlier Squid versions.. see below),
and pure FIFO at the other. My thinking at the moment is starting from
FIFO and see what can be done to adapt this to the needs of a cache
without sacrifying to much disk space.

Also, Squid 2 does not really pay much attention expiration, it is
almost pure LRU, so it is not that far from a FIFO replacement with
respect to objects that are only used once. In fact it is very
questionable if the expiration check in the store maintaince routine
does anything useful at all other than destabilizing the LRU age (see
storeMaintainSwapSpace, but ignore the description which is hopeless out
of date).

The reason to why Squid preserves expired objects this is that even an
expired object is useful in a later refresh check as many objects
haven't changed even if expired, so throwing away expired objects is a
big waste close to identical to throwing away old (non-expired) objects
as both needs to be validated with the origin source, and both are
probably not modified.

> This basically means that we do NOT want to blindly overwrite objects
> that could be useful and longterm. And this leads us to unavoidable

I am not exactly proposing to blindly overwrite objects but at the same
time I don't agree in that a blind FIFO is to simplistic. There are
however several approaches which can be used to preserve useful
information without a serious I/O impact. You could say that what I am
trying to do is to have a FIFO mimic some of the properties of LRU, and
if neccesary with some unused space optimization, but my epasis is on
optimized write performance.

It would be nice if someone with some real life logs could calculate the
ratio of TCP_REFRESH_HIT to TCP_REFRESH_MISS to guide me in my future
thoughts.

Hmm.. speaking about refreshes. If I am not mistaken Squid ignores using
IMS on some classes of objects, even if it has a copy in cache (no
content-length and some other criterias). Why is that? To mee It seems
like a very stupid thing to do when many servers can be configured to
support IMS on generated content (SSI and similar).

> this cyclic idea reminded me log-structured FS alot, and I went on
..
> Henrik, there's a work done that seems extremely related to what
> you thinking about. You should read that, it may be useful for your
> work.
> http://now.cs.berkeley.edu/Papers2/Abstracts/sosp97.html

Thanks. I'll look into that.

> > test it using simple trace-driven simulations because it is not obvious
> > how many/what objects have to be specially preserved on any decent size
> > FIFO queue.

On the question of traces: Is store.log fixed in Squid yet, or does it
still forget to log some (most) releases? This has some importance in
comparing trace driven results to Squid.

/Henrik
Received on Tue Jul 29 2003 - 13:15:56 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:02 MST