IMS bugs and proposals from Christian Balzer on 1996-10-07 (squid-users)

From: Christian Balzer <cb@dont-contact.us>
Date: Mon, 7 Oct 1996 09:32:50 +0200 (MET DST)

Hi there,

as announced earlier here are my somewhat confusing and frustating
experiences with the IMS (combined with the retain-in-cache)
functionality in 1.1b4.
Lemme do this by pointing out the ideal cache (for me :) behaviour
and compare this to Squid's current behaviour as I understand it, these
Squid comments will be in square brackets [].
Corrections to any errors of mine are more than welcome.

For me the ideal cache is combination of minimum bandwidth usage and
maximum up-to-date (valid) data. The domain my Squid serves hangs of
a 64Kb/s link, though I have plenty of diskspace (2GB currently, but I'd
upgrade to 2-4 times this amount if it would make sense!) and processing
power, as these are a fraction of the amount I'd have to shell out for
a link >128Kb/s...

So the cache should keep as much data as the local resources allow, to
maximize the chance of a local hit.
[The expire_age tag _should_ do this nicely, however I'd like to see an
option "forever" there and leave any cache cleanings to space constraints.]

This way data that still has a valid local lifetime (TTL) or didn't change
since the last access (IMS check) could and would be served from the local
cache.
[This is however were the 1.1b4 still fails a lot, I've seen a _load_ of
TCP-MISS entries of things that were in the cache at one point and clearly
couldn't have expired (expire_age is set to 90000) nor have an Expires
header. I was at loss as to why, but something I suspected already was
just now proven (if this amounts to all the misses I can't tell yet,
though). If something is retrieved with a TCP_EXPIRED_HIT it becomes
immediately available to be flushed out of the cache, instead of
taking the expire_age and the results of the IMS check into account.
Since the data is still valid, it's local lifetime should be recalculated
according to the TTL patterns and it should only be eligible for expiring
if it's total lifetime (expire_age) has been exceeded.]

However to minimize external access it's necessary to be as "smart" as
possible about verfying the integrity of the local data.
[Squid forces the reload of data which has expired due to the Expires
header, an option to do keep/use _and_ "rejuvenate" the local data if a
combined IMS and size check result in it being still valid would be
greatly appreciated!]

Then there is the currentness/validity of data, if the local data is
stale people will avoid the cache.
[A sensible TTL pattern should take care of that, _IF_ the above bug
is fixed. However people who think that the latest Dilbert strip has
to be out now can and will hit the "Reload" button, a lot. Right now
I estimate that about 20-30% of the external accesses are TCP_REFRESH
requests of perfectly valid data. An option to Squid which allows
these requests to be transformed into an IMS/size check procedure would
greatly reduce this load. Incidently I do see some TCP_IMS_HIT in the
access logs, so some browsers must be doing things more sensible
than Netscape (or am I missing something obvious here?).]

Mata ne,

<CB>

P.S. Since I don't think we'll get a human readable date (back?) into
the native access log format, does anybody have a _fast_ filter which will
do the trick for me?

-- 
  // <CB> aka Christian Balzer, Tannenstr. 23c, D-64342 Seeheim, Germany
\X/  CB@brewhq.swb.de | Voice: +49 6257 83036, Fax/Data: +49 6257 83037
SWB  - The Software Brewery - | Team H, Germany HQ | Anime no Otaku

Received on Mon Oct 07 1996 - 00:35:32 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:33:14 MST