Re: Cache server size?

From: Dancer <dancer@dont-contact.us>
Date: Thu, 05 Feb 1998 00:14:07 +1000

Empirically, I find that any moderately sized cache attracts about a 25%
hit rate. For me, I use that figure in my head when I want to factor in
the hit-rates of our leaf-caches into the figures for the hub-cache. On
the occasions I check, I find that 25% is a 'close-enough' figure.

As it is, though, we've _reduced_ our hit-rate in the interests of
saving more data. (Yeah, I know that sounds a little odd). What I mean
is, that I find we get _more_ data delivered to the users from the
cache, despite the fact that the hit-rate is slightly lower, because
we're storing larger, more persistant objects.

Take a look at the following policy.conf file from our squid setup. Yes,
I realise that it probably doesn't look all that much like anyone
else's. I revamped the config structure (no patches required). It _is_
more readable, though. Look at what we _do_ cache, and what we don't.
I'd be more than happy to compare notes with other people about policies
like this..I have the nagging feeling that I've missed some good bets
for long-caching.

Before I digress too far, the LRU time (accessible from
cache-information via the cachemanager.cgi) is one important guide.
That's how long objects stay in the cache before they are pushed out by
newer things (or, if nothing is getting pushed out, it's the value you
set it to). If it's less than a week, you either have an insanely
eclectic user-base with a lot of bandwidth, or your cache is far too
small. I figure a 3GB cache with an LRU of about 9 days is a fairly
healthy sign. It's not the only indicator, but it's a place to start. A
good caching policy that reflects your usage is another good thing to
have.

D

//
******************************************************************************

// Define units of time
//
******************************************************************************

#define HONOUR_EXPIRY_HEADER 0
#define HOUR 360
#define HALF_DAY 720
#define DAY 1440
#define WEEK 10080
#define MONTH 40320

//
******************************************************************************

// Define min/pct/max values for basic policies
//
******************************************************************************

#define OBJECT_CHANGES_INFREQUENTLY WEEK 50% MONTH
#define OBJECT_CHANGES_OFTEN HONOUR_EXPIRY_HEADER 20% HALF_DAY
#define ANYTHING .
#define MODERATE_CACHING DAY 50% WEEK

//
******************************************************************************

// Define rules to make policies easier to read
// No user-serviceable parts inside.
//
******************************************************************************

#define rule(PATTERN,POLICY) refresh_pattern/i PATTERN POLICY
#define file_extension_rule(PATTERN,POLICY) rule(PATTERN ## $,POLICY)
#define case_sensitive_rule(PATTERN,POLICY) refresh_pattern PATTERN
POLICY

//
******************************************************************************

//
******************************************************************************

//
******************************************************************************

//
// Configuration directives start here.
//
//
******************************************************************************

// TAG: cache_stoplist
// A list of words which, if found in a URL, cause the object to
// immediately removed from the cache. In other words, use this
// to force certain objects to never be cached. You may list
this
// option multiple times.
//
// The default is to not cache URLs containing 'cgi-bin' or '?'.
//
cache_stoplist cgi-bin ?

// TAG: cache_stoplist_pattern # case sensitive
// TAG: cache_stoplist_pattern/i # case insensitive
//
// Just like 'cache_stoplist' but you can use regular expressions

// instead of simple string matching. There is no default.
//
// cache_stoplist_pattern

// TAG: refresh_pattern # case sensitive
// TAG: refresh_pattern/i # case insensitive
//
// usage: refresh_pattern regex min percent max
//
// min and max are specified in MINUTES.
// percent is an integer number.
//
// Please see the file doc/Release-Notes-1.1.txt for a full
// description of Squid's refresh algorithm. Basically a
// cached object is:
//
// FRESH if age < min
// STALE if expires < now
// STALE if age > max
// FRESH if lm-factor < percent
//
// The refresh_pattern lines are checked in the order listed here.
// The first entry which matches is used. If none of the entries
// match, then the default will be used.
//

// Cover any microsoft boo-boos
rule(microsoft,OBJECT_CHANGES_INFREQUENTLY)
rule(msn.com,OBJECT_CHANGES_INFREQUENTLY)

// Cache active-X, overriding their wishes (if possible)
file_extension_rule(.asp,OBJECT_CHANGES_INFREQUENTLY)

// Changes very rarely, if ever
file_extension_rule(.gif,OBJECT_CHANGES_INFREQUENTLY) // GIF images
file_extension_rule(.zip,OBJECT_CHANGES_INFREQUENTLY) // ZIP archives
file_extension_rule(.exe,OBJECT_CHANGES_INFREQUENTLY) // Executables
file_extension_rule(.arj,OBJECT_CHANGES_INFREQUENTLY) // ARJ archives
file_extension_rule(.a[0-9][0-9],OBJECT_CHANGES_INFREQUENTLY) // ARJ
archive volumes
file_extension_rule(.jpg,OBJECT_CHANGES_INFREQUENTLY) // JPEG images
file_extension_rule(.jpeg,OBJECT_CHANGES_INFREQUENTLY) // JPEG images
file_extension_rule(.jpe,OBJECT_CHANGES_INFREQUENTLY) // JPEG images
file_extension_rule(.png,OBJECT_CHANGES_INFREQUENTLY) // PNG images
file_extension_rule(.mpg,OBJECT_CHANGES_INFREQUENTLY) // MPG movies
file_extension_rule(.mpeg,OBJECT_CHANGES_INFREQUENTLY) // MPG movies
file_extension_rule(.mpe,OBJECT_CHANGES_INFREQUENTLY) // MPG movies
file_extension_rule(.avi,OBJECT_CHANGES_INFREQUENTLY) // msvideo movies
file_extension_rule(.mov,OBJECT_CHANGES_INFREQUENTLY) // Quicktime
movies
file_extension_rule(.wav,OBJECT_CHANGES_INFREQUENTLY) // WAV audio files

file_extension_rule(.rar,OBJECT_CHANGES_INFREQUENTLY) // RAR archives
file_extension_rule(.r[0-9][0-9],OBJECT_CHANGES_INFREQUENTLY) // RAR
archive volumes
file_extension_rule(.ram,OBJECT_CHANGES_INFREQUENTLY) // Realaudio
control files
file_extension_rule(.viv,OBJECT_CHANGES_INFREQUENTLY) // VIVO-active
movies
file_extension_rule(.cab,OBJECT_CHANGES_INFREQUENTLY) // Microsoft
CABINET archive
file_extension_rule(.class,OBJECT_CHANGES_INFREQUENTLY) // Java
component
file_extension_rule(.tar,OBJECT_CHANGES_INFREQUENTLY) // tape-archive
file_extension_rule(.tgz,OBJECT_CHANGES_INFREQUENTLY) // Gzipp'ed tar
file_extension_rule(.gz,OBJECT_CHANGES_INFREQUENTLY) // Gzip compressed
file_extension_rule(.swf,OBJECT_CHANGES_INFREQUENTLY) // Shockwave/flash

file_extension_rule(.js,OBJECT_CHANGES_INFREQUENTLY) // Javascript
file_extension_rule(.deb,OBJECT_CHANGES_INFREQUENTLY) // Debian Package
file_extension_rule(.rpm,OBJECT_CHANGES_INFREQUENTLY) // RedHat Package
file_extension_rule(.dcr,OBJECT_CHANGES_INFREQUENTLY) //
application/x-director
file_extension_rule(.mid,OBJECT_CHANGES_INFREQUENTLY) // MIDI file
file_extension_rule(.qt,OBJECT_CHANGES_INFREQUENTLY) // Quicktime Movie
file
file_extension_rule(.au,OBJECT_CHANGES_INFREQUENTLY) // Sun audio file
file_extension_rule(.xbm,OBJECT_CHANGES_INFREQUENTLY) // X-bitmap

// Frequent updates
file_extension_rule(.htm,OBJECT_CHANGES_OFTEN) // HTML documents
file_extension_rule(.html,OBJECT_CHANGES_OFTEN) // HTML documents
file_extension_rule(./,OBJECT_CHANGES_OFTEN) // Directory indexes and/or
HTML documents

// Everything else
rule(ANYTHING,MODERATE_CACHING) // Anything we missed

// TAG: reference_age
// As a part of normal operation, Squid performs Least Recently
// Used removal of cached objects. The LRU age for removal is
// computed dynamically, based on the amount of disk space in
// use. The 'reference_age' value defines the maximum LRU age.
// For example, setting reference_age to '1 week' will cause
// objects to be removed if they have not been accessed for a week
// or more. If set to zero, LRU removal is disabled, and objects
// will be removed only when disk usage is over the high water
// mark. The default value is one year.
//
// Specify a number here, followed by units of time. For example:
// 1 week
// 3.5 days
// 4 months
// 2.2 hours
//
reference_age 1 year

Claudia Baertschi wrote:

> We are running Squid 1.1.8 at the University of Kassel (Germany).
> As we will install Squid on another machine we wonder whether the
> cache size is ok or too small (or too big).
>
> Do you have an idea who to evaluate the cache size (is there
> anything to gain if we make the cache size bigger...)?
>
> - scripts to examine the squid logs
> - experiences
> - best marks to reach
>
> Thank you for any tips and hints,
> Claudia

--
Did you read the documentation AND the FAQ?
If not, I'll probably still answer your question, but my patience will
be limited, and you take the risk of sarcasm and ridicule.
Received on Wed Feb 04 1998 - 06:29:54 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:38:47 MST