RE: Data all piles up in one directory.

From: Thorne Lawler <thorne@dont-contact.us>
Date: Wed, 29 Mar 2000 17:31:07 +1000

Squid people,

Since I last posted, I've added another gigabyte of disk to the squid cache
mentioned below, but not changed from the default number of cache
directories in either case, i.e:

cache_dir /squid/cache 400 16 256
cache_dir /squid2/cache 850 16 256

Now that the cache has filled, and been running for a few months, I am still
seeing some unsatisfactory responses: large downloads simply not being
cached. Our hit rates are still alarmingly low too: 7.8% from the last
month's report.

I can see that the subdirectories in my cache_dirs are not bing filled
evenly, but I gathered from the previous correspondence that this was
semi-expected behaviour, and not really a problem...

My problem is this: my company has almost no need to cache small files (i.e.
less than 50Kb) as we are immediately downstream from a good commercial
cache, but we *do* need to cache large downloads, the larger the more
important.

Is there a straightforward way to optimize (or limit) squid to cache only
extremely large files? The ideal target range would be any file over
100Kb... with no upper limit given the constraints of disk-space.

Alternatively, getting the cache working in its current condition would
do...

Any suggestions for a possible cause would be appreciated.

--
Thorne Huw Lawler,
Technical Support
Verve Inc.
> -----Original Message-----
> From: Thorne Lawler [mailto:thorne@verveinc.com]
> Sent: Thursday, 13 January 2000 10:28
> To: 'Clifton Royston'
> Cc: 'Squid Users Mailing List (E-mail)'
> Subject: RE: Data all piles up in one directory.
>
>
> Thanks Clifton, and thanks to those who replied directly.
>
> I have looked more closely at what the cache is doing, and am
> not so worried
> any more.
>
> For what it's worth, the cache to which I was referring is
> servicing a small
> link (128kb ISDN; bandwidth is *expensive* in .au) and a
> small office (9
> people). It is also using a set of really *nice*
> parent-caches provided by
> our ISP. For this reason, it's main purpose is to catch the
> really large
> downloads. Cacheing any objects under 100kb is an added bonus
> as far as
> we're concerned.
>
> That said, I have scrounged about the office and found a few
> gigabytes of
> disk which can be patched into the cache, and I will be doing
> this shortly.
>
> Thanks again for the speedy assistance. :)
>
> --
> Thorne Huw Lawler
> Verve, Inc.
>
>
> > -----Original Message-----
> > From: Clifton Royston [mailto:cliftonr@lava.net]
> > Sent: Wednesday, 12 January 2000 15:45
> > To: Thorne Lawler
> > Cc: Squid Users Mailing List (E-mail)
> > Subject: Re: Data all piles up in one directory.
> >
> >
> > On Wed, Jan 12, 2000 at 02:13:13PM +1100, Thorne Lawler wrote:
> > > Newbie question:
> > >
> > > Given a Squid 2.2 system set up, essentially, according to
> > the Quickstart
> > > instructions, with only one significant modification; I
> > have increased the
> > > 'maximum_object_size' to '40 MB'; why would all the
> > cache-data e being
> > > placed in a single cache-directory?
> >
> > ...
> > > --------------------------
> > > When Squid wants to create a new disk file for storing an
> > object, it first
> > > selects which cache_dir the object will go into. This is
> > done with the
> > > storeDirSelectSwapDir() function. If you have N cache
> > directories, the
> > > function identifies the 3N/4 (75%) of them with the most
> > available space.
> > > These directories are then used, in order of having the
> > most available
> > > space. When Squid has stored one URL to each of the 3N/4
> > cache_dir's, the
> > > process repeats and storeDirSelectSwapDir() finds a new set
> > of 3N/4 cache
> > > directories with the most available space.
> > > --------------------------
> >
> > cache_dir here has a specialized meaning, not the same as UNIX
> > directory.  In general terms it would better be described as cache
> > directory-hierarchy, or in most cases (where Squid is given
> a chunk of
> > disk to itself) cache partition or cache file-system.  Directories
> > within that cache partition are filled "from the bottom up", filling
> > cache_dir/00/00 until 256 files have been stored there,
> then moving to
> > cache_dir/00/01, etc.
> >
> >
> > > Given that this is the case, after ~2 weeks of operation
> > under moderate
> > > load, with a total of 450MB of space, and a cache_dir line
> > which reads:
> > >
> > > --------------------------
> > > cache_dir /squid/cache 400 16 256
> > > --------------------------
> >
> > This is a single cache_dir (note the keyword used to define
> it is the
> > same as in the explanation above) so everything goes in
> there, and the
> > subdirectories in it are filled from the bottom up as you observed.
> >
> >
> > > The cache also appears to be repacing cache data at what
> > seems(to me) to be
> > > a very rapid rate... I will probably understand this better
> > as I read more
> > > of the on-line support documentation.
> >
> > A significant issue is that 450MB will not hold very much web data,
> > especially if you defined the largest cacheable object to be
> > 40MB.  Ten
> > users downloading different pieces of demo software, MP3s, or
> > quicktime
> > movies off the web will completely empty your cache.  Another way of
> > looking at it is that 450MB is about 41 minutes of full-throttle T1
> > traffic - not long!  If that's all you can allocate to the
> cache, I'd
> > suggest turning the maximum object size way way down.  Try
> adding more
> > disk, or setting the maximum object size to 8KB, where at least it
> > should end up with a lot of cached button GIFs/etc. from
> > popular sites.
> >
> > A good hit rate is around 30%; I doubt you can get near
> that with your
> > current configuration, because objects will get flushed out of the
> > cache before Squid would even have a chance to figure out if they're
> > popular or not.
> >
> > > Is this a known behaviour?
> >
> > Yep to both.
> >
> >   -- Clifton
> >
> > --
> >  Clifton Royston  --  LavaNet Systems Architect --
> cliftonr@lava.net
> >         "An absolute monarch would be absolutely wise and good.
> >            But no man is strong enough to have no interest.
> >              Therefore the best king would be Pure Chance.
> >               It is Pure Chance that rules the Universe;
> >           therefore, and only therefore, life is good." - AC
> >
> >
>
>
Received on Wed Mar 29 2000 - 00:33:45 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:52:27 MST