Re: doublecheck from Robert Collins on 2000-11-14 (squid-dev)

From: Robert Collins <robert.collins@dont-contact.us>
Date: Wed, 15 Nov 2000 09:09:33 +1100

I was planning on a readdir() style approach anyway. I have seen swap.state
situations where files aren't in the log and after a rebuild just aren't
seen by squid. The rebuild swap from disk routine checks this anyway.

Yes Time based acls is the method to make it happen.... I was talking
concept first :]

as far as ongoing vs one off checking goes... i'll put it out to the jury
:] Do we run through the cache once on startup (possibly according to time
based acls), or do we loop forever (only operating according to time bsaed
acls/ current cache load etc)?

What if some bright spark decides to delete half their files to free disk
space? Wouldn't it be sensible for squid to detect that and shrink the in
memory swap.state? Or if a disk is starting to fail seeing 100's or 1000's
of corrupt file warnings would prompt the admin.

Rob

----- Original Message -----
From: "Andres Kroonmaa" <andre@online.ee>
To: "Robert Collins" <robert.collins@itdomain.com.au>
Cc: <squid-dev@squid-cache.org>
Sent: Wednesday, November 15, 2000 12:39 AM
Subject: Re: doublecheck

On 14 Nov 2000, at 22:15, Robert Collins <robert.collins@itdomain.com.au>
wrote:

> Well as I understand it the dirty rebuild from swaplog/ dirty rebuild from
> disk process is to start serving hits asap. The doublecheck code was to
> allow users to validate the cache without it going offline when an
> inconsistency is found.

Sure. if you just read in the swap.state, its fast. Then sort the file
list and validate with whatever means, including stat if needed/wanted.
While swap.state isn't validated, we can live with some false-hits.
Worst thing we can not tolerate is when per-entry reading of swapstate
takes hours to complete, choking alot of resources and slowing down cache.

> By doing that we can easily go L2 dir by L2 dir and address your
performance
> suggestion. We can't do that during a rebuild from log because we are
> parsing the log. I'll whip something up this week to go through all the
> cache dirs L2 by L2 and crosscheck.

after I wrote my mess, I started to hesitate, what if eventually over a
very
large set of files, both approaches would indeed show similar results. So I
ran a test over my two squid cachedirs, both equally filled and equal in
size.
First test had few L2 dirs touched, so potentially in cache, the second run
was done on a "cold" untouched ufs.

./a.out -g -l 384 -L 10 -n 1234234 -m 1234234 -p /www/cidera2 | ptime
./a.out
Done Reading Filenames: 1234234

real 54:44.085
user 5.871
sys 2:24.190
------------------
(wait: 52:19.024)

./a.out -g -l 384 -L 10 -n 1234234 -m 1234234 -p /www/cidera1 -s | ptime
./a.out
Done Reading Filenames: 1234234

real 2:22.715
user 2.899
sys 42.163
------------------
(wait: 1:37.653)

So it takes about 1 hour to stat 1.2M files in random order. and about 2.5
minutes in sequential order. Its over 20-fold increase in performance.

> Perhaps this could be an ongoing background process? With start-stop
periods
> in squid.conf
>
> i.e.
> cache_consistency_start 12:01 am
> cache_consistency_stop 5am

better use timebased ACLs ;)
cache_consistency_access allow AT_NIGHT ON_SUNDAY

But I think we shouldn't need to check periodically. Inconsistency is an
emergency situation, which must be resolved as it occurs...

perhaps we could get best consistency checking if we use readdir() per L2,
checking all files found on the way, removing ones that do not belong
there,
creating L2 dirs if not present, stating all existing files.

------------------------------------
Andres Kroonmaa <andre@online.ee>
Delfi Online
Tel: 6501 731, Fax: 6501 708
Pärnu mnt. 158, Tallinn,
11317 Estonia
Received on Tue Nov 14 2000 - 15:03:11 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:58 MST