Re: assertion failed: store_dir_diskd.c:839: "rb->flags.need_to_validate"

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Tue, 10 Jul 2001 16:07:13 +0200

Andres Kroonmaa wrote:

> Actually, I'm abit unhappy with how swap.state is currently managed.

You are not the first one...

Attached is a fragment of a prehistoric thread on the subject...

--
Henrik

Message-ID: <34D793F4.7CA7503C@hem.passagen.se>
Date: Tue, 03 Feb 1998 23:02:28 +0100
From: Henrik Nordstrom <hno@hem.passagen.se>
X-Mailer: Mozilla 3.01Gold (X11; I; Linux 2.0.32 i586)
MIME-Version: 1.0
To: Duane Wessels <wessels@nlanr.net>
CC: squid-dev@nlanr.net
Subject: Store state logs
References: <199802030312.OAA25697@moto.off.connect.com.au>
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

My idea on how the store state maintaince should work:

1. Keep a transaction based log of the swap state. We should be able to
quickly recover the state from the state logs even if Squid restarts
several times while starting, and this without corrupting the cache or
losing track of cached objets.

2. Duplicate the metadata that is not deduced from the object data
inside each swap file. Most notably the URL.

All extra overhead while the swap state is recovered should be avoided
if possible (and touching each swap object is a HUGE overhead).

The object based metadata is primarly for verification and recovery
purposes.

- Verify that swapped in object this really the object we think it
should be (the correct URL), to guarantee that Squid in no circumstanses
gives the wrong object to the clients, even if the cache is corrupted.

- If using hash based store keys, the metadata does not contain the
objcet URL and we should be able to recover this somehow. The URL is
needed for proper Hit-metering and possibly other operations as well,
but it is not required in-memory. And this guards for any unexpected
hash collisions, which gives the hash based keys a higher trust factor.

- If the swap state log is lost, Squid could slowly rebuild the cache
from the disk objects. This should be done at a moderate speed to not
saturate the system by the cache rebuild.

The requirements for the state logs depends a bit on how we do the store
object validation. Without validation the transaction model of the state
logs needs to be strict and include expunged objects. If we add
validation on swapin (combined with a graceful fall back to fetch the
object when the validation fails) we can handle the state more loosely,
with the main purpose of quickly knowing at least wich objects we have
in the store. In this situation it does not matter much if we think we
have a few objects that in fact is not there, as this will be recovered
gracefully when encountered.

And in either case, the URL validation is more or less required, to
reliably handle store hits while recovering the swap state from log
files.

Some ideas on how the store logs can be handled:

The store state logs is kept in files named
swap.state.<nnnn>

The base log called
swap.state.1

Updates are written to transaction logs named
swap.state.<timestamp>
where timestamp is when the log was opened (for example when Squid
starts).

Periodically write the swap state to new.swap.state. When this done
rename it to swap.state.1 and remove all previous transaction logs (all
but the current one).

When Squid starts read all the swap.state files sorted numerically by
timestamp. Any old new.swap.state file is ignored.

---
Henrik Nordstr=F6m
Sparetime Squid Hacker

Message-ID: <34D8BD24.F5CC30@hem.passagen.se>
Date: Wed, 04 Feb 1998 20:10:28 +0100
From: Henrik Nordstrom <hno@hem.passagen.se>
X-Mailer: Mozilla 3.01Gold (X11; I; Linux 2.0.32 i586)
MIME-Version: 1.0
To: Stewart Forster <slf@connect.com.au>
CC: Duane Wessels <wessels@nlanr.net>, squid-dev@nlanr.net
Subject: Re: Store state logs
References: <199802040052.LAA28226@moto.off.connect.com.au>
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

Stewart Forster wrote:

> > Periodically write the swap state to new.swap.state. When this done
> > rename it to swap.state.1 and remove all previous transaction logs (a=
ll
> > but the current one).

One of my ideas was that transaction based logs allows us to compact the
state logs while running. Any changes during the "clean-write" is
recorded in the last transaction log.

---
Henrik Nordstr=F6m
Sparetime Squid Hacker
Received on Tue Jul 10 2001 - 08:30:27 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:14:06 MST