Re: [squid-users] Re: ordering in rebuilt swap.state from Amos Jeffries on 2012-01-25 (squid-users)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 26 Jan 2012 18:47:10 +1300

On 26/01/2012 3:27 p.m., RW wrote:
> On Thu, 26 Jan 2012 12:19:30 +1300
> Amos Jeffries wrote:
>
>> On 26.01.2012 12:02, RW wrote:
>>> I noticed today that according to "squidclient mgr:storedir" my
>>> retention had dropped from 200+ days to 5 days, but all the content
>>> still seemed to be there. I tried rebuilding swap.state from the
>>> cache
>>> files, but it made no difference.
>> Under the LRU retention policy the cache age is just a measure of the
>> oldest file creation timestamp.
> mtime surely.

Modification is done by erase and replace. The'U' in LRU means *used*
though, so if anything its an inverted atime order list.

>
> My assumption was that the linked list in memory was not correctly
> ordered, and so the head object wasn't the least recently used.

Then that would be a bug in the list code. It has not been touched for
many years, so I would hope not.

>
>> I suspect you had one file up at 200 days and most of the cache under
>> 5 days. That file recently being erased the retention details drop
>> immediately to a different oldest entry (added 5 days ago).
> It isn't, it's 200 days of steady use, and the cache is less than half
> full, so nothing has been erased. And as I said squid had picked-up all
> the 200 days worth of content (8GB and 300,000 objects).

That "half full" tells me that Squid has not had reason to need more
space cleared by force. In which case old stale objects are left in
place just in case they are needed one day. At least 1 could reasonably
have not been touched since that first day.

>
> Actually I did lower the cache size to delete a few hundred MB, and I
> watched the "cache age" in real-time and it all over the place

Do you means "LRU reference age:" was randomly changing "all over the
place"? it is the head of the list, but if that happens to be locked for
use right then it will skip down and try displaying the next one. That
might account for LRU rising sometimes during the purge. Its expected to
change randomly but steadily downward during a purge until the new size
limit is reached.

Regular traffic flowing in and out of the cache may be freeing up space
by erasing things as they are invalidated and replaced. Squid is not
caching _everything_ so much as it is caching the latest copy of every
unique URL across that period. atime stays very short for most things,
but some get old. If you could graph it as count:/:age you would expect
to see a decay curve of ages the height of your req/sec traffic rate at
0-1 seconds out to the "reference age" days long at the tail.

>>> I'm guessing that the rebuild had already been done automatically
>>> and that when that happens there's no explicit sort to restore lru
>>> order, it just relies on subsequent access and ageing. Does that
>>> sound reasonable?
>>>
>>> I was wondering, if I wrote bit of script to rename the cache files
>>> into mtime order could I rebuild swap.state in lru order?
>> You could, but it would be of little value. The cache file names and
>> swap.state order has no bearing on the memory index hash algorithm,
>> retention policy algorithm,
> I assumed that part of the point of having a journal was that reading
> it in sequentially would lru order the queue when squid starts up.
> Does squid not preserve that information across restarts?

You are right. I checked the code and it uses the policy list to dump
the clean journal. I was thinking it used the index hash. But that makes
sense now that I think about it, the old index contains things for every
cache_dir, not just the one being dumped.

>> FYI: swap.state is simply a journal of what has been added/removed
>> from that cache. A "rebuild' is just a dump of the memory index
>> contents into a fresh empty journal file. Effectively erasing all
>> records of things removed.
>
> I was referring to deleting swap.state when squid isn't running, in
> which case it's recreated from the cache files.

If you do this erase you will loose the LRU atime details in the
journal. I'm saying atime, but its also accounting for memory HIT stuff
which does not get recorded in disk atime records.

Amos
Received on Thu Jan 26 2012 - 05:47:18 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 26 2012 - 12:00:03 MST