Re: LFU in squid from Yee Man Chan on 1999-04-25 (squid-dev)

From: Yee Man Chan <ymc@dont-contact.us>
Date: Sun, 25 Apr 1999 21:15:55 -0400 (EDT)

>
> This is correct as of Squid 2 (I've modified the replacement
> policy in 2.1.PATCH2). In Squid 1 (1.1.20 and presumably others) the
> replacement policy was implemented by bucket scanning; those comments
> still remain but are not valid.

Is it true that there is a list of objects in Squid 2 now? I looked at
1.1.20 and it obviously has no list at all.

> In the work we did here we implemented two LFU-variants. We
> used a heap for executing the policy efficiently. We also implemented a
> version of LRU that used the heap for comparison. These are not true
> LFU, since we do not keep metadata across object lifetimes -- when an
> object is removed from cache we remove its metadata. Proper LFU keeps
> object metadata. We are considering how to do this, and I'd be happy to
> discuss it further if someone wants to hack. Basic idea: keep the
> object metadata in a separate hash table and garbage collect it...

When you say proper LFU, do you mean perfect LFU, ie we keep the refcount
even if the object is purged?

> You're absolutely correct about the CPU usage. I have written a
> technical report that discusses this in a fair amount of detail. I
> presented a work in progress paper at WCW'99 and have now completed the
> work (there were a couple of puzzles I had to solve first). The paper
> is in final review and will be published externally in a little while.
> If anyone on this list would like to give me comments on it before
> publication (i.e., some time this week) I would be more than happy to
> send a pre-publication version to you, and would appreciate the eyes...

Can I get a copy? You can send it by email or give me the URL. Thanks in
advance.

>
> One other thing we noted in the experiments is that the heap
> based LRU policy achieved a significantly lower hit rate than the list
> based one (only the list one used storeExpiredReferencedAge). So, it
> costs more but achieves a little better hit rate. Nevertheless the
> frequency-based policies behaved quite a bit better than LRU. We intend
> to release the code changes to the Squid cache community. With the heap
> implementation it is trivial to investment and implement new replacement
> policies. Lots of people say it doesn't matter (because disks are so
> cheap and you can improve the cache in many other ways). But my feeling
> is if you can improve your resource utilization through the replacement
> policy, why not?

Does your LFU-variant have any aging policy? I know that if you don't do
that the old but high refcount objects will hardly be purged, so the
effective cache size keeps shrinking over a long period of time.
If you have aging policy, can you tell me how it works?
Plus, what do you mean by variant? How does it deviate from the
traditional memory paging LFU, ie refcount is lost when object is
purged and new objects are always allow to get in.

Actually, my colleague (the guy in CC:) also wrote a simulator using heap
with a comparison function pointer to implement LFU, LRU, GD-size and
other policies.

>
> One final note -- the heap implementation requires O(log N) time
> (where N is the number of objects in the heap) every time an object is
> hit or evicted. The list implementation requires O(1) to move an object
> to the head of the list or evict from the tail. We expected the new
> policies would take a CPU hit, although since CPU is not the bottleneck
> we felt it was worthwhile. To our surprise, the heap policies require
> less CPU than LRU due to storeExpiredReferencedAge, and since we are
> able to keep more objects in cache. That was quite pleasing.

What size of cache do you simulate? I don't know whether this is true when
the cache is 64GB - the size of cache COMPAQ sold for 200K.

>
> I'll have more information for the caching community shortly,
> including source code (unless my hands get cut off). A copy of the
> paper will appear on my home page. Best regards,

Where is your HP?

Thanks.
Yee Man Chan

>
> -- jad --
> John Dilley <jad@hpl.hp.com>
>
Received on Tue Jul 29 2003 - 13:15:57 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:06 MST