Squid refresh and LRU logic. Proposal.
                    by Andres Kroonmaa, 15.Jan.97

     Current implementation of squid decides when it is right to issue
 a refresh or purge object from disk based on very few things, namely
 timestamp (when object was last validated with source, object's last
 modified time as given by source, and object's expires as given by
 source. Expires seems to be very rarely used and is effectively unused
 field in squid's metadata mostly. There is a lastreference time also,
 but is used only in memory and is not preserved during restarts,
 after a restart it takes value of last refreshtime (timestamp).

     Base logic of squid when object is requested is as follows. When
 object is retrieved successfully, squid checks if it is private or
 public type, if it is public type and passes cache_stop regex it is
 saved to disk storage and thats it mostly. Principal logic is to
 "save all that come in and see later if that was effective".

     When object is requested again, it must pass refresh_patterns to
 be considered fresh. If it is fresh it is given fast, if it is "stale",
 it is IMS-refreshed with a good chance of being not modified.
 In overall this gives quite a good hitrate for caches with lots of
 free storage and object reuse frequency is bigger than object purge
 frequency.

     Cache fills until there is no more space to store additional
 objects. Then LRU algoritm starts to purge objects that were refreshed
 longest time ago (as lastref == timestamp after reload), or are expired.


     Stuff that gets into the cache cannot be controlled in better
 manner than either yes or no (cache_stop), if anything gets into cache
 it cannot be selectively expired and thrown out. Example for this
 would be desire to cache only certain TLDs, or _not_ to cache some TLDs,
 but, as cache is still requested for these, it seems to be a good idea
 to still cache these but for some short timeperiod, after which purge
 object from disk. To make most good out of this and not to purge out
 blindly objects that would be requested shortly after that, knowledge
 of specific usage count and/or frequency of any object is desired.
 Currently, squid equates last-reference-time with last-refresh-time,
 But since last refresh there can be many references which would not
 trigger refresh, actual time objects are last referenced is known only
 during squid's runtime, between reloads this info is lost. In this
 light, LRU algoritm is of slightly different meaning.

     Squid 1.0 used to estimate how long any object will live in cache
 using ttl_patterns, which gave quite a reasonable control over the
 way cache storage was used. It was fairly easy to mark some URL
 patterns for deletion in short future, whereas now its almost
 impossible.

     Web usage statistics show that majority of cache hits occur during
 quite a limited time frame which is usually from few hours to few days
 after the initial fetch. After that hitrate for any object drops
 considerably and stays at some low level depending on general interest
 to this particular object. This way, it is desireable to cache any object
 for some short time period, and later, depending on its personal hitrate,
 either drop or keep.

     With current design squid "hopes" that any object cached will be
 rerequested soon (and is very right about that) and the only other
 thing squid worries about is how to keep cached objects uptodate.
 Very minimal control is given to cache manager on how will objects be
 purged when space is needed, and LRU algoritm does not take into account
 any site specific preferences.

     Very busy caches and possibly too small caches (to cache all the
 usage patterns clients push on them) maintain their contents in quite
 ineffective way not taking into account sitespecific bandwidth costs.
 LRU implemented is more like emergency firefighting - the need to free
 at least this much space purging oldest refreshed objects. On the other
 hand, cache managers tend to give longest refresh intervals to URL
 patterns that are used most or change least.
     From this comes out two contradictory rules, on one side, refresh
 logic tends to refresh static, rarely changing and best for caching
 objects least frequently as there is simply no active need in this,
 and on the other hand, LRU finds that these most rarely refreshed
 objects are best candidates to be purged. LRU tends to never purge
 objects whose lifetimes are known to be short and which are refreshed
 constantly, often upon every request, yet just those objects are
 sticked to the disk space.

     Consider a parent cache with disk space of 4GB and 10 of its
 children with 1GB each. Suppose that all children have around 20%
 of common interest, making 2GB in common. Now suppose that each
 child cache's hitrate is 50% which is fairly good. That in total
 will push around 5GB on their parent in some hypotetic timeframe
 of which only 20% is of common interest, or around 1GB. This means
 that in some time period parent cache will have to reuse all of its
 cache contents and its hitrate can be 1/5GB 20% at best, and that
 in case if its LRU is ideal and it is able to keep that particular
 1GB of common interest at right time.

     Parent cache's hitrate can be as good as its childrens only if
 total storage is bigger than its children aggregate. If it's not,
 its effectivenes depends totally on effectiveness of its LRU
 algoritm. Btw, seems that big national parent caches tend to have
 pure hitrate exactly because they are unable to accomodate all the
 aggregate traffic in its storage and their LRU algoritm fails to
 find most useful objects to be cached.

     Parent/Child paradox. Is there any good reason to keep a copy
 of object in parent cache when it was requested by its child cache
 and child is going to keep it anyway, and parent can ask for it
 from its children when needed? If parent's total storage is smaller
 than aggregate of its children then there isn't as it will be
 flooded away by upcoming requests. Then, there is no point in having
 disk storage on parent at all? Make it only its children coordinator
 and when one child makes request poll other children for a hit and
 if missed then just proxy it? Then again, if parent cache is pretty
 huge one and can hold anything its children could ever dream of
 then there seems to be no point in having disk storage at any child?
 Of course, in real life there are limited bandwidth links between
 parent and children to be conserved, but main truth is that parent
 either have to have huge storage or must be pretty smart in selecting
 what to keep or what to drop from cache.


     Why LRU?

     LRU is by definition an algoritm that must sort items in order
 of their last access time and discard those that are accessed longest
 time ago. Is this actually what Web caching would benefit from most?

     What if try to implement an algoritm that sorts items in order of
 their value in the context of reusability and usefulness so far and to
 discard those which are least useful to keep. or, in short, weighting
 the cost of keeping or dropping an item. Lets still name it LRU as in
 general it is means to free some space when needed.

 Lets try to define what are the values of Web object to weight in LRU.
 In no particular order:

 * cost of retrieving object in terms of delay, bandwidth, billing;
   (this includes object size, connectivity to the source, byte cost)
 * frequency of reuse of particular object
   (overall average hitrate, or, cost savings so far by keeping it)
 * reference count so far;
   (compared to reuse - is it saving any cost? or just wasting disk space)
 * cache operator preference in keeping/dropping otherwise equal items;
   (To mark some types explicitly as candidates for dropping when space
    is needed)
 * cacheability,
   (constantly changing items should get preference for dropping;)
 * last time of reference;
   (objects that were very useful to keep but it was a year ago...)
 * possibly trend of hitrate in time (hitrate goes up or down);

 Currently squid uses only last time of reference (=refresh) of an object
 as a criteria for dropping. And it does that in portions of 256 at a time
 and removes 8 worst case items of these 256. Problem with this is that
 it does not take into account whether removed item is actually giving
 good hitrate in long term, whether its big or small, how many times
 it has been reused, etc. There are lots of objects that are referenced
 constantly, but are never useful in caching, why store those to disk?

 To make LRU more optimal, squid would need additional statistics per
 object to be kept in its metadata and across reloads. Parts of this
 weighting values would be discovered when object enters cache, partly
 these weights will be modified during its reuse or hit, partly some
 values will be weighted when object is about to be dropped.

 *   *   *

 In general consider this model.

 1. When object is first fetched.

   Estimate penalty of fetch - size, time needed, delay, hopcount to source,
   estimate cost of retrieving - pattern configurable billing weight,
   possibly taking into account hopcount?

      penalty = size*SizeFactor + time*TimeFactor [+ hopcount*HopFactor]
      cost = penalty * PatternFactor,

   express the cost in $$.$$ if you like

 2. During reuse (normal operation)

   Pure hit increases average hitrate and hit count. Object has been once
   more time useful to be kept, it has conserved bandwidth, time and thus
   some money.

      refcount++;
      usefulcount++;
      ( and, bytesaved = size*usefulcount )

   Refresh or reload decrease or reset object's usefullness and implicitly
   reduces average hitrate and conserved money.

      refcount++;
      usefulcount -= somefuncof(cost);

   Save last reference time independently from last refresh time into
   swaplogfile to keep track of object's useful resue count..
   Then we can calculate at any time:

      avghitrate == usefulcount / ( lastreference - firstfetch ) or,
      avghitrate == usefulcount / ( lastreference - lastrefresh ) or,
      usefulness == cost * usefulcount ( $$ conserved ) or
      usefulness == ( usefulcount / refcount (%) ) * cost

   Usefulness can drop below 0 and become negative. This is logically
   very much possible. Consider objects that expires every 5 minutes,
   and are requested no more than every 30 minutes with max-age of 5.
   Logically, it is cheeper not to use disk space at all for those.

   Possibly use a linked list and move each used entry to the top of
   it, thus pushing least recently used items to the end. Garbage
   collection would start from the end of list.

 Into swaplog save, (upon writecleanlog in order of MRU to LRU):
 id timestamp expires lastmod (lastref refcount usefulcnt penalty) size URL
 that is, 4 additional fields,

 3. During garbage collection weight each object by some rules.

 Ideally, garbage collection would sort all items in their order of
 weighted usefulness or cost-penalty, and then release in order of
 least importance. In reality this would mean incredible overhead not
 suitable to be done on-demand basis. Still, this could be done
 say once per day at some preconfigured off hours. To be more suited
 for constant runtime task there can be used some kind of threshold,
 failing to achieve which means item is to be released. For this
 threshold to be manually given would need heavy investigation, but
 this threshold can be calculated based on some stats and needs. Eg.
 during runtime and reload of metadata, it is fairly easy to calculate
 average cost and usefulness of all objects in general, then assign
 some percentage of this to be the threshold. If this threshold is
 recalculated during full garbage collection, selftuning occurs.

 This threshold calculation should take into account disk space used
 and average dayly addition to it to keep free space in some high/low
 watermark. Average usefullness will be slowly growing, thus increasing
 bang for a buck. Now garbage collection could be run as a restartable
 event in squid, in each run scanning some of the objects and suspending
 till later. As the thresehold is calculated to be some percentage of
 overall, there must be some percent of objects to be released, and
 particular order is of no importance. Compared to current garbage
 collection which scans fixed 256 objects and deletes least 8 blindly
 this method would release strictly some percent of objects below
 some useful level.

    As a result of such weighting and more complex LRU cache contents
 would contain most useful stuff in terms of bandwidth conservation,
 will tend to drop least useful objects instead of least recently
 refreshed and in the end reduce costs and increase overall hitrate.
 Especially on caches with heavy transit traffic.

    There will appear several objects, that are retrieved without prior
 knowledge of their uselessness for being cached. These items would be
 saved to disk just to be found on next few accesses to be invaluable
 for caching and dropped. These include objects with immediate expire,
 contents of which includes some sort of meta tags that forces browsers
 to refresh, etc. Interestingly, if reference count increases, but useful
 count only decreases and goes below 0, squid would heuristcally find
 that object referenced is not cacheable, which is usually done by manual
 configuration patterns. To keep these objects from entering cache there
 might be some separate list of objects dropped recently and this be cached.
 So, upon request of such an object squid could retrieve information
 about its usefulness from its metacache and decide not to cache it
 at all. Disk access usage is reduced and less work on LRU is done.
 Actually, some objects in swaplog file might have all needed info for
 being cached, except file-id which may be of some fixed value, declaring
 that this object is not cached and shouldn't be. These objects might
 be constructed into hash tables as always expired and with minimal data.
 After some time not being referenced these can be dropped altogether.


-------------------------------------------------------------------
 Andres Kroonmaa                          Telefon:        6308 909
 Network administrator
 E-mail: andre@ml.ee                        Phone: (+372) 6308 909
 Organization:    MicroLink Online
 EE0001, Estonia, Tallinn, Sakala 19          Fax: (+372) 6308 901
-------------------------------------------------------------------