Re: Pinning objects in Squid Cache from aditya agarwal on 2014-03-19 (squid-dev)

From: aditya agarwal <adi_agrwl_at_yahoo.co.in>
Date: Thu, 20 Mar 2014 13:22:16 +0800 (SGT)

Hi, As Alex is out of office till April 2nd, can any other developer in the group comment on this approach? Thanks, Aditya On Thursday, 20 March 2014 9:52 AM, aditya agarwal <adi_agrwl_at_yahoo.co.in> wrote: Hi Alex, Thanks for all the help, I understand its difficult to take out time with on-going Squid-3.0 project. I have done some changes to the squid code and wanted to get the approach reviewed: 1. I have added one more constant 'ENTRY_PINNED' into the ENUM for StoreEntry->flags. As flags is ushort and the ENUM had 15 entries therefore there was space for one more. 2. In clientCheckNoCacheDone() instead of calling clientProcessRequest(), I call clientCheckPinned(). 3. In clientCheckPinned() a. If the request is cachable and Config.accessList.pin_cached exists I call aclNBCheck with callback as clientCheckPinnedDone(). b. Else I call clientProcessRequest() 4. In clientCheckPinnedDone(), if the answer is ACCESS_ALLOWED, I set the flag ENTRY_PINNED to true. In the end of clientCheckPinnedDone() I call clientProcessRequest(). 5. In lru_purgeNext(), I call a function storeEntryPinned() and if it returns true I add the StoreEntry to the tail of the list. This is similar to if storeEntryLocked() returns true. 6. I add the object to the tail because I observed that when we access the object in lru_referenced() we are adding the object to the tail of the list. Please let me know if this approach will work. Thanks, Aditya On Wednesday, 19 March 2014 8:31 PM, Alex Rousskov <rousskov_at_measurement-factory.com> wrote: On 03/19/2014 12:18 AM, aditya agarwal wrote: > I would go with Implementation 4.1 and below is my understanding of it: > 1. If the user wants to pin a video object from 10.102.1.1/abc.wmv. > We define an acl as follows: > acl ToPin 10.102.1.1/abc.wmv Yes, although the above is not a valid acl declaration (it is missing an ACL type). I assume you just sketched it without the type. FWIW, this project should not require new ACL types if you add support for the existing dst, dstdomain, etc. ACL types but that may not be easy (see 4.1.2b below). Alternatively, you can add a new ACL type that matches the store entry key. That would ease implementation a lot, but force an admin to use a really ugly/unnatural interface. > pin_cached allow ToPin > 2. At the time of eviction check if the URL of the cached-entry > matches(regex) with ToPin entries given in squid.conf Yes. This may not be as trivial as it sounds though. Most ACL code in Squid requires an HTTP request to extract things like Request URI. You have two options: 4.1.2a: Restrict pin_cached to ACL types that do not require an HTTP request. This does not require special code (just documentation), although it would be best to check whether the configured ACLs require a request or not (there is an ACL::requiresRequest) API for that, but it need more work to be usable for this configuration-time check purpose). However, this option may not give you enough ACL types to use. 4.1.2b: Create a fake HttpRequest object and populate it with victim's StoreEntry details. To avoid performance overheads, the object creation should happen once so that the same object can just be updated whenever a new victim is considered for the replacement. Another problem you may face here is that the StoreEntry object for ufs caches does not have the Request URL (which is stored on disk). I cannot think of a good solution to this problem (without entering implementation option 4.2 territory). > 3. If it is a match, move the entry to the top of the replacement > policy index. Yes. > 4. If the user wants to unpin the object we will go and remove the > acl from the squid.conf and eventually the object would be flushed > out from the cache or we can also go and specifically delete it using > SquidClient. Correct. The pinned object may also be deleted while Squid handles regular traffic. For example, an HTTP DELETE request for the same object may delete the cached entry. The pin_cached feature, as described here, does not violate HTTP because it only affects the cache replacement policy (which is outside of HTTP control). If you want to prevent cache deletions for reasons other than replacement policy decisions, more work is needed (and you may see significantly higher Squid Project resistance during the patch review). > Please correct me if I have misunderstood something. Also as I am new > to squid code, can you please point me to the files in Squid 2.7 > where I would need to make the required modifications? Sorry, I cannot help with Squid v2.7 improvements at this time -- too much Squid3 work to spend time on a dead branch... The following could be a good starting step though: $ fgrep -RI Config.replPolicy squid-2.7/src/ Most of my comments should apply to Squid2 and Squid3, but please do not expect v2.7 patches to be officially accepted (or even reviewed). HTH, Alex. > On Friday, 14 March 2014 8:51 PM, Alex Rousskov wrote: > On 03/13/2014 11:46 PM, aditya agarwal wrote: > >> We have a set of new requirements in which we need to provide pinning >> of objects in squid cache such that they are not evicted by squid's >> LRU policy. I needed help in doing this and therefore had posted in >> squid-users group regarding the same. I got responses from Alex and >> Amos on some of the approaches that I can take to achieve this. > >> Alex also mentioned that I should post the same on squid-dev group >> and discuss the best possible option. As per Alex following are the >> approaches: > >> 1. Adding an interface to cache manager to pin-unpin specific cached objects. >> 2. An extension of HTTP request method >> 3. Adding an ICAP/eCAP interfaces to mark misses for pinning >> 4. Mark a pinning set configurable via squid.conf > >> I understand Options 1 and 4 but I am not very clear by what he means >> in options 2 & 3. Also we might only have a few 10s or 100-200 of >> videos to pin. > > Option 2 means supporting a new HTTP method like X-PIN that will pin the > referenced object to the cache. To use that option, the admin would have > to send an HTTP X-PIN request to Squid. > > Option 3 uses an ICAP or eCAP response meta header that tells Squid that > the requested object should be pinned. To use that option, the admin > would have to use an ICAP service or an eCAP adapter. > > >> Can you please provide your opinions on which option would be best and why? > > I think you should go with option #4. Add a new pin_cached directive > similar to send_hit and store_miss in Squid trunk: > > acl ToPin ... > pin_cached allow ToPin > pin_cached deny !ToPin > > > * Implementation sketch 4.1 (simpler and more flexible): > > Check pin_cached whenever Squid replacement policy wants to delete a > cached entry. Move rejected candidates to the top of the replacement > policy index (as if they were "used" now) to avoid rechecking them too > frequently. Eventually, you can add specialized ACLs that would match on > the specific deletion reason/location and check pin_cached in other > deletion cases/places. > > > * Implementation sketch 4.2 (faster): > > Check pin_cached whenever Squid starts caching an entry or loads cached > entry metadata during a cache index rebuild. Based on the results of > that check, raise or clear a new STORE_PINNED entry flag. Do not let > replacement policy to delete pinned entries, moving them to the top of > the replacement policy index (as in 4.1). A Squid restart would be > required to unpin pinned entries. > > > If pinning protection goes beyond replacement policy decisions, it > violates HTTP proxy caching rules and the feature should be marked as such. > > >> Note: Our squid version is 2.7. > > Could have been worse! > > > HTH, > > > Alex. > > > >> On Thursday, 13 March 2014 11:13 PM, Alex Rousskov <rousskov_at_measurement-factory.com> wrote: >> On 03/13/2014 09:52 AM, aditya agarwal wrote: >> >>> We had already thought of the second option to fetch the objects at >>> regular intervals so that they are always at the head of the queue in >>> cache, but it doesn't seem to be very scalable as we can have 100s of >>> videos which the client might want to pin to cache. >> >> Please also keep in mind that not all cache_dir types support LRU. For >> example, frequently requesting URLs in Rock storage would not help much. >> >> >>> I wanted to know if there is any modification that can be done in >>> squid to support pinning of objects. >> >> Yes, it would be possible to add such support. I can think of several >> options: >> >> * A cache manager interface to pin and unpin individual cached objects. >> It will not be simple if you want pinning to last across Squid restarts >> or if you want to pin using regular expressions and such. >> >> * An extension HTTP request method for the same purpose, but cache >> manager may be an overall better approach, especially from access >> control point of view. >> >> * It is also possible to add an eCAP/ICAP (or even a new helper) >> interface to mark misses for pinning. Adaptation makes pinning using >> regular expressions easy, but it will add performance overheads unless >> you are already using an adaptation service. >> >> * Finally, one could make a pinning set configurable via squid.conf >> ACLs. For mostly static sets that can be stored in a few MB or RAM >> (thousands of URLs, not millions), this is probably the most efficient >> and simple option. >> >> If you decide to work on any of this, please consider discussing >> specifics on squid-dev first. There are caveats related to each option >> and the choice of the best option is not obvious IMO. >> >> >> Cheers, >> >> Alex. >> >> >> >>> On Thursday, 13 March 2014 3:05 PM, Amos Jeffries <squid3_at_treenet.co.nz> wrote: >>> On 13/03/2014 9:22 p.m., aditya agarwal wrote: >>> >>>> Hi, >>>> >>>> I wanted to know if there is a way to PIN certain objects in Squid's >>>> cache, so that they are not removed or subjected to eviction because >>>> of the LRU policy running in squid. >>>> >>>> Thanks, Aditya >>>> >>> >>> That depends on what the objects are ... so "what exactly are you trying >>> to achieve?" >>> >>> Meanwhile ... objects locally served up by Squid using the >>> /squid-internal-static/ well-known URL path prefix have it. Such things >>> as icons for the error pages and FTP directory listings. >>> See the mime.conf file installed with your Squid on how to configure >>> those URL objects. >>> >>> >>> However, if you are wanting this for arbitrary objects served up elsewhere: >>> >>> * the best way is not to bother. >>> Cache is a _temporary_ storage area (a type of buffer) not a long term >>> archive. Correctly following HTTPP protocol ensures up to date reliable >>> content at all times. >>> >>> * the second-best way is to simply poll your proxy with a request for >>> it before the replacement policy removes it. This works on the same >>> principle as prefetching and has all the same problems with generating >>> correct client headers. >>> >>> Amos >>>
Received on Thu Mar 20 2014 - 05:25:25 MDT

This archive was generated by hypermail 2.2.0 : Thu Mar 20 2014 - 12:00:13 MDT