Re: adding content to the cache from Alex Rousskov on 2009-05-12 (squid-dev)

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 12 May 2009 14:23:46 -0600

On 05/12/2009 01:52 PM, Laurent Luce wrote:

> Is there an easy way to check if a specific URL has been cached by
> Squid ? I know about doing a head request to Squid but is there
> another way to do that ?

HEAD request with an only-if-cached Cache-Control directive is probably
the best way to do it. Not sure whether Squid will honor only-if-cached
for HEAD requests, but it should. Please test.

Wget and curl can do that check. I did not check squidclient, but I
suspect it can do it too.

HTH,

Alex.

> ----- Original Message ----
> From: Alex Rousskov <rousskov_at_measurement-factory.com>
> To: Laurent Luce <laurentluce49_at_yahoo.com>
> Cc: squid-dev_at_squid-cache.org
> Sent: Tuesday, May 12, 2009 12:33:35 PM
> Subject: Re: adding content to the cache
>
> On 05/12/2009 11:03 AM, Laurent Luce wrote:
>
>> I think we are going to use a fake server and ask Squid to go get the
>> file.
>>
>> Can you describe what squidclient -P (put file) does exactly.
>
> It sends an HTTP PUT request to Squid and reads the response. The PUT
> request body is taken from the named file. You most likely do not need
> this because you want to GET content from the fake server, and not PUT
> content to the fake server. Squid does not cache request bodies.
>
>> Are you planning to do a segment cache in the future. This would be
>> interesting especially for video caching. Or maybe it is there
>> already and I missed it. Let me know.
>
> What is a "segment cache"? Caching of partial responses? Caching one
> response across several cache_dirs?
>
> Thanks,
>
> Alex.
>
>
>> ----- Original Message ----
>> From: Alex Rousskov <rousskov_at_measurement-factory.com>
>> To: Laurent Luce <laurentluce49_at_yahoo.com>
>> Sent: Sunday, May 10, 2009 9:36:27 PM
>> Subject: Re: adding content to the cache
>>
>> On 05/10/2009 08:57 PM, Laurent Luce wrote:
>>
>>> We thought of using a web server and getting the object using Squid.
>>> We were hoping for a way of telling Squid that an object has been
>>> added to its cache using some API. Reading your email, it seems to be
>>> quite complicated. We would have preferred to keep the download part
>>> separated from Squid and this is kind of a requirement for us. I
>>> think we will explore the server solution and ask Squid to request
>>> the object itself.
>> It is difficult for me to get into details without knowing more about
>> your environment and requirements. If you cannot use the existing
>> "external" cache push APIs via a simple fake "server" or "peer" hack,
>> eCAP may be the second overall best option that would still not require
>> duplicating existing code and maintaining a lot of custom code.
>>
>> Good luck,
>>
>> Alex.
>>
>>
>>> ----- Original Message ----
>>> From: Alex Rousskov <rousskov_at_measurement-factory.com>
>>> To: Laurent Luce <laurentluce49_at_yahoo.com>
>>> Cc: squid-dev_at_squid-cache.org
>>> Sent: Sunday, May 10, 2009 2:22:16 PM
>>> Subject: Re: adding content to the cache
>>>
>>> On 05/10/2009 02:08 PM, Laurent Luce wrote:
>>>
>>>> I am looking into adding objects at runtime, on a regular basis,
>>>> probably 2 or 3 every minute or so. Those objects are stored on the
>>>> machine running Squid and I also have the HTTP headers for those
>>>> objects. I just want to support one storage scheme (the one by
>>>> default I guess).
>>>>
>>>> Looking at your answer, you are saying that adding objects at runtime
>>>> is quite complicated due to the way Squid keeps that information
>>>> internally, am I correct ?
>>> Squid keeps some store information in RAM. If you update storage
>>> externally, you will need some kind of communication channel to update
>>> that information if you want Squid to notice the changes in store
>>> information. You will also need to take care of conflicts. AFAICT,
>>> implementing that would be a waste of time for your use case.
>>>
>>> You could implement store updates from the running Squid process itself.
>>> This will work and will be very efficient. You may be able to use eCAP
>>> request satisfaction, so that you do not have to learn a lot of Squid
>>> code, deal with its complexities, and keep your custom patch in sync
>>> with Squid code changes. AFAICT, this is _not_ the best way forward, but
>>> I may be missing some important requirement. Please let me know if I am.
>>>
>>>
>>>> I started looking at store*.c files to see if I could modify the
>>>> store at runtime and add those objects. Any approach would you
>>>> recommend if I wanted to go down this road.
>>> I would recommend reusing the existing code:
>>>
>>> 1) When an object needs to be added, request it from Squid using an
>>> off-the-shelf HTTP client like wget or curl. The client will talk to
>>> Squid using HTTP.
>>>
>>> 2) Write a very simple script that would serve the right object with the
>>> right headers, pretending to be an HTTP server or a peer cache. Any
>>> popular scripting language will have an HTTP library that will make
>>> writing such a fake server easy. Squid will talk to your script using HTTP.
>>>
>>> 3) If you are pushing objects for URLs that have different entities if
>>> accessed directly, then you will need to play with configuration so that
>>> only requests from your HTTP client are routed to your fake HTTP
>>> server/peer. I do not have a recipe, but a simple IP- or header-based
>>> ACL may be sufficient.
>>>
>>> Your 3/minute push rate is very low so no further optimizations are
>>> probably necessary. If your objects are huge, you could optimize so that
>>> the client does not have to receive the content even though Squid
>>> fetches and stores everything. This would be similar to how some Range
>>> and some IMS requests are processed by Squid.
>>>
>>> The above approach will work with any storage scheme. Do you see any
>>> compelling reason to implement what you want inside Squid instead?
>>>
>>> Thank you,
>>>
>>> Alex.
>>>
>>>
>>>> ----- Original Message ----
>>>> From: Alex Rousskov <rousskov_at_measurement-factory.com>
>>>> To: Laurent Luce <laurentluce49_at_yahoo.com>
>>>> Cc: squid-dev_at_squid-cache.org
>>>> Sent: Sunday, May 10, 2009 12:01:20 PM
>>>> Subject: Re: adding content to the cache
>>>>
>>>> On 05/09/2009 08:04 PM, Laurent Luce wrote:
>>>>
>>>>> Actually, I am looking at a way of adding it directly to the squid
>>>>> cache. Basically, take the file and add it to the cache. I am looking
>>>>> into patching Squid to provide an API to do that. How complicated do
>>>>> you think it is if I want to add the file content along with the
>>>>> metadata directly into the cache ?
>>>> Do you want to add objects runtime or offline? How many objects do you
>>>> need to add (ballpark estimates: hundreds, thousands, millions)? How
>>>> often do you need to add them (once, daily, weekly, etc.)? Are those
>>>> objects stored on the machine running Squid? How are those objects
>>>> stored now? Do stored objects come with HTTP response headers?
>>>>
>>>> If you want to add objects while Squid is running and want them to
>>>> become available as they are being addeded, fetching those objects using
>>>> wget/curl may be the best short-term solution (which can be optimized
>>>> and tuned using special headers and a local script pretending to be an
>>>> origin server).
>>>>
>>>> If you want to add objects offline, do you want to support multiple
>>>> Squid storage schemes (e.g., ufs, COSS, RockStore, etc.)? The best
>>>> implementation will probably depend on that and on the answers to the
>>>> questions above.
>>>>
>>>> Thank you,
>>>>
>>>> Alex.
>>>>
>>>>
>>>>> ----- Original Message ----
>>>>> From: Amos Jeffries <squid3_at_treenet.co.nz>
>>>>> To: Laurent Luce <laurentluce49_at_yahoo.com>
>>>>> Cc: squid-dev_at_squid-cache.org
>>>>> Sent: Friday, May 8, 2009 1:42:53 AM
>>>>> Subject: Re: adding content to the cache
>>>>>
>>>>> Laurent Luce wrote:
>>>>>> I am looking for a way to manually add content to the cache. Is there an API to do that ?
>>>>>>
>>>>>> For
>>>>>> example, I have the following file image.gif and I want to add it to
>>>>>> the proxy cache so it can be served from there when needed.
>>>>>>
>>>>>> Laurent
>>>>>>
>>>>> Not at present.
>>>>> You need to place it somewhere on a web server, assign it sufficient cache-control settings to store for a long while and then request it from squid.
>>>>>
>>>>> Common tools such as wget or squidclient can do that for you.
>>>>>
>>>>> Amos
>>>>> -- Please be using
>>>>> Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
>>>>> Current Beta Squid 3.1.0.7
Received on Tue May 12 2009 - 20:23:57 MDT

This archive was generated by hypermail 2.2.0 : Thu May 14 2009 - 12:00:02 MDT