Re: adding content to the cache from Alex Rousskov on 2009-05-10 (squid-dev)

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Sun, 10 May 2009 15:22:16 -0600

On 05/10/2009 02:08 PM, Laurent Luce wrote:

> I am looking into adding objects at runtime, on a regular basis,
> probably 2 or 3 every minute or so. Those objects are stored on the
> machine running Squid and I also have the HTTP headers for those
> objects. I just want to support one storage scheme (the one by
> default I guess).
>
> Looking at your answer, you are saying that adding objects at runtime
> is quite complicated due to the way Squid keeps that information
> internally, am I correct ?

Squid keeps some store information in RAM. If you update storage
externally, you will need some kind of communication channel to update
that information if you want Squid to notice the changes in store
information. You will also need to take care of conflicts. AFAICT,
implementing that would be a waste of time for your use case.

You could implement store updates from the running Squid process itself.
This will work and will be very efficient. You may be able to use eCAP
request satisfaction, so that you do not have to learn a lot of Squid
code, deal with its complexities, and keep your custom patch in sync
with Squid code changes. AFAICT, this is _not_ the best way forward, but
I may be missing some important requirement. Please let me know if I am.

> I started looking at store*.c files to see if I could modify the
> store at runtime and add those objects. Any approach would you
> recommend if I wanted to go down this road.

I would recommend reusing the existing code:

1) When an object needs to be added, request it from Squid using an
off-the-shelf HTTP client like wget or curl. The client will talk to
Squid using HTTP.

2) Write a very simple script that would serve the right object with the
right headers, pretending to be an HTTP server or a peer cache. Any
popular scripting language will have an HTTP library that will make
writing such a fake server easy. Squid will talk to your script using HTTP.

3) If you are pushing objects for URLs that have different entities if
accessed directly, then you will need to play with configuration so that
only requests from your HTTP client are routed to your fake HTTP
server/peer. I do not have a recipe, but a simple IP- or header-based
ACL may be sufficient.

Your 3/minute push rate is very low so no further optimizations are
probably necessary. If your objects are huge, you could optimize so that
the client does not have to receive the content even though Squid
fetches and stores everything. This would be similar to how some Range
and some IMS requests are processed by Squid.

The above approach will work with any storage scheme. Do you see any
compelling reason to implement what you want inside Squid instead?

Thank you,

Alex.

> ----- Original Message ----
> From: Alex Rousskov <rousskov_at_measurement-factory.com>
> To: Laurent Luce <laurentluce49_at_yahoo.com>
> Cc: squid-dev_at_squid-cache.org
> Sent: Sunday, May 10, 2009 12:01:20 PM
> Subject: Re: adding content to the cache
>
> On 05/09/2009 08:04 PM, Laurent Luce wrote:
>
>> Actually, I am looking at a way of adding it directly to the squid
>> cache. Basically, take the file and add it to the cache. I am looking
>> into patching Squid to provide an API to do that. How complicated do
>> you think it is if I want to add the file content along with the
>> metadata directly into the cache ?
>
> Do you want to add objects runtime or offline? How many objects do you
> need to add (ballpark estimates: hundreds, thousands, millions)? How
> often do you need to add them (once, daily, weekly, etc.)? Are those
> objects stored on the machine running Squid? How are those objects
> stored now? Do stored objects come with HTTP response headers?
>
> If you want to add objects while Squid is running and want them to
> become available as they are being addeded, fetching those objects using
> wget/curl may be the best short-term solution (which can be optimized
> and tuned using special headers and a local script pretending to be an
> origin server).
>
> If you want to add objects offline, do you want to support multiple
> Squid storage schemes (e.g., ufs, COSS, RockStore, etc.)? The best
> implementation will probably depend on that and on the answers to the
> questions above.
>
> Thank you,
>
> Alex.
>
>
>> ----- Original Message ----
>> From: Amos Jeffries <squid3_at_treenet.co.nz>
>> To: Laurent Luce <laurentluce49_at_yahoo.com>
>> Cc: squid-dev_at_squid-cache.org
>> Sent: Friday, May 8, 2009 1:42:53 AM
>> Subject: Re: adding content to the cache
>>
>> Laurent Luce wrote:
>>> I am looking for a way to manually add content to the cache. Is there an API to do that ?
>>>
>>> For
>>> example, I have the following file image.gif and I want to add it to
>>> the proxy cache so it can be served from there when needed.
>>>
>>> Laurent
>>>
>> Not at present.
>> You need to place it somewhere on a web server, assign it sufficient cache-control settings to store for a long while and then request it from squid.
>>
>> Common tools such as wget or squidclient can do that for you.
>>
>> Amos
>> -- Please be using
>> Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
>> Current Beta Squid 3.1.0.7
Received on Sun May 10 2009 - 21:22:26 MDT

This archive was generated by hypermail 2.2.0 : Mon May 11 2009 - 12:00:02 MDT