Re: adding content to the cache from Alex Rousskov on 2009-05-12 (squid-dev)

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 12 May 2009 13:33:35 -0600

On 05/12/2009 11:03 AM, Laurent Luce wrote:

> I think we are going to use a fake server and ask Squid to go get the
> file.
>
> Can you describe what squidclient -P (put file) does exactly.

It sends an HTTP PUT request to Squid and reads the response. The PUT
request body is taken from the named file. You most likely do not need
this because you want to GET content from the fake server, and not PUT
content to the fake server. Squid does not cache request bodies.

> Are you planning to do a segment cache in the future. This would be
> interesting especially for video caching. Or maybe it is there
> already and I missed it. Let me know.

What is a "segment cache"? Caching of partial responses? Caching one
response across several cache_dirs?

Thanks,

Alex.

> ----- Original Message ----
> From: Alex Rousskov <rousskov_at_measurement-factory.com>
> To: Laurent Luce <laurentluce49_at_yahoo.com>
> Sent: Sunday, May 10, 2009 9:36:27 PM
> Subject: Re: adding content to the cache
>
> On 05/10/2009 08:57 PM, Laurent Luce wrote:
>
>> We thought of using a web server and getting the object using Squid.
>> We were hoping for a way of telling Squid that an object has been
>> added to its cache using some API. Reading your email, it seems to be
>> quite complicated. We would have preferred to keep the download part
>> separated from Squid and this is kind of a requirement for us. I
>> think we will explore the server solution and ask Squid to request
>> the object itself.
>
> It is difficult for me to get into details without knowing more about
> your environment and requirements. If you cannot use the existing
> "external" cache push APIs via a simple fake "server" or "peer" hack,
> eCAP may be the second overall best option that would still not require
> duplicating existing code and maintaining a lot of custom code.
>
> Good luck,
>
> Alex.
>
>
>> ----- Original Message ----
>> From: Alex Rousskov <rousskov_at_measurement-factory.com>
>> To: Laurent Luce <laurentluce49_at_yahoo.com>
>> Cc: squid-dev_at_squid-cache.org
>> Sent: Sunday, May 10, 2009 2:22:16 PM
>> Subject: Re: adding content to the cache
>>
>> On 05/10/2009 02:08 PM, Laurent Luce wrote:
>>
>>> I am looking into adding objects at runtime, on a regular basis,
>>> probably 2 or 3 every minute or so. Those objects are stored on the
>>> machine running Squid and I also have the HTTP headers for those
>>> objects. I just want to support one storage scheme (the one by
>>> default I guess).
>>>
>>> Looking at your answer, you are saying that adding objects at runtime
>>> is quite complicated due to the way Squid keeps that information
>>> internally, am I correct ?
>> Squid keeps some store information in RAM. If you update storage
>> externally, you will need some kind of communication channel to update
>> that information if you want Squid to notice the changes in store
>> information. You will also need to take care of conflicts. AFAICT,
>> implementing that would be a waste of time for your use case.
>>
>> You could implement store updates from the running Squid process itself.
>> This will work and will be very efficient. You may be able to use eCAP
>> request satisfaction, so that you do not have to learn a lot of Squid
>> code, deal with its complexities, and keep your custom patch in sync
>> with Squid code changes. AFAICT, this is _not_ the best way forward, but
>> I may be missing some important requirement. Please let me know if I am.
>>
>>
>>> I started looking at store*.c files to see if I could modify the
>>> store at runtime and add those objects. Any approach would you
>>> recommend if I wanted to go down this road.
>> I would recommend reusing the existing code:
>>
>> 1) When an object needs to be added, request it from Squid using an
>> off-the-shelf HTTP client like wget or curl. The client will talk to
>> Squid using HTTP.
>>
>> 2) Write a very simple script that would serve the right object with the
>> right headers, pretending to be an HTTP server or a peer cache. Any
>> popular scripting language will have an HTTP library that will make
>> writing such a fake server easy. Squid will talk to your script using HTTP.
>>
>> 3) If you are pushing objects for URLs that have different entities if
>> accessed directly, then you will need to play with configuration so that
>> only requests from your HTTP client are routed to your fake HTTP
>> server/peer. I do not have a recipe, but a simple IP- or header-based
>> ACL may be sufficient.
>>
>> Your 3/minute push rate is very low so no further optimizations are
>> probably necessary. If your objects are huge, you could optimize so that
>> the client does not have to receive the content even though Squid
>> fetches and stores everything. This would be similar to how some Range
>> and some IMS requests are processed by Squid.
>>
>> The above approach will work with any storage scheme. Do you see any
>> compelling reason to implement what you want inside Squid instead?
>>
>> Thank you,
>>
>> Alex.
>>
>>
>>> ----- Original Message ----
>>> From: Alex Rousskov <rousskov_at_measurement-factory.com>
>>> To: Laurent Luce <laurentluce49_at_yahoo.com>
>>> Cc: squid-dev_at_squid-cache.org
>>> Sent: Sunday, May 10, 2009 12:01:20 PM
>>> Subject: Re: adding content to the cache
>>>
>>> On 05/09/2009 08:04 PM, Laurent Luce wrote:
>>>
>>>> Actually, I am looking at a way of adding it directly to the squid
>>>> cache. Basically, take the file and add it to the cache. I am looking
>>>> into patching Squid to provide an API to do that. How complicated do
>>>> you think it is if I want to add the file content along with the
>>>> metadata directly into the cache ?
>>> Do you want to add objects runtime or offline? How many objects do you
>>> need to add (ballpark estimates: hundreds, thousands, millions)? How
>>> often do you need to add them (once, daily, weekly, etc.)? Are those
>>> objects stored on the machine running Squid? How are those objects
>>> stored now? Do stored objects come with HTTP response headers?
>>>
>>> If you want to add objects while Squid is running and want them to
>>> become available as they are being addeded, fetching those objects using
>>> wget/curl may be the best short-term solution (which can be optimized
>>> and tuned using special headers and a local script pretending to be an
>>> origin server).
>>>
>>> If you want to add objects offline, do you want to support multiple
>>> Squid storage schemes (e.g., ufs, COSS, RockStore, etc.)? The best
>>> implementation will probably depend on that and on the answers to the
>>> questions above.
>>>
>>> Thank you,
>>>
>>> Alex.
>>>
>>>
>>>> ----- Original Message ----
>>>> From: Amos Jeffries <squid3_at_treenet.co.nz>
>>>> To: Laurent Luce <laurentluce49_at_yahoo.com>
>>>> Cc: squid-dev_at_squid-cache.org
>>>> Sent: Friday, May 8, 2009 1:42:53 AM
>>>> Subject: Re: adding content to the cache
>>>>
>>>> Laurent Luce wrote:
>>>>> I am looking for a way to manually add content to the cache. Is there an API to do that ?
>>>>>
>>>>> For
>>>>> example, I have the following file image.gif and I want to add it to
>>>>> the proxy cache so it can be served from there when needed.
>>>>>
>>>>> Laurent
>>>>>
>>>> Not at present.
>>>> You need to place it somewhere on a web server, assign it sufficient cache-control settings to store for a long while and then request it from squid.
>>>>
>>>> Common tools such as wget or squidclient can do that for you.
>>>>
>>>> Amos
>>>> -- Please be using
>>>> Current Stable Squid 2.7.STABLE6 or 3.0.STABLE15
>>>> Current Beta Squid 3.1.0.7
Received on Tue May 12 2009 - 19:33:49 MDT

This archive was generated by hypermail 2.2.0 : Wed May 13 2009 - 12:00:04 MDT