Re: [squid-users] Compute digest as content is written to cache

From: Drunkard Zhang <gongfan193_at_gmail.com>
Date: Sun, 12 Aug 2012 15:36:57 +0800

2012/8/12 Amos Jeffries <squid3_at_treenet.co.nz>:
> On 11/08/2012 10:21 p.m., Jack Bates wrote:
>>
>> On 11/08/12 12:30 AM, Amos Jeffries wrote:
>>>
>>> On 11/08/2012 7:22 p.m., Jack Bates wrote:
>>>>
>>>> I am interested in intercepting content as it is written to the cache,
>>>> and computing a digest from the content. Do you know if this can be
>>>> done in some kind of add on, or would it require a change to the core?
>>>
>>>
>>> What type of digest and to what purpose?
>>
>>
>> I was thinking of using OpenSSL
>> SHA256_Init()/SHA256_Update()/SHA256_Final(). The purpose I have in mind is
>> to detect identical content at different URLs
>>
>> Given a response with a "Location: ..." header and a "Digest: SHA-256=..."
>> header (such as from MirrorBrain), if the URL in the "Location: ..." header
>> is not already cached but the "Digest: SHA-256=..." header matches the
>> content at some other URL that is already cached, then I want to update the
>> "Location: ..." header with the cached URL. I think this should redirect
>> clients to mirrors that are already cached
>
>
> Small problem there. The digest is not calculated/known until the object is
> finished arriving. By then it is too late to attach new headers. And way too
> late to decide whether to ask that source for it.
>
Agree. Multiple different splashing headers with same content is
really hard to store. Split headers/contents and store them
respectively may works, and the headers should store as on-to-many
mapping.
Received on Sun Aug 12 2012 - 07:37:24 MDT

This archive was generated by hypermail 2.2.0 : Sun Aug 12 2012 - 12:00:03 MDT