Re: MD5 and URL validation (continue to other very old thread)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 23 Nov 2012 16:11:42 +1300

On 23/11/2012 3:49 p.m., Eliezer Croitoru wrote:
> On 11/22/2012 10:35 AM, Henrik Nordström wrote:
>> ons 2012-11-21 klockan 21:06 +0200 skrev Eliezer Croitoru:
>>
>>> The problem is that it only being checked while a file is being fetched
>>> from UFS(what I have checked) while from RAM it wont be checked.
>>
>> There is no risk of object store displacement in RAM.
>>
> I think that 99% of the time you should not expect object store
> displacement.
> If we do suspect it then it should be tested while rebuilding the store.
> the OS suppose to be very steady so

We have an amazingly large amount of queries about how to manipulate the
cache directories in-place manually while the proxy is operating and
there are tools like squidpurge and server-push cache seeders all over
the place.

>
>>> The result is that when store_url_rewrite feature is being used the
>>> check points on inconsistency between the request url and the object in
>>> cache_dir (naturally).
>>
>> The metadata URL should be the store_url, not requested URL.
>>
> Of-course but the question is "to mess with it or not?"
>
> I had a problem to make the needed data available such as a flag or
> rewritten url at the step of the check since I kind of lost the scopes
> of specific variables.
> it seems to me like I should use some transit stage in the process.

FYI: The very use of store_url means you have 'displaced' the actual
client query from some other object to reading this one. So it does not
matter what this ones original URL actually was, we just have to trust
the admin re-writing .* -> http://example.com/ did so intentionally. The
problem of getting wrong object back when fiddling with URL is well
publicized and will continue to be so.

>
>>> After a small talk with alex I sat down and made some calculations
>>> about
>>> MD5 collision risks.
>>> The hash used to make the index hash is a string from "byte + url".
>>> For most caches that I know of there is a very low probability for
>>> collision considering the amount of objects and urls.
>>
>> Verifying the MD5 is sufficient. But either MD5 or URL MUST be verified
>> in metadata on objects fetched from disk.
> OK so MD5 is being checked and it's the store_url hash which I think
> how it should be done by design.(and was done before)
>

So can I take it you have a working store_url implementation in squid-3
for review ? :-P

Amos
Received on Fri Nov 23 2012 - 03:11:56 MST

This archive was generated by hypermail 2.2.0 : Fri Nov 23 2012 - 12:00:08 MST