Re: [squid-users] Denying write access to the cache

From: Amos Jeffries <squid3@dont-contact.us>
Date: Sat, 24 Mar 2007 12:23:30 +1200

Guillaume Smet wrote:
> On 3/23/07, Amos Jeffries <squid3@treenet.co.nz> wrote:
>> Looks like a case for something like this that prevents the group
>> 'robots' from retrieving data not already in the cache:
>>
>> acl robots <....>
>> always_direct deny robots
>
> No, that's not what I want. It's not a problem for us that robots
> index all the content of our website. I just want them to not put
> garbage into our cache.
> So they should be able to access every page of the site, using cache
> or not, but they shouldn't be able to put the generated pages in the
> cache so that they don't pollute the cache.
>
>> Still, I would pose you a question:
>> if people find and visit your page by going to a search engine how
>> can they find useful pages that nobody else has visited recently??
>
> I agree. That's why it's not what I'm asking for :).
>
> Thanks for your help.
>
> --
> Guillaume

ah, now I understand.

This is a problem for your web server configuration then. Your cache and
others around the world can be expected to cache any content that they
are allowed to.
The best way to prevent this content being cached is for the originating
web server to mark it as non-cachable using "Pragma: no-cache" and
"Cache-Control: no-cache"

You can find info on them here
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32

Amos
Received on Fri Mar 23 2007 - 18:23:34 MDT

This archive was generated by hypermail pre-2.1.9 : Sat Mar 31 2007 - 13:00:02 MDT