Re: [squid-users] Managing clusters of siblings (squid2.7)

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 02 Oct 2009 14:33:04 +1300

Chris Hostetter wrote:
> : > Couldn't the same thing be done with ACLs? (deny icp/htcp from
> : localhost)
>
> : The problem is multi-stage loops: proxyA->proxyB->proxyA which only shows
> : up in the two headers.
> :
> : The first-degree example I list above can be solved by ACL in icp_access,
> : but when you go another level out into a mesh things get much more
> : complicated unless pulling the data from those headers. So for example
>
> Gotcha ... I didn't even realize that multi-stage queries was something
> squid would do with siblings.
>
> But now you've got me scared: what prevents this from happening even
> with distinct configurations for each peer?
>
> If the A, B, and C, have cache_peer sibling configs that look like this...
>
> A -> B, C
> B -> C, A
> C -> A, B
>
> ...what prevents an A->B->C->A ICP loop from happening right now?

Nothing I know of. But thats not something to be terribly scared of.
'Merely' a waste of resources.

"via on" prevents A->B->A->C type loops nicely by aborting inside the
second A. The absence of that will leave requests looping until they hit
some timeout and all the looped peers close their connections.

This is why we grit our teeth when people say "turn off the via header"
for 'privacy'. As if there was such a thing. The Via: header can be
easily anonymized by setting the visible_hostname to something simple
like 'the' or 'cat' which gives no identification info away.

>
> : Multicast-ICP with all the siblings NOT relaying tests at all might be the
> : best option for your current setup. Where peers simply get added to the
> : multicast group and first responder to a broadcast query gets used. You
>
> Hmmm... i breifly considered multicast, and while it seemed like it might
> be a worthwhile network optimization (to reduce the number of packets on
> the wire) i hadn't done any checking to see if my network was supporting
> it because it didn't seem like it would actually simply the administration
> of clusters, particularly because of this line from the FAQ...
>
> http://wiki.squid-cache.org/SquidFaq/MultiCast#Should_I_be_using_Multicast_ICP.3F
> Multicast does not simplify your Squid configuration file.
> Every trusted neighbor cache must still be specified.
>
> ...but based on your comments, and reading a little closer, it seems like
> taht's mainly just a security issue (further down: "...it would be a bad
> idea to implicitly trust any ICP reply from an unknown address") If squid
> is running in a private network, and only reachable by internal hosts,
> then there's probably little downside in accepting multicast-replies
> blindly.
>
> which raises the question: how would you configure squid to do that? Is
> putting a "multicast-responder" option on a "multicast" group cache-peer
> line supported?

By defining a network range, say 192.168.50.0/24 where all the trusted
multicast peers exist and denying any responses from outside that range.

This also helps with keeping separate multiple clusters of sibling
defined by IP range. But I imagine that is a bit beyond what you need
right now.

>
> Your comment about "first responder to a broadcast query gets used" is
> also a little concerning. By "first responder" do you literally
> mean just the first server to respond to the multicast ICP request, even
> if it's a "MISS" ? ... or will squid not reply to a multicast query
> if it doesn't have a cached copy?
>
> (The situation i'm worried about is when squidB & squidC both recieve the
> same multicast ICP query from squidA and B replies first with a MISS
> before C can reply with a HIT ... would squidA ever pay attention to
> squidC's reply?)

 From what I could tell of the code when I looked, first HIT or the
local timeout for multicast response wait.
MISS responses are ignored. I'm not even sure if the siblings in
multicast mode would bother sending MISS since it seems mostly useless
(the only reason would be an I'm-alive indicator).

>
> : Or cache digest sharing, where all current peers indexes are already known
> : and no query at all is made.
>
> This would still require each peer to be cofigured to point to every other
> peer (except itself) correct? ... so this would just be a network i/o
> optimization, not an administration simplification.
>

Aye, most network benefits and speedup. The admin gains from not having
to configure around possible loop scenarios. Config of each peer is
still there.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE7 or 3.0.STABLE19
   Current Beta Squid 3.1.0.14
Received on Fri Oct 02 2009 - 01:33:12 MDT

This archive was generated by hypermail 2.2.0 : Fri Oct 02 2009 - 12:00:02 MDT