Re: [squid-users] 2-gigabit throughput, 2 squids

From: Dave Dykstra <dwd@dont-contact.us>
Date: Wed, 13 Jun 2007 08:25:51 -0500

On Wed, Jun 13, 2007 at 09:33:19AM -0300, Michel Santos wrote:
>
> Dave Dykstra disse na ultima mensagem:
> > Hi,
> >
> > I wanted more throughput for my application than I was able to get with
> > one gigabit connection, so we have put in place a bonded interface with
> > two one-gigabit connections agregated into one two-gigabit connection.
> > Unfortunately, with one squid, re-using objects that are small enough to
> > fit into the Linux filesystem cache but large enough to be efficent (a
> > few megabytes each), it maxes out a CPU core at around 140MB/s. This is
> > a dual dual-core AMD Opteron 270 (2Ghz) machine, so it is natural to
> > want to take advantage of another CPU. (This is a 64-bit 2.6.9 Linux
> > kernel and I think I have squeezed about all I am going to out of the
> > software). At first I tried running two squids separately on the two
> > different interfaces (without bonding, 2 separate IP addresses) but that
> > confused the Cisco Service Load Balancer (SLB) we're using to share the
> > load & availability with another machine so I had to drop that idea.
> > For much the same reason, I don't want to use to two different ports.
> > So then the problem is how to distribute the load coming in on the one
> > IP address & port to two different squids. Two different processes
> > can't open the same address & port on Linux, but one process can open a
> > socket and pass it to two forked children. So, I have modified
> > squid2.6STABLE13 to accept a command line option with a file descriptor
> > of an open socket to use instead of opening its own socket. I then
> > wrote a small perl script to open the socket and fork/exec the two
> > squids. This is working and I am now getting around 230MB/s throughput
> > according to the squid SNMP statistics.
> >
>
> I use another approach. I run three squids. using two on 127.0.0.2 and .3
> which serve as parent. So the Ip address which contacts the remote sites
> is the IP address of the server. I get very high performance and the setup
> is easy without helper programs.

So everything is filtered through one squid? I would think that would
be a bottleneck. Does it not actually cache anything itself, just pass
objects through with proxy-only?

> What was important to me that I can
> sibling both in order not getting objects cached twice

So you have the two backend squids set as proxy-only too? I would think
that half of your objects would then go through all 3 squids!

We have a pair of machines for availability & scaling purposes and I
wanted them to be siblings so they wouldn't both have to contact the
origin server for the same object. The problem with cache_peer siblings
is that once an item is in a cache but expired, squid will no longer
contact its siblings. In my situation objects currently expire
frequently so having siblings was pretty much useless (as I like to
describe it -- like the elevators in the 40-floor Wayside School, one
that goes up and one that goes down, which both worked perfectly --
once). We had instead configured them as cache_peer parents to avoid
that problem, but when we turned on collapsed_forwarding in squid2.6
that brought us to some deadlock situations. We decided that
collapsed_forwarding would result in fewer contacts to the origin server
than peering, so we ended up disabling peering altogether.

> I am curious if you can get higher throuput using sockets or tcp over
> loopbacks.

What kind of throughput do you get with your arrangement?

- Dave
Received on Wed Jun 13 2007 - 07:25:56 MDT

This archive was generated by hypermail pre-2.1.9 : Sun Jul 01 2007 - 12:00:04 MDT