RE: [squid-users] Proxy Benchmarks

From: Chris Robertson <crobertson@dont-contact.us>
Date: Wed, 1 Dec 2004 10:33:44 -0900

>> > From: Ow Mun Heng [mailto:Ow.Mun.Heng@wdc.com]
>> > On Tue, 2004-11-30 at 03:10, Chris Robertson wrote:
>> > Do you have any experience with load_balance??
>>
>> I have some. I have somewhere between 150 and 200 remote sites each with
>> their own squid server that all have to pass traffic by a collection
point
>> at the central office.
>
> I'm thinking more like a distributed collection point and not only 1
> Central Location.
>
> eg: X number of Remote server farms and X+1 number of squid servers.

As far as I know ICP or Digest exchange should work. ICP is a very constant
communication, and seems far better suited to peers that are very close
(same network segment), whereas digest is an occasional transfer, and seems
better suited to distant peers.

>
>
>> At the CO we have three Squid servers. Two are
>> acting as load balancing peers (each running one squid process)
> OK
>
>
>> and the
>> third is a parent for the two (running two Squid processes on a dual proc
>> box)
>
> Why 2 instances of Squid Processes?

Squid can't natively take advantage of multiple processors. In the interest
of not overwhelming the parent with requests from two children, and in the
interest of taking advantage of the second processor, while still having all
requests come from one IP address, I have the two children round robin
between the two processes on the parent squid box. If I had it to do over
again, I would set the three up as a virtual server
(http://www.linuxvirtualserver.org/). But if it ain't broke, don't fix it.

>
>> to give the world a single IP address that our traffic comes from.
> Is this advisable? Maybe for a private establishment, but may not be so
> for end-users (eg: ISP)
>

At first we just had the three central proxies acting as round-robin parents
for the remote sites. There are some web applications (some banking, other
educational) that don't like seeing a single client's "session" coming from
multiple IP addresses.
 
>> If
>> the parent dies, the two load balancers will surf direct.
> Surf Direct? What do you mean? No Squid proxy at all? Doesn't the 2,
> load balancers become the failover for the parent?
>

If the parent dies, the client sites continue to round-robin through the
children. If one of the children dies, the clients surf through the
remaining one. If the internet link to the children dies, the sites don't
have internet access. Hopefully that answers your question.

>> It's not the most
>> graceful solution, but it has been working for several months.
>>
>> Currently traffic is peaking about 100 requests/sec and 1.5MB/sec, with
CPU
>> usage under 50% on all processors (Intel Xeon 3.0GHz, 2GB RAM on the
peers
>> 4GB on the parent).
>
> Wow. how big is your cache_dir then? 10MB per 1GB of space.. you have
> what 200GB of Cache_dir on the peers?
>

Actually, I only have about 6 GB of disk cache on each central proxy. Much
of the RAM is being used to store hot objects, but these servers are not
really used for caching. The majority of my customers access the internet
over satellite, so the majority of the caching is done at the customer's
presence.

> Reiserfs on aufs?
>

ext2 and aufs on the central children proxies, the parent is currently
running FreeBSD.

> What's your max_object_size?
>

After using an awk script (scalar.awk
http://scalar.risk.az/scalar091/scalar.awk) I saw that the vast majority of
requests (over 90%) were for objects less than 10KB in size, so that's what
I set the central proxy server's max_object_size to. At the client sites,
it's set to 50MB.

> Thanks

Glad to be what help I can.
Received on Wed Dec 01 2004 - 12:33:49 MST

This archive was generated by hypermail pre-2.1.9 : Sat Jan 01 2005 - 12:00:01 MST