Re: [squid-users] Tuning Squid for large user base

From: James MacLean <macleajb@dont-contact.us>
Date: Sat, 6 Mar 2004 23:16:25 -0400 (AST)

On Fri, 5 Mar 2004, Peter Smith wrote:

> I can definitely help here as I have squid setup for a large Educational
> setting as well.

Hi Peter. Very thankfull for you response and insight. I will be looking
at these suggestions.
 
> I agree with Mr. Crockett's suggestion with regards to hyperthreading.
> You will get more use out of all your CPU cache than half of it. Also,
> squid would not benefit from the lowered overhead of thread switching
> provided by hyperthreading. However I don't think that should affect
> what you are seeing.

I'm also not sure I want to start testing this until I've exhausted some
other tests. It's a good suggestion, but as you've suggested, it may not
make a great difference in my case.

> I'd be concerned with the "Median Service Times / HTTP Requests (All)"
> #s that you are seeing.

Since this was over a short run just to get the debug to send along, it
may not be the best data to have passed along :(.

> Also, the "Largest file desc currently in use" of 11111 seems
> excessively high. I'd guess that your connection is slow enough that
> squid is having to hold and back-log too many requests.

This is _definitely_ the case. We have a rate limited (QoS via CISCO) 6MBs
link. It's on a 100Mb/s ethernet circuit. It runs _full_ all day. Hence
the idea to apply Squid. We are using squid now at over 300 educational
sites and have had great success with it.

We do have an enormous amount of traffic trying to come through the pipe.
According to NTOP, about 70% is http traffic. Again the idea to plug in
Squid :).

What is obvious is that allowing everyone on our WAN to fight remote sites
for there web pages is quicker than everyone being proxied by squid. This
was the first time we had actually seen Squid act this way and obviously
have been trying to pick out what we have been doing wrong. Some slight
delay because of the proxy is fine, but as you watch, the traffic to the
Internet drops and client response time jumps :(.

We had to do the ulimit -n 25000 and update some .h's so that Squid would
compile to use more than 1024 FDs.

Now that I think of it, we didn't recompile the kernel. I wonder if there
is something we should have done there to accomodate the larger number of
open FDs?

> You might do well to ssh into your squid box and try doing browsing
> actually at the proxy itself and compare the experience with an inside
> browser getting the same page through squid. My guess is it is probably
> the same--not very fast.

I would bet that if I tried from the same box as the Squid box, it would
be very slow, but if I tried from another box on the same subnet, it would
be faster. But I have not gone down that road yet. I've been using an MRTG
graph of the NIC activity and some other numbers to see the effect.
 
> Another thing that might be driving up the # of file desc usage is the
> "half_closed_clients off" you have. On my squids I run with this as in
> the default config, "on".

I've tried this both ways. But as is true from above, there comes a large
backlog of client requests, so which ever would help in that situation I
should probably be doing.

> Also, on my systems we are running named instead of using squid's fqdn
> cache, this may help things out a bit--ymmv..

If the DNS times are any gauge, they always seem _very_ fast. We have a
DNS on the same LAN and I believe it is OK... Except if it is lagging
because the main pipe is full ;).

> I was running a squid box with only 1 GB of RAM recently and moved it to
> 4GB RAM. The only thing I altered was I changed from "cache_mem 128 MB"
> to "cache_mem 1024 MB" and "maximum_object_size 256 MB" to
> "maximum_object_size 512 MB".

Interesting about the maximum_object_size. I had that large, but thought
smaller would be better and just not cache large objects... They are long
time transactions and the client expects them to take longer :). What is
_really_ the benefit of a larger number here?

> I also increased the amount of "cache_dir" but found that this caused
> squid to eat up too much RAM for the disk index and pushed it back to
> the original setting.

This would be our next step... If we were happy with the performance of
"no_cache deny all" was performing.

In the beginning we set it up as any other Squid site execpt we gave it a
20G cache_dir. But when it kept stalling the requests so bad, we began
just testing with nothing saving to the cache.

Again, and this may be important in our case, later Friday afternoon when
students started to leave, the 6M feed was not full. We turned up a 2G
cache_dir and it flew. Or alteast it was very acceptable. But this, to me,
makes for 2 possibilities :

. Squid slows way down when it's upstream request pipe is full, or
. There is a certain number of open FD's that when we go beyond it, Squid
start to stall?

> I notice you have a "redirect_children 30"--are you running a
> redirector? This can significantly alter your numbers.

I am not doing any redirecting... Or shouldn't be. That line was there,
but I understood it had not effect if you had no redirectors... Arg. Is
that not the case? Maybe that's causing us havoc :(?

> We currently have 3 seperate ICP-ing squids and each box has from 3-5
> cache drives. If you do not have multiple squids, you might want to
> consider more hardware.. Take a look at my CPU usage--these boxes are
> kept VERY busy, and I have three of them!

Are the 3 cache_dir's per box on different channels... for speed?

We only have one box at this time, but maybe we should be using more than
one.

> negative_ttl 30 seconds

I'll use this, should allow some DNS failures to recover faster.

> cache_dir aufs /cache1 2048 64 64
> cache_dir aufs /cache2 2048 64 64
> cache_dir aufs /cache3 2048 64 64

Why not diskd? We tried aufs and it didn't seem to work near as well as
diskd? Maybe we were not running it properly. Actually we had compiled it
with threading. Maybe we should have tried that without threading?

> fqdncache_size 0

Does this mean 0 or no limit?

> Connection information for squid:
> Number of clients accessing cache: 7835
> Number of HTTP requests received: 22331805
> Average HTTP requests per minute since start: 2177.7
> Select loop called: 186487837 times, 3.299 ms avg
> File descriptor usage for squid:
> Maximum number of file descriptors: 4096
> Largest file desc currently in use: 946
> Number of file desc currently in use: 352
> Files queued for open: 1
> Available number of file descriptors: 3743
> Reserved number of file descriptors: 100
> Store Disk files open: 1

Does this suggest you are servicing 7835 clients and need no more than 946
FDs? 'Cause that looks like the opposite of mine. For example, the last
time we tested it, I see :

Number of clients accessing cache: 938
Number of HTTP requests received: 450489
Maximum number of file descriptors: 32768
Largest file desc currently in use: 2358

Less clients, way more FDs.... Hmm.

> HTH,
> Peter Smith

Thanks again for the information,
JES

-- 
James B. MacLean        macleajb@ednet.ns.ca
Department of Education 
Nova Scotia, Canada
     
Received on Sat Mar 06 2004 - 20:16:31 MST

This archive was generated by hypermail pre-2.1.9 : Thu Apr 01 2004 - 12:00:01 MST