Re: [squid-users] max capacity for a T1/E1 in terms of HTTP req/sec

From: Joe Cooper <joe@dont-contact.us>
Date: Mon, 22 Oct 2001 01:40:42 -0500

You're reading way too much precision into this article, Khiz. I'm
knowledgable about Squid and ISP client loads, but I'm not omniscient.

This is simply a rule of thumb. There is no hard and fast rule for how
many requests a T1/E1 can support or how many reqs/sec a client
population will generate. There cannot be. Alex and others have
explained this quite clearly here in response to your previous questions.

But, we know that a T1 is 1.5Mbits/sec. No more can be pulled from a
T1...so, given that, we can make some guesstimates. If the entire T1 is
dedicated to web traffic (which it never is, but let's say that it is
for this estimation), there can be about 150-300 dialup clients browsing
normally at one time making a few requests per minute. When all is said
and done this workload will probably be about 15-20 requests/sec at peak
load.

With a web cache, however, it is possible for more data to flow to
clients than comes through the T1, possibly a lot more. So, maybe, in
some perfect combination of events those 150-300 dialup users will all
request the same objects from the cache, or will suddenly storm the most
popular news sites--which have some cachable images and such--pushing
the request rate much higher than would be possible from a non-cached
T1. So, to be extraordinarily safe, you put a cache capable of 40
reqs/sec on your T1...you can relax and never worry about whether your
client load will overload your cache. If your client population is
normally generating 20 reqs/sec (not unusual for a single T1), then a
cache capable of 40 will be more than plenty.

That said, I couldn't build a cache using todays available hardware that
would be slower than that, even if I tried. In fact, I can't even spec
a machine small enough to only handle a single T1...it would cost the
same as a machine supporting two T1s. A single IDE drive and 128MB of
RAM will push 50-80 reqs/sec if tuned correctly. So, there is no point
trying to 'precisely' spec a machine for a single T1. If your machine
is configured correctly, it will just work.

Oh, and the answer to your questions:

> since i am not very familiar with the polymix-2 workload can someone
> pls explain the same

Polymix-2 was the workload used at the second cacheoff. It is similar
to Polymix-3 (used at the third cacheoff) and Polymix-4 to be used at
the upcoming cacheoff in November. All are quite accurate simulations
of 'real' client traffic. The data generated is actual HTML web pages
with random sizes, and faked data files, the result being a very
realistic simulation of actual web browsing behavior.

They include two 4 hour 'peak' periods at the load specified (i.e. 40
reqs/sec, in this discussion) as well as ramp up/down periods and then
an intervening 'idle' phase. Simulating a work day in the life of a web
cache over a ~14 hour period. It also starts with a 'fill' phase (or
requires a separate one in the case of PM-2) which pumps data into the
cache at a high rate with a low recurrence, in order to cause the cache
to receive enough data to fill the total capacity of the cache twice.
This insures that the cache is operating in its 'normal' state during
the benchmarking run, and cause disk fragmentation and other factors
that may slow a web cache in real world operation.

All very accurate and good. But it takes about 24-36 hours to run a
full test and requires some time and reading to configure correctly.

> if i had to test using simple.pg workload on the same T1 line .would
> the number of HTTp req/sec change????????

simple.pg is barely a workload. It has no resemblence to reality. Do
not test a web cache using simple.pg. Just don't do it. There is no
point and no knowledge to be gained by doing so. simple.pg tests
whether the polygraph stations can talk to each other, and your build of
polygraph worked. Nothing more.

If you must have an easy to use workload, and precision doesn't matter
much, Datacomm-1 is easy to configure and quick to provide results. It
isn't nearly as realistic as the Polymix-[234] workloads, nor is it as
exhaustive, but it can tell you roughly what kind of client load your
cache will support in about 4 hours (without a fill--so guesstimate a
much lower real result than what you can do on an empty cache run). I
run it quite often just to make sure nothing is broken on a new hardware
or software platform.

> does this imply that 40 req/sec is the max that can be achieved on a T1

No way! Sorry. A T1 will probably never support that many reqs/sec. or
4Mbits of web throughput even for a few minutes, no matter how big or
fast your web cache is. More likely you'll be able to achieve extreme
peaks in the area of 20. I don't monitor any T1 web caches right now,
so I can't be terribly precise on this, but I think 20 might even be a
little high of an estimate. I was merely making a suggestion for how to
size a web cache to allow plenty of room for big spikes in usage.

Don't take simple suggestions to be "rules" or statistically precise
mathematical formulas, as they are not. There is no way to precisely
estimate traffic patterns of a group of users--a business or school LAN
will behave very differently than an ISP, and a dialup ISP will act a
lot different than a DSL ISP which is different than a satellite or
wireless ISP. All behave differently. But a T1 is always 1.5Mbits--so
you can be absolutely /sure/ that a cache capable of ~40 reqs/sec and
~4Mbits throughput on a Polymix-[234] workload will be able to handle
the client load of a 1.5Mbits uplink (while one that handles only
~20/2Mbits might not handle the peaks very well).

Now, my question: Have you been worrying about getting 'only' 70
reqs/sec from your cache when you only have a single 1.5Mbit uplink to
support? ;-)

I guess the silly notion that Squid is 'slow' hasn't been put in
perspective lately. Squid is only slow if you're talking about 48Mbit
uplinks. If you're talking about a single T1, or even a couple T1s,
Squid is /fast/. You can't push Squid even if you try. Tuning Squid is
required if you have to support a big pipe. If you're supporting a
little pipe, like a single T1, you don't really even need to read my
articles. Just build a correctly configured Squid and run it on any
modern hardware.

khiz code wrote:

> Hi all
> i recently went thru some article by Joe which says
> "A T1 or lower only requires
> about 4Mbps throughput from the cache (or ~40 reqs/sec of Polymix-2
> workload, if you are familiar with the IRcache bake-offs)."
> ????????????
> since i am not very familiar with the polymix-2 workload can someone
> pls explain the same
> if i had to test using simple.pg workload on the same T1 line .would
> the number of HTTp req/sec change????????
> does this imply that 40 req/sec is the max that can be achieved on a T1
> pls get back
> TIA
> khiz

-- 
Joe Cooper <joe@swelltech.com>
http://www.swelltech.com
Web Caching Appliances and Support
Received on Mon Oct 22 2001 - 00:36:43 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:03:01 MST