Re: [squid-users] Ideal cache placement (was Re: Why Squid is great (was: fourth cache off??))

From: Jon Kay <jkay@dont-contact.us>
Date: Sat, 22 Dec 2001 20:57:42 -0600

Vivek grumbled:
> Jon Kay wrote:
> > Vivek originally sed:
> > # My gut feeling is that caches near real choke points make sense,
> > # and that caches near similar user populations also potentially
> > # makes sense. As for ideal placement, that probably depends more
> > # on topology than any blanket assertion can easily cover.
>
> > OK, so if we should centralize everything, then let's have One Big
> > Fast Cache in the center to serve all requests.
>
> If there aren't any choke points between the users and that central
> server, what's wrong with that argument?

1) If there wasn't a choke point, now there is.

2) Latency. The One Big Fast Cache can't be near everybody. Those
   Afghans whom you were so protective of in that other email will Not
   Be Happy. Even if the Red Cross buys them 30MB of leased
   satellite bandwidth.

3) Hops. Lots of crossings of unlossy networks will give you a
   lossy network because of the inverse-exponential nature of the math.

4) It will be administered in a way equally unpleasant to everybody.

5) Reliability.

> Again, I already addressed this:
>
> > # Adding new communications
> > #into the mix may not be a clear win, especially if the extra
> > #communication doesn't scale well with the number of caches.
>
> and
>
> > # The drawback to hierarchical caches
> > # is the additional latency involved in the hierarchy - much worse
> > # than router hops or line losses.

We quite agree. That's why we proposed hint caches, which gets rid of
most of those troubles. Our paper shows that a system of hint caches
deployed over the US significantly outperforms hierarchical or
individualized caches.

You telling us that iMimic caches don't support ICP, Cache Digest, or
even cache hierarchies? Really??? Yet another reason to go Squid!

> Plus, I didn't suggest one box per user, but one box per internal group.
> In the real world, people setting up caches are able to make reasonable
> decisions about these things.

> But again, why the arbitrary decision to aim for internal groups?
> If the group actually generates enough traffic to make their
> link to the internal backbone a choke point, then that's one thing.
> Otherwise, it's just more administration for little gain.

> Even from a price-performance standpoint, the approach of
> arbitrarily peppering internal groups doesn't make sense. The
> cheapest commercial cache at the 4th cacheoff was 130 reqs/sec
> for $2500. The next one up handled 800 reqs/sec for $3250.
> If I'm a system administrator at a company, I'd probably buy one
> or two of the 800 req/sec boxes rather than 6 or 12 of the
> 130 req/sec units.

That's because network server companies for years have been spinning
about the benefits of centralizing everything. And they're right,
there are benefits in sysadmin effort. But it does cost user response
time and reliability.

> LAN bandwidth is pretty cheap and not nearly
> as lossy as you suggest.

OK, let's take a look front and center at these wondrous central LANs
of yours. They can often be 3,4, even 5 bridge/router hops away,
plus 3-5ms assorted basic roundtrip routing and propagation delay,
which we see once for each part of the dual TCP handshake. To that we
have to add 3-30ms store/forward delay for the average 12k object on an a
assortment of 10BT and 100BT Ethernets.... Just one old Ethernet, and
an unaggressively designed network, and you have just doubled that hit
time of yours.

Even in a 100BT environment, where I'll spot you a mere 9ms
delay, well, let's take a look at the math:

Total cache response time = L1part + L2part + misspart

L1part = hit time * L1 hitprob
L2part = L2 hit time * L2 chance
misspart = misstime * miss chance

In the monolithic world, we have just one iMimic DataReactor 2100 with
hit and miss time as displayed in the measurement-factory results,
PLUS the 9ms just discussed. And no L2 cache or intercache protocol
of any kind.

Its user population is unlikely to be big enough to get up to the
maximal hit rate in the report. We'll give it a 35% hit rate, at a
guess.

iMimic L1part = 0.35 * (140 ms + 9 ms) = 52 ms
iMimic L2part = 0 ms
iMimic misspart = 0.65 * (3000 ms + 9 ms) = 1956 ms
iMimic total = 52 ms + 0 ms + 1956 ms = 2008 ms

OK, now let's take a look at a hypothetical cloud of hint caches each
operating with Swell 1000 / Squid performance levels. Because each
cache has only a tenth the user population, we lose 10% in hitrate,
going to 25% (that's pretty uniformly the effect of Zipf's Law).
But we can get the extra 10% through "L2" fetches to nearby caches,
with about the same 9ms penalty as we charged to iMimic L1 (less,
really, but it would take a complicated model to capture it and we
don't need it to win).

Hint Cache L1part = 0.25 * 25 ms = 6 ms
Hint Cache L2part = 0.10 * (25 ms + 9 ms) = 3 ms
Hint Cache misspart = 0.65 * 2690 ms = 1749 ms
Hint Cache total = 6 + 3 + 1749 = 1758 ms

There you go. Your users are 1/8 faster if you go Hint Cache. And
it's all far more scalable.

Plus, the central version is less reliable because now everybody
depends on the wires around those boxes. And we need faster/more
expensive LANs because we have more traffic around those boxes. And
maybe we don't buy those faster LANs fast enough and we have
congestion and those LOST PACKETS during busy periods.

-- 
Jon Kay        pushcache.com                      jkay@pushcache.com
http://www.pushcache.com/                             (512) 420-9025
Squid consulting				  'push done right.'
Received on Sat Dec 22 2001 - 19:59:36 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:05:28 MST