Re: [squid-users] fourth cache off??

From: Vivek Pai <vivek@dont-contact.us>
Date: Thu, 20 Dec 2001 14:57:20 -0500

Jon Kay wrote:
>
> > The Swell entry did a lot better on hit time than previous cacheoffs,
> > and I'm guessing that the underlying cause was largely the RAM disk.
> > It makes sense, since this workload has a median file size that is
> > smaller than its mean file size.
>
> ...which, of course, is true in spades for real cache workloads.
> Your point?

See below.

> > b) Is this approach scalable? The three entries with better hit times
> > had throughputs roughly 4-20 times as high. So, if your system can
> > only hold 4GB of memory, the RAM disk approach has less than a
> > factor of 2 from its current numbers before one would expect some
> > degradation.
>
> That's a silly comparison. You configure a given system to do best
> with the load it's given. One would almost think you were bringing
> this up to highlight iMimic's better throughput :-).

Please don't jump to conclusions. My point was that Joe stated that
his approach had lots of headroom based on what he saw with the disk
lights. However, if the memory needed scales with request rate, the
memory limit may be the bottleneck.

> Permit me to remind you of Squid's superior latency improvement/$ :-).

The top 3 hit times were iMimic OEMs, as were the top 6 overall
response times and the top 5-6 price/performance numbers. Our own
entry set a new throughput record for Linux-based systems and one
of our OEMs set the new record for single-box systems. I don't
think that it's a bad showing at all.

> > c) Is the comparison fair? Since polygraph/polymix is a disk-bound
> workload . . .
>
> Of course it's fair. He had to pay the cost of the RAM in his entry.

That's not the point. I'm trying to look at technical metrics
rather than cost/marketing numbers. My point is that you're comparing
latencies for very different request rates. I believe I even said
that in the part that you clipped.

> > d) Finally, there's the issue of whether stable storage is important
> > for a proxy or not. If a large fraction of the content is stored on
> > a RAM disk, a reboot or power loss is a significant concern. My
> > conclusion is that if you can do without the RAM disk, it's
> > probably better to build a cache that uses stable storage for files
> > and uses memory only as a hot object cache.
>
> It's a CACHE. When the power goes out, your hit rate suffers for
> awhile and then you come back. If your cache is located in
> Afghanistan, then I recommend going with the stable storage.

Here's a back-of-the-envelope calculation:

A 1GB RAM disk can hold about 250K objects of 4KB each. If
the cache is handling 130 reqs/sec and half of that is
cacheable, its fill rate will be about 65 req/sec. So, if
the RAM disk gets wiped, it'll take 250K/65 = 3846 seconds
to rebuild it. One hour of degraded performance after a
reboot is possibly significant. If the average object size
in the RAM disk is smaller, then the rebuild time gets even
longer. It's up to the consumer as to whether this is tolerable.

-Vivek
Received on Thu Dec 20 2001 - 12:57:23 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 17:05:26 MST