Re: [squid-users] Squid Hardware requirements.

From: Marcus Kool <Marcus.Kool_at_urlfilterdb.com>
Date: Fri, 14 Jun 2013 17:34:32 +0200

On Fri, Jun 14, 2013 at 09:53:20PM +0800, csn233 wrote:
> With YMMV in mind, I get different mileage:
>
> On Fri, Jun 14, 2013 at 7:41 PM, Marcus Kool
> <marcus.kool_at_urlfilterdb.com> wrote:
> > and if your network pipe has sufficient capacity, also fetching
> > an object again from the internet is can be faster than fetching from disk.
>
> Your network may be fast, but it doesn't imply a fast path between you
> and the origin server. In other words, it depends on other factors
> than just your own network pipe.

yes, mileage may vary and depends on many factors.
Overall, squid servers without disk cache can be faster than with disk cache,
so it is worth looking at it.

> > - more expensive (disks + battery-backed I/O controller)
>
> Expensive disks/battery-backed are over-kill. More/adequate spindles
> should do the job just as well. Why do you need a battery-backed
> controller? Squid is not a transaction-based system - if you lose the
> cache, tough, do "squid -z" and start again.

fast disks are good. multiple controllers and mutiple buses are good.
An EMC disk array is the most expensive and best option since Squid desires
a huge number of IOPS.
Battery-backed disk controllers are a good tradeoff: they are not so expensive
and give a reasonable performance boost.

> > - Squid uses more memory to index the disk cache (14 MB memory per GB disk
> > cache)
>
> My memory allocation is only about 20-30% of that (formula), and
> paging/swapping metrics doesn't indicate there is a problem. General
> formulas may not always apply.

The 14 MB per GB is documented in the Squid wiki and based on the
observation that the avergae object size is 13 KB.
If you only have 20-30% of the formula you may have a larger average
object size or only use 20-30% of the confgured disk cache.

> > unless a redundant hot-swap RAID array is used, less downtime.
>
> Older versions has a problem if a cache_dir fails, I think. Has this
> changed with later versions, or in the pipeline to change, anyone?

The thread started with a web proxy for an ISP.
ISPs generally do not want to restart the proxy and/or rebuild the index.
It takes too long.

> > One can also redistribute budget:
> > - use the budget of the disk system to max out memory.
>
> The benefits of memory will plateau pretty quickly. Unless one
> regularly has a whole bunch of users wanting to access the same pages
> within a relatively short time, the benefit from more memory has its
> limits. Max-out could easily become wastage.

No, memory is by far the fastest cache media. Since memory is
relatively cheap it is the best option.

> > - put as much memory as possible.
>
> Disagree - see above. It depends.

Ok, I stated it a bit aggressive. It should read
"Buy as much memory as your budget allows".

> > - carefully size the disk cache; not too large since Squid keeps the index
>
> Agree. If your hit-ratios don't increase, there's not much point in
> having larger cache_dir's. But I wouldn't go as far as "carefully".
> You just need enough or more, just not too much more.

That is your point of view. I prefer to be careful not to use
more than enough since it wastes memory.

> > - if using a disk cache, use fast disks and a very good caching I/O
> > controller to get maximum disk performance
>
> Up to a point only, as mentioned above. Local disk I/O may be fast,
> but it doesn't mean your internet access will be as well. Which means
> you end up spending money on hardware that does not deliver actual
> results.

Squid is hungry for a large number of IOPS. So get the best that
your budget can buy.
For low budgets this is a relatively cheap caching disk controller,
for high budgets it varies between low-end and high-end disk arrays
(the ones that have between 32 and 1000+ of spindles).

> As Amos said, get the fastest per-core GHz you can find, number of
> cores not important. And have enough disk spindles.
Received on Fri Jun 14 2013 - 15:34:39 MDT

This archive was generated by hypermail 2.2.0 : Sat Jun 15 2013 - 12:00:08 MDT