Re: [squid-users] Squid Hardware requirements.

From: Marcus Kool <marcus.kool_at_urlfilterdb.com>
Date: Fri, 14 Jun 2013 15:21:25 -0300

On 06/14/2013 01:03 PM, csn233 wrote:
> On Fri, Jun 14, 2013 at 11:34 PM, Marcus Kool
> <Marcus.Kool_at_urlfilterdb.com> wrote:
>
>>>> - more expensive (disks + battery-backed I/O controller)
>>>
>>> Expensive disks/battery-backed are over-kill. More/adequate spindles
>>> should do the job just as well. Why do you need a battery-backed
>>> controller? Squid is not a transaction-based system - if you lose the
>>> cache, tough, do "squid -z" and start again.
>>
>> fast disks are good. multiple controllers and mutiple buses are good.
>> An EMC disk array is the most expensive and best option since Squid desires
>> a huge number of IOPS.
>> Battery-backed disk controllers are a good tradeoff: they are not so expensive
>> and give a reasonable performance boost.
>
> You are missing the point completely. I'm not discussing the
> performance in relation to good/bad hardware, or EMC or IOPS or
> otherwise. I'm talking about how it relates to Internet access
> performance. If you have good hardware, but the bottleneck is
> somewhere else, you are barking up the wrong tree.

Well, sorry for misunderstanding. How can I understand that you talk/think
about disk vs internet when the text that you quote is about disks ???

>>>> - Squid uses more memory to index the disk cache (14 MB memory per GB disk
>>>> cache)
>>>
>>> My memory allocation is only about 20-30% of that (formula), and
>>> paging/swapping metrics doesn't indicate there is a problem. General
>>> formulas may not always apply.
>>
>> The 14 MB per GB is documented in the Squid wiki and based on the
>> observation that the avergae object size is 13 KB.
>> If you only have 20-30% of the formula you may have a larger average
>> object size or only use 20-30% of the confgured disk cache.
>
> Yes, it may be documented. You forgot the IF's and MAY's. IF's and
> MAY's are very important. IF it applies to you, or it MAY apply to
> you, Try not to quote things without qualifications or understanding.

No. The FAQ is pretty clear about how many bytes are required per cached object.
No IF's or MAY's.
BTW: I co-authored this section of the FAQ.

>> The thread started with a web proxy for an ISP.
>> ISPs generally do not want to restart the proxy and/or rebuild the index.
>> It takes too long.
>
> Don't assume. State the technical variables and let them decide.

We make assumptions with every word that we write here.
It is fair to assume that an ISP wants a stable proxy and my
comment mentioned the assumption. IMHO nothing bad with that.

>> No, memory is by far the fastest cache media. Since memory is
>> relatively cheap it is the best option.
>
> No, it's not when it doesn't solve the problem, if your bottleneck is
> somewhere else.

Eh? the "problem" of a web proxy is to serve HTTP/S requests as fast
as possible. As memory is the fastest medium for caching objects,
it is valid to go for a lot of memory.
The bottlenecks are network and disk.
The network bottleneck is the easiest: buy more bandwidth if required.
The disk bottleneck has two ways of solving:
1) do not use disks (memory cache only)
2) go for fast disks, fast controllers and/or fast disk arrays.

>> Ok, I stated it a bit aggressive. It should read
>> "Buy as much memory as your budget allows".
>
> Wrong. Don't buy more than what you actually need.
>
>> That is your point of view. I prefer to be careful not to use
>> more than enough since it wastes memory.
>
> I said the same thing.
>
>> Squid is hungry for a large number of IOPS. So get the best that
>> your budget can buy.
>> For low budgets this is a relatively cheap caching disk controller,
>> for high budgets it varies between low-end and high-end disk arrays
>> (the ones that have between 32 and 1000+ of spindles).
>
> Nothing to do with low or high budgets. Buy what will provide benefits
> in relation to cost.

Yes, that was the question of the person who started this thread:
what hardware will do?
And, as many will agree, there are a lot of IF's and MAY's and
there is no perfect answer. One can only try to help the person who
asks a question and help him decide. Explaining the need for
a fast I/O system is an important part of it. Mentioning
options for implementation are suggestions that people can
evaluate. Most likely the original poster of this thread will
do that. A simple "Buy what will provide benefits in relation
to cost" does not answer the question nor offers any help.

Marcus
Received on Fri Jun 14 2013 - 18:21:35 MDT

This archive was generated by hypermail 2.2.0 : Sat Jun 15 2013 - 12:00:08 MDT