Re: Seeking enterprise ideas for Squid

From: Francesco Chemolli <kinkie-ml@dont-contact.us>
Date: 09 Oct 2002 13:15:00 +0200

GV <gv_kovai@yahoo.com> writes:

[...]

> Network I/O: Instead of polling client connections,
> Squid could employ an interrupt mechanism to determine
> the next “ready” client. We implemented RT Signals
> (targeted at improving performance by reducing network
> I/O), and got a lot of help and support from the Open
> Source developers. (The intent here is not to glorify

> RT Signals vis-à-vis polling. Rather, it is an example
> of the kind of ideas we are looking for in order to
> bring Squid performance progressively closer to the

> “desired enterprise level performance” as defined by
> some of our potential clients).

I'd like to give a spin to your solution once my current batch of
(unrelated) tests is done. I don't like the idea of a required kernel patch
though.

> The question remains: What else can be done to improve
> networking I/O performance? In large systems, where
> one can mitigate the impact of disk I/O by throwing
> huge amounts of real memory at the problem, this seems
> to be the place to go for getting better performance
> numbers.
>
> Disk I/O: We have not looked at this issue in great
> detail. There does seem to be an opportunity to
> improve I/O performance by using raw I/O instead of
> going through the file system. There may be other
> alternatives as well.

Adrian is working on refactoring the store i/o interfaces. Raw-disk COSS,
O_DIRECT (on Linux) COSS may be a good idea, but this has to wait until the
store interfaces are sorted out.

> Multithreaded Squid: This comes up during discussion
> with enterprise customers. We are not sure it is
> necessarily the only way to go. But the economics of
> running Squid on a single box (duly partitioned)
> vis-à-vis running multiple instances on multiple
> systems can get pretty overpowering. Does the
> development community have any thoughts to share on
> the potential of a multithreaded Squid product? Or are
> there ways to run multiple instances of Squid on a
> properly partitioned SMP system, with acceptable
> levels of scaling in thruput and response time?

Squid IS multithreaded. It's just multithreaded in the lightest possible
way, namely select() and friends.

Jests aside, I think a nice way would to allow pooling an array's cache,
but also this requires some infrastructure work, namely pushing the store
dirs' index into the store modules.

On the MT issue, Adrian already answered: one thread per connection is no
good. Other ideas which might be interesting is parallelizing the I/O
tasks. Aufs is an example of this. It would be interesting to know if and
how it would be possible to do the same with client-squid and
squid-upstream connections.

> We are not looking for complete agreement on any of
> the ideas stated above. However, because there will be
> different views, what do people think of the notion of
> an enterprise edition of Squid, and a standard
> edition? Or is the overhead of two Squid versions
> (multiplied by the number of platforms) simply not
> worthwhile?

One of the (good) things of open-source development is that eventually good
ideas get spread among all implementations. I'm all in favour of merging
what's good and GPLed of your work back into the main trunk.

> On a related note, is it appropriate to discuss
> proprietary vendor features on this discussion list?

Interesting question.
Would you define "proprietary"? I think that if you GPL them, or plan to
GPL them, or suggest GPLed improvements to the way of things which in the
meanwhile add hooks for non-GPLed extensions, then that's fine by me.

-- 
	kinkie (kinkie-ml [at] libero [dot] it)
	Random fortune, unrelated to the message:
Dr. Jekyll had something to Hyde.
Received on Wed Oct 09 2002 - 05:16:39 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:54 MST