Thoughts for Squid-3

Licensing and Copyright

DW:

GPL or BSD?

I would kind of like to assign the copyright to FSF.

Language

DW:

I'd like to write C++. Is it portable and ubiqutitous enough? Will it cause more support headaches from people who can't get it to compile?

Modularity

Henrik:

I think Squid should be divided into a number of "independent" processes:

Main problem with having multiple processes is how to make efficient inter-process calls, and to do this we probably have to make a large sacrifice in portability. Not all UNIX:es are capable of efficient inter-process communication at the level required, and most requires some tuning. However, if layered properly we might be able to provide full portability with the sacrifice of some performance on platforms with limited IPC capabilities.

The object database I'd like to see distributed to the disk processes, where each process maintains the database for the objects it has, with only a rought estimate (i.e. like a cache digest) collected centrally.

Any IPC should to be carefully planned and performed at a macro level with as large operations as feasible, with proper error recovery in case one of the components fail. If a networking process fails only the requests currently processed by that process should be affected, similary if a disk process fails only the requests currently served from that disk process should be affected.

For DNS/proxy_auth/whatever else some limited distributed caching in the networking processes might be required to cut down on the number of IPC calls, but the bulk of these caches should be managed centrally.

This requires a number of major transitions of the code desing. For example there will be no globally available StoreEntry structure to connect things together.

Dancer:

The modular, cooperative design allowed us to write different components in different languages (I had to draft a couple perl programmers to get enough manpower to rewrite the project in a week) and allowed us to test components seperately. Now, after three months, we've yet to show any trouble with it. We had exactly two bugs, both found on the first day of testing.

Ah...except....The scheduler (at least under linux 2.0, and probably under 2.2) can display asymptotic, sinusoidal behaviour when some of your components are entirely CPU bound, but have other components relying on them. It's an important thing to watch out for. I can explain more about this gotcha if anyone's interested, but it should be obvious.

Features

What do we want to support that is hard to do with the current code?