Re: STL revisited

From: Robert Collins <robertc@dont-contact.us>
Date: Mon, 04 Sep 2006 09:21:15 +1000

On Mon, 2006-09-04 at 00:11 +0800, Adrian Chadd wrote:
> On Sun, Sep 03, 2006, Robert Collins wrote:
> > I'd like to raise the STL again as an option for squid3 rather than
> > NIH'ing a whole bunch of core infrastructure.
>
> Its cool. The STL looks nice. The really trippy thing will be:
>
> * will it affect performance?

Of course. The STL generic algorithms and containers offer well studied
performance profiles - worst case scenarios, expected average
performance - and we should be choosing what containers to use, where,
with care. (If a hash capable of storing 10Million objects is what we
need, the STL map -may- be suitable, or it may not. If its not suitable,
we shouldn't use it for that specific part of the code base). The most
important thing for me though, is that these containers are being
maintained, and improved, by other people, so we can focus on writing
code that uses them, rather than having to improve our fundamental
datatypes.

So it can either give us huge wins, or huge losses, depending on whether
we consider each datatype we choose to convert, profile our use of it,
and then do an API specific benchmark of before and after... or whether
we charge in and go STL good, NIH bad, and randomly choose containers.
[meep].

> * how much will it cut the current codebase down by?

Well for starters, our basic data types such as:
 - dlink list ('list')
 - hash ('tr1/unordered_set')
 - bitmap ('bitmap')
 - Vector ('vector')

(Some of these come from
http://en.wikipedia.org/wiki/Technical_Report_1).

can be removed from the squid codebase.

So can some of our more recent utilities such as reference counting -
('tr1/memory' - the shared_ptr type) [documentation
http://www.boost.org/libs/smart_ptr/shared_ptr.htm] (A more precise
match to what we have today is
http://www.boost.org/libs/smart_ptr/intrusive_ptr.html).

And some older ones such as cbdata -
('tr1/memory' - the weak_ptr type) [documentation
http://www.boost.org/libs/smart_ptr/weak_ptr.htm]

(Boost is a extension library to the STL of extremely high quality and
portability. I consider it optional but recommended - you cant use it
without the STL, but you can use the Stl without it.)

Additionally, there is a effort to make policy based containers for the
STL - http://people.redhat.com/~bkoz/associative_containers/ is the best
reference I have to it at the moment - which include an implementation
of splay trees, and appears to have considerable thought going into it
for both performance and cleanliness. This could well subsume our splay
implementation if we wanted it too - or we can stay with our current
one, but make it more useful - for instance, we can make an associative
version of it by storing pairs, which could be extremely useful.

Part of the benefits we should expect are somewhat simpler things though
- building our own typedefs and extensions on the STL base.

Lastly, we have a lot of glue code, that is essentially getting
functions + pointers from objects and methods and vice verca. Utilities
like bind and function are good ways to reduce the impedance mismatch
there.

> * if they're at odds; which is more important?

Performance :).

> And:
>
> * can it wait for after -3? :)
>
> My suggestion is to wait until -3 is released, then create an "STL" SF
> branch and nut out the code to incorporate STL. We can then compare
> codebase clarity, performance and portability.

Right now, 3.0 is the top priority. However, I think the big step is
saying 'using the STL is allowed' - we should always be considering any
individual change with rigour - and doing a single large branch is going
to be much harder than just having a series of branches which say 'here
is an improvement X, and it happens to use this STL feature.'

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

Received on Sun Sep 03 2006 - 17:21:29 MDT

This archive was generated by hypermail pre-2.1.9 : Sun Oct 01 2006 - 12:00:06 MDT