Re: BIN_TREE vs SPLAY_TREE vs linked list

From: Arjan de Vet <Arjan.deVet@dont-contact.us>
Date: Mon, 30 Mar 1998 22:43:42 +0200 (CEST)

--MimeMultipartBoundary
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Duane Wessels:

>To see which one performed the best I ran a test with IP access
>controls. I used an access list of 388 hosts/networks and
>an input list of 4,146,495 IP addresses from 1 day's worth
>of the NLANR cache logs. Here are my results:
>
> TOTAL CPU TIME # OF COMPARISONS PER CHECK
> (seconds) MEDIAN MEAN
>-------------- -------------- -------- ------
> SPLAY 192.09 7 6.77
> BTREE 194.69 8 7.47
>LINKED LIST 200.21 12 16.9
>
>
>In terms of CPU time, the results are pretty close. The linked-list
>case took less than 5% more CPU time than the SPLAY case.

I did some similar experiments with your code but with much a larger access
list.

Access list of 3552 entries (a real one from one of my proxies). With an
input list of 1584386 entries I got the following results:

btree 59.428u 0.211s
splay 55.540u 0.366s
ll 179.970u 0.281s

During this test I noticed something strange: the btree experiment gave
different results. I tracked this down to a 'problem' in the access list:
some IP numbers occured twice and some IP numbers were also present in the
list because of network slices (x.x.x.x/y). The reason for this is that the
list is generated from different sources.

When thinking about it I understand why this goes wrong with btrees but I do
not understand yet why splay trees do not have the same problem with this
list :-(.

After removing the network slices the btree algorithm gave the same results
as splay or ll with the following times:

btree 59.130u 0.328s
splay 55.586u 0.335s
ll 224.521u 0.336s

>My first preference would be to keep the linked-list approach because
>it is much simpler to implement and maintain, but I am also open to
>keeping SPLAY instead.

I would vote for SPLAY, it's guaranteed fast (whereas linked lists use an
LRU heuristic which works most of the time for some ACLs but is not
guaranteed to be optimal). My BTREE approach seems to have problems with IP
numbers and network slices containing these IP numbers in the same list but
I don't have the time now to investigate this further.

I haven't done any experiments with 'dstdomain blockinglist' types of ACLs;
I hope to do them later this week.

Arjan

-- 
Arjan de Vet, Eindhoven, The Netherlands            <Arjan.deVet@adv.IAEhv.nl>
URL: http://www.IAEhv.nl/users/devet/       for PGP key: finger devet@IAEhv.nl
--MimeMultipartBoundary--
Received on Tue Jul 29 2003 - 13:15:47 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:11:44 MST