Bugfix for splay trees

From: Arjan de Vet <Arjan.deVet@dont-contact.us>
Date: Wed, 12 Feb 1997 01:49:36 +0100 (MET)

In article <9702101145.ZM13917@fatboy.geog.unsw.edu.au> Ed Knowles wrote:

>> >It has been #ifdef out as there appeares to be an, as yet, unsolved bug :(
>>
>> What kind of bug? I've been working on similar things and putting Squid's
>> IP lists into binary trees is not trivial I think (see URL above).
>
>I'm not sure :((
>
>The code worked on the small lists that I made up for testing, but Duane said
>he run into some problems when he put it on the NLANR caches. I am unable to
>test acl checks from many clients as not too many clients use my proxy.

I think I've found the bug. The bug is even architecture (little/big
endian) dependent :-(, on my Intel FreeBSD (little endian) machine I could
trigger the bug rather fast. Here's a fix:

--- acl.c.orig Thu Feb 6 20:59:53 1997
+++ acl.c Wed Feb 12 01:27:10 1997
@@ -370,16 +370,16 @@
     int rc = 0;
     addr.s_addr &= data->mask.s_addr; /* apply netmask */
     if (data->addr2.s_addr == 0) { /* single address check */
- if (addr.s_addr > data->addr1.s_addr)
+ if (ntohl(addr.s_addr) > ntohl(data->addr1.s_addr))
             rc = 1;
- else if (addr.s_addr < data->addr1.s_addr)
+ else if (ntohl(addr.s_addr) < ntohl(data->addr1.s_addr))
             rc = -1;
         else
             rc = 0;
     } else { /* range address check */
- if (addr.s_addr > data->addr2.s_addr)
+ if (ntohl(addr.s_addr) > ntohl(data->addr2.s_addr))
             rc = 1;
- else if (addr.s_addr < data->addr1.s_addr)
+ else if (ntohl(addr.s_addr) < ntohl(data->addr1.s_addr))
             rc = -1;
         else
             rc = 0;

>If this is the form that the tree takes after self-adjusting, sobeit. The
>overall efficiency of the tree is still O(log n). Given that we assume access
>is not made evenly by all sites listed in the acl at one time, the tree will be
>reorganised to give the best lookup for those clients that are using the proxy
>at the time. Least checked clients will be at the end of the tree. Balance is
>not a concern.
>
>Even if the tree did adjust into a linked list, a search for the last
>element would cause the most adjustment of the tree. It would only be a
>linked list for one search.

I don't think the above is completely true. Now that I've studied the code
some more I think you can get really heavy self-adjusting when two sites,
A with the smallest IP number in the list and B with the largest number in
the list, are the two only users of the cache. Every switch in client
leads to a complete rotation of the tree. And if you start with a linear
list (worst case situation for a tree) it will remain a linear list:

         B A
        / \
       / -> \
      / \
     / \
    A B

I hope to produce a patch in the next few days which implements both splay
trees and balanced binary trees so people can start comparing them in
real-life situations and choose the best solution for their purposes.

Arjan
Received on Tue Feb 11 1997 - 17:08:32 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:34:25 MST