Re: [code] [for discussion] map-trie

From: Francesco <gkinkie_at_gmail.com>
Date: Thu, 12 Jun 2014 11:33:14 +0200

On 12 Jun 2014, at 11:28, Amos Jeffries <squid3_at_treenet.co.nz> wrote:

> On 12/06/2014 10:43 a.m., Kinkie wrote:
>> Hi,
>> I've done some benchmarking, here are the results so far:
>> The proposal I'm suggesting for dstdomain acl is at
>> lp:~kinkie/squid/flexitrie . It uses the level-compact trie approach
>> I've described in this thread (NOT a Patricia trie). As a comparison,
>> lp:~kinkie/squid/domaindata-benchmark implements the same benchmark
>> using the current splay-based implementation.
>>
>> I've implemented a quick-n-dirty benchmarking tool
>> (src/acl/testDomainDataPerf); it takes as input an acl definition -
>> one dstdomain per line, as if it was included in squid.conf, and a
>> hostname list file (one destination hostname per line).
>>
>> I've run both variants of the code against the same dataset: a 4k
>> entries domain list, containing both hostnames and domain names, and a
>> 18M entries list of destination hostnames, both matching and not
>> matching entries in the domain list (about 7.5M hits, 10.5M misses).
>>
>> Tested 10 times on a Core 2 PC with plenty of RAM - source datasets
>> are in the fs cache.
>> level-compact-trie: the mean time is 11 sec; all runs take between
>> 10.782 and 11.354 secs; 18 Mb of core used
>> full-trie: mean is 7.5 secs +- 0.2secs; 85 Mb of core.
>> splay-based: mean time is 16.3sec; all runs take between 16.193 and
>> 16.427 secs; 14 Mb of core
>>
>> I expect compact-trie to scale better as the number of entries in the
>> list grows and with the number of clients and requests per second;
>> furthermore using it removes 50-100 LOC, and makes code more readable.
>>
>> IMO it is the best compromise in terms of performance, resources
>> useage and expected scalability; before pursuing this further however
>> I'd like to have some feedback.
>
> Does making compact vs full trie configurable affect compexity much? I
> think we have sufficiently wide customer base that people will want to
> go further now that its known there is a much faster version.

It's currently one single typedef in libtrie. Could be made templatized with no effort.
However libtrie needs some work still: it can't be walked, it's not type-safe, it can't be used for algorithms.
I expect I could merge the work I've done on the ternary search trie (lp:~kinkie/squid/tst) and this to have a fully-functional trie implementation in a few days of work, exposing the same API as std::map, having level-compact-trie or full-trie implementation via a template parameter.

        Kinkie
Received on Thu Jun 12 2014 - 09:33:25 MDT

This archive was generated by hypermail 2.2.0 : Thu Jun 12 2014 - 12:00:13 MDT