Re: [code] [for discussion] map-trie

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Sat, 14 Jun 2014 10:34:25 -0600

On 06/14/2014 12:39 AM, Kinkie wrote:
> On Fri, Jun 13, 2014 at 4:50 PM, Alex Rousskov
> <rousskov_at_measurement-factory.com> wrote:
>> On 06/11/2014 04:43 PM, Kinkie wrote:
>>> level-compact-trie: the mean time is 11 sec; all runs take between
>>> 10.782 and 11.354 secs; 18 Mb of core used
>>
>>> full-trie: mean is 7.5 secs +- 0.2secs; 85 Mb of core.
>>
>>> splay-based: mean time is 16.3sec; all runs take between 16.193 and
>>> 16.427 secs; 14 Mb of core
>>
>> How about std::map? Have you considered using a standard class for this
>> purpose?
>
> std::map doesn't offer efficient prefix matching, but sure, I'll try
> to prepare a test run on the same data to establish a baseline.

Is explicit support to prefix matching actually required though? I am
thinking of using std::map::lower_bound or similar to find the vicinity
of a possible prefix match. For example, when the map has a b.c domain
stored, and you are searching for a.b.c, the lower_bound method returns
a "pointer" to the stored b.c node. After that, a simple prefix test
between the two strings can return the final match/mismatch answer.

Needless to say, all domain names ought to be stored/compared in the
reverse order of their labels. For example, a.b.c is stored internally
as c.b.a. It just feels more natural for me to render it as a.b.c in a
human-oriented text like this email.

And if we need to support prefixes that do not end at domain label
boundaries (i.e., *foo.bar should match zzzfoo.bar), then we just treat
each character as a "label".

HTH,

Alex.
Received on Sat Jun 14 2014 - 16:34:41 MDT

This archive was generated by hypermail 2.2.0 : Mon Jun 16 2014 - 12:00:11 MDT