Re: [RFC] Tokenizer API

From: Francesco Chemolli <gkinkie_at_gmail.com>
Date: Tue, 10 Dec 2013 06:51:24 +0100

> Hi,
> SBuf supplies a few find() variants which could help which are not constant time but rely on lower-level primitives and related optimizations. My suggestion is to have CharacterSet be a SBuf and rely on them, at least for now. In any case having them be a SBuf promotes better interface decoupling and abstraction.

Oh, one more argument for having the low-level matching primitives in SBuf: it's a pet peeve of mine to use some form of compact tries and/or FSM to do single-pass low-level string matching in SBuf, possibly by lifting code from GNU grep (it's very efficient but complex). Redoing find_first_of() and startsWith() here would duplicate code and undermine that possibility and qualifies as premature optimisation IMO :)

        Kinkie
Received on Tue Dec 10 2013 - 05:51:34 MST

This archive was generated by hypermail 2.2.0 : Tue Dec 10 2013 - 12:00:10 MST