Re: SBuf review

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Thu, 18 Sep 2008 19:39:53 -0600

On Thu, 2008-09-18 at 22:14 +0200, Kinkie wrote:

> > *B1* I still think that string-manipulation interfaces should be moved
> > into a String class.
>
> I could agree to that if I could see an equally efficient way to implement this.
> I've called the class SBuf because I see it as being a bit of a String
> and a bit of a Buf.

This is not an efficiency issue, IMO. The all-in-one and
separate-and-conquer designs, if done correctly, have similar overheads.

> > *B2* I still think that having NULL Strings is a bad idea.
>
> Lazy instantiation?

Lazy instantiation is an implementation detail. I am talking about the
API. You can instantiate lazily with or without nil strings.

> I can agree to a name change. But then it's also a pretty
> straightforward porting of a MemBuf construct with the same name.
> Reasoning goes, if it's in MemBuf there may be a valid reason for it.

This is not about a name or the old C code that we should not mimic. It
is about introducing an invalid and unusual state that coders will
forget about and will have to bother with. The code will be _better_ if
isNull is removed. It will be worse if it stays.

> > *B2* I am not sure placing your classes in include/ is better than
> > placing them in src/. We have discussed similar questions before. I do
> > not know whether there was consensus, but I hope we agreed that only
> > Squid-independent portability/wrapper code should live outside of src/.
> > I think your code will (or should) be Squid-dependent. It will make a
> > lot of things easier. It is difficult to develop outside of src/ and it
> > does not buy us anything in this case.
>
> I agree it should be Squid-dependent, as it will have to hook into
> MemPools, Debug, CacheMgr and various other places.
> I'm trying to follow the squid 3 coding guidelines as expressed in the
> wiki, but maybe I misunderstood them.
> I can move it to src/ no problem.

Which wiki page? I do not see anything related at
http://wiki.squid-cache.org/Squid3CodingGuidelines

> > *B2* class outOfBoundsException: public std::exception
> >
> > It is better to inherit all your customized exceptions from the existing
> > TextException. Otherwise, the catching code will become unnecessary long
> > and the exception class code will duplicate a lot of things.
> > TextException itself should be polished and split or renamed to
> > Exception, but that is a different topic. Most generic catching code
> > should catch TextExceptions (which includes their children).
>
> I've followed recommended c++ coding guidelines, but I have no issues
> in implementing this.

Please do. We may make std::exception a parent of Exception, but that is
a different (and minor) issue.

> > *B1* Exception specifications: type method(...) throw(...);
> >
> > Surprisingly (well, at least it was to me!), exception specifications
> > are a bad idea in virtually all cases except for empty specifications.
> > Empty specifications should also be avoided unless you Know What You Are
> > Doing and Absulutely Sure About That.
> >
> > Please remove all your empty throw() exception specifications.
> >
> > Please comment out (but do not remove) your non-empty exception
> > specifications. They are useful as documentation:
> > type method(...); // throw(...)
> > or perhaps
> > type method(...); // throws ...
> >
> > Thinking I am nuts? Here is one of the authoritative references on this
> > fascinating subject: http://www.gotw.ca/publications/mill22.htm
>
> Ok. Leaving them documented is the most important thing to me.
> I can also move them to doxygen comments so that they'll be more visible.

It is your call. Some of us find doxygen comments less readable :-).

> > *B2* Move helper class declarations outside of SBuf. Class is not a
> > namespace. Nested classes complicate the overall impression from the API
> > and tempt you to do things you really should not.
>
> If possible at all, I'd prefer not to.
> The reason why classes are placed like that is to enforce
> accessibility constraints without using friend classes as much as
> possible (see below)

I doubt you will need friend classes to implement a clean API in this
case. However, I am certain that a lot of the current mess will be
resolved on its own when you are forced to think about natural class
boundaries.

> > *B3* If a class does not have both dump() and print() methods, let's
> > call the only printing method print(). This will make it easier to
> > manipulate printable objects for debugging.
>
> The convention I followed is: dump() is for debugging, stats etc.
> print() is for operator <<.

There is no such convention because "for operator <<" does not define
method semantics. I am not going to argue about this B3 issue though. We
have a bigger fish to catch...

> > *B2* If SBufStore is refcounted, why is it not using the RefCountable
> > API we already have? If this is some kind of optimization, I would
> > prefer to see it done after the code is reviewed, merged, and debugged.
>
> This code is very low-level. and it's also very self-contained. I aim
> for maximum efficiency.

We have gone through "maximum efficiency" arguments a few times already,
have not we? You give me a choice between these two options:

a) New, complex, poorly designed, low-level code that is virtually
guaranteed to have hard-to-find bugs. Hours of debugging and fixing
low-level problems that break user code. An unsubstantiated promise of a
minor speedup.

b) Old code that is proven to work. Similar design overhead. Can be
replaced with another code (e.g., more efficient or thread-safe) without
rewriting user code.

Which option do you think I am likely to pick? For something as critical
as String? After we have already been through several rounds of bug
fixing?

> > *B2* Please make SBufStore data members private. They are too sensitive
> > to expose them like that. With the sensitive info open, we will not
> > catch all the abuses (or spend too much time explaining why a public
> > interface should not be used).
>
> The whole SBufStore class is a protected member of SBuf. Noone except
> for SBuf and its derived or friend classes can even see objects of
> that class, let alone manipulate them.

(a) SBuf code is already abusing its access rights, and I expect that
abuse to grow, especially if SBuf remains both a meaningful string and
an opaque buffer.

(b) The nested classes are not very likely to stay that way.

> In this case I've deliberately broken object layering to increase
> performance and code compactness significantly; otherwise SBuf would
> have to proxy most of its methods to the underlying SBufStore, which
> then would lack a significant understanding of what's going on. It
> would also make code much harder to follow and read as the logic would
> be split in two parts.

I am not going to discuss performance again because same arguments apply
here. I do not know what "code compactness" means. However, I am
positive that code with clean separation of responsibilities will be
easier to "follow and read".

You are correct that if one class proxies most of its methods to
another, one has to review the design: sometimes proxying is not bad,
sometimes it is a bug. I do not think a String will be a proxy in front
of a Buffer. I do not know whether a Buffer needs a Store, but if it
does, it should not be a proxy for it either.

> > *B2* SBufStore(const SBuf &S) makes implicit conversions possible. Do we
> > really need these? Doesn't creating "store" from a buffer that uses that
> > "store" make you feel uncomfortable?
>
> No, it doesn't. See my previous comments about who's the only ones who
> can do that.
> If that was a public API, then I'd agree with you 100%; but it's not.

See (a) and (b) above.

> > *B1* SBuf assignment does not handle "assignment to self".
>
> Right. Is it safe to implement that as a noop?

Yes, if that noop is implemented safely :-)

> > *B2* I would remove numerous comments about what gets changed and what
> > does not. Const declarations already say as much!
>
> I prefer being redundant than unclear, but it's fine.

What's so unclear about "const"?! And redundancy has already gotten you
into trouble with comments not matching the code...

> > *B2* slice() should be implemented by calling other, more primitive
> > chopping methods. Same for other methods that simply combine existing
> > simpler actions. Please do not force us to review the same code many
> > times.
>
> I'll check. In my opinion, where doing otherwise doesn't make things
> significantly more efficient, slice() should be the low-level
> primitive, and the other methods the shortcuts.

That's fine too, of course.

> Regarding String, I recall that the aim of this class is to replace
> MemBuf and char[] instances with something which increases efficiency
> while retaining all the advantages of automatic memory management.

String should be used whenever we manipulate meaningful character data.
Buffer should be used whenever we manipulate opaque blobs of memory.
Both should be efficient, of course. It is better not to define either
in terms of old, often broken, APIs.

HTH,

Alex.
Received on Fri Sep 19 2008 - 01:40:08 MDT

This archive was generated by hypermail 2.2.0 : Fri Sep 19 2008 - 12:00:04 MDT