Re: [MERGE] c++-refactor HttpHdrCc

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 27 Sep 2011 10:24:04 -0600

On 09/27/2011 12:23 AM, Kinkie wrote:

>> The "may not be null-terminated" part can be removed, but it would be
>> useful to preserve it as it will allow us to use this class for many
>> other things. Please note that if you do preserve it, you would need to
>> fix comparison functions and add a stream output operator (among other,
>> minor changes).
>
> Preserved.
> The current iteration of the class is intentionally write-only (can be
> only initialized and compared, but not read, and its data members are
> private.
> This is consistent with the function it needs to perform now.
> Should I lift this limitation?

I do not consider private members a limitation if we do not need access
to them. We can always make them public later if needed.

You should document those data members though.

> + bool operator< ( const StringArea &s) const { return theLen < s.theLen || strncmp(theStart,s.theStart,theLen) < 0; }

This is still broken, on two counts:

A) The strncmp call may segfault while accessing wrong memory if theLen
is greater than s.theLen.

B) You need to make sure that if s1 is less than s2, then s2 is greater
than or equal to s1. Consider this test case with the current code:

  "bb" < "aaa" (because of the length test)

but the reverse is also true:

  "aaa" < "bb" (because of the strncmp test)

Your searches will not work correctly given the above conflict.

I suspect you want to do this instead:

    return theLen < s.theLen || (theLen == s.theLen && strcmp(...) < 0);

I would also replace strncmp() with memcmp() so that the same class can
be used for any buffer area. Once the less operator is fixed, memcmp()
would produce the same result as strncmp() and might be a tad faster.

>>> +class strblob {
>>
>> The class name must be changed to follow Squid style and, hopefully have
>> more meaning. Consider "StringArea". You may want to move it to
>> src/base/StringArea.h.
>
> Done.

Please note that v5 of your patch is missing StringArea.h

>>> + bool operator< ( const strblob &s2) const { return strncmp(thePtr,s2.thePtr,theLen) < 0; }
>>
>> This will crash if s2.theLen is smaller than this->theLen. Needs more
>> len-specific checks. Note that for your current purposes, you can
>> declare shorter strings smaller even if they are not so in the dictionary.
>
> Done.
>
>> This is one reason why such comparison may need to be moved outside the
>> object. Another reason is that we will probably want case-insensitive
>> comparison in other contexts.
>
> Can't be done without making the pointed-to string readable. Will do
> if you confirm that you think that's the best course forward.

Your call. This can be done later if we need different comparison operators.

>>> + const strblob tmpstr(item,nlen);
>>> + const CcNameToIdMap_t::iterator i=CcNameToIdMap.find(tmpstr);
>>
>> Consider s/tmpstr/lookup/ or some such. You can avoid this named object
>> altogether if you just create strblob when calling find(). The iterator
>> should be constant, I think. Here is a sketch:
>>
>> const CcNameToIdMap_t::const_iterator match =
>> CcNameToIdMap.find(strblob(item, nlen));
>
> std::map.find takes a const reference argument. With an unnamed
> temporary it wouldn't compile.

Why not? It does compile fine for me, but perhaps your compiler is
better. What error do you get?

>>> + cc->setMaxAge(MAX_AGE_UNSET);
>>
>> So it is set or unset? If you do not want to add an explicit clear
>> MaxAge() method, how about s/MAX_AGE_UNSET/MAX_AGE_UNKNOWN/?
>
> Sure.

I still see UNSET in v5 of the patch.

>> Similar for S_MAXAGE_UNSET, MIN_FRESH_UNSET, MAX_STALE_ALWAYS, etc. (see
>> the blob below for the full list).
>>
>>
>>> + static const int32_t MAX_AGE_UNSET=-1; //max-age is unset
>>> + static const int32_t S_MAXAGE_UNSET=-1; //s-maxage is unset
>>> + static const int32_t MAX_STALE_UNSET=-1; //max-stale is unset
>>> + static const int32_t MAX_STALE_ALWAYS=-2; //max-stale is set to no value
>>> + static const int32_t STALE_IF_ERROR_UNSET=-1; //stale_if_error is unset
>>> + static const int32_t MIN_FRESH_UNSET=-1; //min_fresh is unset
>>
>> The old types were "int". You are making them more specific now with
>> "int32_t". I am not sure such a change is withing this patch scope, but
>> even if it is, perhaps you should add a HttpCcDirective typedef or
>> similar to avoid spreading int32_t throughout the code.
>
> I'm simply refactoring what's in there.

No, you are also changing their type from "int" to "int32_t", exposing
yourself to more criticism about the type choice, typedefs, etc.

> They used to be initialized to
> -1, as a duplicate validity check to mask (see below for more on
> mask). even worse, they are (in trunk) tested inconsistently:
> sometimes with EBIT_TEST on mask, sometimes with checks on >=0 on the
> value.
> trunk's HttpHdrCc::parse has the clearest picture on the current conventions:
> for bool (present/not present) header values, EBIT_SET(mask) is used
> for integer header values, EBIT_SET(mask) + assigning value is used.
> In case of parse error on the value, both are cleared
> for optional-integer header values, EBIT_SET(mask) + assigning value
> is used. The special "-1" value means both "not set" and "set but not
> specified", to tell the difference the mask is used.
>
> What I've done is to define explicit constants to tell the three cases
> apart without needing to look up the map, and used getters/setters for
> those headers in order to enfoce consistency. Other headers are still
> set by looking at mask (although I'm inclined to define a generic
> get/set combo for bool values)

It is a shame that most of the above knowledge is not present in your
changes. Neither the static constant names nor the getters/setters relay
the above conventions. This will most likely lead to more bugs where
somebody will copy a mask-testing code when they should have copied the
value-checking code or vice versa. More on that below.

>>> + if (cc && minFresh!=HttpHdrCc::MIN_FRESH_UNSET) {
>> ...
>>> + entry->mem_obj->getReply()->cache_control->getStaleIfError() != HttpHdrCc::STALE_IF_ERROR_UNSET &&
>> ...
>>> + if (cc->getMaxAge() != HttpHdrCc::MAX_AGE_UNSET) {
>>
>> These and other similar comparisons now look even worse than they were.
>> How about adding HttpHdrCc::hasMaxAge() and such?
>
> Could be done, maybe reintroducing the mask tests?
>
> API would then tentatively be
> class HttpHdrCc {
> //generic getters/setters, for bool header values
> set(http_hdr_enum);
> clear(http_hdr_enum);
> bool get ((http_hdr_enum);
>
> //specific getters/setters, for integer values. Some special values
> need to remain though.
> getMinFresh()
> setMinFresh()
> clearMinFresh()

Fine with me, although I would probably use explicit names for all
directives, add dedicated has*() tests, and drop "get" (and probably
"set") prefixes:

public:
    /* boolean (present, not present) directives */
    bool hasFoo() const { return hasDirective(ccFoo); }
    bool foo() const { return foo_; }
    void foo(bool beSet) { setDirective(ccFoo, foo_, beSet); }
    void clearFoo() { clearDirective(ccFoo, foo_); }

    /* integer-valued directives */
    bool hasBar() const { return hasDirective(ccBar); }
    int32_t bar() const { return bar_; }
    void bar(int32_t value) { setDirective(ccBar, bar_, beSet); }
    void clearBar() { clearDirective(ccBar, bar_); }

protected:
    /* these know that all directives are stored as integers */
    bool hasDirective(int id) const { return EBIT_TEST(mask, id); }
    void setDirective(int id, int32_t &member, int32_t value) { ... }
    void clearDirective(int id, int32_t &member) { ... }

This adds a few more lines, but the added lines are simple, there is no
code duplication, and the whole thing makes the caller's code easier to
read (and may even detect usage bugs based on your "current usage"
findings discussed above!).

>> (One more reason to provide more than 3 lines of context during review!)
>
> Unfortunately "bzr send" doesn't allow to specify it, that I know of :(
> I've checked web and documentation, but I can't find of any
> alternative way to sanely produce the patch.

You can use bzr diff to create a patch for review. Personally, I never
have the time to actually apply bundles anyway (which is often not
trivial as the trunk may have been changed, creating conflicts).

We should ask Robert why there is no way to specify diff options for the
in-bundle patch.

>>>>> + HttpHdrCc(int32_t max_age_=-1, int32_t s_maxage_=-1,
>>>>> + int32_t max_stale_=-1, int32_t min_fresh_=-1) :
>>>>
>>>> Do we really all five of these constructors, including the one that
>>>> allows for implicit conversion from int32_t to HttpHdrCc? I do not think
>>>> so. Please leave just what we need and use "explicit" to prevent
>>>> implicit conversions as needed.
>>>
>>> Overengineering. Changed to only the default contstructor (we need it
>>> to set various fields to "-1" by default).
>>> Ok for explicit.
>>
>> An explicit keyword does not make sense for a default constructor, does
>> it? Not sure why it compiles for you, but I would recommend removing
>> that keyword because your current code does not allow implicit type
>> conversions.
>
> Not allowing implicit conversions was the objective,

Implicit conversions are allowed by constructors with exactly one
parameter. You no longer have such a constructor, so "explicit" is
redundant at best.

Cheers,

Alex.
Received on Tue Sep 27 2011 - 16:25:01 MDT

This archive was generated by hypermail 2.2.0 : Wed Sep 28 2011 - 12:00:04 MDT