Re: [MERGE] c++-refactor HttpHdrCc

From: Kinkie <gkinkie_at_gmail.com>
Date: Wed, 14 Sep 2011 20:30:48 +0200

On Tue, Sep 13, 2011 at 6:11 PM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On 09/12/2011 05:38 PM, Kinkie wrote:
>> Hi,
>>   the attached bundle refactors the HttpHdrCc into a c++ class,
>> attempting also to clean the code up. Main changes:
>> - 6 less global functions in protos.h
>> - 1 less struct in structs.h
>> - proper c++ construction/destruction semantics, standard MemPool integration
>
>> - name->id header parsing now done via std::map instead of iterating
>> over list: complexity should go from O(n) to O(log n) with STL-derived
>> libs.
>
> On the other hand, you now deallocate and allocate a new SquidString for
> every CC directive just to find that directive ID so the total cost may
> be not be lower. I think we can do much better, and even if we cannot or
> will not, I do not think we should commit the "optimization" part of
> this change until proper performance regression testing.

Point taken. I've implemented a small helper class (basically a const
char*+length) to support the lookup in a very cheap way. I'd like to
get your feedback on it before trying to extend its use (it's
currently an in-cc-file definition).

> And if you want to mix optimization with cleanup, here is one place
> where you can save quite a few cycles for many messages by not ignoring
> the return value of getList():
>
>>      getList(HDR_CACHE_CONTROL, &s);
>>
>> -    cc = httpHdrCcParseCreate(&s);
>> +    cc=new HttpHdrCc();
>> +    if (!cc->parseInit(s)) {
>> +        delete cc;
>> +        cc = NULL;
>> +    }

Done.

>> +/** Http Cache-control header representation
>> + *
>> + * Store, parse and output the Cache-control HTTP header.
>
> In comments, please capitalize HTTP Cache-Control the way RFC 2616 does.

Done.

>> + * Store, parse and output the Cache-control HTTP header.
>
> I do not think your HttpHdrCc class deals with the output. On the other
> hand, please consider converting httpHdrCcPackInto() and
> httpHdrCcUpdateStats() as well.

I've so far chosen not to do it as it doesn't look strictly necessary
and it increases coupling.

>> -HttpHeaderFieldInfo *CcFieldsInfo = NULL;
>> +/// Map an header name to its type, to expedite parsing
>> +typedef std::map<String,http_hdr_cc_type> HdrCcNameToIdMap_t;
>> +static HdrCcNameToIdMap_t HdrCcNameToIdMap;
>
> HdrCcNameToIdMap should be constant so that we do not accidentally
> modify it. You can make it a pointer and use similar-to-current
> initialization code or try to initialize statically.

Are you sure we run such a risk? I've now changed it so that it is a
static data member with const key. Do you think it could be enough?

>> +        HdrCcNameToIdMap_t::iterator i;
>
> The above variable can be marked as const.

Done.

>> +static HdrCcNameToIdMap_t HdrCcNameToIdMap;
>
> Consider dropping the "Hdr" prefix since you are inside a HdrCc file
> already. This will also make it consistent with CcAttrs name.

Done.

>> void
>> +HttpHdrCc::clear()
>> +{
>> +    mask=0;
>> +    max_age=-1;
>> +    mask=0;
>> +    max_age=-1;
>> +    s_maxage=-1;
>> +    max_stale=-1;
>> +    stale_if_error=0;
>> +    min_fresh=-1;
>> +    other.clean();
>> +}
>
> This duplicates the default constructor. Mask and max_age are cleared
> twice. Consider this implementation instead (which can be inlined):
>
>    *this = HttpHdrCc();

Done. Clever.

>> +    int32_t i;
>> +    /* build lookup and accounting structures */
>> +    for (i=0;i<CC_ENUM_END;i++) {
>
> The "i" variable should be local to the loop. Please consider adding
> spaces around operators.

Done.

>> +        assert(i==CcAttrs[i].id); /* verify assumption: the id is the key into the array */
>> +        HdrCcNameToIdMap[CcAttrs[i].name]=CcAttrs[i].id;
>
> Consider computing CcAttrs[i] once instead of three times and storing it
> as a local const reference.

Done.

>> +        assert(i==CcAttrs[i].id); /* verify assumption: the id is the key into the array */
>
> Where do we actually use that assumption?

HttpHeaderCcFields is not const. Its stat field member can be updated
(look for ++CcAttrs[type].stat.repCount). In other words, it doubles
as an array-type data-store, indexed on the header type numeric value.
This assertion validates that.

>> -        if (EBIT_TEST(cc->mask, type)) {
>> +        if (EBIT_TEST(mask, type)) {
>> -            EBIT_SET(cc->mask, type);
>> +            EBIT_SET(mask, type);
>> -            if (!p || !httpHeaderParseInt(p, &cc->max_age)) {
>> +            if (!p || !httpHeaderParseInt(p, &max_age)) {
>> -                cc->max_age = -1;
>> -                EBIT_CLR(cc->mask, type);
>> +                max_age = -1;
>> +                EBIT_CLR(mask, type);
>> -            if (!p || !httpHeaderParseInt(p, &cc->s_maxage)) {
>> +            if (!p || !httpHeaderParseInt(p, &s_maxage)) {
>
> All of the above (and more) are essentially non-changes. Please consider
> making "cc" equal to "this" to avoid them for the purpose of the review,
> so that we can pinpoint the real changes. That hack should be removed
> before the final commit, of course.

Ok.

>> + *  Created on: Sep 2, 2011
>> + *      Author: Francesco Chemolli
>
> I would recommend removing that because there are many authors for that
> code. You have my permission to remove my Author tag in src/HttpHdrCc.cc
> as well.

Done.

>> +class HttpHdrCc
>> +{
>> +
>> +public:
>> +    int32_t mask;
>> +    int32_t max_age;
>> +    int32_t s_maxage;
>> +    int32_t max_stale;
>> +    int32_t stale_if_error;
>> +    int32_t min_fresh;
>> +    String other;
>> +
>> +    HttpHdrCc(int32_t max_age_=-1, int32_t s_maxage_=-1,
>
> Please move public data member after public methods. It would be nice to
> document them too, especially those that do not correspond to CC
> directives. Also, check whether we really need all of the above,
> especially mask, as public fields (more on that later below).

Done. Documented "mask" and "other", the others should be straightforward.
public mask could be avoided if we do full getters/setters, but doing
that elegantly would be scope-balooning.

>> +    HttpHdrCc(int32_t max_age_=-1, int32_t s_maxage_=-1,
>> +            int32_t max_stale_=-1, int32_t min_fresh_=-1) :
>
> Do we really all five of these constructors, including the one that
> allows for implicit conversion from int32_t to HttpHdrCc? I do not think
> so. Please leave just what we need and use "explicit" to prevent
> implicit conversions as needed.

Overengineering. Changed to only the default contstructor (we need it
to set various fields to "-1" by default).
Ok for explicit.

> As far as I can see, we just need the default constructor, but my quick
> check may be missing some cases:
>
>> $ fgrep -RI httpHdrCcCreate src
>> src/HttpHdrCc.cc:httpHdrCcCreate(void)
>> src/HttpHdrCc.cc:    HttpHdrCc *cc = httpHdrCcCreate();
>> src/HttpHdrCc.cc:    dup = httpHdrCcCreate();
>> src/http.cc:            cc = httpHdrCcCreate();
>> src/protos.h:SQUIDCEXTERN HttpHdrCc *httpHdrCcCreate(void);
>> src/mime.cc:    reply->cache_control = httpHdrCcCreate();

Yes.

>> +    /** reset the structure to a clear state.
>> +     *
>> +     */
>> +    void clear();
>
> The "clear state" is an undefined concept. Consider a more specific
> description such as:
> /// reset to the after-default-construction state
>
> (and there is probably a more elegant way of saying that).

How about "reset all data members to default state"? Otherwise I'll
use your suggestion.

>> +     * \note: internal structures are not cleaned-up beforehand.
>> +     *        caller must explicitly clear() beforehand if he wants that
>> +     */
>> +    bool parseInit(const String &s);
>
> It is wrong that *Init() does not initialize. If this is not your fault
> and you do not want to fix it, then mark with XXX.

That's a direct derivative of the name it has now.
I'll rename it to parse.

>> +    /** set the max_age value
>> +     *
>> +     * \param max_age the new max age. Values <0 clear it.
>> +     */
>> +    void setMaxAge(int32_t max_age);
>> +
>> +    /** set the s-maxage
>> +     *
>> +     * \param s_maxage the new max age. Values <0 clear it.
>> +     */
>> +    void setSMaxAge(int32_t s_maxage);
>
>
> It feels wrong to have many public fields and [only] these two methods.
> We should probably hide all fields and provide the corresponding name()
> and name(int32_t) accessors and setters to those fields that need them.

Direct refactoring artifact.
However, setSMaxAge is not used anywhere, so I'll remove it. Are you
sure it's needed to set up a full getter/setter structure for what is
in essence a glorified struct? It really feels like over-engineering.

>> === modified file 'src/HttpHeader.h'
>> --- src/HttpHeader.h  2011-06-23 08:31:56 +0000
>> +++ src/HttpHeader.h  2011-09-12 22:54:38 +0000
>> @@ -37,6 +37,7 @@
>>  #include "HttpHeaderRange.h"
>>  /* HttpHeader holds a HttpHeaderMask */
>>  #include "HttpHeaderMask.h"
>> +#include "HttpHdrCc.h"
>>
>>
>
> Is this addition needed? Please use "class HttpHdrCc;" pre-declaration
> instead if possible.

Removed the include. Class forward declaration was already in place.
This unveiled a few other needed include-places which were indirectly
included.

Attached is iteration 2 of the patch, build- and run-tested (not ready
for merge, but for a second validation round).

Thanks

-- 
    /kinkie

Received on Wed Sep 14 2011 - 18:30:55 MDT

This archive was generated by hypermail 2.2.0 : Tue Sep 27 2011 - 12:00:03 MDT