Re: [MERGE] c++-refactor HttpHdrCc

From: Kinkie <gkinkie_at_gmail.com>
Date: Thu, 29 Sep 2011 08:53:12 +0200

On Thu, Sep 29, 2011 at 7:04 AM, Alex Rousskov
<rousskov_at_measurement-factory.com> wrote:
> On 09/28/2011 04:43 PM, Kinkie wrote:
>> On Wed, Sep 28, 2011 at 6:02 PM, Alex Rousskov wrote:
>>> On 09/28/2011 07:23 AM, Kinkie wrote:
>
>> For the next class, we may want to do more frequent and smaller
>> merges, even if it means that there will be an interim phase.
>
> Not sure I like the sound of "interim phase", but I am all for smaller
> patches as long as they are self-contained. This project, for example,
> has two nearly independent parts: performance optimization using
> std::map for lookups and general code polishing. And you could have
> split it into even smaller parts.

Yes.
For the next one I'll definitely try to split up more, e.g.
1. c++-ification
2. api changes
3. performance

keeping the first two separate is going to be not really easy.

>> What we _can_ do is make HttpHdrCc::parse explicitly require that all
>> parsed headers be >= 0 . (done that, btw)
>
> I thought httpHeaderParseInt() already guarantees non-negative values,
> but I now see that it does not. You did the right thing by adding more
> parsing checks in v7.

Thanks.

>> +    if (setting) {
>> +        if (new_value < 0)
>> +            debugs(65,3,HERE << "rejecting negative Cache-Control directive " << hdr
>> +                << " value " << new_value );
>> +    } else {
>> +        new_value=-1; //rely on the convention that "unknown" is -1
>> +    }
>> +
>> +    value=new_value;
>> +    setMask(hdr,setting);
>
> The "new_value < 0" check only affects debugging, but the debugging
> message implies that we are _rejecting_ the negative value. AFAICT, we
> still set the mask bit on, even if the value was negative. Should that
> code set "setting" to false in addition to calling debugs()?

In my opinion that's a syntax error on the client and we should ignore
that directive altogether, so the behaviour is buggy.
I've changed it by adding a return statement immediately after, thus
no setting is done.

>>>> +    // max-stale has a special value (MAX_STALE_ALWAYS) which correspond to having
>>>> +    // the directive without a numeric specification, and directs to consider the object
>>>> +    // as always-expired.
>>>
>>> Your always-expired definition is the opposite of what valueless
>>> max-stale means, which is "the client is willing to accept a stale
>>> response of any age."
>>
>> My fault, it's what I had (mis)understood from the code.
>
> That comment still needs fixing. I would do something like:
>
> // We use MAX_STALE_ANY value to mark valueless max-stale directive
> // which instructs us to treat the response of any age as fresh.
>
> And place that comment where MAX_STALE_ANY is declared, replacing the
> current non-explanation there.

Done (with a slight modification). Here's the code:

        ///used to mark a valueless Cache-Control: max-stale directive, which instructs
        /// us to treat responses of any age as fresh
        static const int32_t MAX_STALE_ANY=0x7fffffff;

>>> You may want to set MAX_STALE_ANY to the maximum valid value. This
>>> should not be necessary, but if there are bugs where we use max-stale
>>> and do not check for MAX_STALE_ANY it will help mask some of them.
>>
>> Actually, I believe it would be the perfect solution also from a logic
>> point of view.
>> (implemented). The only reason to keep it alive is this code (from refresh.cc):
>>
>>             if (cc->haveMaxStale() && staleness > -1) {
>>                 if (cc->maxStale()==HttpHdrCc::MAX_STALE_ANY) {
>>                     /* max-stale directive without a value */
>>                     debugs(22, 3, "refreshCheck: NO: max-stale wildcard");
>>                     return FRESH_REQUEST_MAX_STALE_ALL;
>>                 } else if (staleness < cc->maxStale()) {
>>                     debugs(22, 3, "refreshCheck: NO: staleness < max-stale");
>>                     return FRESH_REQUEST_MAX_STALE_VALUE;
>>                 }
>>             }
>>
>> If we have no real need to distinguish the FRESH_REQUEST_MAX_STALE_ALL
>> and FRESH_REQUEST_MAX_STALE_VALUE this piece of code could be
>> simplified.
>>
>> This change also can allow us to change the data type from int32_t to
>> uint32_t and, together with the getters/setters, get rid of all
>> UNKNOWN states (we now consitently use the bitmask to check for
>> set/unset).
>
> We would always need MAX_STALE_ANY or equivalent because we need to
> output max-stale and not max-stale=.... when we got a valueless
> max-stale. However, it is likely that some of the other code can indeed
> be simplified.

You're right.
Nothing else is using max-stale.

>>>> +            if (flag == CC_MAX_STALE && maxStale()!=MAX_STALE_ALWAYS)
>>>> +                packerPrintf(p, "=%d", (int) maxStale());
>>>
>>> Why do we never pack MAX_STALE_ALWAYS? Either pack it or add a comment
>>> explaining why it is parsed but never packed.
>>
>> We reproduce what has come in, I guess.
>
> I found the code that implicitly packs MAX_STALE_ANY! Please add the
> following comment near the "(flag == CC_MAX_STALE &&
> maxStale()!=MAX_STALE_ANY)" packing check:
>
> // for MAX_STALE_ANY, we already packed bare "max-stale" above

Ok. But that is the same behaviour as for all other directives with
values. The only difference is that "=seconds" is not printed in case
max-stale holds the special value MAX_STALE_ANY.
In fact, that code can be optimized by using a switch statement.
Now it looks like this:

    for (flag = CC_PUBLIC; flag < CC_ENUM_END; ++flag) {
        if (isSet(flag) && flag != CC_OTHER) {

            /* print option name for all options */
            packerPrintf(p, (pcount ? ", %s": "%s") , CcAttrs[flag].name);

            /* for all options having values, "=value" after the name */
            switch (flag) {
            case CC_MAX_AGE:
                packerPrintf(p, "=%d", (int) maxAge());
                break;
            case CC_S_MAXAGE:
                packerPrintf(p, "=%d", (int) sMaxAge());
                break;
            case CC_MAX_STALE:
                /* max-stale's value is optional.
                  If we didn't receive it, don't send it */
                if (maxStale()!=MAX_STALE_ANY)
                    packerPrintf(p, "=%d", (int) maxStale());
                break;
            case CC_MIN_FRESH:
                packerPrintf(p, "=%d", (int) minFresh());
                break;
            default:
                /* do nothing, directive was already printed */
            }

            ++pcount;
        }
    }

>>          case CC_MAX_STALE:
>> -
>> -            if (!p || !httpHeaderParseInt(p, &cc->max_stale)) {
>> +            if (!p || !httpHeaderParseInt(p, &max_stale) || max_stale < 0) {
>>                  debugs(65, 2, "cc: max-stale directive is valid without value");
>> -                cc->max_stale = -1;
>> +                maxStale(MAX_STALE_ANY);
>
> The above will treat max-stale=invalid as a valid valueless max-stale. I
> think it would be better to treat malformed max-stale as no max-stale at
> all. How about setting MAX_STALE_ANY _only_ when p is nil?

Fine by me, but from what I can tell I'm preserving the current behaviour.
Also, wouldn't that make the parser unnecessarily strict?

>> I'm testing the patch. However the longer we gon on (and the larger
>> the scope gets) the less I am confident that no bugs were introduced.
>
> Same here, but we "go on" not because I want to enlarge the scope, but
> because there are new bugs introduced. If there are no bugs, any
> polishing can usually be done in one or two iterations.
>
> I do not see any serious problems in v7 though (let's not count the
> wrong always-expired comment as a bug). I think it can be committed with
> the above polishing touches unless others with fresh eyes can find bugs.

If I get no comments by Monday, I'll merge then.

Thanks.

-- 
    /kinkie
Received on Thu Sep 29 2011 - 06:53:19 MDT

This archive was generated by hypermail 2.2.0 : Thu Sep 29 2011 - 12:00:03 MDT