Re: [RFC] Interim review: typedefs.h / protos.h splitting

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Wed, 19 Sep 2012 12:01:26 +1200

On 19/09/2012 6:37 a.m., Alex Rousskov wrote:
> On 09/18/2012 10:36 AM, Kinkie wrote:
>>>> - renames some types to CamelCase convention
>>>> - adds some (for now empty placeholders) code files for c++-ification
>>>> of classes (e.g. CacheMgrPasswd, CustomLog)
>>>> - implements the full set of getters/setters for class RequestFlags -
>>>> the method names are certainly improveable, advice is sought
>>> I recommend undoing changes that do nothing but wrap a boolean field in
>>> two or three methods. They just add more ink and the naming is not
>>> consistent across different fields (e.g., some fields are "set" others
>>> are "marked"; some have default set values, others do not).
>> The names reflect my attempt at highlighting that these flags are not
>> consistent: some seem to be markers that something happened. Some seem
>> to be requests to something at a later processing stage. The names are
>> my attempt at disambiguating those two cases.
> And some are both cases? I doubt we should try to convey so much
> information in a flag name because nobody will notice what we are trying
> to do but many will be confused by apparent inconsistencies.
>
>
>>> Cases are where some additional processing/logic is implemented (e.g.,
>>> nocache_hack) should be wrapped, but why complicate processing for the
>>> vast majority of fields?
>> One possible benefit I saw was to clarify when things are set and when
>> they are tested.
> An assignment versus an if statement clarifies that enough IMO. In fact,
> a method call versus a method call provides a far less obvious visual
> distinction.

+1.

I see no need for setters/getters for most of these which are actual flags.
There are some which need to combine several flags eg. intercepted() {
return isNat() || isTproxy() || sslBumped(); }

>
>>> There are three other issues related to the RequestFlags change:
>>>
>>>> RequestFlags():
>>>> + nocache(false), ims(false), auth_(false), cachable(false),
>>>> + hierarchical_(false), loopdetect(false), proxy_keepalive(false), proxying_(false),
>>>> + refresh_(false), redirected(false), need_validation(false),
>>>> + fail_on_validation_err(false), stale_if_hit(false), nocache_hack(false), accelerated_(false),
>>>> + ignore_cc(false), intercepted_(false), hostVerified_(false), spoof_client_ip(false),
>>>> + internal(false), internalclient(false), must_keepalive(false), connection_auth_wanted(false), connection_auth_disabled(false), connection_proxy_auth(false), pinned_(false),
>>>> + canRePin_(false), authSent_(false), noDirect_(false), chunkedReply_(false),
>>>> + streamError_(false), sslPeek_(false),
>>>> + doneFollowXForwardedFor(!FOLLOW_X_FORWARDED_FOR),
>>>> + sslBumped_(false), destinationIPLookedUp_(false), resetTCP_(false),
>>>> + isRanged_(false)
>>>> + {}
>>> For this discussion, let's assume that we initialize
>>> doneFollowXForwardedFor to false like other fields (and then set it in
>>> the constructor body to a proper value or make a smart getter).
>>>
>>> Can you check whether GCC can optimize the above into some form of
>>> memset(0). I suspect it cannot, especially since correctly setting a
>>> single bit probably requires fetching and preserving the value of nearby
>>> bits. If I am right, I wonder whether we should use memset(0)
>>> explicitly? It feels like we are adding a lot of operations to every
>>> request where one operation would work just fine (I know some compilers
>>> will even unwrap memset for small constant size values but I do not know
>>> whether GCC does that).
>> That's a possible optimization. I wonder however what happens if
>> nothing is specified in the constructor list.
> Nothing happens (except for default data member constructors, if any).
> The generated default constructor does nothing (but execute data member
> constructors, if any).

I think we should design some flag separation structs inside this class.
Similar to how SquidConfig class contains sub-struct for each protocol
with the protocol specific flags broken down. We can memset() these
sub-structs.

This comes to mind after the details being discussed above:
  struct _marks {...} done;
  struct _flags {...} flags;

>
>> Maybe the
>> initializations are silently done? Or a memset is done? I suspect it
>> is undefined in the standard.
> It is very well defined. In general, you do not pay for what you do not
> use in C++, in part to achieve partial backwards compatibility with C.
> There is no magic :-).
>

I thought it was guaranteed that a class was initialized to zero? and
that we only needed to provide explicit constructor details for members
which needed other setup?

>>> Since the old code did the same kind of bit-by-bit initialization, you
>>> can investigate this optimization later. However, I would still change
>>> doneFollowXForwardedFor initialization to follow the pattern (if nothing
>>> else, this not-false exception tempts others to add more initialization
>>> exceptions, complicating future optimizations).
>> I would like to explain a little bit more about this, as it is another
>> bit that was irking me.
>> Some of the processing code is full of
>> struct {
>> #if somefeature
>> int somevalue;
>> #endif
>> //...
>> };
>> int somefunction() {
>> #if somefeature
>> process(struct.somevalue);
>> #endif
>> }
>>
>> This XFF change is an attempt to spark discussion on a pattern for
>> getting rid of this spaghetti (in a similar way like what happened
>> with the initializers).
> Strictly speaking, #if-statements do not create spaghetti code, but I
> think I know what you mean. If we can remove many #if-statements by
> always providing a few methods, we should provide those methods (and
> make them no-op when needed using #if-statements). There are many gray
> areas where the decision is not obvious, unfortunately.
>
>
>> The proposal is to sacrifice a bit of
>> space-efficiency in the data structures and using #ifdefs that cause
>> NOPs in the functions, hopefully containing fetures enabling in the
>> called code, freeing the caller from having to care.
> Right. And whether the data member needs to be #ifdefed also depends on
> several factors. Complex or large members and members requiring complex
> or slow initialization should be #ifdefed (but we sometimes provide a
> whole no-op class to handle such cases better!).
>
>> Whether the effort is useful, I'd like us to spend some time discussing on.
> I doubt we will find a one-size-fits-all approach, unfortunately.
>

NP: we also need to work around uninitialized X and unused X compiler
warnings. Depending on the situation.

>
>>>> + void clearDoneFollowXFF() {
>>>> + /* do not allow clearing if FOLLOW_X_FORWARDED_FOR is unset */
>>>> + doneFollowXForwardedFor = false || !FOLLOW_X_FORWARDED_FOR;
>>>> + }
>>> Something went wrong here. The "false ||" part does not do anything. I
>>> failed to find the corresponding old code so I am not sure what this
>>> should be.
>> Old code didn't care. It's another step at what I was trying to
>> obtained. The invariant is: if FOLLOW_X_FORWARDED_FOR is unset,
>> doneFollowXForwardedFor should always return true.
> Whatever the explanation is, the code looks wrong because the "false ||"
> part does not do anything. It should be removed or replaced. Perhaps you
> meant something like
>
> doneFollowXForwardedFor = !FOLLOW_X_FORWARDED_FOR;
>
> ?

+1 on this.

Amos
Received on Wed Sep 19 2012 - 00:01:39 MDT

This archive was generated by hypermail 2.2.0 : Tue Sep 25 2012 - 12:00:10 MDT