Re: [PATCH] improve hack in src/base/TextException.h

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Tue, 22 Feb 2011 10:06:59 -0700

On 02/22/2011 08:36 AM, Tsantilas Christos wrote:
> On 02/22/2011 04:47 PM, Kinkie wrote:
>> because it involves tricks to embed the same code again and again in
>> different compilation units to perform the caching.

> Well, I do not have strong opinion on that, we can remove caching if
> we do not like. But this is a problem which it is solved and I doubt
> that it can cause more problems. If we believe that every time we are
> going to add a new module or feature we have to spend time for a new
> solution/hack to fix it, then we should remove it. This is just my
> opinion.

I may have been the one who suggested this optimization, but I am not
strongly attached to it, and I am not against removing it if it causes
more problems than it solves.

I do not think adding a few bytes to every .o file to store the computed
hash value is a problem, but others may find bigger flaws with the
approach, of course.

Perhaps others can vote and break the current tie? Current code caches
the source code file name hash to avoid recomputing it for every failed
Must(). Must()s do not fail often, but, unlike assert()s, they may fail
when there are I/O and compatibility failures. The cost of such caching
is a few extra bytes added to every .o file that uses TextException. The
cost of not caching is computing a simple hash for a 10-40 byte string
(depending on the source file and the compiler).

>> IMVHO it is not
>> even worth the time we have collectively spent in this discussion ;)

That is probably true, at least short-term. FWIW, I would not bother
keeping this pandora box open if it were not for the GCC-specific
__attribute__ that was added by those who opened it :-).

Thank you,

Alex.
P.S. Eventually, I think every source file may get a unique name/ID that
does not depend on the compiler, so this code will have to be changed
again, but we are not there yet.

>> On 2/22/11, Tsantilas Christos<chtsanti_at_users.sourceforge.net> wrote:
>>> On 02/22/2011 03:18 PM, Tsantilas Christos wrote:
>>>> On 02/22/2011 02:08 PM, Kinkie wrote:
>>>>> Hi Christos,
>>>>> My question is, why use the cache at all? It seems quite complex for
>>>>> a little gain..
>>>>
>>>> You mean the FileNameHashCached method?
>>>> It is used to compute the hash of the file name to be used inside the
>>>> "Must" function. We may have multiple Must calls inside a simple
>>>> function. We are calling many times the "Must" function for each HTTP
>>>> requests comes through squid.
>>>> If we do not cache the result we are going to compute this hash on
>>>> every
>>>> Must call. It is not normal.
>>>>
>>>> The FileNameHashCached is not complex, it is a very simple function.
>>>> And
>>>> the gain is huge not little.
>>>
>>> Well, with a second view, I saw that the FileNameHashCached called only
>>> if the argument/condition passed to Must is not true (on failures etc).
>>> So yes it is not important. We are expecting to have a small number of
>>> such cases.
>>> Anyway I do not think it is so complex this part of code (just an if),
>>> why should we compute this value again and again?
>>>
>>>>
>>>>
>>>>>
>>>>> kinkie
>>>>> roaming
>>>>>
>>>>> Il giorno 22/feb/2011 11.54, "Tsantilas Christos"
>>>>> <chtsanti_at_users.sourceforge.net<mailto:chtsanti_at_users.sourceforge.net>>
>>>>>
>>>>> ha scritto:
>>>>>> Hi all,
>>>>>>
>>>>>> On 02/21/2011 08:31 PM, Alex Rousskov wrote:
>>>>>>> On 02/21/2011 12:27 AM, Kinkie wrote:
>>>>>>>>> Sorry, I am not sure I understand the above. Should I reverse
>>>>> r11236 and
>>>>>>>>> commit a casting fix? Or do you want be to post a patch that some
>>>>> Hudson
>>>>>>>>> job will pick up and test somehow first?
>>>>>>>>
>>>>>>>> Why?
>>>>>>>> Judging from the comment in there it's just meant to trick the
>>>>>>>> compiler into not (mis)understanding that the FileNameHash
>>>>> function is
>>>>>>>> unused, because in the end it's invoked via trickery.
>>>>>>>
>>>>>>> FileNameHash is _sometimes_ unused. The function is used in
>>>>> compilation
>>>>>>> units where certain macros are used and only in those compilation
>>>>> units.
>>>>>>>
>>>>>>>
>>>>>>>> IMVHO there are two clean ways out of this:
>>>>>>>> - use a compiler-dependent thing such as __attribute__((unused)) to
>>>>>>>> politely inform the compiler that sometimes the programmer knows
>>>>>>>> more
>>>>>>>> about stuff than the compiler
>>>>>>>
>>>>>>> A compiler-specific attribute is not a "clean way", especially
>>>>>>> when it
>>>>>>> is introduced as a fix for a rather different problem and actually
>>>>> kind
>>>>>>> of lies to the compiler in some cases.
>>>>>>>
>>>>>>>
>>>>>>>> - just get rid of this whole FileNameHashCached thing as a
>>>>>>>> premature
>>>>>>>> optimization, and just recalculate the filename hash when we
>>>>>>>> throw a
>>>>>>>> TextException (which is hopefully not that often)
>>>>>>>
>>>>>>> Sure, let's revisit the necessity of this hash caching optimization.
>>>>>>>
>>>>>>> Christos, do you remember what were the key reasons for adding it
>>>>> in the
>>>>>>> first place?
>>>>>>
>>>>>>
>>>>>> OK, with a little research I found it :-)
>>>>>> This is a part of the log error details patch.
>>>>>>
>>>>>> The original comment was:
>>>>>>> ... For the files the Must is not used, I got a compiler error that
>>>>> " the FileNameHashCached defined but not used"
>>>>>>> To solve it I just define a foo_class....
>>>>>>
>>>>>>
>>>>>> I am not sure about the "__attribute__((unused))" definition but
>>>>>> if it
>>>>>> is only suppressing the warning, it is not bad idea...
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thank you,
>>>>>>>
>>>>>>> Alex.
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
Received on Tue Feb 22 2011 - 17:07:13 MST

This archive was generated by hypermail 2.2.0 : Wed Feb 23 2011 - 12:00:05 MST