Re: boolean bit fields

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Thu, 24 Jan 2013 22:43:40 +1300

On 24/01/2013 7:51 p.m., Alex Rousskov wrote:
> On 01/23/2013 07:05 PM, Amos Jeffries wrote:
>> On 24/01/2013 7:20 a.m., Kinkie wrote:
>>> the attached patch turns the unsigned int:1 flags in CachePeer to
>>> bools.
>
>> Please retain the :1 bitmasking. My microbench is showing a consistent
>> ~50ms speed gain on bitmasks over full bool, particularly when there are
>> multiple bools in the structure. We also get some useful object size gains.
> Hello,
>
> FYI: With g++ -O3, there is no measureable performance difference
> between bool and bool:1 in my primitive tests (sources attached). I do
> see that non-bool bit fields are consistently slower though ("foo:0"
> below means type "foo" without bit fields; bool tests are repeated to
> show result variance):

Excellent. Thanks for that. I did not go down to the ASM level for my
benchmarks. Just the 100 million loop iteration timing, runs a few
dozens of times to get an idea of the variance.
The binary was not built with -O at all, so whatever the G++ default is
was used.

<snip>
> To me, it looks like bit fields in general may hurt performance where
> memory composition is not important (as expected, I guess), and that
> some compilers remove any difference between full and bit boolean with
> -O3 (that surprised me).
>
> G++ assembly source comparison seem to confirm that -- boolean-based
> full and bit assembly sources are virtually identical with -O3 and newer
> g++ versions, while bit fields show a lot more assembly operations with
> -O0 (both diffs attached). Assembly is well beyond my expertise though.
At -O3 G++ is optimizing for speed at expense of code size.
-O2 is probably a better comparision level and AFAIK the preferred level
for high-performance and small code size build.

>
> Am I testing this wrong or is it a case of YMMV? If it is "YMMV", should
> we err on the side of simplicity and use simple bool where memory
> savings are not important or not existent?

I think YMMV with the run-time measurement. I had to run the tests many
times to get an average variance range on the speed even at 100M loops.
Some runs the speed was 100ms out in the other direction, but only some,
most were 50ms towards bool:1. And the results differed between flag
position and struct with 1-byte length and struct with enough flags for
2-bytes.

I did not have time to look at the ASM, thank you for the details there.
If -O2 shows the same level of cycles reduction I think I will change my
tack...
  we should be letting it handle the bitfields. BUT, we should still
take care to arrange the flags and members such that -O has an easy job
reducing them.

Amos
Received on Thu Jan 24 2013 - 09:43:51 MST

This archive was generated by hypermail 2.2.0 : Thu Jan 24 2013 - 12:00:08 MST