Re: hi-res profiling

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Wed, 2 Oct 2002 11:54:19 +0300

On 2 Oct 2002, at 9:26, Henrik Nordstrom <hno@marasystems.com> wrote:

> > Can probes reenter?
>
> From reading the source it seems nesting is not supported very well
> within the same performance counter. If you do the readings will be
> somewhat screwed up usually giving more time spent in the counter
> than actual, counting the time from last start to each stop.

 you mean opposite, it shows less time than actual. Time between
 successive starts is lost.

> Extending it to support nesting would not be too hard, but involves
> overhead so counters needing nesting should perhaps use other calls.
> Knowing which sections need nesting is not too hard. For such
> counters I propose measuring the time from first start to last stop.

 I don't find support for nesting to be useful. It will be hard to
 present such stuff, and I think it will add considerable overhead per
 probe run. Simplest is to add check to Start that if probe is running,
 then do Stop+Start.

> Some suggestions for optimizations:
>
> The timer name can be eleminated from the profiling calls and moved to
> xprof_InitLib. It is static for each timer so there is no need of
> sending and assigning it on each call..

 yes, it can, but benefit will be very small. Its because cpu tick
 counter readout is slow instruction, and during its execution this
 name stuff is handled concurrently in cpu. We talk here about saving
 few (2-6) cpu ticks.

> Maybe consider making the start/stop functions inline, at least up to
> the "nesting / concurrent timers" check. I assume there quite often
> will be two or more concurrent timers running, one large scope
> counter and several small scope counters..

 I tried this. I found that benefit is either very small or negative.
 Its to do with CPU caches. Inlining causes measured code to extend,
 giving more work to prefetch queues, ie. slowing measured code down.
 Especially noticable when many short sections of probed code are closely
 together. Also, inlining causes all probes to be part of code in ram,
 extending its overall footprint. Basically, inlining causes profiler
 to be more intrusive. Having it in libcalls causes some overhead on
 each invocation, but if probes are used frequently, cpu caching is
 better utilised (single piece of code to cache), and generally result
 is probably even better. In addition, inlining makes it hard to estimate
 probe overhead. My decision to skip inlining was primirily based on
 that if you can measure overhead, you can account for it, and I wanted
 to make additional code as small as possible to avoid actual code
 slowdown as much as possible.

> Consider eleminating the initlib check on each call, and instead have
> it called early from main. The amount of code (including
> xprof_update) is fairly small so inlining should help I think.

 Yes, initlib check should be possible to omit. Gotta do this. its
 historical relict. All probe code is few hundred bytes of code if
 inlined. If used alot in some func, this adds considerable amount.
 xprof_update is inlined within lib/Profiler.c

> As the set of timers is static, make the array statically allocated.
> This saves some most of the addressing complexity when updating the
> timers, allowing direct memory access instead of indirect via a
> pointer + offset, especially noticeable in case of inlined start/stop
> functions.

 Yes, agree. This is also historical. Initial version wanted to make
 it possible for user to insert probes anywhere in code with freetext
 names and without a need to update .h files. Now that enum's gives
 alot of performance benefit, this should be redone.

 There is also mention of compiletime define to enable/disable
 measuring of unaccounted probetime, disabling it would cause notable
 probe code reduction and that should also be enclosed in ifdefs. Its
 left there for now, because I've found knowing unaccounted time is
 very useful to detect existence of pieces of code that were considered
 to be no cpu hogs, but unexpectedly became such.

------------------------------------
 Andres Kroonmaa <andre@online.ee>
 CTO, Microlink Online
 Tel: 6501 731, Fax: 6501 725
 Pärnu mnt. 158, Tallinn,
 11317 Estonia
Received on Wed Oct 02 2002 - 03:03:33 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:51 MST