Profiler.h File Reference
Include dependency graph for Profiler.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  _xprof_stats_data
 
struct  _xprof_stats_node
 

Macros

#define XP_NOBEST   (hrtime_t)-1
 
#define PROF_start(probename)   xprof_start(XPROF_##probename, #probename)
 
#define PROF_stop(probename)   xprof_stop(XPROF_##probename, #probename)
 

Typedefs

typedef struct _xprof_stats_node xprof_stats_node
 
typedef struct _xprof_stats_data xprof_stats_data
 
typedef xprof_stats_node TimersArray [1]
 

Functions

void xprof_start (xprof_type type, const char *timer)
 
void xprof_stop (xprof_type type, const char *timer)
 
void xprof_event (void *data)
 

Variables

TimersArrayxprof_Timers
 

Macro Definition Documentation

#define PROF_start (   probename)    xprof_start(XPROF_##probename, #probename)
#define PROF_stop (   probename)    xprof_stop(XPROF_##probename, #probename)
#define XP_NOBEST   (hrtime_t)-1

Definition at line 29 of file Profiler.h.

Referenced by xprof_reset(), xprof_show_item(), and xprof_summary_item().

Typedef Documentation

typedef xprof_stats_node TimersArray[1]

Definition at line 52 of file Profiler.h.

Definition at line 33 of file Profiler.h.

Definition at line 31 of file Profiler.h.

Function Documentation

Variable Documentation

TimersArray* xprof_Timers

CPU Profiling implementation.

This library implements the Probes needed to gather stats. See src/ProfStats.c which implements historical recording and presentation in CacheMgr.cgi.
For timing we prefer on-CPU ops that retrieve cpu ticks counter. For Intel, this is "rdtsc", which is 64-bit counter that virtually never wraps. For alpha, this is "rpcc" which is 32-bit counter and wraps every few seconds. Currently, no handling of wrapping counters is implemented. Other CPU's are also not covered. Potentially all modern CPU's has similar counters.

Usage. Insert macro PROF_state(probename) in strategic places in code. PROF_start(probename); ... section of code measured ... PROF_stop(probename);

probename must be added to the xprof_type.h enum list with prepended "XPROF_" string.

Description.

PROF
gathers stats per probename into structures. It indexes these structures by enum type index in an array.
PROF
records best, best, average and worst values for delta time, also, if UNACCED is defined, it measures "empty" time during which no probes are in measuring state. This allows to see time "unaccounted" for. If OVERHEAD is defined, additional calculations are made at every probe to measure approximate overhead of the probe code itself.
Probe data is stored in linked-list, so the more probes you define, the more overhead is added to find the deepest nested probe. To reduce average overhead, linked list is manipulated each time PR_start is called, so that probe just started is moved 1 position up in linkedlist. This way frequently used probes are moved closer to the head of list, reducing average overhead. Note that all overhead is on the scale of one hundred of CPU clock ticks, which on the scale of submicroseconds. Yet, to optimise really fast and frequent sections of code, we want to reduce this overhead to absolute minimum possible.
For actual measurements, probe overhead cancels out mostly. Still, do not take the measured times as facts, they should be viewed in relative comparison to overall CPU time and on the same platform.
Every 1 second, Event within squid is called that parses gathered statistics of every probe, and accumulates that into historical structures for last 1,5,30 secs, 1,5,30 mins, and 1,5 and 24 hours. Each second active probe stats are reset, and only historical data is presented in cachemgr output.

Reading stats.

"Worst case" may be misleading. Anything can happen at any section of code that could delay reaching to probe stop. For eg. system may need to service interrupt routine, task switch could occur, or page fault needs to be handled. In this sense, this is quite meaningless metric. "Best case" shows fastest completion of probe section, and is also somewhat useless, unless you know that amount of work is constant. Best metric to watch is "average time" and total cumulated time in given timeframe, which really show percentage of time spent in given section of code, and its average completion time. This data could be used to detect bottlenecks withing squid and optimise them.
TOTALS are quite off reality. Its there just to summarise cumulative times and percent column. Percent values over 100% shows that there have been some probes nested into each other.

Definition at line 108 of file Profiler.cc.

Referenced by xprof_average(), xprof_InitLib(), and xprof_stop().

 

Introduction

Documentation

Support

Miscellaneous

Web Site Translations

Mirrors