File Reference
#include "squid.h"
#include "profiler/Profiler.h"
#include <cassert>
Include dependency graph for

Go to the source code of this file.


struct  _callstack_entry


#define MAXSTACKDEPTH   512


static void xprof_update (xprof_stats_data *head)
static void xprof_InitLib (void)
void xprof_start (xprof_type type, const char *timer)
void xprof_stop (xprof_type type, const char *timer)


TimersArrayxprof_Timers = NULL
struct _callstack_entry cstack [MAXSTACKDEPTH]
int cstack_head = 0
static xprof_stats_dataxp_UNACCOUNTED
static int xprof_inited = 0

Macro Definition Documentation

#define MAXSTACKDEPTH   512

Definition at line 113 of file

Referenced by xprof_start().

Function Documentation

static void xprof_InitLib ( void  )

Definition at line 143 of file

References cstack_head, xprof_inited, XPROF_LAST, XPROF_PROF_UNACCOUNTED, and xprof_Timers.

Referenced by xprof_start().

static void xprof_update ( xprof_stats_data head)

Variable Documentation

Definition at line 121 of file

Referenced by xprof_start(), and xprof_stop().

int cstack_head = 0

Definition at line 122 of file

Referenced by xprof_InitLib(), xprof_start(), and xprof_stop().

xprof_stats_data* xp_UNACCOUNTED

Definition at line 139 of file

int xprof_inited = 0

Definition at line 140 of file

Referenced by xprof_InitLib(), and xprof_start().

TimersArray* xprof_Timers = NULL

CPU Profiling implementation.

This library implements the Probes needed to gather stats. See src/ProfStats.c which implements historical recording and presentation in CacheMgr.cgi.
For timing we prefer on-CPU ops that retrieve cpu ticks counter. For Intel, this is "rdtsc", which is 64-bit counter that virtually never wraps. For alpha, this is "rpcc" which is 32-bit counter and wraps every few seconds. Currently, no handling of wrapping counters is implemented. Other CPU's are also not covered. Potentially all modern CPU's has similar counters.

Usage. Insert macro PROF_state(probename) in strategic places in code. PROF_start(probename); ... section of code measured ... PROF_stop(probename);

probename must be added to the xprof_type.h enum list with prepended "XPROF_" string.


gathers stats per probename into structures. It indexes these structures by enum type index in an array.
records best, best, average and worst values for delta time, also, if UNACCED is defined, it measures "empty" time during which no probes are in measuring state. This allows to see time "unaccounted" for. If OVERHEAD is defined, additional calculations are made at every probe to measure approximate overhead of the probe code itself.
Probe data is stored in linked-list, so the more probes you define, the more overhead is added to find the deepest nested probe. To reduce average overhead, linked list is manipulated each time PR_start is called, so that probe just started is moved 1 position up in linkedlist. This way frequently used probes are moved closer to the head of list, reducing average overhead. Note that all overhead is on the scale of one hundred of CPU clock ticks, which on the scale of submicroseconds. Yet, to optimise really fast and frequent sections of code, we want to reduce this overhead to absolute minimum possible.
For actual measurements, probe overhead cancels out mostly. Still, do not take the measured times as facts, they should be viewed in relative comparison to overall CPU time and on the same platform.
Every 1 second, Event within squid is called that parses gathered statistics of every probe, and accumulates that into historical structures for last 1,5,30 secs, 1,5,30 mins, and 1,5 and 24 hours. Each second active probe stats are reset, and only historical data is presented in cachemgr output.

Reading stats.

"Worst case" may be misleading. Anything can happen at any section of code that could delay reaching to probe stop. For eg. system may need to service interrupt routine, task switch could occur, or page fault needs to be handled. In this sense, this is quite meaningless metric. "Best case" shows fastest completion of probe section, and is also somewhat useless, unless you know that amount of work is constant. Best metric to watch is "average time" and total cumulated time in given timeframe, which really show percentage of time spent in given section of code, and its average completion time. This data could be used to detect bottlenecks withing squid and optimise them.
TOTALS are quite off reality. Its there just to summarise cumulative times and percent column. Percent values over 100% shows that there have been some probes nested into each other.

Definition at line 108 of file

Referenced by xprof_average(), xprof_InitLib(), and xprof_stop().






Web Site Translations