And what makes it different from simply using gprof to get automatic
profiling of all functions?
/Henrik
Andres Kroonmaa wrote:
> 
>  Disregard the previous patch. I've managed to solve my problems
>  and updated patch is located at:
>     http://www.online.ee/~andre/squid/profil/
>  It integrates cleanly with 2.3S4-hno-CVS, adds configure option
>  --enable-cpu-profiling, and pretty much works ;)
> 
>  I've not heard any comments on it, no good, no bad. please.
> 
> On 22 Sep 2000, at 11:48, Andres Kroonmaa <andre@online.ee> wrote:
> 
> >
> >  I've put together some profiling tools into squid, and I'd
> >  like to through it out for your review.
> >  The patch is clumsy, like my coding skills, and I had some
> >  trouble with integrating it nicely.
> >  It compiled ok when I patched and recompiled in already compiled
> >  tree, but it seems to fail now due to different approach needed
> >  to make both src/ and lib/ happy.
> >  This is something I'm not good at.
> >
> >  Description.
> >  Patch adds new src file, profiling.c where profiling probes are
> >  implemented. include/profiling.h defines a macro, that can be
> >  inserted anywhere in the code, and be used to fire off and stop
> >  a measurement. Precision is ca 1 cpu clock tick. Works best for
> >  Intel (and only pentiums up) (this is where I'm on), I know it
> >  should work with small fixes on alpha, and for sparc some code
> >  needs to be added.
> >
> >  I've added probes in several places in the source, to have some
> >  results. These can be seen at:
> >   http://www.online.ee/~andre/squid/profil/
> >   http://www.online.ee/~andre/squid/profil/sample-longterm.html
> >  and patch itself:
> >   http://www.online.ee/~andre/squid/profil/prof.diff
> >
> >  Patch adds new menu to cachemgr, and keeps history stats for
> >  last 1,5,30 secs and 1,5,30 minutes.
> >
> >  probes are measuring delta ticks between invocations, and
> >  accumulate this time into a structure. Every 1 seconds, from
> >  Event management, summary is generated for each historical
> >  period. Time is measured between event runs, 1sec, and time
> >  accumulated by probes is compared to it, resulting percentage
> >  of CPU use. Best, average and worst cases are also noted. In
> >  general, I believe this allows us to have a look at what is
> >  eating CPU inside squid. CPU clock timer moves on globally
> >  for the whole system, so all non-squid time is unaccounted
> >  and may be attributed wrongly to some probes, upping worst
> >  case dramatically.
> >
> >  This patch seems stable. It has some functional problems: it
> >  cannot handle overlapping probes of same name (it would get timing
> >  wrong), it tries to measure "idle" time between different probes,
> >  (ie time when no probes in running state) and it doesn't workout
> >  well. Also, because several probes I've put into code do nest or
> >  overlap, totals are also off.
> >  Average added overhead seems to be about 650 ticks per probecall,
> >  which can be optimised somewhat, at a cost of smaller convenience.
> >
> >  One more thing I wanted to ask you gurus. I've implemented probe
> >  matching by way of string pointers. To start a probe, you give a
> >  const char string as a parameter to a probe, it records the pointer
> >  to it, and subsequently compares not string itself, but the pointers.
> >  This allows for fast check, and the string itself is used as a
> >  description. Seems also to work ok. I understand that it could
> >  brake if probe with the same name is started in one module, and
> >  stopped in another. Then const char strings would have different
> >  pointers? This could result in the probes with the same name to
> >  be seen as different. This should also be immediately seen in
> >  cachemgr output. Or is such approach acceptable, generally?
> >
> >  Please review the sample output, and the patch itself, and comment.
> >  Also, hints on how to integrate it properly, or (re)design hints
> >  are very welcome.
> >
> > PS. alpha timing macro is there, but I have no acces to alpha right
> >  now, so I cannot test. It would be helpful, if someone with skills
> >  to get it compiled could test it. Also, if other cpu-types can be
> >  added, would be very nice. All that is needed is a func or macro
> >  that is able to read out CPU onchip tick-counter, if supported.
> >  Actually, profiling code doesn't know or care in what units it
> >  is measuring. the only thing it assumes is that counter between
> >  sample is incrementing uniformly. It could be actually microsecs,
> >  nanosec, or whatever. For some CPU's that have 32bit counters
> >  (alpha, isn't it weird that 64bit cpu has 32bit counters?), it
> >  may wrap in few seconds. Currently, no code for handling that.
> >
> > PPS. preliminary results hints that squid spends considerable amount
> >  of time in preparing FD's for poll (setting up pfds.events), near
> >  10% of total CPU time under about 50 reqs/sec load.
> >  Also, ACL matching seems comsuming alot, also approaching 10%.
> >  The rest is open for debate.
> 
> ------------------------------------
>  Andres Kroonmaa <andre@online.ee>
>  Delfi Online
>  Tel: 6501 731, Fax: 6501 708
>  Pärnu mnt. 158, Tallinn,
>  11317 Estonia
Received on Fri Sep 29 2000 - 00:18:20 MDT
This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:12:38 MST