Re: hi-res profiling

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Wed, 2 Oct 2002 12:44:35 +0300

On 2 Oct 2002, at 17:41, Robert Collins <robertc@squid-cache.org> wrote:

> I came to the same conclusion from the source, and am glad to have it
> confirmed. For now, I agree with you on the timing of such recursion.
> I'm actually concerned about a much more complex case:
> something () {
> call cb function.
> }
>
> callback() {
> }
>
> where the callback occurs for a different instance of something(). We
> need to be very sure to PROF_STOP (something) before calling out of it
> IMO.

 This depends. On abstract level, while calling cb we still have a fact
 that something() hasn't completed yet. Thus its not right to stop its
 probe and continue after cbfunc, or we'll have bad stats saying that
 this func has run twice instead of once.
 Probably we'd want to split it then into 2 probes, one for entry upto
 cbfunc, and then other part that runs after return from cbfunc:

 something () {
   PROF_Start(Part1)
    [...]
   PROF_Stop(Part1)
    call cb function
   PROF_Start(Part2).
    [...]
   PROF_Stop(Part2)
 }
 This way you'll have idea of how many times recursion happens, how
 much time this adds up to, avoid nested probes, and in addition can
 see separately parts of something() executing. Comparing part2 and part1
 you can immediately decide if probing part1 even makes sense (it doesn't
 if it takes always <0.1% of cpu).

 But what would be the goal of recursive probes? Desrcibe in terms
 "what do we need to know?" We need to know 1) how long does single
 instance of cb function run. We count all invocations of func, and
 each measured time. So we have eventually all cumulated time for func
 and best, avg, worst case runs. If we have different runs based on
 data, then we probably want to have many probes mentioning that fact,
 like cb_to_read or cb_to_write.
 2) We want to know how much time it takes together per something()
 that might have many calls to cb funcs, enclose it between probes
 outside something(), at its calling time. Calling it again from within
 cb func seems to call for trouble. Do we really have such recursion?
 We might need to omit enclosing with probes invocation of something()
 if its called from cb funcs, or add probes named something_fromcb_this
 or something_fromcb_that. Or, compare runtime of something() with
 its subpart "part2".

 Given we have above two measured, looks like we have all we need to
 optimise. We can't really clutter cachemgr output for loaded cache
 with too many probes. There we really want to have important areas
 covered, like avg time to service FD, time to service disk, but not
 something like "this cb func with recursion 5 into itself with 4
 different params takes 2379 cpu ticks of time". Developers watch
 how much time is wasted in some area of squid, and then add more precise
 probe points into that area to pinpoint cpuhog. To find real culprit,
 we really don't want to have probes handling recursion, we'd like to
 have such code split into nicely readable pieces and find which part
 is used too often, or takes too long to run. After its optimised,
 we remove precision probes, or we'll get lost ourselves in amount
 of cachemgr output.

------------------------------------
 Andres Kroonmaa <andre@online.ee>
 CTO, Microlink Online
 Tel: 6501 731, Fax: 6501 725
 Pärnu mnt. 158, Tallinn,
 11317 Estonia
Received on Wed Oct 02 2002 - 03:53:43 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:51 MST