Re: hi-res profiling from Andres Kroonmaa on 2002-10-02 (squid-dev)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Wed, 2 Oct 2002 12:44:35 +0300

On 2 Oct 2002, at 17:41, Robert Collins <robertc@squid-cache.org> wrote:

> I came to the same conclusion from the source, and am glad to have it
> confirmed. For now, I agree with you on the timing of such recursion.
> I'm actually concerned about a much more complex case:
> something () {
> call cb function.
> }
>
> callback() {
> }
>
> where the callback occurs for a different instance of something(). We
> need to be very sure to PROF_STOP (something) before calling out of it
> IMO.

This depends. On abstract level, while calling cb we still have a fact
that something() hasn't completed yet. Thus its not right to stop its
probe and continue after cbfunc, or we'll have bad stats saying that
this func has run twice instead of once.
Probably we'd want to split it then into 2 probes, one for entry upto
cbfunc, and then other part that runs after return from cbfunc:

something () {
   PROF_Start(Part1)
    [...]
   PROF_Stop(Part1)
    call cb function
   PROF_Start(Part2).
    [...]
   PROF_Stop(Part2)
}
This way you'll have idea of how many times recursion happens, how
much time this adds up to, avoid nested probes, and in addition can
see separately parts of something() executing. Comparing part2 and part1
you can immediately decide if probing part1 even makes sense (it doesn't
if it takes always <0.1% of cpu).

But what would be the goal of recursive probes? Desrcibe in terms
"what do we need to know?" We need to know 1) how long does single
instance of cb function run. We count all invocations of func, and
each measured time. So we have eventually all cumulated time for func
and best, avg, worst case runs. If we have different runs based on
data, then we probably want to have many probes mentioning that fact,
like cb_to_read or cb_to_write.
2) We want to know how much time it takes together per something()
that might have many calls to cb funcs, enclose it between probes
outside something(), at its calling time. Calling it again from within
cb func seems to call for trouble. Do we really have such recursion?
We might need to omit enclosing with probes invocation of something()
if its called from cb funcs, or add probes named something_fromcb_this
or something_fromcb_that. Or, compare runtime of something() with
its subpart "part2".

Given we have above two measured, looks like we have all we need to
optimise. We can't really clutter cachemgr output for loaded cache
with too many probes. There we really want to have important areas
covered, like avg time to service FD, time to service disk, but not
something like "this cb func with recursion 5 into itself with 4
different params takes 2379 cpu ticks of time". Developers watch
how much time is wasted in some area of squid, and then add more precise
probe points into that area to pinpoint cpuhog. To find real culprit,
we really don't want to have probes handling recursion, we'd like to
have such code split into nicely readable pieces and find which part
is used too often, or takes too long to run. After its optimised,
we remove precision probes, or we'll get lost ourselves in amount
of cachemgr output.

------------------------------------
Andres Kroonmaa <andre@online.ee>
CTO, Microlink Online
Tel: 6501 731, Fax: 6501 725
Pärnu mnt. 158, Tallinn,
11317 Estonia
Received on Wed Oct 02 2002 - 03:53:43 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:51 MST