Re: more profiling from Adrian Chadd on 2006-09-19 (squid-dev)

From: Adrian Chadd <adrian@dont-contact.us>
Date: Tue, 19 Sep 2006 21:10:51 +0800

On Tue, Sep 19, 2006, Gonzalo Arana wrote:

> Is hires profiling *that* heavy? I've used it in my production squids
> (while I've used squid3) and the overhead was neglible.

It doesn't seem to be that heavy.

> There is a comment in profiling.h claiming that rdtsc (for x86 arch)
> stalls CPU pipes. That's not what Intel documentation says (page 213
> -numbered as 4-209- of the Intel Architecture Software Developer
> Manual, volume 2b, Instruction Reference N-Z).
>
> So, it should be harmless to profile as much code as possible, am I right?

Thats what I'm thinking! Things like perfsuite seem to do a pretty good job
of it without requiring re-compilation as well.

> This could be automatically done by the compiler, if the profile probe
> was contained in an object. The object will get automatically
> destroyed (and therefore the profiling probe will stop) when the
> function exits.

Cute! It'd still be a good idea to explicitly state beginning/end where
appropriate. What might be nice is a "i was deallocated at the end of the
function rather than being deallocated explicitly" counter so things
could be noted?

>
> We could build something like gprof call graph (with some
> limitations). Adding this shouln't be *that* difficult, right?
>
> Is there interest in improving the profiling code this way? (i.e.:
> somewhat automated probe collection & adding call graph support).

It'd be a pretty interesting experiment. gprof seems good enough
to obtain call graph information (and call graph information only)
and I'd rather we put our efforts towards fixing what we can find
and porting over the remaining stuff from 2.6 into 3. We really
need to concentrate on fixing up -3 rather than adding shinier things.
Yet :)

I'm going to continue doing microbenchmarks to tax certain parts of
Squid (request parsing, reply parsing, connection creation/teardown,
storage memory management, small/large object proxying/caching,
probably should do some range request tests as well) to find the really
crinkly points and iron them out before the -3 release.

Bout the only really crinkly point I see atm is the zero-sized reply
stuff. I have a sneaking sense that the forwarder code is still slightly
broken.

Adrian

Adrian
Received on Tue Sep 19 2006 - 07:10:47 MDT

This archive was generated by hypermail pre-2.1.9 : Sun Oct 01 2006 - 12:00:06 MDT