Re: squid-3 update from Andres Kroonmaa on 2002-10-30 (squid-dev)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Wed, 30 Oct 2002 09:55:07 +0200

On 29 Oct 2002, at 20:11, Adrian Chadd <adrian@squid-cache.org> wrote:

> > > I have the call graphs available from gprof if anyone is interested.
> >
> > gprof? isn't it showing you complete beans? thats the whole reason
> > I wrote profiling branch.
>
> Complete beans? I'll take a look at the profiling branch, but the
> call graphs are the interesting bits I'm after.

yes, gprof has granularity of systick, 10ms. It simply counts number
of times any func is called, and divides 10ms time evenly between them.
So if you have 20k calls to some acl or mempool that takes about 200
cpu ticks each, and then sleep in poll for 9.8ms, gprof will account
99% of time to wrong stuff. In reality it may try to account for poll
somehow, but if its own time granularity is 10ms, it can never say
how long poll took exactly. It has no way to be more precise. Besides,
it adds considerable overhead, it tracks recursion to know parent/child
relations. All it gives you is proportion of number of calls to each
other and nice call graphs. Actual time spent anywhere is wrong.

btw, as I understand, profiling branch is merged into head now.
combining both gprof and highres profiling might be a good pick.

> > I'd speculate that above says: there are 2-3 fsb requests outstanding per
> > data_mem_refs (p0/p1), which is roughly 25% of cputime for each cpu, and
> > that there is roughly 16-20% cputime spent on all data references, both
> > cache hits and misses.
>
> Hm. I'd probably do it with a single CPU, it might make more sense.
> Squid shouldn't be using an even amount of time per CPU unless the pthreads
> in aufs are taking more CPU than I remember them.

ah, I didn't mention that I have 2 squids running also. but single
squid would be shuffled around between cpus and summ would be total
for single squid. not much difference.
My concern is that based on counters, my fsb is utilised 60-150%
which makes me "hmm". 16-20% of cpu time is 90-140% fsb time. By
sampling separately that cpucache hitrate is about 20%, we could
account for that, which roughly yields 100% fsb sustained utilisation.
Not good at all, but not very suprising either.

> > There is also this funny counter: counter of ticks while cpu is NOT halted.
> > its at about 50% on my box. So, its either that we are waiting for io, or
> > cpu is stalled for whatever other reasons.
>
> Curious.

actually I recall that cpu can be halted for only single reason -
no work to do. So it must be iowait or thread scheduling wait.

------------------------------------
Andres Kroonmaa <andre@online.ee>
CTO, Microlink Online
Tel: 6501 731, Fax: 6501 725
Pärnu mnt. 158, Tallinn,
11317 Estonia
Received on Wed Oct 30 2002 - 01:03:47 MST

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:17:02 MST