Re: 2nd-biggest Squid performance problem? from Andres Kroonmaa on 2001-09-28 (squid-dev)

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Fri, 28 Sep 2001 12:36:13 +0200

..were talking of cpuProfiling branch..

On 27 Sep 2001, at 15:11, Jon Kay <jkay@pushcache.com> wrote:

> Now, I am trying to understand which probe name corresponds to what
> part of Squid. Is there a probe that roughly corresponds with read()
> calls, the rough correspondent of commHandleWrite?.

no. I was trying to identify cpuhogs more generally. Closest to read
is comm_read_handler probe. but it potentially includes also ACL checks,
DNS socket reads, etc. But mostly netreads.

> Good business looking at prep_pfds. Is that what it looks like, the
> loops in which the select() fdsets are set up?

Yes, only poll() here. But not only pure fdsets stuff. It also includes
commDeferRead() checks overhead and delay_pools overhead if compiled in
(isn't in my sample case). Still, its all prep_pfds overhead.
Worst figure I've seen so far was at ~30% cpu. I think thats about
max possible when cpu flattens out.

> Is handle_ready_fd the corresponding figuring out which fds are ready?

no. handle_ready_fd also includes calling callbacks for ready FD's.
So it is total time spent servicing all ready FD's after poll.

> What is UNACCOUNTED? Any idea where the extra 30% comes from (I know,
> it's hard to nail that down, I only got it to within 12% after a year
> of work, I'm mostly just randomly curious)?

UNACCOUNTED is time during which no probe is active. Its there just to
notice that some overhead is not under probes. During startups this
time is pretty high at 90%, for eg. I haven't bothered to probe it.

There are many overlapping probes. For eg. handle_ready_fd probe is a
sum of comm_write_handler and comm_read_handler. Also, commHandleWrite
is just a subset of comm_write_handler, it may be diskHandleWrite too.
This overlapping causes totals to become over 100%.

> Pity we can't get at what part of the select call is overhead and what
> part is idle time.

I can show separate tests for poll() overhead.

> I'm surprised that read_handler is so small, my uneducated guess would
> have pegged it at 50% higher.

It is when you have some ACLs. This box has almost no ACLs.
If you mean pure read() times, then things get worse when there are
thousands of concurrent sessions involved. Also, this proxy doesn't
handle any disk io in read_handler. net io is fast on average. Its
only when alot of work is done when it cumulates to notable amouts.

------------------------------------
Andres Kroonmaa <andre@online.ee>
CTO, Microlink Online
Tel: 6501 731, Fax: 6501 725
Pärnu mnt. 158, Tallinn,
11317 Estonia
Received on Fri Sep 28 2001 - 04:43:14 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:14:22 MST