Re: profiling aufs

From: Andres Kroonmaa <andre@dont-contact.us>
Date: Mon, 7 Oct 2002 15:11:29 +0300

On 7 Oct 2002, at 20:59, Robert Collins <robertc@squid-cache.org> wrote:

> On Mon, 2002-10-07 at 20:34, Henrik Nordstrom wrote:
> > On Monday 07 October 2002 11.00, Robert Collins wrote:
> >
> > > This isn't true. IFF the worker threads do not use -any- common
> > > profiled calls, then it works, otherwise we may get xmalloc timings
> > > trashed (for example).
> >
> > xmalloc isn't exacly threads safe today if any of the
> > trace/debug/leakcheck xmalloc options is enabled..
>
> Yeah, and this is a minor problem.

 Wait a minute. We use libmalloc thats in 8 cases out of 10 not safe
 for performance reasons, we use sprint, fprint and global vars, we
 use exit(), etc.. Making this tiny piece threadsafe is not piece of
 cake, imo.
 Main problem is that whole coding inside squid is not threadsafe,
 and there's unwritten rule that anything thats used in threads is
 on its own, ie. you'd need to write special threadsafe versions of
 what you'd need. That puts thread safeness apart from current squid
 code, it's limited to only funcs used in threads.

> > And I agree that having "threads safe" profiling is a good idea and
> > needs to be investigated. I think a good start would be to allow for
> > separate profiling timers. Having a full profiling block per therad
> > is most likely overkill.
>
> Probably, but the overhead is relatively small, and it seemed simpler to
> simply use a consistent struct across the board.

 I'd agree with this. But pthread_getspecific() would add runtime
 overhead, maybe be quite notable. Passing pointer to timers with
 threadsp struct at init and keep it on stack should allow better
 inlining where possible (global timers).

> > A interesting question is how to perform any
> > meaningful statisticts on per-thread timers.. the actual timer itself
> > is only a small part, you also need a readout of the timer.
>
> I believe I already addressed this. Unless we are interested in the
> performance within each thread (I don't think we are), we can simply
> collate the statistics each profile event.

 To collate you need exclusive access to thread-specific data.
 Unless you can somehow stop all threads in a known state, that
 requires mutexes for every access to this data. Omiting mutexes
 is thread unsafe and will lead to regularly garbaged results.
 Adding mutexes adds huge runtime overhead. We could add command
 to readout timers from each thread though.. Probably even possible,
 but would stop main thread until all queued aio ops are completed,
 ie. pretty intrusive mod.

------------------------------------
 Andres Kroonmaa <andre@online.ee>
 CTO, Microlink Online
 Tel: 6501 731, Fax: 6501 725
 Pärnu mnt. 158, Tallinn,
 11317 Estonia
Received on Mon Oct 07 2002 - 06:20:43 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:53 MST