Re: profiling aufs

From: Robert Collins <robertc@dont-contact.us>
Date: 07 Oct 2002 23:49:41 +1000

On Mon, 2002-10-07 at 22:11, Andres Kroonmaa wrote:

> > > xmalloc isn't exacly threads safe today if any of the
> > > trace/debug/leakcheck xmalloc options is enabled..
> >
> > Yeah, and this is a minor problem.
>
> Wait a minute. We use libmalloc thats in 8 cases out of 10 not safe
> for performance reasons, we use sprint, fprint and global vars, we
> use exit(), etc.. Making this tiny piece threadsafe is not piece of
> cake, imo.
> Main problem is that whole coding inside squid is not threadsafe,
> and there's unwritten rule that anything thats used in threads is
> on its own, ie. you'd need to write special threadsafe versions of
> what you'd need. That puts thread safeness apart from current squid
> code, it's limited to only funcs used in threads.

Which is why xmalloc thread safeness is a minor problem IMO.

> > Probably, but the overhead is relatively small, and it seemed simpler to
> > simply use a consistent struct across the board.
>
> I'd agree with this. But pthread_getspecific() would add runtime
> overhead, maybe be quite notable. Passing pointer to timers with
> threadsp struct at init and keep it on stack should allow better
> inlining where possible (global timers).

Any sane pthread_getspecific will simply be a lookup into the thread
parameter block, which should be -very- fast as long as it's not
gatewaying into the kernel. This should be easy enough to test also.
Keeping it on the stack will require passing it to every function in the
thread (yuck), or making it a member of an object (requires a struct
common to all in-thread calls either as a has-a or is-a). Either is
doable, but more intrusive into the user code.
 
> > I believe I already addressed this. Unless we are interested in the
> > performance within each thread (I don't think we are), we can simply
> > collate the statistics each profile event.
>
> To collate you need exclusive access to thread-specific data.
> Unless you can somehow stop all threads in a known state, that
> requires mutexes for every access to this data. Omiting mutexes
> is thread unsafe and will lead to regularly garbaged results.

Actually, there are solutions to avoid the need for tight
syncronization. As you suggest, extracting the stats from the thread is
doable. I'd suggest something like this:

every 5 seconds the thread queue management code switches the stats
struct to a new pointer. The old struct is inserted (thread safe -
interlocked or mutex) into a list of pending structs, with the
timestamp. The new struct is retrieved from a list of used structs.
Then the main event simply grabs the entire pending list (again, a
single interlocked call) and adds to the appropriate stats entries (with
a maximum latency of 10 seconds, unless the disk code really really
sucks :}. I forget the name for this but it's a variant of code used to
update widely used readonly structs in threaded systems.

Rob

Received on Mon Oct 07 2002 - 07:49:45 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:53 MST