Re: hi-res profiling

From: Henrik Nordstrom <hno@dont-contact.us>
Date: Wed, 2 Oct 2002 13:36:03 +0200

Robert Collins wrote:

> I disagree. Squid will be running in relocatable in most every OS, so
> making the data static will not enabl direct memory access. The best it
> will do is make the lookup DS+constant rather than
> pointervalue+constant. These two ops should be pretty dang close in
> performance, and the tick retrieving call may prevent them having any
> impact anyway. Having a pointervalue lookup is good because then
> essentially no code changes when you introduce thread support - it
> simply becomes the value of the pthread_key. While that particular code
> path will have slightly higher overhead, it shouldn't go into kernel
> space, only into libc to retrieve the thread information block.

At least all UNIXes runs with a flat memory space within the process., with
some limited relocation for the application at link time. Libraries partly
dynamically relocated via, but only code segments.. (data segments are
relocated at dynamic linking time by rewriting references)

A staticall defined array with constant indexes (inlined functions) will
translate to direct memory accesses.

A dynamically allocated array with dynamic indexes (not inlined functions)
translates to a indirect loopup via the pointer address plus calculated entry
offset plus field offset.

Some exampel gcc -O2 output:

struct some_struct {
        int f1;
        int f2;
};

struct some_struct a1[24];

struct some_struct *a2;

// Direct array acess, constant index

        a1[12].f2 = 24;
 8048408: c7 05 44 96 04 08 18 movl $0x18,0x8049644
 804840f: 00 00 00

// Pointer array access, constant index
        a2[12].f2 = 24;
 8048417: a1 a0 96 04 08 mov 0x80496a0,%eax
 804841c: c7 40 64 18 00 00 00 movl $0x18,0x64(%eax)

// Pointer array access, variable index
        a2[i].f2 = 24
 8048428: a1 a0 96 04 08 mov 0x80496a0,%eax
 804842d: c7 44 d8 04 18 00 00 movl $0x18,0x4(%eax,%ebx,8)
 8048434: 00

Keychart:
        &a1 = 0x80495E0
        &a2 = 0x80496a0
        24 = 0x18.
        (int) 24 = 0x18 0x00 0x00 0x00
        sizeof (struct some_struct) = 8

If the size of the structure is not a power of 2 then situation gets worse for
the variable index case..

struct some_struct2 {
        int f1;
        int f2;
        int f3;
};

struct some_struct2 *a3;

        a3[i].f2 = 24;
 8048435: a1 c0 96 04 08 mov 0x80496c0,%eax
 804843a: 8d 1c 5b lea (%ebx,%ebx,2),%ebx
 804843d: c7 44 98 04 18 00 00 movl $0x18,0x4(%eax,%ebx,4)
 8048444: 00

struct some_struct3 {
        int v[13];
};

struct some_struct3 *a4;

        a4[i].v[2] = 24;
 8048428: 8d 04 5b lea (%ebx,%ebx,2),%eax
 804842b: 8d 04 83 lea (%ebx,%eax,4),%eax
 804842e: 8b 15 c8 96 04 08 mov 0x80496c8,%edx
 8048434: c7 44 82 08 18 00 00 movl $0x18,0x8(%edx,%eax,4)
 804843b: 00

While the constant index to static array is obviously still a direct memory
access

struct some_struct3 a5[24];

        a5[12].v[2] = 24;
 804843c: c7 05 58 99 04 08 18 movl $0x18,0x8049958
 8048443: 00 00 00

Regards
Henrik
Received on Wed Oct 02 2002 - 05:36:07 MDT

This archive was generated by hypermail pre-2.1.9 : Tue Dec 09 2003 - 16:16:51 MST