Re: squid-3.HEAD IPAddress leak

From: Amos Jeffries <squid3@dont-contact.us>
Date: Fri, 8 Feb 2008 15:52:58 +1300 (NZDT)

> On Fri, 2008-02-08 at 11:46 +1300, Amos Jeffries wrote:
>
>> The basics:
>>
>> addrinfo* extends the neutral sockaddr* type ( union { sockaddr_in,
>> sockaddr_in6, sockaddr_storage} ) while adding the size values and
>> moving the flags and protocol details.
>
> Hi Amos,
>
> I understand that we need to support IPv4 and IPv6 addresses. I do
> not understand why we are suddenly talking about pointers and trees?
>
> I can understand if wherever Squid2 was using an IPv4 address, Squid3
> would use an (ip4, ip6, whichOneFlag) class with appropriate methods
> that check the flag and return or set the right thing. I know that would
> not be the best way to implement it, but it would work and I would
> understand what is going on.

The protocol extensions defined to replace sockaddr_in and in_addr went
through three generations of 'migration-pathway' attempts.
The first;
  sockaddr* defines a union type that was nasty for casting and did not
include that flag you mention in the structure itself. Most of the
accept(), connect(), bind() etc still use this as their input type.

The second;
 sockaddr_storage (as Husni uses, and Adrian mentioned) was created to
provide a better way of using sockaddr* so the sockaddr_in and
sockaddr_in6 bits could be read-written easily. But the big/litte endian
problems between OS screwed up the sockaddr_in* sa_family field locations
inside it so developers still can't portably use it for the v6/v6 flag
they wanted.

The third;
  addrinfo* defines a whole new type. Wrapping that old sockaddr* mess and
providing in a nice set fo bells and whistles for use. Most importantly
that flag we need to pas to the system calls.

It is in _exactly_ the class you mention. But in struct form so C people
can use it too.

>
> Can you explain AddrInfo in the above simple (ip4, ip6, whichOneFlag)
> class terms?

There are a few extra bits I'v snipped from this explanation.

The important bits for the squid comm code are:

struct addrinfo
{
... here is that flag you want to switch on if its been set properly.
... AF_INET (IPv4) or AF_INET6 (IPv6) or AF_UNSPEC (NULL, error, 'whatever')
  int ai_family;

... some bits needed only by the socket() creation calls
  int ai_socktype;
  int ai_protocol;

... and the actual data for the IP address:

... (1) pointer to somewhere holding the sockaddr_storage/*_in/*_in6
  struct sockaddr *ai_addr;

... (2) and the sizeof() for the (1) object.
... if the flag is not set this _might_ be magic'd instead.
  int ai_addrlen;

... along with some bits set when ia DNS lookup has been done
... _sometimes_ it's the host FQDN
... optional so we never new/free this ourselves in squid.
  char *ai_canonname;

... and because some system lookups in IPv6 can return _multiple_ addresses
... it has a linked-list of branchs for each of the alternatives
  struct addrinfo *ai_next; /* Pointer to next in list. */
};

> Why do we need AddrInfo now

To kill dozens of castings, magic operations on object sizes, and cut down
dozens of lines of 'compatibility code'.

> and why are we talking about a
> "tree" with "pointers" and alloc/free operations?

I see a tree of pointers I call it a tree even if its a class-1 (dynamic
linked-list).

addrinfo - makes a tree node.
   ai_addr (a sockaddr pile of unfortunate structs)
   ai_canonname (a null-terminated string),
   ai_next (another addrinfo .. repeat ad inifinitum)

Together they make a pretty tree. But every used piece is eseentially
another new, memset, free.

HTH.

>
> Did Squid2 use an equivalent of AddrInfo? If yes, did it use it as often
> as Squid3 does now? Did Squid2 allocated and freed it?

What I've seen of Husni's final 2.6s13 patch he used addrinfo in many of
the same system-interface places and uses the naked system
getaddrinfo()/freeaddrinfo() calls where I use my GetAddrInfo() wrappers
to the system ones.
Then where he is passing around either sockaddr* or sockaddr_storage*
where I universally use IPAddress& .

There are a few spots where a 'read-only' variant might be possible. So as
to pretend to allocate when no new/free is actually done, just a pointer
to the IPAddress field and a local addrinfo.
One of them is this connect() IFF that bug can be guaranteed not to
overwrite the internal IPAddress data and warp squid.

I may have some time to do this tonight. Bug me about it if I'm on.

Amos

>
> Thank you,
>
> Alex.
>
>
>> I'm using it in three ways.
>>
>> First Usage: [simplest via InitAddrInfo() paired with an assignment
>> somewhere ]
>> - to create an empty addrinfo tree root and pass to the system to be
>> filled out. The results are copied to the fde or xxData objects for
>> later
>> async use.
>>
>>
>> Second usage: [ hiding complex setup via GetAddrInfo() ]
>> - to create addrinfo tree containing the current IPAddress so system
>> or
>> squid can use the structure fields in some comm call.
>> Variant 1: addrinfo provides: AF_*, sockaddr_storage*, and sizeof()
>> results in one package with fixed field names/locations for squid to
>> send individually to a system call without mode-specific #ifdef's.
>> Variant 2: whole addrinfo goes to system for use.
>>
>>
>> Third Usage: [ hiding complex usage via GetAddrInfo() paired with an
>> assignment somewhere ]
>> - to create addrinfo tree containing the current IP and socket
>> information so system can update/add its knowledge of the FQDN,
>> hostname, alternate IPs, socket options, or original destination (in NAT
>> case).
>> (both above Variantions apply here too)
>> - which then gets saved somewhere for squid to use later.
>>
>>
>> All three uses above end up with a tree of dynamic data squid is
>> expected
>> to cleanup. So all uses are terminated with a FreeAddrInfo() to perform
>> that cleanup.
>>
>> >
>> > There's a netural sock addr size type - its called sockaddr_storage.
>> > addr_info is for return results from hostname<->ip queries..
>> >
>>
>> struct addrinfo extends struct sockaddr_storage to handle multiple IPs,
>> hostnames, and removes the need for many type-castings in user-space
>> code.
>>
>> Amos
>>
>> >
>> > Adrian
>> >
>> >
>> > On Fri, Feb 08, 2008, Amos Jeffries wrote:
>> >> > On Thu, Feb 07, 2008, Robert Collins wrote:
>> >> >
>> >> >> > Still, this is one of those "death of a thousand cuts" method of
>> >> >> > killing performance..
>> >> >>
>> >> >> Right, I haven't seen the commit; care to mail the diff?
>> >> >
>> >> > Which? I just looked at the places where Amos is calling
>> GetAddrInfo()
>> >> > and FreeAddrInfo(); more then one are:
>> >> >
>> >> > * GetAddrInfo(temp, );
>> >> > * F->{something} = temp;
>> >> > * FreeAddrInfo(temp);
>> >>
>> >> Where?!
>> >>
>> >> I hate addrinfo enough that I only added addrinfo where it needed:
>> >>
>> >> GetAddrInfo(temp)
>> >> syscall_needing_neutral_sockaddr_size_family(temp)
>> >> FreeAddrInfo(temp)
>> >>
>> >> OR:
>> >>
>> >> GetAddrInfo(temp)
>> >> syscall_altering_addrinfo(temp)
>> >> F->something = temp;
>> >> FreeAddrInfo(temp)
>> >>
>> >>
>> >> Amos
>> >>
>> >
>> > --
>> > - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid
>> > Support -
>> > - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA
>> -
>> >
>>
>
>
Received on Thu Feb 07 2008 - 19:53:07 MST

This archive was generated by hypermail pre-2.1.9 : Sat Mar 01 2008 - 12:00:09 MST