Re: [DRAFT][MERGE] Cleanup comm outgoing connections in trunk

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Fri, 25 Jun 2010 15:54:27 +1200

Alex Rousskov wrote:
> On 05/24/2010 03:46 AM, Amos Jeffries wrote:
>> Alex Rousskov wrote:
>>> * Comm::Connection copy constructor does not copy peer correctly OR
>>> Comm::Connection destructor does not clean peer correctly. It may be a
>>> good idea to hide peer_ from public use to ensure it is always
>>> locked/unlocked properly.
>> Okay. Fixed the constructors and hidden the _peer variable behind
>> accessors.
>> One thing though, the set accessor can do cbdata ops fine, is it okay
>> for the retrieve acessor to just return the ptr? leaving
>> cbdataReference*( x.peer() ) to the calling code?
>
> Yes, it is OK. The object storing a pointer should lock/unlock it as
> needed. Accessors should not do that if they just return what was
> already stored.
>
>
>> Which is preferred in this case where the paths is only valid as
>> long as FwdStateData exists? RefCounted or cbdataReference?
>
> I prefer RefCounted if you do not need to separate "valid" from
> "allocated". Looks like that is what your second revision implements.
>
>
>> dialer makes sure the object being dialed still exists, or drops the
>> call. Am I correct?
>
> You are correct. However, when the Call object stores something, we need
> to be careful that this "something" is still valid when the Call object
> accesses it. For example, a Call or Dialer object should not print a
> call argument for debugging if that argument may be invalid. This is a
> general answer to a general question; not specific to your changes.
>
>
>> I may have overlooked somewhere where FwdStateData could be deleted by
>> operations parallel to FwdStateData::start(). Though I believe if that
>> is possible we have a terribly huge race bug somewhere.
>
>
> AsyncJobs should be started using AsyncStart() which never fails. Any
> failures inside AsyncJob::start() should be no different from failures
> inside any other part of the job that is executed asynchronously because
> the initiator code does not really know when start() will be executed.
>
> IIRC, we violate the AsyncStart rule in client_side*.cc (at least).
>
> Again, general answer here, not specific to your code.
>
>
>>> * Please remove ConnectStateData solo connections as
>>> exceptional/different code path. It has already bitten you (see below).
>>> One alternative is to create paths with a single connection instead
>>> (inside the solo constructor which will be the only solo-specific code
>>> then). Another alternative is to add path/solo access methods that hide
>>> the difference from the rest of the ConnectStateData code without the
>>> Paths creation overhead.
>> This was the first model I started with. It runs onto severe problems
>> with connections such as Ident, slow ACL, and DNS where there is no
>> state object to hold the authoritive lifetime of the paths vector.
>
> Neither of the two proposed solutions assume the presence of such an
> object (other than ConnectStateData object itself).
>
>> Its also terribly inefficient allocating a vector, adding the single
>> Comm::Connection object and pushing it, having comm navigate through
>> extra API layers to validate its presence then access it.
>
> The second solution does not allocate a vector.
>
>
>> You will have to straighten me out on this but I think it may be needed
>> by ICAP Xaction when the ICAP is configured with a service name (DNS
>> lookup plus vector of destinations assembled) instead of IP address
>> (singleton).
>
> Not sure. Do "paths" stand for multiple IP addresses with a single DNS
> name? AFAICT, it is not documented or, to be more precise, the paths
> members are documented as being ... paths.

They can do. Paths is at the IP level below the name level of DNS, so
one DNS result in multiple potential connection paths.

We don't yet expand the outgoingAddr set for each of those destinations,
but the potential is becoming present to do so in future.

>
>
>>> * Whenever possible, please declare Pointer parameters and Pointer
>>> return types as const references to avoid unnecessary locking/unlocking.
>>>
>> Unfortunately its not easy to set const on the volatile Comm::Connection
>> objects in outbound connection handling. They get created as
>> might-be-used objects and released on failures. Activating one will
>> shortly result in Comm updating it's properties (transforming wildcard
>> IP to specific on-link IPs etc).
>
> I was not talking about that. I will try to be more specific in the next
> review email.
>
>
>>> * We now have ConnectStateData and ConnStateData public classes.
>>> StateData suffix is already wonderfully descriptive, but this brings
>>> confusion to the next level.
>> Aye. I'm slowly working through comm.cc figuring out what bits are used
>> by which side. ConnStateData will hopefully get a better descriptive
>> name when its scope is clearly known.
>
>
> Let's solve this by renaming ConnectStateData instead. I believe you
> already defined ConnStateData scope. With some polishing touches I would
> restate it as:
>
> /// Async-opener of a Comm connection.
> /// Can find the first "working" among multiple destinations.
>
> If you agree with the above scope, let's s/ConnectStateData/ConnOpener/g
>
>
>>> I think the primary reason for most of these problems is that the new
>>> Connection class has dual personality. It is both holds the information
>>> necessary to establish a connection _and_ the established connection
>>> information. I suggest to split Connection into two classes:
>>>
>>> 1) Create a PeerPort or PeerAddress class that will hold information
>>> necessary to connect to a peer. This can be a cache_peer, an origin
>>> server, and ICAP server, and [in SMP branch] another Squid process.
>>> FwdState and others that need to cycle through peers will create and
>>> pass PeerPorts or PeerAddresses arrays to Comm::Connector class (see
>>> below).
>>>
>>> 2) Connection object will have a peer member of type PeerPort or
>>> PeerAddress. It will also have fd and other fields necessary for an
>>> _established_ connection.
>>>
>>> 3) Rename ConnStateData to Connector and document it as "creates a
>>> Connection to [one of] the supplied PeerPorts". This class will iterate
>>> peer details, try opening connections, and will eventually succeed or
>>> fail. When it succeeds, it will create the corresponding Connection
>>> object and return it to the caller.
>>>
>> This is back to the model we are trying to get away from.
>
> I disagree that what I am proposing is what we should get away from
> (naturally).
>
> The value of your changes is in moving the handling of connection
> retries and multiple destinations from high-level code to a dedicated
> ConnOpener class. I agree that it is the right thing to do. No argument
> there.
>
> I just think it should be done differently when it comes to handling
> possible Connection destinations. I think it is wrong to store a vector
> of unconnected Connections and then use the first element when we can
> store a vector of Destinations and use one resulting Connection instead.
>

Agreed. The Destination being IPs, we end up with an object which stores:
  destination IP + port
  intended local IP + port
  intended TCP / TLS flags
  flags for whether FD is pconn or pinned
  ptr to the peer if a peer link.

... (the fields of Comm::Connection data storage).

using two objects would call for this pathway:
   generated by peer selection, passed to forwarding,
  passed to comm, COPIED to comm with an FD number added, COPIED to
logging, then discarded.

Using Comm::Connection from template stage through to completed
connection with just an update of the FD field (instead of flag + field)
gets rid of one copy already. Next upgrade will be towards getting rid
of the second.
  Along with several other copies on the client conn.

>
>> Your PeerPort class == FwdServer (generated by peer select and handled
>> by FwdStateData when calling ConnectStateData)
>
> They are probably similar, but the current code is not using FwdServer
> for all connection establishment calls. I think it is good to have a

It is. The FwdServer fields get de-constructed into parameters for
multiple comm_* functions in forwarding. Which permeate through comm to
an fde piecemeal.

> single Connection Destination class(es) that all code is using in a
> uniform fashion. Your code does that, but using the wrong class
> (full-blown Connection instead of small and specific Destination).
>
>
>>> * Will Comm::Connection be used for incoming connections as well? If
>>> not, the name is probably too generic. Should be something like
>>> PeerConnection. Or perhaps we do not need this class at all?! If it is
>>> only used as an information holder during callback (after the above
>>> PeerPort changes) then the callback data should be modified to carry
>>> that information instead of adding a new class.
>>>
>> see comm flow description above.
>
>
> I still think the above concern is valid. I hope that the problems with
> the current implementation (a collection of unconnected connections) are
> clear enough. I tried to clarify above that I am not proposing to going
> back to "handle everything in high-level code" model. I only want to
> change the connection destination type (solo and paths) from
> "unconnected Connection" to "Connection Destination".

We have a miss-communication here.
You describe a concern:

>>> * Will Comm::Connection be used for incoming connections as well?

Answer being: Yes it is used for incomming.

>>> If not, the name is probably too generic. Should be something like
>>> PeerConnection.

Therefore renaming (on these particular grounds) is not grounded.

>>> Or perhaps we do not need this class at all?! If it is
>>> only used as an information holder during callback (after the above
>>> PeerPort changes) then the callback data should be modified to carry
>>> that information instead of adding a new class.

There are several different ways this same identical data is being used,
fields of ConnStateData, fde, HttpRequest meta fields, AccessLogEntry
fields. function parameters passed around piece-meal with some complex
lookups needed when any given piece of the data is not passed over some
middle section of code (comm_local_port() etc).

This is an aim to normalize the particular and small TCP link data
end-to-end across the squid processing pathways without copying.

I have not yet fully merged fde in yet, but when done the fd_table can
be a table/vector/map of pointers to Comm::Connection which have been
opened.

>
> Review of your latest patch will follow in a few hours.

Responding to particular changes in that emails response.

Amos

-- 
Please be using
   Current Stable Squid 2.7.STABLE9 or 3.1.4
Received on Fri Jun 25 2010 - 03:54:45 MDT

This archive was generated by hypermail 2.2.0 : Sat Jun 26 2010 - 12:00:07 MDT