Re: [PREVIEW] cache_peer standby=N

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Sat, 08 Feb 2014 22:41:00 -0700

On 02/08/2014 04:34 AM, Amos Jeffries wrote:

>>> My understanding of the plan for runners was that it would be more
>>> appropriate for a pre-config runner which saved the state prior to
>>> config parsing and a post-config runner which sync'd them together.
>>>
>>> I see no need for a sync() method in that model. Two runners with run()
>>> would do it just fine.

>> I doubt it is a good idea to handle [re]configuration of a single module
>> using two different objects: initial configuration (runner1),
>> pre-reconfiguration (runner2), post-reconfiguration (runner2), final
>> shutdown (runner1). Those two objects will need access to the same data
>> in most cases; why separate them? The two objects will also need two
>> different registries, creating two parallel registry sets, even though
>> we will probably keep the calling order in the sets the same.
>>
>> IMO, we should use just one runner object, responsible for module
>> [re]configuration. To use one object, we need at least one new method
>> (or add run() method parameters, but that is icky!). We can convert that
>> into two objects/registry sets later if really needed.

> You mean parameters like which registry/hook is being run?
> virtual void run(const RunnerRegistry &r) = 0;

Not exactly. It is not a very good idea to use the existing registry
parameter for this purpose because the runner object should not really
know or care exactly which registry object it is being run by. The
mapping between runners and registries and the list of available
registries itself is a different layer that individual runner objects
should not deal with. We currently use simple global enum instead of
true registries so it is tempting to start writing "if r is
rrFinalizeConfig" statements, but that kind of knowledge should be kept
at a different layer IMO.

What I meant by that "icky" solution is adding a new parameter that
tells the run() method for which configuration stage it is being called:

    enum { cfgInitial, cfgSave, cfgSync } CfgStage;
    virtual void run(const RunnerRegistry &, CfgStage stage) = 0;

This is, of course, very C-ish and icky on many levels. A dedicated
method for each configuration stage is a good way to do it.

> The child classes could of course have methods handling the different
> registry points differently if they need to. But that is child
> implementation not needing multiple entry points into the same registry
> hook.
>
> The tricky (or not) part would just be ensuring that two or more
> registries contained entries to the same object and that destruct of the
> registries did not step on each others feets.

I think there are two correct ways to model this:

A) Focus on configuration state stored or managed by the runner. Any
configuration adjustments go through the runner object. The runner class
has a method dedicated to each kind of adjustment. This is the model I
have proposed.

B) Focus on always pairing configuration events: Initial configuration
and final destruction; pre-configuration "save" and post-configuration
"sync". This is the model you will get if you limit runners to a single
run() method (plus destructor) and add reconfiguration-specific registries.

For the current Squid code, I believe (A) has the right focus and is a
more "natural" solution. (B) adds strong pairings that are not really
essential for a lot of code and also increases complexity by requiring
more registries that are already tricky to order and document.

The only potential problem with (A) that I can think of is that it
implies the same registry order for both initial configuration and
reconfiguration. We might eventually run into an unnatural situation
where the initial configuration order for registries has to differ from
their reconfiguration order. Until that happens (and it may never happen
AFAICT), this single-order property should be viewed as a strength: we
only need to get the registries order right once!

I hope the above detailed dissection of the problem convinces you that
the proposed approach with new methods (i.e., A) is a good one. If not,
please explain why you think (B) or any other approach is better.

>>>> Other seemingly unrelated changes triggered by standby=N addition
>>>> -----------------------------------------------------------------
>>>>
>>>> * Removed PconnPool from fde.h. We used to create immortal PconnPool
>>>> objects. Now, standby pools are destroyed when their peer is destroyed.
>>>> Sharing raw pointers to such pools is too dangerous. We could use smart
>>>> pointers, but PconnPools do not really belong to such a low-level object
>>>> like fde IMHO.
>>>
>>> Agreed.
>>>
>>> You can still use smart pointers for the CachePeer members to avoid
>>> explicit memory management code in CachePeer.
>>
>> Yes, the patch already uses a CbcPointer for the manager job. Do you
>> think I should make PconnPool refcountable? I am a little worried what
>> that would do to the old global pools -- the old code may not be ready
>> for their destruction at Squid exit time...

[...]
> If the global pool is going to be used in smart Pointers or references
> to it are, then it does need to have the global reference be a Pointer
> as well to prevent early destruction.

Exactly! And that global Pointer is going to destruct the object when
Squid exits. As I said, I am worried that the old code may not be ready
for "global" pools destruction at Squid exit time. Currently, we never
destruct them.

> NP: at Squid exit time the FD table sockets will have already been
> forcibly closed by the broken shutdown process, invalidating the pool
> entries without freeing their memory or setting their FD values to -1.
> Any use of the pool that late is already seriously broken.

Yes, but since we have all seen many low-level at-exit problems with
various Comm-related calls, I hesitate opening that pandora box for
global pconn pools without a pressing need to do so. Conceptually, there
is nothing wrong with using refcounting for pools, of course.

>>>> * Added FwdState::closeServerConnection() to encapsulate server
>>>> connection closing code, including the new noteUses() maintenance. Also
>>>> updated FwdState::serverClosed() to do the same maintenance.
>>>>
>>>> * Encapsulated commonly reused ToS and NfMark code into
>>>> GetMarkingsToServer(). May need more work in FwdState where some
>>>> similar-but-different code remains.
>>>
>>> Please merge that as a separate project.
>>
>> Do you want me to commit the above two changes now? Or do you need me to
>> post them as a stand-alone patch for a separate review?
>
> I dont think another review is necessary. The patch segments related to
> them look okay (sans that noteUses() bit you highlighted as new for the
> pools change of course).

OK, I will work on committing those changes separately per your request.

Thank you,

Alex.
Received on Sun Feb 09 2014 - 05:41:09 MST

This archive was generated by hypermail 2.2.0 : Sun Feb 09 2014 - 12:00:10 MST