Re: [PREVIEW] cache_peer standby=N

From: Amos Jeffries <squid3_at_treenet.co.nz>
Date: Sun, 09 Feb 2014 00:34:22 +1300

On 8/02/2014 8:57 p.m., Alex Rousskov wrote:
> On 02/07/2014 11:33 PM, Amos Jeffries wrote:
>> On 8/02/2014 10:14 a.m., Alex Rousskov wrote:
>>> The feature focus is to instantly provide a ready-to-use connection to a
>>> cooperating cache peer, virtually at all times. This is useful when
>>> connection establishment is "too slow" and/or when infrequent peer use
>>> prevents Squid from combating slow connection establishment with the
>>> regular idle connection pool.
>
>>> Max-conn=M with standby=N
>>> -------------------------
>>>
>>> The relationship between the max-conn limit and standby/idle connections
>>> is a complex one. After several rewrites and tests, Squid now obeys
>>> max-conn limit when opening new standby connections and accounts for the
>>> standby connections when checking whether to allow peer use. This often
>>> works OK, but leads to standby guarantee violations when non-standby
>>> connections approach the limit.
>>
>> This tells me the "guarantee" is wrong.
>>
>> The statement above "All N standby connections are not opened at once"
>> also indicates the impossible nature of such guarantees.
>
> Indeed. The patch does not use that terminology. I will try to avoid,
> qualify, or at least quote "guarantee" during future discussions. Do you
> know of a guarantee that is and will be always valid? [rhetorical]
>
>
>> A guarantee of "up to N" rather than "N at (virtually) all times" would
>> not have neither of these violation problems.
>
> And would not be much of a guarantee either since "up to N" includes
> zero :-).
>
> In practice, the code does provide N standby connections in most cases,
> given a reasonable configuration and other usual preconditions, but no
> firm guarantees of any kind, of course.
>
>

 :-)

>>> Added reconfiguration support to RegisteredRunner API
>>> -----------------------------------------------------
>>>
>>> Added RegisteredRunner::sync() method to use during Squid
>>> reconfiguration. The existing run() method and destructor are great for
>>> the initial configuration and final shutdown, but do not work well for
>>> reconfiguration when you do not want to completely destroy and then
>>> recreate the state. The sync() method (called via SyncRegistered) can be
>>> used for that.
>>>
>>> Eventually, the reconfiguration API should present the old "saved"
>>> config and the new "current" config to RegisteredRunners so that they
>>> can update their modules/features intelligently. For now, they just see
>>> the new config.
>>>
>>> If the feature is merged, this change will go in a separate sub-commit.
>
>
>> My understanding of the plan for runners was that it would be more
>> appropriate for a pre-config runner which saved the state prior to
>> config parsing and a post-config runner which sync'd them together.
>>
>> I see no need for a sync() method in that model. Two runners with run()
>> would do it just fine.
>
> I doubt it is a good idea to handle [re]configuration of a single module
> using two different objects: initial configuration (runner1),
> pre-reconfiguration (runner2), post-reconfiguration (runner2), final
> shutdown (runner1). Those two objects will need access to the same data
> in most cases; why separate them? The two objects will also need two
> different registries, creating two parallel registry sets, even though
> we will probably keep the calling order in the sets the same.
>
> IMO, we should use just one runner object, responsible for module
> [re]configuration. To use one object, we need at least one new method
> (or add run() method parameters, but that is icky!). We can convert that
> into two objects/registry sets later if really needed.

You mean parameters like which registry/hook is being run?
As seen in src/base/RunnersRegistry.h line 51:

    virtual void run(const RunnerRegistry &r) = 0;

The child classes could of course have methods handling the different
registry points differently if they need to. But that is child
implementation not needing multiple entry points into the same registry
hook.

The tricky (or not) part would just be ensuring that two or more
registries contained entries to the same object and that destruct of the
registries did not step on each others feets.

>
> You are right that when we start to support modules that need to save
> their state before reconfiguration (apart from Squid saving the global
> configuration object), we will need to add another, fourth method. For
> example:
>
> run()
> save()
> sync()
> destructor
>
> I did not want to add save() now because the PconnPool module does not
> need it now (and I prefer not to design code without specific use
> cases), but I did think about this and convinced myself that adding
> sync() is not going to make things worse for those addressing
> pre-reconfiguration needs in the future. They would just add another
> method and the corresponding activation code in main.cc.

Sure.

>
> Does this explanation address your concern?
>

Not yet.

>
>
>>> Other seemingly unrelated changes triggered by standby=N addition
>>> -----------------------------------------------------------------
>>>
>>> * Removed PconnPool from fde.h. We used to create immortal PconnPool
>>> objects. Now, standby pools are destroyed when their peer is destroyed.
>>> Sharing raw pointers to such pools is too dangerous. We could use smart
>>> pointers, but PconnPools do not really belong to such a low-level object
>>> like fde IMHO.
>>
>> Agreed.
>>
>> You can still use smart pointers for the CachePeer members to avoid
>> explicit memory management code in CachePeer.
>
> Yes, the patch already uses a CbcPointer for the manager job. Do you
> think I should make PconnPool refcountable? I am a little worried what
> that would do to the old global pools -- the old code may not be ready
> for their destruction at Squid exit time...

Being RefCountable (ie having un/lock() methods and counter) makes no
imposition on objects which are not pointed to by RefCount smart
pointers (ie globals or local variable instances).

If the global pool is going to be used in smart Pointers or references
to it are, then it does need to have the global reference be a Pointer
as well to prevent early destruction. That is the only implication on
globals.

NP: at Squid exit time the FD table sockets will have already been
forcibly closed by the broken shutdown process, invalidating the pool
entries without freeing their memory or setting their FD values to -1.
Any use of the pool that late is already seriously broken.

>
>>> * Added FwdState::closeServerConnection() to encapsulate server
>>> connection closing code, including the new noteUses() maintenance. Also
>>> updated FwdState::serverClosed() to do the same maintenance.
>>>
>>> * Encapsulated commonly reused ToS and NfMark code into
>>> GetMarkingsToServer(). May need more work in FwdState where some
>>> similar-but-different code remains.
>>
>> Please merge that as a separate project.
>
> Do you want me to commit the above two changes now? Or do you need me to
> post them as a stand-alone patch for a separate review?

I dont think another review is necessary. The patch segments related to
them look okay (sans that noteUses() bit you highlighted as new for the
pools change of course).

Amos
Received on Sat Feb 08 2014 - 11:34:42 MST

This archive was generated by hypermail 2.2.0 : Sun Feb 09 2014 - 12:00:10 MST