Re: [RFC PATCH] Re: PAC serving

From: Alex Rousskov <rousskov_at_measurement-factory.com>
Date: Fri, 16 Jul 2010 10:21:28 -0600

On 07/15/2010 08:58 PM, Amos Jeffries wrote:
> Alex Rousskov wrote:
>> On 07/14/2010 08:02 AM, Amos Jeffries wrote:
>>> Since no description beats the real code here is the current patch I
>>> have.
>>>
>>> If you guys can't find any other major issues with this I'll go ahead
>>> and make it build and test it out.
>>
>> I have some comments, but they should not stop you from running and
>> testing the code, of course:
>
> Taking these into account on the cleanup-comm usage of AsyncJob has got
> me around a lot of weird problems.
> I think this list or a very similar should be added to AsyncJob.h class
> documentation. What is there now leaves people like myself using the API
> wrong.

Agreed. I have created src/base/AsyncJobs.dox as a starting point.

>> * Never call swanSong() directly. If you are outside an AsyncCall
>> handler, and want to kill the job, then call deleteThis(). If you are
>> inside an AsyncCall handler, you have several options for job
>> termination:
>>
>> - Call mustStop(reason) for errors that require further processing in
>> the same method(s) chain, below/after the mustStop() call. Efficient.
>>
>> - Throw (via Must or directly) for errors that do not require further
>> processing in the same method(s) chain, below/after the mustStop() call.
>> Inefficient but simple and allows exiting from deeply nested method
>> calls.
>>
>> - Otherwise, just finish the call. Your doneAll() should return true
>> and the job will terminate successfully.
>>
>> swanSong() will be called automatically in all of these cases when the
>> job is being terminated. It is a general cleanup method, like a
>> destructor. The only difference is that a destructor must not throw.
>>
>>
>> * Do not assume swanSong() is called in some perfectly nice job state.
>> The job code or the code it calls may throw at any time after start()
>> was called. The entry may be gone, the Abort may have been called, the
>> fd may have been closed, etc.
>>
>>
>> * Never call deleteThis() in contexts other than those documented above.
>> It is a hack for the old-style code. You can avoid it and other
>> old-style special precautions altogether if you convert sync calls into
>> async ones. This is especially easy for old-style calls that have only
>> one parameter ("data") or two simple parameters.
>>
>>
>> * In swanSong, always call swanSong() of the parent, after you are done
>> cleaning up your job. It does not matter whether the [current] parent
>> swanSong() does nothing.
>>
>>
>> * If a job does Comm I/O, it probably needs a Comm closing handler.
>>
>>
>> * To start a job, use AsyncStart. Do not call start() directly. There
>> are examples in the code.
>
> Is that true in ALL cases both sync and async?

I am not sure I understand the question. The job lifecycle is:

Construction: Setup job parameters in the constructor. Do as little as
possible because any problems may be difficult for the creator to handle
in a sync mode. Construction is done in a sync mode with the caller --
the caller is blocked.

Life: Call AsyncStart to "start" the job. This non-blocking call makes
the job "async". After AsyncStart, the job runs and dies on its own and
it is a bad idea to call the job synchronously. Use AsyncCalls instead.
This applies to Store callback wrappers and other old APIs with sync
callbacks. The wrapper should create an async call to the job.

Death: If the job quits after AsyncStart, then swan song and destruction
is called automatically from JobCall code when the job state changes
(see mustStop(), throw, and doneAll() discussion above).

> In this particular PAC case, we want the code of start() to be run first
> as sync to grab small files as fast as possible only ending with a
> folowup async loop if the first does not complete in time.

In general, "we want sync because it is faster than async" leads to the
wrong design and bugs. Follow the lifecycle model above (see a specific
example of that model below).

Specifically, start() must not be called directly/synchronously. This
rule is a part of the AsyncJob API so you do not have a choice here.
Name your configuration/preparation method differently if you want to
call it before AsyncStart (again, see below).

> The first and subsequent actual Call is scheduled at the end of start.

AsyncJob starts (i.e., becomes async) with AsyncStart(). When is the
first related async call is scheduled is not relevant; it could be
before, during, or after start.

If you really want to do something before starting the job, do it in the
constructor or some custom method that the job creator will call
_before_ calling AsyncStart:

    std::auto_ptr<MyJob> job(new MyJob(...)); // sync/blocking
    job->prepare(...); // sync/blocking
    job->prepareSomethingElse(...); // sync/blocking
    AsyncStart(job.release()); // non-blocking

If you do not need complex preparations, it is better to do this instead:

    AsyncStart(new MyJob(...));

Keep in mind that you have no async debugging, cleanup, and protections
until you call AsyncStart with a job pointer.

>> * Starting a job from a Comm handler is fine, but that handler should
>> not be a part of the job you are starting. Start the job in the caller
>> and let the job register its needs with Comm...
>
> This still general or do you see a particular violation I've done?
> Noting that I call start() sync as above and let the start() routine do
> all the registering bits.

Your specific code violates virtually all the general rules I mentioned
above or I would not have mentioned them. Specifically, job's comm
handler should only be called after the job became async. The call may
be _scheduled_ at any time as long as you can guarantee that the call
will not fire before AsyncStart.

HTH,

Alex.
Received on Fri Jul 16 2010 - 16:22:33 MDT

This archive was generated by hypermail 2.2.0 : Fri Jul 16 2010 - 12:00:08 MDT