|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 6/9] libxl: Asynchronous/long-running operation infrastructure
On Tue, 2012-01-24 at 17:27 +0000, Ian Jackson wrote:
> Thanks for the thorough review.
>
> Ian Campbell writes ("Re: [Xen-devel] [PATCH 6/9] libxl:
> Asynchronous/long-running operation infrastructure"):
> > You've done device removal, do you have a list of other things which
> > should use this? (perhaps with an associated list of people's names...)
>
> No, I don't have such a list, but offhand:
> domain creation (bootloader, etc.)
> qmp
> device attach
> domain destruction
> bootloader
> vncviewer exec (or vncviewer parameter fetching - api should change)
> (these may overlap).
Thanks.
> > Is there a possibility that libxl might decide that the operation isn't
> > actually going to take all the long and do things synchronously,
> > returning normal success (e.g. 0)? Is that the reason for the separate
> > return code for this "I did what you asked me" case?
>
> No, even if the operation completes quickly, the same exit path is
> taken. So if you ask for a callback or an event, you get that even if
> the callback or event happened before your initiating function
> returns. As I say:
>
> * Note that it is possible for an asynchronous operation which is to
> * result in a callback to complete during its initiating function
> * call. In this case the initating function will return
> * ERROR_ASYNC_INPROGRESS, even though by the time it returns the
> * operation is complete and the callback has already happened.
Thanks, I think I simply hadn't got to that comment when I wrote the
question.
If this is a direct cut-n-paste then presumably there is a patch
somewhere where "initiating" is spelled "initating" as above. Hah, I've
just spotted where I pointed it out last time and you corrected it
below... At least I'm consistent.
> If ao_how is non-NULL then these functions cannot return 0.
> If it is NULL they cannot return ASYNC_INPROGRESS.
>
> I chose to use a new exit status because it seemed safer but that's a
> matter of taste and if you prefer I could return 0 for that case.
I'm undecided (plus it seems a bit like bikeshedding). I certainly
prefer either 0 or {LIBXL_}ASYNC_IN_PROGRESS to ERROR_ASYNC_IN_PROGRESS
though.
> > Can we drop the ERROR_ prefix? I know that's inconsistent with the other
> > return codes but those actually are errors.
>
> I guess we could but isn't this going to become a proper IDL enum at
> some point ?
At which point it would become LIBXL_ERROR_{FOOS} and
LIBXL_ASYNC_IN_PROGRESS?
[...]
> > > + * "disaster", except that libxl calls ao_how->callback instead of
> > > + * libxl_event_hooks.event_occurs.
> > > + *
> > > + * If ao_how->callback==NULL, a libxl_event will be generated which
> > > + * can be obtained from libxl_event_wait or libxl_event_check.
> >
> > Or be delivered via event_occurs?
>
> Yes, in principle, although that would be a silly thing to ask for.
> Why would you want your ao completions delivered via some central
> callback function just so that you could split them up again ?
True.
> > > + * call. In this case the initating function will return
> >
> > initiating
>
> Fixed.
>
>
> > > +typedef struct {
> > > + void (*callback)(libxl_ctx *ctx, int rc, void *for_callback);
> > > + union {
> > > + libxl_ev_user for_event; /* used if callback==NULL */
> > > + void *for_callback; /* passed to callback */
> >
> > Why void * for one bit of "closure" but an explicit uint64_t for the
> > other. I nearly commented on the use of uint64_t previously -- void *,
> > or perhaps (u)intptr_t is more normal.
>
> The context value in an event needs to be marshallable to a foreign
> language or a foreign process. So it can't be a pointer. Or at
> least, it has to be a type which can contain integers as well as
> pointers, and an integer type is better for that.
Fair enough.
>
> The context value for a callback function can be a void* because you
> don't ever need to "marshal" the "callback", or if you do you can wrap
> up the context appropriately.
>
> > > + * Completion after initiator return (asynch. only):
> ...
> > > + * * initiator calls ao_complete: |
> > > + * - observes event not net done, |
> > > + * - returns to caller |
> > > + * |
> > > + * - ao completes on some thread
> > > + * - generate the event or call the callback
> > > + * - destroy the ao
> >
> > Where does ao_inprogress fit into these diagrams?
>
> There's a mistake in the diagrams: where it says on the left
> "initiator calls ao_complete" it should read "... ao_inprogress", and
> likewise for "ao_complete takes the lock". I will fix this.
Thanks. While rereading with that substitution (and finding that it made
sense) I noticed:
+ * * initiator calls ao_complete: |
+ * - observes event not net done, |
You want s/net/yet/.
> > > +void libxl__ao_complete(libxl__egc *egc, libxl__ao *ao, int rc) {
> > > + assert(ao->magic == LIBXL__AO_MAGIC);
> > > + assert(!ao->complete);
> > > + ao->complete = 1;
> > > + ao->rc = rc;
> > > +
> > > + if (ao->poller) {
> > > + assert(ao->in_initiator);
> > > + libxl__poller_wakeup(egc, ao->poller);
> > > + } else if (ao->how.callback) {
> > > + LIBXL_TAILQ_INSERT_TAIL(&egc->aos_for_callback, ao,
> > > entry_for_callback);
> > > + } else {
> > > + libxl_event *ev;
> > > + ev = NEW_EVENT(egc, OPERATION_COMPLETE, ao->domid);
> > > + if (ev) {
> > > + ev->for_user = ao->how.u.for_event;
> > > + ev->u.operation_complete.rc = ao->rc;
> > > + libxl__event_occurred(egc, ev);
> > > + }
> > > + ao->notified = 1;
> > > + }
> > > + if (!ao->in_initiator && ao->notified)
> > > + libxl__ao__destroy(libxl__gc_owner(&egc->gc), ao);
> >
> > You added a helper for this libxl__gc_owner(&egc..) construct.
>
> You mean EGC_GC and CTX. I don't think that's a good idea here
> because it obscures exactly what's going on. In particular, there are
> two gcs here - the ao's and the egc's - and one of them may be about
> to evaporate.
OK.
> > > + rc = eventloop_iteration(&egc,ao->poller);
> > > + if (rc) {
> > > + /* Oh dear, this is quite unfortunate. */
> > > + LIBXL__LOG(CTX, LIBXL__LOG_ERROR, "Error waiting for"
> > > + " event during long-running operation
> > > (rc=%d)", rc);
> > > + sleep(1);
> > > + /* It's either this or return ERROR_I_DONT_KNOW_WHETHER
> > > + * _THE_THING_YOU_ASKED_FOR_WILL_BE_DONE_LATER_WHEN
> > > + * _YOU_DIDNT_EXPECT_IT, since we don't have any kind of
> > > + * cancellation ability. */
> >
> > Does this constitute a "disaster" (in the special hook sense)?
>
> No, disaster just lets us say that some events may be lost and an ao
> completion might not be an event. disaster doesn't let us randomly
> store up ongoing activity and have it happen when not expected.
> For example, a caller asking a synchronous operation does not expect
> to get an error code and then have the operation continue in the
> background anyway.
OK.
> > > + * Usually callback function should use GET_CONTAINING_STRUCT
> >
> > Now called CONTAINER_OF
>
> Fixed. Sometimes I wish the compiler could look into comments and
> spot when I've done this kind of thing.
Or read the comments and DWIM so we just need to write the comments ;-)
> > > + * to obtain its own structure, containing a pointer to the ao,
> > > + * and then use the gc from that ao.
> > > + */
> > > +
> > > +#define AO_CREATE(ctx, domid, ao_how) \
> > > + libxl__ao *ao = libxl__ao_create(ctx, domid, ao_how); \
> > > + if (!ao) return ERROR_NOMEM; \
> > > + AO_GC; \
> > > + CTX_LOCK;
> >
> > Where does the unlock which balances this come from? The only unlock I
> > see in this patch is the temporary drop in libxl__ao_inprogress which is
> > matched by another lock.
>
> The AO_INPROGRESS macro is supposed to unlock it, but locks it again
> by mistake. Well spotted.
>
> > > +#define AO_INPROGRESS do{ \
> > > + libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \
> > > + int ao__rc = libxl__ao_inprogress(ao); \
> > > + libxl__ctx_lock(ao__ctx); /* gc is now invalid */ \
> >
> > Is this supposed to be unlock answering my question above? Likewise in
> > ABORT?
>
> Yes. Indeed the comment above agrees:
>
> * - If initiation is successful, the initiating function needs
> * to run libxl__ao_inprogress right before unlocking and
> * returning, and return whatever it returns (AO_INPROGRESS macro).
> *
> * - If the initiation is unsuccessful, the initiating function must
> * call libxl__ao_abort before unlocking and returning whatever
> * error code is appropriate (AO_ABORT macro).
Right.
I think this highlights my concern about some code paths not being run
by the in-xl users...
> > > + return ao__rc; \
> > > + }while(0)
> >
> > Can we arrange for AO_INPROGRESS and AO_ABORT to return the rc? So it
> > would become
> > return AO_INPROGRESS;
>
> That would be possible. I wasn't sure whether to do it like that.
> Note that AO_CREATE already might return; doing it the way I have it
> now seems more symmetrical.
>
> But perhaps it would make things clearer to have the return outside
> the macro.
I thought there was a general preference for that sort of thing, but I
suppose it depends on the required macro contortions to make it happen.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |