[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Safely finish closing protocol when guest fails in blkfront

  • To: "Glauber de Oliveira Costa" <gcosta@xxxxxxxxxx>
  • From: "Glauber de Oliveira Costa" <glommer@xxxxxxxxx>
  • Date: Tue, 5 Dec 2006 18:25:24 -0200
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Keir Fraser <keir@xxxxxxxxxxxxx>
  • Delivery-date: Tue, 05 Dec 2006 12:25:27 -0800
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=So9cDJ2pb0llgbNsU8RSG4JCp3C0xbUEQijD9maJ5+DNJKX6BsEWRc5jH+CtTlMWUXWlH6gXY38xVCIz4AyMUP4U+xCBiOnyGn/p2rU1eqRVspcajz1Ddw2HMq5LN8vvUcW1PGLMFVcyq9BCVdWMzXkf1cUyrFBqzukPRfT8UjU=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

> Assignment and unassignment of physical resources is really a tools issue.
> Tools should really be integrated with device-hotplug success/failure anyway
> -- for example, it is likely the initiator would like confirmation of
> success/failure in most cases.
Agree. But what if after properly initiation, frontend finds an error
and starts Closing protocol? What will happen is that the test

        if (xenbus_dev_is_online(dev))

will cause the device to not be unregistered. At this point, it do not
see frontend changes. (putting backend in closing leads to frontend
closing,closed, but backend never see frontend closing, never going to

Given that, what tools can do ? At the current point, this is what leads
me to believe that arbitrary frontend-failure cases should be handled in the 


Let me just try to clarify this. (after all, I just realised that even
if this is the right path, there's a piece missing).

Right now, I think that handling failures in the frontend code is the
correct choice, because failures can pretty much happen anytime .
According to the diagram at
http://wiki.xensource.com/xenwiki/XenSplitDrivers, a closedown
initiated by the frontend should end in the device being unregistered,
and I don't think tools will _ever_ be able to do it.  The best they
can do is wait to see if the device is properly connected, but what if
the error happens after it?  If this is indeed the real scenario, the
missing piece would be to delete the error message, to avoid
unregistering devices that should not be unregistered.

If you can assure, that now and ever, errors in the frontend side will
_always_ be constrained to the pre-Connect steps, then, my proposal is
to set the online flag just after the device is connected. It would
assure that device is properly unregistered, and tools would have a
way to know if the process was successfull (online = 1). Any comments
on that ?

I assume that I don't understand exactly the purpose of online. At
first I thought it was save & restore related, but I'm currently able
to save & restore with online being always 0. Can you shed some light
on it ?

As soon as you answer those, I'll proceed with the right approach to fix this.

Glauber de Oliveira Costa.
"Free as in Freedom"

Add your comments to GPLv3 at:

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.