WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] pci device hotplug, race accessing xenstore

To: Phung Te Ha <phungte@xxxxxxxxx>
Subject: Re: [Xen-devel] pci device hotplug, race accessing xenstore
From: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
Date: Wed, 14 Oct 2009 14:34:35 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "horms@xxxxxxxxxxxx" <horms@xxxxxxxxxxxx>
Delivery-date: Wed, 14 Oct 2009 06:34:34 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <f6cf36180910131213x21218f2am7ed55c0a8a381312@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <f6cf36180910131213x21218f2am7ed55c0a8a381312@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Alpine 2.00 (DEB 1167 2008-08-23)
On Tue, 13 Oct 2009, Phung Te Ha wrote:
> Hello Simon,
> 
> I took the source as per you message: 
> http://marc.info/?l=xen-devel&m=124748015304566&w=4
> 
> compiled and run it on an Intel-DQ35JO, Fedora-10.
> 
> When I try to pass pci device through at boot time in configuration file, 
> there's a race between xend and qemu accessing
> xenstore.
> 
> Xend waits in signalDeviceModel(...) for qemu to declare 'running' then write 
> to the dm-command pipe the devices to be
> passed-through.
> 
> On the qemu side, it poses a watch on  /local/domain/0/device-model/2/command 
> and expects the dm-command from there, by
> calling xs_watch(...). xs_watch(...) causes xenstored to run do_watch(...) 
> and at the end, run add_event(...) with the
> following comment:
>           /* We fire once up front: simplifies clients and restart. */
> 
> 
> The problem shows when xend is faster, detecting qemu 'running' state, and 
> calls xstransact.Store adn writes to the
> command pipe, before qemu can call main_loop_wait(...) and run one empty loop 
> on the command pipe. This write causes
> xenstored to run a fires_watch, thus another add_event(...).
> The problem shows in qemu log by an extra dm-command, using wrong parameter 
> and fails to initialize, for instance:
> 
> ...
> xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command
> read_message: msg type reply pci-ins
> dm-command: hot insert pass-through pci dev
> read_message: msg type reply 0000:00:1b.0@100
> register_real_device: Assigning real physical device 00:1b.0 ...
> pt_register_regions: IO region registered (size=0x00004000 
> base_addr=0x90420004)
> pt_msi_setup: msi mapped with pirq ff
> register_real_device: Real physical device 00:1b.0 registered successfuly!
> IRQ type = MSI-INTx
> read_message: msg type reply OK
> read_message: msg type reply OK
> xs_read_watch: msg type 15 body /local/domain/0/device-model/3/command
> read_message: msg type reply pci-ins
> dm-command: hot insert pass-through pci dev
> read_message: msg type reply 0x20
> hot add pci devfn -1 exceed.
> read_message: msg type reply OK
> ...
> 
> On the xend side:
> 
> ...
>     (bdf_str, vdevfn))
> VmError: Cannot pass-through PCI function '0000:00:1b.0@100'. Device model 
> reported an error: no free hotplug devfn
> [2009-10-13 10:45:10 4174] ERROR (XendDomainInfo:471) VM start failed
> Traceback (most recent call last):
> ...
> 
> 

I think we should take this chance to make the pci-insert protocol more
reliable.
In particular we are missing the following things:

- qemu shouldn't accept any dm-command unless it is in state "running";

- xend should remove the command node on xenstore after reading
state "pci-inserted" and before writing state "running"  again.

This way when the second xenstore watch fires the pci-ins command is
never executed for a second time because either qemu is not in the right
state (pci-inserted instead of running) or the command node doesn't
contain any data (it has been removed by xend).

Another problem is that nothing else can happen while xend waits for the
device model to be in state running, this also prevents pci coldplug
from working with stubdoms.
Is it possible to run signalDeviceModel in a new xend Thread?
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel