This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] RE: VT-d device assignment may fail (regression from Xen c/s

To: Jan Beulich <JBeulich@xxxxxxxxxx>, "Han, Weidong" <weidong.han@xxxxxxxxx>
Subject: [Xen-devel] RE: VT-d device assignment may fail (regression from Xen c/s 19805:2f1fa2215e60)
From: "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Date: Mon, 25 Oct 2010 15:05:10 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 25 Oct 2010 00:07:16 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4CC17FE5020000780001E91F@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4CC17FE5020000780001E91F@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Actx0cVsgiWGq/+9SEOrY0XhWXBdrACPAImQ
Thread-topic: VT-d device assignment may fail (regression from Xen c/s 19805:2f1fa2215e60)
>-----Original Message-----
>From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
>Sent: Friday, October 22, 2010 6:13 PM
>To: Han, Weidong
>Cc: Jiang, Yunhong; xen-devel@xxxxxxxxxxxxxxxxxxx
>Subject: VT-d device assignment may fail (regression from Xen c/s
>in this patch you removed a bus/devfn check around an invocation of
>domain_context_mapping_one() avoiding the attempt to call the
>function again if it was already called for this very device. This
>removal, however, conflicts with the context_present() check at the
>top of domain_context_mapping_one() - in particular, pdev->domain
>isn't set to the new owner yet, and hence the function fails.

Weidong is on travel, and he may give more comments when is back tomorrow.

What's the removed bus/devfn check you mean? I didn't catch it in the patch.

>The question now is whether some similar check should be restored,
>or whether pdev->domain should get updated earlier. This may
>need some additional consideration, since from looking at the code
>I would say that reassign_device_ownership() needs some error
>handling improvements too: Currently, partial failure isn't being
>handled properly at all (respective devices are left in a half way
>state - no longer properly assigned to Dom0, but also not yet
>assigned to DomU).
>I also wonder what guarantee there is for a device to exist at
><secbus>:00.0 (since if there is none, the same context_present()
>check could at least theoretically again lead to problems as it
>checks for pci_get_pdev() returning non-NULL).

Hmm, the function 0 should always exists, but I didn't find in spec that device 
0 should always be populated.

>Finally, isn't it inconsistent that only the original device gets its
>->domain set to the new owner and gets moved to that domain's
>device list, but neither the upstream bridge nor that bridge's
><secbus>:00.0 get handled the same way? What if below that

Per my understanding, the bridge and the <secbus>:00.0 is only for PCI device 
because all PCI device behind the same pcie2pci bridge should be assigned to 
the same domain. So if a device is assigned to a domain, the bridge and the 
<secbus>:00.0 should be the same, so it is not that neccessary to keep that 
information for the bridge and <secbus>:0.0 .

But seems current implementation missed something, Weidong, correct me please, 
if I'm wrong.
1) Currently Xen hypervisor does not gurantee the "atomic" assignment of 
device. I assume this is done by tools currently. But if tools does not guard 
this, it may cause problem in xen hypervisor. For example, if tools assign PCI 
device A to domain A, and then it try to assign PCI device B (in the same bus 
as device A) to domain B, the second assignment (to domain B) will fail because 
the assign to the pci bridge fail, and thus leave the device B in half way, As 
Jan stated above.

2) If a device is hot-added, the hot-added device is owned by domain0 by 
default, that may cause issue.

>bridge a device gets hot-added? Wouldn't that device
>incorrectly end up in Dom0, with no failures because the bridge
>still appears to be owned by Dom0 while it really isn't?

Yes, that has trouble. The device should be hide to dom0.



Xen-devel mailing list