WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] IOMMU: improve the FLR logic and move it from hypervisor to

To: "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] IOMMU: improve the FLR logic and move it from hypervisor to Control Panel?
From: "Cui, Dexuan" <dexuan.cui@xxxxxxxxx>
Date: Thu, 19 Jun 2008 13:13:38 +0800
Delivery-date: Wed, 18 Jun 2008 22:14:09 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcjRyzk1uEdlx8vPSFedd7rG+KIQSQ==
Thread-topic: IOMMU: improve the FLR logic and move it from hypervisor to Control Panel?
Currently, when creating/destroying hvm guest with assigned devices, we
perform FLR for the devices in hypervisor:
xen/drivers/passthrough/vtd/utils.c: pdev_flr(). 
The logic is:
a) if the device is PCI-e endpoint and it supports FLR, use that;
b) for other cases, we use D3hot/D0 transition for FLR.

There are some issues:

1) looks there are few PCIe devices supporting FLR now. So currently,
almost all the PCIe devices and all PCI devices use the D3hot/D0 method.
However, actually, Dstate transition is not guaranteed to  properly
clear the device state;

2) in case a), the current implementation is actually buggy:
Transaction_Pending_bit==0 doesn't mean the completion of FLR, just
means a way to ensure there is no pending transaction when we're going
to issue FLR (so we can be sure there is no data corruption). 
And according to PCIe spec, after issuing FLR, we should wait at least
100ms, but "mdelay(100)" is not acceptable in Xen...

To resolve the issues, I propose to change the FLR logic to:

1) If the device is PCIe endpoint and supports PCIe FLR, use that;
2) Else, if the device is PCIe endpoint, and all functions on the device
are assigned to the same guest, we use the immediate parent bus's
"Secondary Bus Reset" to reset all functions of the device (here,
actually we require all the functions of the device be assigned to the
same guest);
3) Else, if the device is PCI endpoint and is on a host bus (e.g.
integrated devices), and if the device supports PCI "Advanced
Capabilities", we use that for FLR;
4) Else, if the device is a vendor integrated PCI device with "known"
set of vendor/device id, we use the vendor-defined method of issuing
FLR. For instance, for the VendorID=0x8086, we can use the method
defined in Intel ICH9 Datasheet to perform FLR;
5) Else, we use the" Secondary Bus Reset" (we ensure all the PCI devices
behind a bridge must be assigned to the same guest).

And I propose to move the FLR logic to Control Panel. 
The benefits are: 
1) It's natural, and makes the hypervisor thin;
2) The 100ms-delay can be implemented easily in Control Panel, but not
easily in hypervisor;
3) Some logic, like the lookup of a device's BDF to its parent's BDF can
be done  more easily in Control Panel.

Comments are appreciated.

Thanks,
-- Dexuan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel