[Xen-devel] Re: [RFC][patch 0/7] Enable PCIE-AER support for XEN

To:	"Ke, Liping" <liping.ke@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject:	[Xen-devel] Re: [RFC][patch 0/7] Enable PCIE-AER support for XEN
From:	Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date:	Fri, 14 Nov 2008 14:47:44 +0000
Cc:	"Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Delivery-date:	Fri, 14 Nov 2008 06:48:09 -0800
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<E2263E4A5B2284449EEBD0AAB751098401C2E06123@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	AclGK0m2eSDPjWwdSqGbijhYmTR3FgAPKnsH
Thread-topic:	[RFC][patch 0/7] Enable PCIE-AER support for XEN
User-agent:	Microsoft-Entourage/11.4.0.080122

Need proper changeset comments and signed-off-by lines from you for all
these patches (especially the new ones, which don't have any upstream
comments or sign offs). Putting the domAction node in /local/domain/x/ is
dubious: it doesn't need to be accessible outside of dom0. How about
sticking it in pciback's directory, and have the watch set up from pciif.py?

 -- Keir

On 14/11/08 07:33, "Ke, Liping" <liping.ke@xxxxxxxxx> wrote:

> Following 7 patches are for PCIE AER (Advanced Error Reporting) support for
> XEN.
> ---------------------------------------------------------------------------
> Patches 1~4 back port from Linux Kernel which enables kernel support to AER.
> 
> Those patches enable DOM0 PCIE error handling capability. When a device sends
> a PCIE error message to the root port, it will trigger an interrupt. The irq
> handler then collect root error status register then schedule a work to
> process the error based on the error type (correctable/non-fatal/fatal).
> 
> For correctable errors, clear error status register of the device
> For non-fatal error, call the callback functions of the endpoint's driver. For
> bridge, it will broadcast the error to the downstream ports. In dom0, it means
> pciback driver will be called accordingly.
> For fatal error, except reseting the pcie link as additional job, it have the
> same process with non-fatal error.
> ----------------------------------------------------------------------------
> Patch 5~7: AER error handler implementation in pciback and pcifront. This the
> main job we have done
> 
> As we mentioned above, pciback pci error handler will be scheduled by root
> port AER service. Pciback then ask pcifront help to call end-device driver for
> finally completing the related pci error handling jobs.
> 
> We noticed there might be some race condition between pciback ops (such as pci
> error handling we now work on or other configuration ops) and pci-hotplug.
> Those issues will be solved before sending patch.
> ---------------------------------------------------------------------------
> Test: 
> We have tested the patches on IPF Hitachi which could trigger Unsupported
> Request non-fatal AER by read/write a non-existing function on a pci-device
> which support AER. (We need to make sure the end device, and the middle bridge
> and the root port must support AER too)
> We also test it on the x86 and make sure it will not break current code path.
> ---------------------------------------------------------------------------
> Below example workflow which might be helpful:
> 1) Assigned an AER-capable network device to a PV driver domain (No-VTD
> supported on Hitachi).
> 2) Installed network device driver in PV guest which support pci error
> handling.
> 3) If no device driver installed in PV guest, or the driver does not support
> pci error recovery functions, the guest will be killed directly (the devices
> will be FLRed). For HVM guest, it will be killed obviously.
> 4) Trigger AER by test driver, an interrupt will be generated and caught by
> root port. 
> 5) AER service driver below root port in DOM0 will help to do the recovery
> steps in bottom half of the aer interrupt context.
> For each recovery process (error_detected, mmio_enabled, slot_reset,
> error_resume), aer core will cooperate with each below devices which has
> registered pci_error_handlers to finish the process. For details, please see
> the related docs in kernel (attached aer_doc.patch).
> 6) pciback_error_handler will then be called by AER core for each above four
> processing. Pciback will send the processing notification to pcifront,
> pcifront then try to call the corresponding device driver if device driver has
> the pci_error_handler..
> If all each recovery process succeeds, this pcie error should have been fixed
> and successfully recovered. Otherwise, impacted guest will be killed.
> 
> Thanks& Regards,
> Criping



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

[Xen-devel] Re: [RFC][patch 0/7] Enable PCIE-AER support for XEN