Need proper changeset comments and signed-off-by lines from you for all
these patches (especially the new ones, which don't have any upstream
comments or sign offs). Putting the domAction node in /local/domain/x/ is
dubious: it doesn't need to be accessible outside of dom0. How about
sticking it in pciback's directory, and have the watch set up from pciif.py?
-- Keir
On 14/11/08 07:33, "Ke, Liping" <liping.ke@xxxxxxxxx> wrote:
> Following 7 patches are for PCIE AER (Advanced Error Reporting) support for
> XEN.
> ---------------------------------------------------------------------------
> Patches 1~4 back port from Linux Kernel which enables kernel support to AER.
>
> Those patches enable DOM0 PCIE error handling capability. When a device sends
> a PCIE error message to the root port, it will trigger an interrupt. The irq
> handler then collect root error status register then schedule a work to
> process the error based on the error type (correctable/non-fatal/fatal).
>
> For correctable errors, clear error status register of the device
> For non-fatal error, call the callback functions of the endpoint's driver. For
> bridge, it will broadcast the error to the downstream ports. In dom0, it means
> pciback driver will be called accordingly.
> For fatal error, except reseting the pcie link as additional job, it have the
> same process with non-fatal error.
> ----------------------------------------------------------------------------
> Patch 5~7: AER error handler implementation in pciback and pcifront. This the
> main job we have done
>
> As we mentioned above, pciback pci error handler will be scheduled by root
> port AER service. Pciback then ask pcifront help to call end-device driver for
> finally completing the related pci error handling jobs.
>
> We noticed there might be some race condition between pciback ops (such as pci
> error handling we now work on or other configuration ops) and pci-hotplug.
> Those issues will be solved before sending patch.
> ---------------------------------------------------------------------------
> Test:
> We have tested the patches on IPF Hitachi which could trigger Unsupported
> Request non-fatal AER by read/write a non-existing function on a pci-device
> which support AER. (We need to make sure the end device, and the middle bridge
> and the root port must support AER too)
> We also test it on the x86 and make sure it will not break current code path.
> ---------------------------------------------------------------------------
> Below example workflow which might be helpful:
> 1) Assigned an AER-capable network device to a PV driver domain (No-VTD
> supported on Hitachi).
> 2) Installed network device driver in PV guest which support pci error
> handling.
> 3) If no device driver installed in PV guest, or the driver does not support
> pci error recovery functions, the guest will be killed directly (the devices
> will be FLRed). For HVM guest, it will be killed obviously.
> 4) Trigger AER by test driver, an interrupt will be generated and caught by
> root port.
> 5) AER service driver below root port in DOM0 will help to do the recovery
> steps in bottom half of the aer interrupt context.
> For each recovery process (error_detected, mmio_enabled, slot_reset,
> error_resume), aer core will cooperate with each below devices which has
> registered pci_error_handlers to finish the process. For details, please see
> the related docs in kernel (attached aer_doc.patch).
> 6) pciback_error_handler will then be called by AER core for each above four
> processing. Pciback will send the processing notification to pcifront,
> pcifront then try to call the corresponding device driver if device driver has
> the pci_error_handler..
> If all each recovery process succeeds, this pcie error should have been fixed
> and successfully recovered. Otherwise, impacted guest will be killed.
>
> Thanks& Regards,
> Criping
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|