WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-changelog

[Xen-changelog] [linux-2.6.18-xen] merge with linux-2.6.18-xen.hg

To: xen-changelog@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-changelog] [linux-2.6.18-xen] merge with linux-2.6.18-xen.hg
From: "Xen patchbot-linux-2.6.18-xen" <patchbot-linux-2.6.18-xen@xxxxxxxxxxxxxxxxxxx>
Date: Wed, 26 Nov 2008 12:00:28 -0800
Delivery-date: Wed, 26 Nov 2008 12:02:22 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-changelog-request@lists.xensource.com?subject=help>
List-id: BK change log <xen-changelog.lists.xensource.com>
List-post: <mailto:xen-changelog@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-changelog>, <mailto:xen-changelog-request@lists.xensource.com?subject=unsubscribe>
Reply-to: xen-devel@xxxxxxxxxxxxxxxxxxx
Sender: xen-changelog-bounces@xxxxxxxxxxxxxxxxxxx
# HG changeset patch
# User Isaku Yamahata <yamahata@xxxxxxxxxxxxx>
# Date 1227662655 -32400
# Node ID 6591b4869889bdd705267c47ac855f0c4bb7d75b
# Parent  61d1f2810617b4d0a15ba192fcf679918d5b0438
# Parent  f236d7def9944909bf40015ff4a08817b0803ed9
merge with linux-2.6.18-xen.hg
---
 Documentation/pcieaer-howto.txt             |  252 +++++++++
 arch/i386/kernel/io_apic-xen.c              |    5 
 arch/i386/pci/irq.c                         |    9 
 arch/x86_64/kernel/io_apic-xen.c            |    5 
 drivers/i2c/busses/Kconfig                  |    1 
 drivers/i2c/busses/i2c-i801.c               |    2 
 drivers/pci/msi-xen.c                       |  177 ++++--
 drivers/pci/pcie/Kconfig                    |    1 
 drivers/pci/pcie/Makefile                   |    3 
 drivers/pci/pcie/aer/Kconfig                |   12 
 drivers/pci/pcie/aer/Makefile               |    8 
 drivers/pci/pcie/aer/aerdrv.c               |  346 ++++++++++++
 drivers/pci/pcie/aer/aerdrv.h               |  125 ++++
 drivers/pci/pcie/aer/aerdrv_acpi.c          |   68 ++
 drivers/pci/pcie/aer/aerdrv_core.c          |  757 ++++++++++++++++++++++++++++
 drivers/pci/pcie/aer/aerdrv_errprint.c      |  248 +++++++++
 drivers/pci/pcie/portdrv_bus.c              |    1 
 drivers/pci/pcie/portdrv_pci.c              |  224 ++++++--
 drivers/pci/search.c                        |   31 +
 drivers/scsi/ahci.c                         |   25 
 drivers/scsi/ata_piix.c                     |   12 
 drivers/xen/balloon/balloon.c               |    4 
 drivers/xen/balloon/sysfs.c                 |    9 
 drivers/xen/blkback/blkback.c               |   44 -
 drivers/xen/blktap/blktap.c                 |   43 +
 drivers/xen/core/evtchn.c                   |  583 ++++++++++-----------
 drivers/xen/core/pci.c                      |    8 
 drivers/xen/fbfront/xenfb.c                 |    8 
 drivers/xen/pciback/controller.c            |   35 +
 drivers/xen/pciback/passthrough.c           |   10 
 drivers/xen/pciback/pci_stub.c              |  319 +++++++++++
 drivers/xen/pciback/pciback.h               |   15 
 drivers/xen/pciback/pciback_ops.c           |   25 
 drivers/xen/pciback/slot.c                  |   30 +
 drivers/xen/pciback/vpci.c                  |   30 +
 drivers/xen/pciback/xenbus.c                |    9 
 drivers/xen/pcifront/pci_op.c               |  115 ++++
 drivers/xen/pcifront/pcifront.h             |   13 
 drivers/xen/pcifront/xenbus.c               |   14 
 include/asm-i386/mach-xen/asm/hypercall.h   |    7 
 include/asm-x86_64/mach-xen/asm/hypercall.h |    7 
 include/linux/aer.h                         |   24 
 include/linux/pci.h                         |    6 
 include/linux/pci_ids.h                     |    3 
 include/linux/pcieport_if.h                 |    6 
 include/xen/interface/features.h            |    6 
 include/xen/interface/grant_table.h         |    9 
 include/xen/interface/io/pciif.h            |   35 +
 include/xen/interface/kexec.h               |   21 
 include/xen/interface/trace.h               |    2 
 kernel/kexec.c                              |    3 
 sound/pci/hda/hda_intel.c                   |    2 
 52 files changed, 3285 insertions(+), 472 deletions(-)

diff -r 61d1f2810617 -r 6591b4869889 Documentation/pcieaer-howto.txt
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/Documentation/pcieaer-howto.txt   Wed Nov 26 10:24:15 2008 +0900
@@ -0,0 +1,253 @@
+   The PCI Express Advanced Error Reporting Driver Guide HOWTO
+               T. Long Nguyen  <tom.l.nguyen@xxxxxxxxx>
+               Yanmin Zhang    <yanmin.zhang@xxxxxxxxx>
+                               07/29/2006
+
+
+1. Overview
+
+1.1 About this guide
+
+This guide describes the basics of the PCI Express Advanced Error
+Reporting (AER) driver and provides information on how to use it, as
+well as how to enable the drivers of endpoint devices to conform with
+PCI Express AER driver.
+
+1.2 Copyright © Intel Corporation 2006.
+
+1.3 What is the PCI Express AER Driver?
+
+PCI Express error signaling can occur on the PCI Express link itself
+or on behalf of transactions initiated on the link. PCI Express
+defines two error reporting paradigms: the baseline capability and
+the Advanced Error Reporting capability. The baseline capability is
+required of all PCI Express components providing a minimum defined
+set of error reporting requirements. Advanced Error Reporting
+capability is implemented with a PCI Express advanced error reporting
+extended capability structure providing more robust error reporting.
+
+The PCI Express AER driver provides the infrastructure to support PCI
+Express Advanced Error Reporting capability. The PCI Express AER
+driver provides three basic functions:
+
+-      Gathers the comprehensive error information if errors occurred.
+-      Reports error to the users.
+-      Performs error recovery actions.
+
+AER driver only attaches root ports which support PCI-Express AER
+capability.
+
+
+2. User Guide
+
+2.1 Include the PCI Express AER Root Driver into the Linux Kernel
+
+The PCI Express AER Root driver is a Root Port service driver attached
+to the PCI Express Port Bus driver. If a user wants to use it, the driver
+has to be compiled. Option CONFIG_PCIEAER supports this capability. It
+depends on CONFIG_PCIEPORTBUS, so pls. set CONFIG_PCIEPORTBUS=y and
+CONFIG_PCIEAER = y.
+
+2.2 Load PCI Express AER Root Driver
+There is a case where a system has AER support in BIOS. Enabling the AER
+Root driver and having AER support in BIOS may result unpredictable
+behavior. To avoid this conflict, a successful load of the AER Root driver
+requires ACPI _OSC support in the BIOS to allow the AER Root driver to
+request for native control of AER. See the PCI FW 3.0 Specification for
+details regarding OSC usage. Currently, lots of firmwares don't provide
+_OSC support while they use PCI Express. To support such firmwares,
+forceload, a parameter of type bool, could enable AER to continue to
+be initiated although firmwares have no _OSC support. To enable the
+walkaround, pls. add aerdriver.forceload=y to kernel boot parameter line
+when booting kernel. Note that forceload=n by default.
+
+2.3 AER error output
+When a PCI-E AER error is captured, an error message will be outputed to
+console. If it's a correctable error, it is outputed as a warning.
+Otherwise, it is printed as an error. So users could choose different
+log level to filter out correctable error messages.
+
+Below shows an example.
++------ PCI-Express Device Error -----+
+Error Severity          : Uncorrected (Fatal)
+PCIE Bus Error type     : Transaction Layer
+Unsupported Request     : First
+Requester ID            : 0500
+VendorID=8086h, DeviceID=0329h, Bus=05h, Device=00h, Function=00h
+TLB Header:
+04000001 00200a03 05010000 00050100
+
+In the example, 'Requester ID' means the ID of the device who sends
+the error message to root port. Pls. refer to pci express specs for
+other fields.
+
+
+3. Developer Guide
+
+To enable AER aware support requires a software driver to configure
+the AER capability structure within its device and to provide callbacks.
+
+To support AER better, developers need understand how AER does work
+firstly.
+
+PCI Express errors are classified into two types: correctable errors
+and uncorrectable errors. This classification is based on the impacts
+of those errors, which may result in degraded performance or function
+failure.
+
+Correctable errors pose no impacts on the functionality of the
+interface. The PCI Express protocol can recover without any software
+intervention or any loss of data. These errors are detected and
+corrected by hardware. Unlike correctable errors, uncorrectable
+errors impact functionality of the interface. Uncorrectable errors
+can cause a particular transaction or a particular PCI Express link
+to be unreliable. Depending on those error conditions, uncorrectable
+errors are further classified into non-fatal errors and fatal errors.
+Non-fatal errors cause the particular transaction to be unreliable,
+but the PCI Express link itself is fully functional. Fatal errors, on
+the other hand, cause the link to be unreliable.
+
+When AER is enabled, a PCI Express device will automatically send an
+error message to the PCIE root port above it when the device captures
+an error. The Root Port, upon receiving an error reporting message,
+internally processes and logs the error message in its PCI Express
+capability structure. Error information being logged includes storing
+the error reporting agent's requestor ID into the Error Source
+Identification Registers and setting the error bits of the Root Error
+Status Register accordingly. If AER error reporting is enabled in Root
+Error Command Register, the Root Port generates an interrupt if an
+error is detected.
+
+Note that the errors as described above are related to the PCI Express
+hierarchy and links. These errors do not include any device specific
+errors because device specific errors will still get sent directly to
+the device driver.
+
+3.1 Configure the AER capability structure
+
+AER aware drivers of PCI Express component need change the device
+control registers to enable AER. They also could change AER registers,
+including mask and severity registers. Helper function
+pci_enable_pcie_error_reporting could be used to enable AER. See
+section 3.3.
+
+3.2. Provide callbacks
+
+3.2.1 callback reset_link to reset pci express link
+
+This callback is used to reset the pci express physical link when a
+fatal error happens. The root port aer service driver provides a
+default reset_link function, but different upstream ports might
+have different specifications to reset pci express link, so all
+upstream ports should provide their own reset_link functions.
+
+In struct pcie_port_service_driver, a new pointer, reset_link, is
+added.
+
+pci_ers_result_t (*reset_link) (struct pci_dev *dev);
+
+Section 3.2.2.2 provides more detailed info on when to call
+reset_link.
+
+3.2.2 PCI error-recovery callbacks
+
+The PCI Express AER Root driver uses error callbacks to coordinate
+with downstream device drivers associated with a hierarchy in question
+when performing error recovery actions.
+
+Data struct pci_driver has a pointer, err_handler, to point to
+pci_error_handlers who consists of a couple of callback function
+pointers. AER driver follows the rules defined in
+pci-error-recovery.txt except pci express specific parts (e.g.
+reset_link). Pls. refer to pci-error-recovery.txt for detailed
+definitions of the callbacks.
+
+Below sections specify when to call the error callback functions.
+
+3.2.2.1 Correctable errors
+
+Correctable errors pose no impacts on the functionality of
+the interface. The PCI Express protocol can recover without any
+software intervention or any loss of data. These errors do not
+require any recovery actions. The AER driver clears the device's
+correctable error status register accordingly and logs these errors.
+
+3.2.2.2 Non-correctable (non-fatal and fatal) errors
+
+If an error message indicates a non-fatal error, performing link reset
+at upstream is not required. The AER driver calls error_detected(dev,
+pci_channel_io_normal) to all drivers associated within a hierarchy in
+question. for example,
+EndPoint<==>DownstreamPort B<==>UpstreamPort A<==>RootPort.
+If Upstream port A captures an AER error, the hierarchy consists of
+Downstream port B and EndPoint.
+
+A driver may return PCI_ERS_RESULT_CAN_RECOVER,
+PCI_ERS_RESULT_DISCONNECT, or PCI_ERS_RESULT_NEED_RESET, depending on
+whether it can recover or the AER driver calls mmio_enabled as next.
+
+If an error message indicates a fatal error, kernel will broadcast
+error_detected(dev, pci_channel_io_frozen) to all drivers within
+a hierarchy in question. Then, performing link reset at upstream is
+necessary. As different kinds of devices might use different approaches
+to reset link, AER port service driver is required to provide the
+function to reset link. Firstly, kernel looks for if the upstream
+component has an aer driver. If it has, kernel uses the reset_link
+callback of the aer driver. If the upstream component has no aer driver
+and the port is downstream port, we will use the aer driver of the
+root port who reports the AER error. As for upstream ports,
+they should provide their own aer service drivers with reset_link
+function. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER and
+reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
+to mmio_enabled.
+
+3.3 helper functions
+
+3.3.1 int pci_find_aer_capability(struct pci_dev *dev);
+pci_find_aer_capability locates the PCI Express AER capability
+in the device configuration space. If the device doesn't support
+PCI-Express AER, the function returns 0.
+
+3.3.2 int pci_enable_pcie_error_reporting(struct pci_dev *dev);
+pci_enable_pcie_error_reporting enables the device to send error
+messages to root port when an error is detected. Note that devices
+don't enable the error reporting by default, so device drivers need
+call this function to enable it.
+
+3.3.3 int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+pci_disable_pcie_error_reporting disables the device to send error
+messages to root port when an error is detected.
+
+3.3.4 int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+pci_cleanup_aer_uncorrect_error_status cleanups the uncorrectable
+error status register.
+
+3.4 Frequent Asked Questions
+
+Q: What happens if a PCI Express device driver does not provide an
+error recovery handler (pci_driver->err_handler is equal to NULL)?
+
+A: The devices attached with the driver won't be recovered. If the
+error is fatal, kernel will print out warning messages. Please refer
+to section 3 for more information.
+
+Q: What happens if an upstream port service driver does not provide
+callback reset_link?
+
+A: Fatal error recovery will fail if the errors are reported by the
+upstream ports who are attached by the service driver.
+
+Q: How does this infrastructure deal with driver that is not PCI
+Express aware?
+
+A: This infrastructure calls the error callback functions of the
+driver when an error happens. But if the driver is not aware of
+PCI Express, the device might not report its own errors to root
+port.
+
+Q: What modifications will that driver need to make it compatible
+with the PCI Express AER Root driver?
+
+A: It could call the helper functions to enable AER in devices and
+cleanup uncorrectable status register. Pls. refer to section 3.3.
+
diff -r 61d1f2810617 -r 6591b4869889 arch/i386/kernel/io_apic-xen.c
--- a/arch/i386/kernel/io_apic-xen.c    Tue Nov 04 12:43:37 2008 +0900
+++ b/arch/i386/kernel/io_apic-xen.c    Wed Nov 26 10:24:15 2008 +0900
@@ -1216,6 +1216,9 @@ int assign_irq_vector(int irq)
 
        BUG_ON(irq != AUTO_ASSIGN && (unsigned)irq >= NR_IRQ_VECTORS);
 
+       if (irq < PIRQ_BASE || irq - PIRQ_BASE > NR_PIRQS)
+               return -EINVAL;
+
        spin_lock_irqsave(&vector_lock, flags);
 
        if (irq != AUTO_ASSIGN && IO_APIC_VECTOR(irq) > 0) {
@@ -2567,8 +2570,10 @@ static int ioapic_resume(struct sys_devi
 
 static struct sysdev_class ioapic_sysdev_class = {
        set_kset_name("ioapic"),
+#ifndef CONFIG_XEN
        .suspend = ioapic_suspend,
        .resume = ioapic_resume,
+#endif
 };
 
 static int __init ioapic_init_sysfs(void)
diff -r 61d1f2810617 -r 6591b4869889 arch/i386/pci/irq.c
--- a/arch/i386/pci/irq.c       Tue Nov 04 12:43:37 2008 +0900
+++ b/arch/i386/pci/irq.c       Wed Nov 26 10:24:15 2008 +0900
@@ -558,6 +558,15 @@ static __init int intel_router_probe(str
                        r->set = pirq_piix_set;
                        return 1;
        }
+
+       if ((device >= PCI_DEVICE_ID_INTEL_PCH_LPC_MIN) && 
+               (device <= PCI_DEVICE_ID_INTEL_PCH_LPC_MAX)) {
+               r->name = "PIIX/ICH";
+               r->get = pirq_piix_get;
+               r->set = pirq_piix_set;
+               return 1;
+       }
+
        return 0;
 }
 
diff -r 61d1f2810617 -r 6591b4869889 arch/x86_64/kernel/io_apic-xen.c
--- a/arch/x86_64/kernel/io_apic-xen.c  Tue Nov 04 12:43:37 2008 +0900
+++ b/arch/x86_64/kernel/io_apic-xen.c  Wed Nov 26 10:24:15 2008 +0900
@@ -895,6 +895,9 @@ int assign_irq_vector(int irq)
   
        BUG_ON(irq != AUTO_ASSIGN && (unsigned)irq >= NR_IRQ_VECTORS);
 
+       if (irq < PIRQ_BASE || irq - PIRQ_BASE > NR_PIRQS)
+               return -EINVAL;
+
        spin_lock_irqsave(&vector_lock, flags);
 
        if (irq != AUTO_ASSIGN && IO_APIC_VECTOR(irq) > 0) {
@@ -2104,8 +2107,10 @@ static int ioapic_resume(struct sys_devi
 
 static struct sysdev_class ioapic_sysdev_class = {
        set_kset_name("ioapic"),
+#ifndef CONFIG_XEN
        .suspend = ioapic_suspend,
        .resume = ioapic_resume,
+#endif
 };
 
 static int __init ioapic_init_sysfs(void)
diff -r 61d1f2810617 -r 6591b4869889 drivers/i2c/busses/Kconfig
--- a/drivers/i2c/busses/Kconfig        Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/i2c/busses/Kconfig        Wed Nov 26 10:24:15 2008 +0900
@@ -127,6 +127,7 @@ config I2C_I801
            ICH8
            ICH9
            ICH10
+            PCH
 
          This driver can also be built as a module.  If so, the module
          will be called i2c-i801.
diff -r 61d1f2810617 -r 6591b4869889 drivers/i2c/busses/i2c-i801.c
--- a/drivers/i2c/busses/i2c-i801.c     Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/i2c/busses/i2c-i801.c     Wed Nov 26 10:24:15 2008 +0900
@@ -36,6 +36,7 @@
     ICH9               2930
     ICH10              3A30
     ICH10              3A60
+    PCH                        3B30
     This driver supports several versions of Intel's I/O Controller Hubs (ICH).
     For SMBus support, they are similar to the PIIX4 and are part
     of Intel's '810' and other chipsets.
@@ -463,6 +464,7 @@ static struct pci_device_id i801_ids[] =
        { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH9_6) },
        { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH10_4) },
        { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_ICH10_5) },
+       { PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PCH_SMBUS) },
        { 0, }
 };
 
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/msi-xen.c
--- a/drivers/pci/msi-xen.c     Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/pci/msi-xen.c     Wed Nov 26 10:24:15 2008 +0900
@@ -48,6 +48,13 @@ struct msi_pirq_entry {
        struct list_head list;
        int pirq;
        int entry_nr;
+#ifdef CONFIG_PM
+       /* PM save area for MSIX address/data */
+       void __iomem *mask_base;
+       u32     address_hi_save;
+       u32     address_lo_save;
+       u32     data_save;
+#endif
 };
 
 static struct msi_dev_list *get_msi_dev_pirq_list(struct pci_dev *dev)
@@ -83,7 +90,7 @@ static struct msi_dev_list *get_msi_dev_
        return ret;
 }
 
-static int attach_pirq_entry(int pirq, int entry_nr,
+static int attach_pirq_entry(int pirq, int entry_nr, u64 table_base,
                              struct msi_dev_list *msi_dev_entry)
 {
        struct msi_pirq_entry *entry = kmalloc(sizeof(*entry), GFP_ATOMIC);
@@ -93,6 +100,9 @@ static int attach_pirq_entry(int pirq, i
                return -ENOMEM;
        entry->pirq = pirq;
        entry->entry_nr = entry_nr;
+#ifdef COMFIG_PM
+       entry->mask_base = table_base;
+#endif
        spin_lock_irqsave(&msi_dev_entry->pirq_list_lock, flags);
        list_add_tail(&entry->list, &msi_dev_entry->pirq_list_head);
        spin_unlock_irqrestore(&msi_dev_entry->pirq_list_lock, flags);
@@ -299,104 +309,173 @@ static void enable_msi_mode(struct pci_d
 #ifdef CONFIG_PM
 int pci_save_msi_state(struct pci_dev *dev)
 {
-       int pos;
+       int pos, i = 0;
+       u16 control;
+       struct pci_cap_saved_state *save_state;
+       u32 *cap;
 
        pos = pci_find_capability(dev, PCI_CAP_ID_MSI);
        if (pos <= 0 || dev->no_msi)
                return 0;
 
-       if (!dev->msi_enabled)
+       pci_read_config_word(dev, msi_control_reg(pos), &control);
+       if (!(control & PCI_MSI_FLAGS_ENABLE))
                return 0;
 
-       /* Restore dev->irq to its default pin-assertion vector */
-       msi_unmap_pirq(dev, dev->irq);
-       /* Disable MSI mode */
-       disable_msi_mode(dev, pos, PCI_CAP_ID_MSI);
-       /* Set the flags for use of restore */
-       dev->msi_enabled = 1;
+       save_state = kzalloc(sizeof(struct pci_cap_saved_state) + sizeof(u32) * 
5,
+               GFP_KERNEL);
+       if (!save_state) {
+               printk(KERN_ERR "Out of memory in pci_save_msi_state\n");
+               return -ENOMEM;
+       }
+       cap = &save_state->data[0];
+
+       pci_read_config_dword(dev, pos, &cap[i++]);
+       control = cap[0] >> 16;
+       pci_read_config_dword(dev, pos + PCI_MSI_ADDRESS_LO, &cap[i++]);
+       if (control & PCI_MSI_FLAGS_64BIT) {
+               pci_read_config_dword(dev, pos + PCI_MSI_ADDRESS_HI, &cap[i++]);
+               pci_read_config_dword(dev, pos + PCI_MSI_DATA_64, &cap[i++]);
+       } else
+               pci_read_config_dword(dev, pos + PCI_MSI_DATA_32, &cap[i++]);
+       if (control & PCI_MSI_FLAGS_MASKBIT)
+               pci_read_config_dword(dev, pos + PCI_MSI_MASK_BIT, &cap[i++]);
+       save_state->cap_nr = PCI_CAP_ID_MSI;
+       pci_add_saved_cap(dev, save_state);
        return 0;
 }
 
 void pci_restore_msi_state(struct pci_dev *dev)
 {
-       int pos, pirq;
-
+       int i = 0, pos;
+       u16 control;
+       struct pci_cap_saved_state *save_state;
+       u32 *cap;
+
+       save_state = pci_find_saved_cap(dev, PCI_CAP_ID_MSI);
        pos = pci_find_capability(dev, PCI_CAP_ID_MSI);
-       if (pos <= 0)
-               return;
-
-       if (!dev->msi_enabled)
-               return;
-
-       pirq = msi_map_pirq_to_vector(dev, dev->irq, 0, 0);
-       if (pirq < 0)
-               return;
+       if (!save_state || pos <= 0)
+               return;
+       cap = &save_state->data[0];
+
+       control = cap[i++] >> 16;
+       pci_write_config_dword(dev, pos + PCI_MSI_ADDRESS_LO, cap[i++]);
+       if (control & PCI_MSI_FLAGS_64BIT) {
+               pci_write_config_dword(dev, pos + PCI_MSI_ADDRESS_HI, cap[i++]);
+               pci_write_config_dword(dev, pos + PCI_MSI_DATA_64, cap[i++]);
+       } else
+               pci_write_config_dword(dev, pos + PCI_MSI_DATA_32, cap[i++]);
+       if (control & PCI_MSI_FLAGS_MASKBIT)
+               pci_write_config_dword(dev, pos + PCI_MSI_MASK_BIT, cap[i++]);
+       pci_write_config_word(dev, pos + PCI_MSI_FLAGS, control);
        enable_msi_mode(dev, pos, PCI_CAP_ID_MSI);
+       pci_remove_saved_cap(save_state);
+       kfree(save_state);
 }
 
 int pci_save_msix_state(struct pci_dev *dev)
 {
        int pos;
+       u16 control;
+       struct pci_cap_saved_state *save_state;
        unsigned long flags;
        struct msi_dev_list *msi_dev_entry;
-       struct msi_pirq_entry *pirq_entry, *tmp;
+       struct msi_pirq_entry *pirq_entry;
 
        pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
        if (pos <= 0 || dev->no_msi)
                return 0;
 
+       printk(KERN_CRIT "Saving MSIX cap\n");
+
        /* save the capability */
-       if (!dev->msix_enabled)
+       pci_read_config_word(dev, msi_control_reg(pos), &control);
+       if (!(control & PCI_MSIX_FLAGS_ENABLE))
                return 0;
+       save_state = kzalloc(sizeof(struct pci_cap_saved_state) + sizeof(u16),
+               GFP_KERNEL);
+       if (!save_state) {
+               printk(KERN_ERR "Out of memory in pci_save_msix_state\n");
+               return -ENOMEM;
+       }
+       *((u16 *)&save_state->data[0]) = control;
 
        msi_dev_entry = get_msi_dev_pirq_list(dev);
 
        spin_lock_irqsave(&msi_dev_entry->pirq_list_lock, flags);
-        list_for_each_entry_safe(pirq_entry, tmp,
-                                 &msi_dev_entry->pirq_list_head, list)
-               msi_unmap_pirq(dev, pirq_entry->pirq);
+       list_for_each_entry(pirq_entry, &msi_dev_entry->pirq_list_head, list) {
+               int j;
+               void __iomem *base;
+
+               /* save the table */
+               base = pirq_entry->mask_base;
+               j = pirq_entry->entry_nr;
+               printk(KERN_CRIT "Save msix table entry %d pirq %x base %p\n",
+                      j, pirq_entry->pirq, base);
+
+               pirq_entry->address_lo_save =
+                       readl(base + j * PCI_MSIX_ENTRY_SIZE +
+                             PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
+               pirq_entry->address_hi_save =
+                       readl(base + j * PCI_MSIX_ENTRY_SIZE +
+                             PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET);
+               pirq_entry->data_save =
+                       readl(base + j * PCI_MSIX_ENTRY_SIZE +
+                             PCI_MSIX_ENTRY_DATA_OFFSET);
+       }
        spin_unlock_irqrestore(&msi_dev_entry->pirq_list_lock, flags);
 
-       disable_msi_mode(dev, pos, PCI_CAP_ID_MSIX);
-       /* Set the flags for use of restore */
-       dev->msix_enabled = 1;
-
+       save_state->cap_nr = PCI_CAP_ID_MSIX;
+       pci_add_saved_cap(dev, save_state);
        return 0;
 }
 
 void pci_restore_msix_state(struct pci_dev *dev)
 {
-       int pos;
+       u16 save;
+       int pos, j;
+       void __iomem *base;
+       struct pci_cap_saved_state *save_state;
        unsigned long flags;
-       u64 table_base;
        struct msi_dev_list *msi_dev_entry;
-       struct msi_pirq_entry *pirq_entry, *tmp;
+       struct msi_pirq_entry *pirq_entry;
+
+       save_state = pci_find_saved_cap(dev, PCI_CAP_ID_MSIX);
+       if (!save_state)
+               return;
+       printk(KERN_CRIT "Restoring MSIX cap\n");
+
+       save = *((u16 *)&save_state->data[0]);
+       pci_remove_saved_cap(save_state);
+       kfree(save_state);
 
        pos = pci_find_capability(dev, PCI_CAP_ID_MSIX);
        if (pos <= 0)
                return;
 
-       if (!dev->msix_enabled)
-               return;
-
        msi_dev_entry = get_msi_dev_pirq_list(dev);
-       table_base = find_table_base(dev, pos);
-       if (!table_base)
-               return;
 
        spin_lock_irqsave(&msi_dev_entry->pirq_list_lock, flags);
-       list_for_each_entry_safe(pirq_entry, tmp,
-                                &msi_dev_entry->pirq_list_head, list) {
-               int rc = msi_map_pirq_to_vector(dev, pirq_entry->pirq,
-                                               pirq_entry->entry_nr, 
table_base);
-               if (rc < 0)
-                       printk(KERN_WARNING
-                              "%s: re-mapping irq #%d (pirq%d) failed: %d\n",
-                              pci_name(dev), pirq_entry->entry_nr,
-                              pirq_entry->pirq, rc);
+       list_for_each_entry(pirq_entry, &msi_dev_entry->pirq_list_head, list) {
+               /* route the table */
+               base = pirq_entry->mask_base;
+               j = pirq_entry->entry_nr;
+
+               printk(KERN_CRIT "Restore msix table entry %d pirq %x base 
%p\n",
+                      j, pirq_entry->pirq, base);
+               writel(pirq_entry->address_lo_save,
+                       base + j * PCI_MSIX_ENTRY_SIZE +
+                       PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET);
+               writel(pirq_entry->address_hi_save,
+                       base + j * PCI_MSIX_ENTRY_SIZE +
+                       PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET);
+               writel(pirq_entry->data_save,
+                       base + j * PCI_MSIX_ENTRY_SIZE +
+                       PCI_MSIX_ENTRY_DATA_OFFSET);
        }
        spin_unlock_irqrestore(&msi_dev_entry->pirq_list_lock, flags);
 
+       pci_write_config_word(dev, msi_control_reg(pos), save);
        enable_msi_mode(dev, pos, PCI_CAP_ID_MSIX);
 }
 #endif
@@ -475,7 +554,7 @@ static int msix_capability_init(struct p
                pirq = msi_map_vector(dev, entries[i].entry, table_base);
                if (pirq < 0)
                        break;
-               attach_pirq_entry(pirq, entries[i].entry, msi_dev_entry);
+               attach_pirq_entry(pirq, entries[i].entry, table_base, 
msi_dev_entry);
                (entries + i)->vector = pirq;
        }
 
@@ -660,7 +739,7 @@ int pci_enable_msix(struct pci_dev* dev,
                        if (mapped)
                                continue;
                        irq = evtchn_map_pirq(-1, entries[i].vector);
-                       attach_pirq_entry(irq, entries[i].entry, msi_dev_entry);
+                       attach_pirq_entry(irq, entries[i].entry, 0, 
msi_dev_entry);
                        entries[i].vector = irq;
                }
         return 0;
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/Kconfig
--- a/drivers/pci/pcie/Kconfig  Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/pci/pcie/Kconfig  Wed Nov 26 10:24:15 2008 +0900
@@ -34,3 +34,4 @@ config HOTPLUG_PCI_PCIE_POLL_EVENT_MODE
           
          When in doubt, say N.
 
+source "drivers/pci/pcie/aer/Kconfig"
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/Makefile
--- a/drivers/pci/pcie/Makefile Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/pci/pcie/Makefile Wed Nov 26 10:24:15 2008 +0900
@@ -5,3 +5,6 @@ pcieportdrv-y                   := portdrv_core.o portdr
 pcieportdrv-y                  := portdrv_core.o portdrv_pci.o portdrv_bus.o
 
 obj-$(CONFIG_PCIEPORTBUS)      += pcieportdrv.o
+
+# Build PCI Express AER if needed
+obj-$(CONFIG_PCIEAER)          += aer/
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/aer/Kconfig
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/drivers/pci/pcie/aer/Kconfig      Wed Nov 26 10:24:15 2008 +0900
@@ -0,0 +1,12 @@
+#
+# PCI Express Root Port Device AER Configuration
+#
+
+config PCIEAER
+       boolean "Root Port Advanced Error Reporting support"
+       depends on PCIEPORTBUS && ACPI
+       default y
+       help
+         This enables PCI Express Root Port Advanced Error Reporting
+         (AER) driver support. Error reporting messages sent to Root
+         Port will be handled by PCI Express AER driver.
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/aer/Makefile
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/drivers/pci/pcie/aer/Makefile     Wed Nov 26 10:24:15 2008 +0900
@@ -0,0 +1,8 @@
+#
+# Makefile for PCI-Express Root Port Advanced Error Reporting Driver
+#
+
+obj-$(CONFIG_PCIEAER) += aerdriver.o
+
+aerdriver-objs := aerdrv_errprint.o aerdrv_core.o aerdrv.o aerdrv_acpi.o
+
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/aer/aerdrv.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/drivers/pci/pcie/aer/aerdrv.c     Wed Nov 26 10:24:15 2008 +0900
@@ -0,0 +1,346 @@
+/*
+ * drivers/pci/pcie/aer/aerdrv.c
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * This file implements the AER root port service driver. The driver will
+ * register an irq handler. When root port triggers an AER interrupt, the irq
+ * handler will collect root port status and schedule a work.
+ *
+ * Copyright (C) 2006 Intel Corp.
+ *     Tom Long Nguyen (tom.l.nguyen@xxxxxxxxx)
+ *     Zhang Yanmin (yanmin.zhang@xxxxxxxxx)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <linux/pcieport_if.h>
+
+#include "aerdrv.h"
+
+/*
+ * Version Information
+ */
+#define DRIVER_VERSION "v1.0"
+#define DRIVER_AUTHOR "tom.l.nguyen@xxxxxxxxx"
+#define DRIVER_DESC "Root Port Advanced Error Reporting Driver"
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION(DRIVER_DESC);
+MODULE_LICENSE("GPL");
+
+static int __devinit aer_probe (struct pcie_device *dev,
+       const struct pcie_port_service_id *id );
+static void aer_remove(struct pcie_device *dev);
+static int aer_suspend(struct pcie_device *dev, pm_message_t state)
+{return 0;}
+static int aer_resume(struct pcie_device *dev) {return 0;}
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+       enum pci_channel_state error);
+static void aer_error_resume(struct pci_dev *dev);
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
+
+/*
+ * PCI Express bus's AER Root service driver data structure
+ */
+static struct pcie_port_service_id aer_id[] = {
+       {
+       .vendor         = PCI_ANY_ID,
+       .device         = PCI_ANY_ID,
+       .port_type      = PCIE_RC_PORT,
+       .service_type   = PCIE_PORT_SERVICE_AER,
+       },
+       { /* end: all zeroes */ }
+};
+
+static struct pci_error_handlers aer_error_handlers = {
+       .error_detected = aer_error_detected,
+       .resume = aer_error_resume,
+};
+
+static struct pcie_port_service_driver aerdrv = {
+       .name           = "aer",
+       .id_table       = &aer_id[0],
+
+       .probe          = aer_probe,
+       .remove         = aer_remove,
+
+       .suspend        = aer_suspend,
+       .resume         = aer_resume,
+
+       .err_handler    = &aer_error_handlers,
+
+       .reset_link     = aer_root_reset,
+};
+
+/**
+ * aer_irq - Root Port's ISR
+ * @irq: IRQ assigned to Root Port
+ * @context: pointer to Root Port data structure
+ * @r: pointer struct pt_regs
+ *
+ * Invoked when Root Port detects AER messages.
+ **/
+static irqreturn_t aer_irq(int irq, void *context, struct pt_regs * r)
+{
+       unsigned int status, id;
+       struct pcie_device *pdev = (struct pcie_device *)context;
+       struct aer_rpc *rpc = get_service_data(pdev);
+       int next_prod_idx;
+       unsigned long flags;
+       int pos;
+
+       pos = pci_find_aer_capability(pdev->port);
+       /*
+        * Must lock access to Root Error Status Reg, Root Error ID Reg,
+        * and Root error producer/consumer index
+        */
+       spin_lock_irqsave(&rpc->e_lock, flags);
+
+       /* Read error status */
+       pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, &status);
+       if (!(status & ROOT_ERR_STATUS_MASKS)) {
+               spin_unlock_irqrestore(&rpc->e_lock, flags);
+               return IRQ_NONE;
+       }
+
+       /* Read error source and clear error status */
+       pci_read_config_dword(pdev->port, pos + PCI_ERR_ROOT_COR_SRC, &id);
+       pci_write_config_dword(pdev->port, pos + PCI_ERR_ROOT_STATUS, status);
+
+       /* Store error source for later DPC handler */
+       next_prod_idx = rpc->prod_idx + 1;
+       if (next_prod_idx == AER_ERROR_SOURCES_MAX)
+               next_prod_idx = 0;
+       if (next_prod_idx == rpc->cons_idx) {
+               /*
+                * Error Storm Condition - possibly the same error occurred.
+                * Drop the error.
+                */
+               spin_unlock_irqrestore(&rpc->e_lock, flags);
+               return IRQ_HANDLED;
+       }
+       rpc->e_sources[rpc->prod_idx].status =  status;
+       rpc->e_sources[rpc->prod_idx].id = id;
+       rpc->prod_idx = next_prod_idx;
+       spin_unlock_irqrestore(&rpc->e_lock, flags);
+
+       /*  Invoke DPC handler */
+       schedule_work(&rpc->dpc_handler);
+
+       return IRQ_HANDLED;
+}
+
+/**
+ * aer_alloc_rpc - allocate Root Port data structure
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when Root Port's AER service is loaded.
+ **/
+static struct aer_rpc* aer_alloc_rpc(struct pcie_device *dev)
+{
+       struct aer_rpc *rpc;
+
+       if (!(rpc = (struct aer_rpc *)kmalloc(sizeof(struct aer_rpc),
+               GFP_KERNEL)))
+               return NULL;
+
+       memset(rpc, 0, sizeof(struct aer_rpc));
+       /*
+        * Initialize Root lock access, e_lock, to Root Error Status Reg,
+        * Root Error ID Reg, and Root error producer/consumer index.
+        */
+       rpc->e_lock = SPIN_LOCK_UNLOCKED;
+
+       rpc->rpd = dev;
+       INIT_WORK(&rpc->dpc_handler, aer_isr, (void *)dev);
+       rpc->prod_idx = rpc->cons_idx = 0;
+       mutex_init(&rpc->rpc_mutex);
+       init_waitqueue_head(&rpc->wait_release);
+
+       /* Use PCIE bus function to store rpc into PCIE device */
+       set_service_data(dev, rpc);
+
+       return rpc;
+}
+
+/**
+ * aer_remove - clean up resources
+ * @dev: pointer to the pcie_dev data structure
+ *
+ * Invoked when PCI Express bus unloads or AER probe fails.
+ **/
+static void aer_remove(struct pcie_device *dev)
+{
+       struct aer_rpc *rpc = get_service_data(dev);
+
+       if (rpc) {
+               /* If register interrupt service, it must be free. */
+               if (rpc->isr)
+                       free_irq(dev->irq, dev);
+
+               wait_event(rpc->wait_release, rpc->prod_idx == rpc->cons_idx);
+
+               aer_delete_rootport(rpc);
+               set_service_data(dev, NULL);
+       }
+}
+
+/**
+ * aer_probe - initialize resources
+ * @dev: pointer to the pcie_dev data structure
+ * @id: pointer to the service id data structure
+ *
+ * Invoked when PCI Express bus loads AER service driver.
+ **/
+static int __devinit aer_probe (struct pcie_device *dev,
+                               const struct pcie_port_service_id *id )
+{
+       int status;
+       struct aer_rpc *rpc;
+       struct device *device = &dev->device;
+
+       /* Init */
+       if ((status = aer_init(dev)))
+               return status;
+
+       /* Alloc rpc data structure */
+       if (!(rpc = aer_alloc_rpc(dev))) {
+               printk(KERN_DEBUG "%s: Alloc rpc fails on PCIE device[%s]\n",
+                       __FUNCTION__, device->bus_id);
+               aer_remove(dev);
+               return -ENOMEM;
+       }
+
+       /* Request IRQ ISR */
+       if ((status = request_irq(dev->irq, aer_irq, SA_SHIRQ, "aerdrv",
+                               dev))) {
+               printk(KERN_DEBUG "%s: Request ISR fails on PCIE device[%s]\n",
+                       __FUNCTION__, device->bus_id);
+               aer_remove(dev);
+               return status;
+       }
+
+       rpc->isr = 1;
+
+       aer_enable_rootport(rpc);
+
+       return status;
+}
+
+/**
+ * aer_root_reset - reset link on Root Port
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver when performing link reset at Root Port.
+ **/
+static pci_ers_result_t aer_root_reset(struct pci_dev *dev)
+{
+       u16 p2p_ctrl;
+       u32 status;
+       int pos;
+
+       pos = pci_find_aer_capability(dev);
+
+       /* Disable Root's interrupt in response to error messages */
+       pci_write_config_dword(dev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+       /* Assert Secondary Bus Reset */
+       pci_read_config_word(dev, PCI_BRIDGE_CONTROL, &p2p_ctrl);
+       p2p_ctrl |= PCI_CB_BRIDGE_CTL_CB_RESET;
+       pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+       /* De-assert Secondary Bus Reset */
+       p2p_ctrl &= ~PCI_CB_BRIDGE_CTL_CB_RESET;
+       pci_write_config_word(dev, PCI_BRIDGE_CONTROL, p2p_ctrl);
+
+       /*
+        * System software must wait for at least 100ms from the end
+        * of a reset of one or more device before it is permitted
+        * to issue Configuration Requests to those devices.
+        */
+       msleep(200);
+       printk(KERN_DEBUG "Complete link reset at Root[%s]\n", dev->dev.bus_id);
+
+       /* Enable Root Port's interrupt in response to error messages */
+       pci_read_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, &status);
+       pci_write_config_dword(dev, pos + PCI_ERR_ROOT_STATUS, status);
+       pci_write_config_dword(dev,
+               pos + PCI_ERR_ROOT_COMMAND,
+               ROOT_PORT_INTR_ON_MESG_MASK);
+
+       return PCI_ERS_RESULT_RECOVERED;
+}
+
+/**
+ * aer_error_detected - update severity status
+ * @dev: pointer to Root Port's pci_dev data structure
+ * @error: error severity being notified by port bus
+ *
+ * Invoked by Port Bus driver during error recovery.
+ **/
+static pci_ers_result_t aer_error_detected(struct pci_dev *dev,
+                       enum pci_channel_state error)
+{
+       /* Root Port has no impact. Always recovers. */
+       return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+/**
+ * aer_error_resume - clean up corresponding error status bits
+ * @dev: pointer to Root Port's pci_dev data structure
+ *
+ * Invoked by Port Bus driver during nonfatal recovery.
+ **/
+static void aer_error_resume(struct pci_dev *dev)
+{
+       int pos;
+       u32 status, mask;
+       u16 reg16;
+
+       /* Clean up Root device status */
+       pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+       pci_read_config_word(dev, pos + PCI_EXP_DEVSTA, &reg16);
+       pci_write_config_word(dev, pos + PCI_EXP_DEVSTA, reg16);
+
+       /* Clean AER Root Error Status */
+       pos = pci_find_aer_capability(dev);
+       pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+       pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+       if (dev->error_state == pci_channel_io_normal)
+               status &= ~mask; /* Clear corresponding nonfatal bits */
+       else
+               status &= mask; /* Clear corresponding fatal bits */
+       pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+}
+
+/**
+ * aer_service_init - register AER root service driver
+ *
+ * Invoked when AER root service driver is loaded.
+ **/
+static int __init aer_service_init(void)
+{
+       return pcie_port_service_register(&aerdrv);
+}
+
+/**
+ * aer_service_exit - unregister AER root service driver
+ *
+ * Invoked when AER root service driver is unloaded.
+ **/
+static void __exit aer_service_exit(void)
+{
+       pcie_port_service_unregister(&aerdrv);
+}
+
+module_init(aer_service_init);
+module_exit(aer_service_exit);
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/aer/aerdrv.h
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/drivers/pci/pcie/aer/aerdrv.h     Wed Nov 26 10:24:15 2008 +0900
@@ -0,0 +1,125 @@
+/*
+ * Copyright (C) 2006 Intel Corp.
+ *     Tom Long Nguyen (tom.l.nguyen@xxxxxxxxx)
+ *     Zhang Yanmin (yanmin.zhang@xxxxxxxxx)
+ *
+ */
+
+#ifndef _AERDRV_H_
+#define _AERDRV_H_
+
+#include <linux/pcieport_if.h>
+#include <linux/aer.h>
+
+#define AER_NONFATAL                   0
+#define AER_FATAL                      1
+#define AER_CORRECTABLE                        2
+#define AER_UNCORRECTABLE              4
+#define AER_ERROR_MASK                 0x001fffff
+#define AER_ERROR(d)                   (d & AER_ERROR_MASK)
+
+#define OSC_METHOD_RUN_SUCCESS         0
+#define OSC_METHOD_NOT_SUPPORTED       1
+#define OSC_METHOD_RUN_FAILURE         2
+
+/* Root Error Status Register Bits */
+#define ROOT_ERR_STATUS_MASKS                  0x0f
+
+#define SYSTEM_ERROR_INTR_ON_MESG_MASK (PCI_EXP_RTCTL_SECEE|   \
+                                       PCI_EXP_RTCTL_SENFEE|   \
+                                       PCI_EXP_RTCTL_SEFEE)
+#define ROOT_PORT_INTR_ON_MESG_MASK    (PCI_ERR_ROOT_CMD_COR_EN|       \
+                                       PCI_ERR_ROOT_CMD_NONFATAL_EN|   \
+                                       PCI_ERR_ROOT_CMD_FATAL_EN)
+#define ERR_COR_ID(d)                  (d & 0xffff)
+#define ERR_UNCOR_ID(d)                        (d >> 16)
+
+#define AER_SUCCESS                    0
+#define AER_UNSUCCESS                  1
+#define AER_ERROR_SOURCES_MAX          100
+
+#define AER_LOG_TLP_MASKS              (PCI_ERR_UNC_POISON_TLP|        \
+                                       PCI_ERR_UNC_ECRC|               \
+                                       PCI_ERR_UNC_UNSUP|              \
+                                       PCI_ERR_UNC_COMP_ABORT|         \
+                                       PCI_ERR_UNC_UNX_COMP|           \
+                                       PCI_ERR_UNC_MALF_TLP)
+
+/* AER Error Info Flags */
+#define AER_TLP_HEADER_VALID_FLAG      0x00000001
+#define AER_MULTI_ERROR_VALID_FLAG     0x00000002
+
+#define ERR_CORRECTABLE_ERROR_MASK     0x000031c1
+#define ERR_UNCORRECTABLE_ERROR_MASK   0x001ff010
+
+struct header_log_regs {
+       unsigned int dw0;
+       unsigned int dw1;
+       unsigned int dw2;
+       unsigned int dw3;
+};
+
+struct aer_err_info {
+       int severity;                   /* 0:NONFATAL | 1:FATAL | 2:COR */
+       int flags;
+       unsigned int status;            /* COR/UNCOR Error Status */
+       struct header_log_regs tlp;     /* TLP Header */
+};
+
+struct aer_err_source {
+       unsigned int status;
+       unsigned int id;
+};
+
+struct aer_rpc {
+       struct pcie_device *rpd;        /* Root Port device */
+       struct work_struct dpc_handler;
+       struct aer_err_source e_sources[AER_ERROR_SOURCES_MAX];
+       unsigned short prod_idx;        /* Error Producer Index */
+       unsigned short cons_idx;        /* Error Consumer Index */
+       int isr;
+       spinlock_t e_lock;              /*
+                                        * Lock access to Error Status/ID Regs
+                                        * and error producer/consumer index
+                                        */
+       struct mutex rpc_mutex;         /*
+                                        * only one thread could do
+                                        * recovery on the same
+                                        * root port hierachy
+                                        */
+       wait_queue_head_t wait_release;
+};
+
+struct aer_broadcast_data {
+       enum pci_channel_state state;
+       enum pci_ers_result result;
+};
+
+static inline pci_ers_result_t merge_result(enum pci_ers_result orig,
+               enum pci_ers_result new)
+{
+       switch (orig) {
+       case PCI_ERS_RESULT_CAN_RECOVER:
+       case PCI_ERS_RESULT_RECOVERED:
+               orig = new;
+               break;
+       case PCI_ERS_RESULT_DISCONNECT:
+               if (new == PCI_ERS_RESULT_NEED_RESET)
+                       orig = new;
+               break;
+       default:
+               break;
+       }
+
+       return orig;
+}
+
+extern struct bus_type pcie_port_bus_type;
+extern void aer_enable_rootport(struct aer_rpc *rpc);
+extern void aer_delete_rootport(struct aer_rpc *rpc);
+extern int aer_init(struct pcie_device *dev);
+extern void aer_isr(void *context);
+extern void aer_print_error(struct pci_dev *dev, struct aer_err_info *info);
+extern int aer_osc_setup(struct pci_dev *dev);
+
+#endif //_AERDRV_H_
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/aer/aerdrv_acpi.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/drivers/pci/pcie/aer/aerdrv_acpi.c        Wed Nov 26 10:24:15 2008 +0900
@@ -0,0 +1,68 @@
+/*
+ * Access ACPI _OSC method
+ *
+ * Copyright (C) 2006 Intel Corp.
+ *     Tom Long Nguyen (tom.l.nguyen@xxxxxxxxx)
+ *     Zhang Yanmin (yanmin.zhang@xxxxxxxxx)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+/**
+ * aer_osc_setup - run ACPI _OSC method
+ *
+ * Return:
+ *     Zero if success. Nonzero for otherwise.
+ *
+ * Invoked when PCIE bus loads AER service driver. To avoid conflict with
+ * BIOS AER support requires BIOS to yield AER control to OS native driver.
+ **/
+int aer_osc_setup(struct pci_dev *dev)
+{
+       int retval = OSC_METHOD_RUN_SUCCESS;
+       acpi_status status;
+       acpi_handle handle = DEVICE_ACPI_HANDLE(&dev->dev);
+       struct pci_dev *pdev = dev;
+       struct pci_bus *parent;
+
+       while (!handle) {
+               if (!pdev || !pdev->bus->parent)
+                       break;
+               parent = pdev->bus->parent;
+               if (!parent->self)
+                       /* Parent must be a host bridge */
+                       handle = acpi_get_pci_rootbridge_handle(
+                                       pci_domain_nr(parent),
+                                       parent->number);
+               else
+                       handle = DEVICE_ACPI_HANDLE(
+                                       &(parent->self->dev));
+               pdev = parent->self;
+       }
+
+       if (!handle)
+               return OSC_METHOD_NOT_SUPPORTED;
+
+       pci_osc_support_set(OSC_EXT_PCI_CONFIG_SUPPORT);
+       status = pci_osc_control_set(handle, OSC_PCI_EXPRESS_AER_CONTROL |
+               OSC_PCI_EXPRESS_CAP_STRUCTURE_CONTROL);
+       if (ACPI_FAILURE(status)) {
+               if (status == AE_SUPPORT)
+                       retval = OSC_METHOD_NOT_SUPPORTED;
+               else
+                       retval = OSC_METHOD_RUN_FAILURE;
+       }
+
+       return retval;
+}
+
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/aer/aerdrv_core.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/drivers/pci/pcie/aer/aerdrv_core.c        Wed Nov 26 10:24:15 2008 +0900
@@ -0,0 +1,757 @@
+/*
+ * drivers/pci/pcie/aer/aerdrv_core.c
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * This file implements the core part of PCI-Express AER. When an pci-express
+ * error is delivered, an error message will be collected and printed to
+ * console, then, an error recovery procedure will be executed by following
+ * the pci error recovery rules.
+ *
+ * Copyright (C) 2006 Intel Corp.
+ *     Tom Long Nguyen (tom.l.nguyen@xxxxxxxxx)
+ *     Zhang Yanmin (yanmin.zhang@xxxxxxxxx)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+#include <linux/acpi.h>
+#include <linux/pci-acpi.h>
+#include <linux/delay.h>
+#include "aerdrv.h"
+
+static int forceload;
+module_param(forceload, bool, 0);
+
+#define PCI_CFG_SPACE_SIZE     (0x100)
+int pci_find_aer_capability(struct pci_dev *dev)
+{
+       int pos;
+       u32 reg32 = 0;
+
+       /* Check if it's a pci-express device */
+       pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+       if (!pos)
+               return 0;
+
+       /* Check if it supports pci-express AER */
+       pos = PCI_CFG_SPACE_SIZE;
+       while (pos) {
+               if (pci_read_config_dword(dev, pos, &reg32))
+                       return 0;
+
+               /* some broken boards return ~0 */
+               if (reg32 == 0xffffffff)
+                       return 0;
+
+               if (PCI_EXT_CAP_ID(reg32) == PCI_EXT_CAP_ID_ERR)
+                       break;
+
+               pos = reg32 >> 20;
+       }
+
+       return pos;
+}
+
+int pci_enable_pcie_error_reporting(struct pci_dev *dev)
+{
+       u16 reg16 = 0;
+       int pos;
+
+       pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+       if (!pos)
+               return -EIO;
+
+       pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+       reg16 = reg16 |
+               PCI_EXP_DEVCTL_CERE |
+               PCI_EXP_DEVCTL_NFERE |
+               PCI_EXP_DEVCTL_FERE |
+               PCI_EXP_DEVCTL_URRE;
+       pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+                       reg16);
+       return 0;
+}
+
+int pci_disable_pcie_error_reporting(struct pci_dev *dev)
+{
+       u16 reg16 = 0;
+       int pos;
+
+       pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
+       if (!pos)
+               return -EIO;
+
+       pci_read_config_word(dev, pos+PCI_EXP_DEVCTL, &reg16);
+       reg16 = reg16 & ~(PCI_EXP_DEVCTL_CERE |
+                       PCI_EXP_DEVCTL_NFERE |
+                       PCI_EXP_DEVCTL_FERE |
+                       PCI_EXP_DEVCTL_URRE);
+       pci_write_config_word(dev, pos+PCI_EXP_DEVCTL,
+                       reg16);
+       return 0;
+}
+
+int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev)
+{
+       int pos;
+       u32 status, mask;
+
+       pos = pci_find_aer_capability(dev);
+       if (!pos)
+               return -EIO;
+
+       pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, &status);
+       pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_SEVER, &mask);
+       if (dev->error_state == pci_channel_io_normal)
+               status &= ~mask; /* Clear corresponding nonfatal bits */
+       else
+               status &= mask; /* Clear corresponding fatal bits */
+       pci_write_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS, status);
+
+       return 0;
+}
+
+static int find_device_iter(struct device *device, void *data)
+{
+       struct pci_dev *dev;
+       u16 id = *(unsigned long *)data;
+       u8 secondary, subordinate, d_bus = id >> 8;
+
+       if (device->bus == &pci_bus_type) {
+               dev = to_pci_dev(device);
+               if (id == ((dev->bus->number << 8) | dev->devfn)) {
+                       /*
+                        * Device ID match
+                        */
+                       *(unsigned long*)data = (unsigned long)device;
+                       return 1;
+               }
+
+               /*
+                * If device is P2P, check if it is an upstream?
+                */
+               if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+                       pci_read_config_byte(dev, PCI_SECONDARY_BUS,
+                               &secondary);
+                       pci_read_config_byte(dev, PCI_SUBORDINATE_BUS,
+                               &subordinate);
+                       if (d_bus >= secondary && d_bus <= subordinate) {
+                               *(unsigned long*)data = (unsigned long)device;
+                               return 1;
+                       }
+               }
+       }
+
+       return 0;
+}
+
+/**
+ * find_source_device - search through device hierarchy for source device
+ * @p_dev: pointer to Root Port pci_dev data structure
+ * @id: device ID of agent who sends an error message to this Root Port
+ *
+ * Invoked when error is detected at the Root Port.
+ **/
+static struct device* find_source_device(struct pci_dev *parent, u16 id)
+{
+       struct pci_dev *dev = parent;
+       struct device *device;
+       unsigned long device_addr;
+       int status;
+
+       /* Is Root Port an agent that sends error message? */
+       if (id == ((dev->bus->number << 8) | dev->devfn))
+               return &dev->dev;
+
+       do {
+               device_addr = id;
+               if ((status = device_for_each_child(&dev->dev,
+                       &device_addr, find_device_iter))) {
+                       device = (struct device*)device_addr;
+                       dev = to_pci_dev(device);
+                       if (id == ((dev->bus->number << 8) | dev->devfn))
+                               return device;
+               }
+       }while (status);
+
+       return NULL;
+}
+
+static void report_error_detected(struct pci_dev *dev, void *data)
+{
+       pci_ers_result_t vote;
+       struct pci_error_handlers *err_handler;
+       struct aer_broadcast_data *result_data;
+       result_data = (struct aer_broadcast_data *) data;
+
+       dev->error_state = result_data->state;
+
+       if (!dev->driver ||
+               !dev->driver->err_handler ||
+               !dev->driver->err_handler->error_detected) {
+               if (result_data->state == pci_channel_io_frozen &&
+                       !(dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)) {
+                       /*
+                        * In case of fatal recovery, if one of down-
+                        * stream device has no driver. We might be
+                        * unable to recover because a later insmod
+                        * of a driver for this device is unaware of
+                        * its hw state.
+                        */
+                       printk(KERN_DEBUG "Device ID[%s] has %s\n",
+                                       dev->dev.bus_id, (dev->driver) ?
+                                       "no AER-aware driver" : "no driver");
+               }
+               return;
+       }
+
+       err_handler = dev->driver->err_handler;
+       vote = err_handler->error_detected(dev, result_data->state);
+       result_data->result = merge_result(result_data->result, vote);
+       return;
+}
+
+static void report_mmio_enabled(struct pci_dev *dev, void *data)
+{
+       pci_ers_result_t vote;
+       struct pci_error_handlers *err_handler;
+       struct aer_broadcast_data *result_data;
+       result_data = (struct aer_broadcast_data *) data;
+
+       if (!dev->driver ||
+               !dev->driver->err_handler ||
+               !dev->driver->err_handler->mmio_enabled)
+               return;
+
+       err_handler = dev->driver->err_handler;
+       vote = err_handler->mmio_enabled(dev);
+       result_data->result = merge_result(result_data->result, vote);
+       return;
+}
+
+static void report_slot_reset(struct pci_dev *dev, void *data)
+{
+       pci_ers_result_t vote;
+       struct pci_error_handlers *err_handler;
+       struct aer_broadcast_data *result_data;
+       result_data = (struct aer_broadcast_data *) data;
+
+       if (!dev->driver ||
+               !dev->driver->err_handler ||
+               !dev->driver->err_handler->slot_reset)
+               return;
+
+       err_handler = dev->driver->err_handler;
+       vote = err_handler->slot_reset(dev);
+       result_data->result = merge_result(result_data->result, vote);
+       return;
+}
+
+static void report_resume(struct pci_dev *dev, void *data)
+{
+       struct pci_error_handlers *err_handler;
+
+       dev->error_state = pci_channel_io_normal;
+
+       if (!dev->driver ||
+               !dev->driver->err_handler ||
+               !dev->driver->err_handler->resume)
+               return;
+
+       err_handler = dev->driver->err_handler;
+       err_handler->resume(dev);
+       return;
+}
+
+/**
+ * broadcast_error_message - handle message broadcast to downstream drivers
+ * @device: pointer to from where in a hierarchy message is broadcasted down
+ * @api: callback to be broadcasted
+ * @state: error state
+ *
+ * Invoked during error recovery process. Once being invoked, the content
+ * of error severity will be broadcasted to all downstream drivers in a
+ * hierarchy in question.
+ **/
+static pci_ers_result_t broadcast_error_message(struct pci_dev *dev,
+       enum pci_channel_state state,
+       char *error_mesg,
+       void (*cb)(struct pci_dev *, void *))
+{
+       struct aer_broadcast_data result_data;
+
+       printk(KERN_DEBUG "Broadcast %s message\n", error_mesg);
+       result_data.state = state;
+       if (cb == report_error_detected)
+               result_data.result = PCI_ERS_RESULT_CAN_RECOVER;
+       else
+               result_data.result = PCI_ERS_RESULT_RECOVERED;
+
+       if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE) {
+               /*
+                * If the error is reported by a bridge, we think this error
+                * is related to the downstream link of the bridge, so we
+                * do error recovery on all subordinates of the bridge instead
+                * of the bridge and clear the error status of the bridge.
+                */
+               if (cb == report_error_detected)
+                       dev->error_state = state;
+               pci_walk_bus(dev->subordinate, cb, &result_data);
+               if (cb == report_resume) {
+                       pci_cleanup_aer_uncorrect_error_status(dev);
+                       dev->error_state = pci_channel_io_normal;
+               }
+       }
+       else {
+               /*
+                * If the error is reported by an end point, we think this
+                * error is related to the upstream link of the end point.
+                */
+               pci_walk_bus(dev->bus, cb, &result_data);
+       }
+
+       return result_data.result;
+}
+
+struct find_aer_service_data {
+       struct pcie_port_service_driver *aer_driver;
+       int is_downstream;
+};
+
+static int find_aer_service_iter(struct device *device, void *data)
+{
+       struct device_driver *driver;
+       struct pcie_port_service_driver *service_driver;
+       struct pcie_device *pcie_dev;
+       struct find_aer_service_data *result;
+
+       result = (struct find_aer_service_data *) data;
+
+       if (device->bus == &pcie_port_bus_type) {
+               pcie_dev = to_pcie_device(device);
+               if (pcie_dev->id.port_type == PCIE_SW_DOWNSTREAM_PORT)
+                       result->is_downstream = 1;
+
+               driver = device->driver;
+               if (driver) {
+                       service_driver = to_service_driver(driver);
+                       if (service_driver->id_table->service_type ==
+                                       PCIE_PORT_SERVICE_AER) {
+                               result->aer_driver = service_driver;
+                               return 1;
+                       }
+               }
+       }
+
+       return 0;
+}
+
+static void find_aer_service(struct pci_dev *dev,
+               struct find_aer_service_data *data)
+{
+       device_for_each_child(&dev->dev, data, find_aer_service_iter);
+}
+
+static pci_ers_result_t reset_link(struct pcie_device *aerdev,
+               struct pci_dev *dev)
+{
+       struct pci_dev *udev;
+       pci_ers_result_t status;
+       struct find_aer_service_data data;
+
+       if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE)
+               udev = dev;
+       else
+               udev= dev->bus->self;
+
+       data.is_downstream = 0;
+       data.aer_driver = NULL;
+       find_aer_service(udev, &data);
+
+       /*
+        * Use the aer driver of the error agent firstly.
+        * If it hasn't the aer driver, use the root port's
+        */
+       if (!data.aer_driver || !data.aer_driver->reset_link) {
+               if (data.is_downstream &&
+                       aerdev->device.driver &&
+                       to_service_driver(aerdev->device.driver)->reset_link) {
+                       data.aer_driver =
+                               to_service_driver(aerdev->device.driver);
+               } else {
+                       printk(KERN_DEBUG "No link-reset support to Device ID"
+                               "[%s]\n",
+                               dev->dev.bus_id);
+                       return PCI_ERS_RESULT_DISCONNECT;
+               }
+       }
+
+       status = data.aer_driver->reset_link(udev);
+       if (status != PCI_ERS_RESULT_RECOVERED) {
+               printk(KERN_DEBUG "Link reset at upstream Device ID"
+                       "[%s] failed\n",
+                       udev->dev.bus_id);
+               return PCI_ERS_RESULT_DISCONNECT;
+       }
+
+       return status;
+}
+
+/**
+ * do_recovery - handle nonfatal/fatal error recovery process
+ * @aerdev: pointer to a pcie_device data structure of root port
+ * @dev: pointer to a pci_dev data structure of agent detecting an error
+ * @severity: error severity type
+ *
+ * Invoked when an error is nonfatal/fatal. Once being invoked, broadcast
+ * error detected message to all downstream drivers within a hierarchy in
+ * question and return the returned code.
+ **/
+static pci_ers_result_t do_recovery(struct pcie_device *aerdev,
+               struct pci_dev *dev,
+               int severity)
+{
+       pci_ers_result_t status, result = PCI_ERS_RESULT_RECOVERED;
+       enum pci_channel_state state;
+
+       if (severity == AER_FATAL)
+               state = pci_channel_io_frozen;
+       else
+               state = pci_channel_io_normal;
+
+       status = broadcast_error_message(dev,
+                       state,
+                       "error_detected",
+                       report_error_detected);
+
+       if (severity == AER_FATAL) {
+               result = reset_link(aerdev, dev);
+               if (result != PCI_ERS_RESULT_RECOVERED) {
+                       /* TODO: Should panic here? */
+                       return result;
+               }
+       }
+
+       if (status == PCI_ERS_RESULT_CAN_RECOVER)
+               status = broadcast_error_message(dev,
+                               state,
+                               "mmio_enabled",
+                               report_mmio_enabled);
+
+       if (status == PCI_ERS_RESULT_NEED_RESET) {
+               /*
+                * TODO: Should call platform-specific
+                * functions to reset slot before calling
+                * drivers' slot_reset callbacks?
+                */
+               status = broadcast_error_message(dev,
+                               state,
+                               "slot_reset",
+                               report_slot_reset);
+       }
+
+       if (status == PCI_ERS_RESULT_RECOVERED)
+               broadcast_error_message(dev,
+                               state,
+                               "resume",
+                               report_resume);
+
+       return status;
+}
+
+/**
+ * handle_error_source - handle logging error into an event log
+ * @aerdev: pointer to pcie_device data structure of the root port
+ * @dev: pointer to pci_dev data structure of error source device
+ * @info: comprehensive error information
+ *
+ * Invoked when an error being detected by Root Port.
+ **/
+static void handle_error_source(struct pcie_device * aerdev,
+       struct pci_dev *dev,
+       struct aer_err_info info)
+{
+       pci_ers_result_t status = 0;
+       int pos;
+
+       if (info.severity == AER_CORRECTABLE) {
+               /*
+                * Correctable error does not need software intevention.
+                * No need to go through error recovery process.
+                */
+               pos = pci_find_aer_capability(dev);
+               if (pos)
+                       pci_write_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+                                       info.status);
+       } else {
+               status = do_recovery(aerdev, dev, info.severity);
+               if (status == PCI_ERS_RESULT_RECOVERED) {
+                       printk(KERN_DEBUG "AER driver successfully 
recovered\n");
+               } else {
+                       /* TODO: Should kernel panic here? */
+                       printk(KERN_DEBUG "AER driver didn't recover\n");
+               }
+       }
+}
+
+/**
+ * aer_enable_rootport - enable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus loads AER service driver.
+ **/
+void aer_enable_rootport(struct aer_rpc *rpc)
+{
+       struct pci_dev *pdev = rpc->rpd->port;
+       int pos, aer_pos;
+       u16 reg16;
+       u32 reg32;
+
+       pos = pci_find_capability(pdev, PCI_CAP_ID_EXP);
+       /* Clear PCIE Capability's Device Status */
+       pci_read_config_word(pdev, pos+PCI_EXP_DEVSTA, &reg16);
+       pci_write_config_word(pdev, pos+PCI_EXP_DEVSTA, reg16);
+
+       /* Disable system error generation in response to error messages */
+       pci_read_config_word(pdev, pos + PCI_EXP_RTCTL, &reg16);
+       reg16 &= ~(SYSTEM_ERROR_INTR_ON_MESG_MASK);
+       pci_write_config_word(pdev, pos + PCI_EXP_RTCTL, reg16);
+
+       aer_pos = pci_find_aer_capability(pdev);
+       /* Clear error status */
+       pci_read_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, &reg32);
+       pci_write_config_dword(pdev, aer_pos + PCI_ERR_ROOT_STATUS, reg32);
+       pci_read_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, &reg32);
+       pci_write_config_dword(pdev, aer_pos + PCI_ERR_COR_STATUS, reg32);
+       pci_read_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, &reg32);
+       pci_write_config_dword(pdev, aer_pos + PCI_ERR_UNCOR_STATUS, reg32);
+
+       /* Enable Root Port device reporting error itself */
+       pci_read_config_word(pdev, pos+PCI_EXP_DEVCTL, &reg16);
+       reg16 = reg16 |
+               PCI_EXP_DEVCTL_CERE |
+               PCI_EXP_DEVCTL_NFERE |
+               PCI_EXP_DEVCTL_FERE |
+               PCI_EXP_DEVCTL_URRE;
+       pci_write_config_word(pdev, pos+PCI_EXP_DEVCTL,
+               reg16);
+
+       /* Enable Root Port's interrupt in response to error messages */
+       pci_write_config_dword(pdev,
+               aer_pos + PCI_ERR_ROOT_COMMAND,
+               ROOT_PORT_INTR_ON_MESG_MASK);
+}
+
+/**
+ * disable_root_aer - disable Root Port's interrupts when receiving messages
+ * @rpc: pointer to a Root Port data structure
+ *
+ * Invoked when PCIE bus unloads AER service driver.
+ **/
+static void disable_root_aer(struct aer_rpc *rpc)
+{
+       struct pci_dev *pdev = rpc->rpd->port;
+       u32 reg32;
+       int pos;
+
+       pos = pci_find_aer_capability(pdev);
+       /* Disable Root's interrupt in response to error messages */
+       pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_COMMAND, 0);
+
+       /* Clear Root's error status reg */
+       pci_read_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, &reg32);
+       pci_write_config_dword(pdev, pos + PCI_ERR_ROOT_STATUS, reg32);
+}
+
+/**
+ * get_e_source - retrieve an error source
+ * @rpc: pointer to the root port which holds an error
+ *
+ * Invoked by DPC handler to consume an error.
+ **/
+static struct aer_err_source* get_e_source(struct aer_rpc *rpc)
+{
+       struct aer_err_source *e_source;
+       unsigned long flags;
+
+       /* Lock access to Root error producer/consumer index */
+       spin_lock_irqsave(&rpc->e_lock, flags);
+       if (rpc->prod_idx == rpc->cons_idx) {
+               spin_unlock_irqrestore(&rpc->e_lock, flags);
+               return NULL;
+       }
+       e_source = &rpc->e_sources[rpc->cons_idx];
+       rpc->cons_idx++;
+       if (rpc->cons_idx == AER_ERROR_SOURCES_MAX)
+               rpc->cons_idx = 0;
+       spin_unlock_irqrestore(&rpc->e_lock, flags);
+
+       return e_source;
+}
+
+static int get_device_error_info(struct pci_dev *dev, struct aer_err_info 
*info)
+{
+       int pos;
+
+       pos = pci_find_aer_capability(dev);
+
+       /* The device might not support AER */
+       if (!pos)
+               return AER_SUCCESS;
+
+       if (info->severity == AER_CORRECTABLE) {
+               pci_read_config_dword(dev, pos + PCI_ERR_COR_STATUS,
+                       &info->status);
+               if (!(info->status & ERR_CORRECTABLE_ERROR_MASK))
+                       return AER_UNSUCCESS;
+       } else if (dev->hdr_type & PCI_HEADER_TYPE_BRIDGE ||
+               info->severity == AER_NONFATAL) {
+
+               /* Link is still healthy for IO reads */
+               pci_read_config_dword(dev, pos + PCI_ERR_UNCOR_STATUS,
+                       &info->status);
+               if (!(info->status & ERR_UNCORRECTABLE_ERROR_MASK))
+                       return AER_UNSUCCESS;
+
+               if (info->status & AER_LOG_TLP_MASKS) {
+                       info->flags |= AER_TLP_HEADER_VALID_FLAG;
+                       pci_read_config_dword(dev,
+                               pos + PCI_ERR_HEADER_LOG, &info->tlp.dw0);
+                       pci_read_config_dword(dev,
+                               pos + PCI_ERR_HEADER_LOG + 4, &info->tlp.dw1);
+                       pci_read_config_dword(dev,
+                               pos + PCI_ERR_HEADER_LOG + 8, &info->tlp.dw2);
+                       pci_read_config_dword(dev,
+                               pos + PCI_ERR_HEADER_LOG + 12, &info->tlp.dw3);
+               }
+       }
+
+       return AER_SUCCESS;
+}
+
+/**
+ * aer_isr_one_error - consume an error detected by root port
+ * @p_device: pointer to error root port service device
+ * @e_src: pointer to an error source
+ **/
+static void aer_isr_one_error(struct pcie_device *p_device,
+               struct aer_err_source *e_src)
+{
+       struct device *s_device;
+       struct aer_err_info e_info = {0, 0, 0,};
+       int i;
+       u16 id;
+
+       /*
+        * There is a possibility that both correctable error and
+        * uncorrectable error being logged. Report correctable error first.
+        */
+       for (i = 1; i & ROOT_ERR_STATUS_MASKS ; i <<= 2) {
+               if (i > 4)
+                       break;
+               if (!(e_src->status & i))
+                       continue;
+
+               /* Init comprehensive error information */
+               if (i & PCI_ERR_ROOT_COR_RCV) {
+                       id = ERR_COR_ID(e_src->id);
+                       e_info.severity = AER_CORRECTABLE;
+               } else {
+                       id = ERR_UNCOR_ID(e_src->id);
+                       e_info.severity = ((e_src->status >> 6) & 1);
+               }
+               if (e_src->status &
+                       (PCI_ERR_ROOT_MULTI_COR_RCV |
+                        PCI_ERR_ROOT_MULTI_UNCOR_RCV))
+                       e_info.flags |= AER_MULTI_ERROR_VALID_FLAG;
+               if (!(s_device = find_source_device(p_device->port, id))) {
+                       printk(KERN_DEBUG "%s->can't find device of ID%04x\n",
+                               __FUNCTION__, id);
+                       continue;
+               }
+               if (get_device_error_info(to_pci_dev(s_device), &e_info) ==
+                               AER_SUCCESS) {
+                       aer_print_error(to_pci_dev(s_device), &e_info);
+                       handle_error_source(p_device,
+                               to_pci_dev(s_device),
+                               e_info);
+               }
+       }
+}
+
+/**
+ * aer_isr - consume errors detected by root port
+ * @context: pointer to a private data of pcie device
+ *
+ * Invoked, as DPC, when root port records new detected error
+ **/
+void aer_isr(void *context)
+{
+       struct pcie_device *p_device = (struct pcie_device *) context;
+       struct aer_rpc *rpc = get_service_data(p_device);
+       struct aer_err_source *e_src;
+
+       mutex_lock(&rpc->rpc_mutex);
+       e_src = get_e_source(rpc);
+       while (e_src) {
+               aer_isr_one_error(p_device, e_src);
+               e_src = get_e_source(rpc);
+       }
+       mutex_unlock(&rpc->rpc_mutex);
+
+       wake_up(&rpc->wait_release);
+}
+
+/**
+ * aer_delete_rootport - disable root port aer and delete service data
+ * @rpc: pointer to a root port device being deleted
+ *
+ * Invoked when AER service unloaded on a specific Root Port
+ **/
+void aer_delete_rootport(struct aer_rpc *rpc)
+{
+       /* Disable root port AER itself */
+       disable_root_aer(rpc);
+
+       kfree(rpc);
+}
+
+/**
+ * aer_init - provide AER initialization
+ * @dev: pointer to AER pcie device
+ *
+ * Invoked when AER service driver is loaded.
+ **/
+int aer_init(struct pcie_device *dev)
+{
+       int status;
+
+       /* Run _OSC Method */
+       status = aer_osc_setup(dev->port);
+
+       if(status != OSC_METHOD_RUN_SUCCESS) {
+               printk(KERN_DEBUG "%s: AER service init fails - %s\n",
+               __FUNCTION__,
+               (status == OSC_METHOD_NOT_SUPPORTED) ?
+                       "No ACPI _OSC support" : "Run ACPI _OSC fails");
+
+               if (!forceload)
+                       return status;
+       }
+
+       return AER_SUCCESS;
+}
+
+EXPORT_SYMBOL_GPL(pci_find_aer_capability);
+EXPORT_SYMBOL_GPL(pci_enable_pcie_error_reporting);
+EXPORT_SYMBOL_GPL(pci_disable_pcie_error_reporting);
+EXPORT_SYMBOL_GPL(pci_cleanup_aer_uncorrect_error_status);
+
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/aer/aerdrv_errprint.c
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/drivers/pci/pcie/aer/aerdrv_errprint.c    Wed Nov 26 10:24:15 2008 +0900
@@ -0,0 +1,248 @@
+/*
+ * drivers/pci/pcie/aer/aerdrv_errprint.c
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Format error messages and print them to console.
+ *
+ * Copyright (C) 2006 Intel Corp.
+ *     Tom Long Nguyen (tom.l.nguyen@xxxxxxxxx)
+ *     Zhang Yanmin (yanmin.zhang@xxxxxxxxx)
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/pm.h>
+#include <linux/suspend.h>
+
+#include "aerdrv.h"
+
+#define AER_AGENT_RECEIVER             0
+#define AER_AGENT_REQUESTER            1
+#define AER_AGENT_COMPLETER            2
+#define AER_AGENT_TRANSMITTER          3
+
+#define AER_AGENT_REQUESTER_MASK       (PCI_ERR_UNC_COMP_TIME| \
+                                       PCI_ERR_UNC_UNSUP)
+
+#define AER_AGENT_COMPLETER_MASK       PCI_ERR_UNC_COMP_ABORT
+
+#define AER_AGENT_TRANSMITTER_MASK(t, e) (e & (PCI_ERR_COR_REP_ROLL| \
+       ((t == AER_CORRECTABLE) ? PCI_ERR_COR_REP_TIMER: 0)))
+
+#define AER_GET_AGENT(t, e)                                            \
+       ((e & AER_AGENT_COMPLETER_MASK) ? AER_AGENT_COMPLETER :         \
+       (e & AER_AGENT_REQUESTER_MASK) ? AER_AGENT_REQUESTER :          \
+       (AER_AGENT_TRANSMITTER_MASK(t, e)) ? AER_AGENT_TRANSMITTER :    \
+       AER_AGENT_RECEIVER)
+
+#define AER_PHYSICAL_LAYER_ERROR_MASK  PCI_ERR_COR_RCVR
+#define AER_DATA_LINK_LAYER_ERROR_MASK(t, e)   \
+               (PCI_ERR_UNC_DLP|               \
+               PCI_ERR_COR_BAD_TLP|            \
+               PCI_ERR_COR_BAD_DLLP|           \
+               PCI_ERR_COR_REP_ROLL|           \
+               ((t == AER_CORRECTABLE) ?       \
+               PCI_ERR_COR_REP_TIMER: 0))
+
+#define AER_PHYSICAL_LAYER_ERROR       0
+#define AER_DATA_LINK_LAYER_ERROR      1
+#define AER_TRANSACTION_LAYER_ERROR    2
+
+#define AER_GET_LAYER_ERROR(t, e)                              \
+       ((e & AER_PHYSICAL_LAYER_ERROR_MASK) ?                  \
+       AER_PHYSICAL_LAYER_ERROR :                              \
+       (e & AER_DATA_LINK_LAYER_ERROR_MASK(t, e)) ?            \
+               AER_DATA_LINK_LAYER_ERROR :                     \
+               AER_TRANSACTION_LAYER_ERROR)
+
+/*
+ * AER error strings
+ */
+static char* aer_error_severity_string[] = {
+       "Uncorrected (Non-Fatal)",
+       "Uncorrected (Fatal)",
+       "Corrected"
+};
+
+static char* aer_error_layer[] = {
+       "Physical Layer",
+       "Data Link Layer",
+       "Transaction Layer"
+};
+static char* aer_correctable_error_string[] = {
+       "Receiver Error        ",       /* Bit Position 0       */
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       "Bad TLP               ",       /* Bit Position 6       */
+       "Bad DLLP              ",       /* Bit Position 7       */
+       "RELAY_NUM Rollover    ",       /* Bit Position 8       */
+       NULL,
+       NULL,
+       NULL,
+       "Replay Timer Timeout  ",       /* Bit Position 12      */
+       "Advisory Non-Fatal    ",       /* Bit Position 13      */
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+};
+
+static char* aer_uncorrectable_error_string[] = {
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       "Data Link Protocol    ",       /* Bit Position 4       */
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       "Poisoned TLP          ",       /* Bit Position 12      */
+       "Flow Control Protocol ",       /* Bit Position 13      */
+       "Completion Timeout    ",       /* Bit Position 14      */
+       "Completer Abort       ",       /* Bit Position 15      */
+       "Unexpected Completion ",       /* Bit Position 16      */
+       "Receiver Overflow     ",       /* Bit Position 17      */
+       "Malformed TLP         ",       /* Bit Position 18      */
+       "ECRC                  ",       /* Bit Position 19      */
+       "Unsupported Request   ",       /* Bit Position 20      */
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+       NULL,
+};
+
+static char* aer_agent_string[] = {
+       "Receiver ID",
+       "Requester ID",
+       "Completer ID",
+       "Transmitter ID"
+};
+
+static char * aer_get_error_source_name(int severity,
+                       unsigned int status,
+                       char errmsg_buff[])
+{
+       int i;
+       char * errmsg = NULL;
+
+       for (i = 0; i < 32; i++) {
+               if (!(status & (1 << i)))
+                       continue;
+
+               if (severity == AER_CORRECTABLE)
+                       errmsg = aer_correctable_error_string[i];
+               else
+                       errmsg = aer_uncorrectable_error_string[i];
+
+               if (!errmsg) {
+                       sprintf(errmsg_buff, "Unknown Error Bit %2d  ", i);
+                       errmsg = errmsg_buff;
+               }
+
+               break;
+       }
+
+       return errmsg;
+}
+
+static DEFINE_SPINLOCK(logbuf_lock);
+static char errmsg_buff[100];
+void aer_print_error(struct pci_dev *dev, struct aer_err_info *info)
+{
+       char * errmsg;
+       int err_layer, agent;
+       char * loglevel;
+
+       if (info->severity == AER_CORRECTABLE)
+               loglevel = KERN_WARNING;
+       else
+               loglevel = KERN_ERR;
+
+       printk("%s+------ PCI-Express Device Error ------+\n", loglevel);
+       printk("%sError Severity\t\t: %s\n", loglevel,
+               aer_error_severity_string[info->severity]);
+
+       if ( info->status == 0) {
+               printk("%sPCIE Bus Error type\t: (Unaccessible)\n", loglevel);
+               printk("%sUnaccessible Received\t: %s\n", loglevel,
+                       info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+                               "Multiple" : "First");
+               printk("%sUnregistered Agent ID\t: %04x\n", loglevel,
+                       (dev->bus->number << 8) | dev->devfn);
+       } else {
+               err_layer = AER_GET_LAYER_ERROR(info->severity, info->status);
+               printk("%sPCIE Bus Error type\t: %s\n", loglevel,
+                       aer_error_layer[err_layer]);
+
+               spin_lock(&logbuf_lock);
+               errmsg = aer_get_error_source_name(info->severity,
+                               info->status,
+                               errmsg_buff);
+               printk("%s%s\t: %s\n", loglevel, errmsg,
+                       info->flags & AER_MULTI_ERROR_VALID_FLAG ?
+                               "Multiple" : "First");
+               spin_unlock(&logbuf_lock);
+
+               agent = AER_GET_AGENT(info->severity, info->status);
+               printk("%s%s\t\t: %04x\n", loglevel,
+                       aer_agent_string[agent],
+                       (dev->bus->number << 8) | dev->devfn);
+
+               printk("%sVendorID=%04xh, DeviceID=%04xh,"
+                       " Bus=%02xh, Device=%02xh, Function=%02xh\n",
+                       loglevel,
+                       dev->vendor,
+                       dev->device,
+                       dev->bus->number,
+                       PCI_SLOT(dev->devfn),
+                       PCI_FUNC(dev->devfn));
+
+               if (info->flags & AER_TLP_HEADER_VALID_FLAG) {
+                       unsigned char *tlp = (unsigned char *) &info->tlp;
+                       printk("%sTLB Header:\n", loglevel);
+                       printk("%s%02x%02x%02x%02x %02x%02x%02x%02x"
+                               " %02x%02x%02x%02x %02x%02x%02x%02x\n",
+                               loglevel,
+                               *(tlp + 3), *(tlp + 2), *(tlp + 1), *tlp,
+                               *(tlp + 7), *(tlp + 6), *(tlp + 5), *(tlp + 4),
+                               *(tlp + 11), *(tlp + 10), *(tlp + 9),
+                               *(tlp + 8), *(tlp + 15), *(tlp + 14),
+                               *(tlp + 13), *(tlp + 12));
+               }
+       }
+}
+
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/portdrv_bus.c
--- a/drivers/pci/pcie/portdrv_bus.c    Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/pci/pcie/portdrv_bus.c    Wed Nov 26 10:24:15 2008 +0900
@@ -24,6 +24,7 @@ struct bus_type pcie_port_bus_type = {
        .suspend        = pcie_port_bus_suspend,
        .resume         = pcie_port_bus_resume, 
 };
+EXPORT_SYMBOL_GPL(pcie_port_bus_type);
 
 static int pcie_port_bus_match(struct device *dev, struct device_driver *drv)
 {
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/pcie/portdrv_pci.c
--- a/drivers/pci/pcie/portdrv_pci.c    Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/pci/pcie/portdrv_pci.c    Wed Nov 26 10:24:15 2008 +0900
@@ -14,8 +14,10 @@
 #include <linux/init.h>
 #include <linux/slab.h>
 #include <linux/pcieport_if.h>
+#include <linux/aer.h>
 
 #include "portdrv.h"
+#include "aer/aerdrv.h"
 
 /*
  * Version Information
@@ -29,6 +31,43 @@ MODULE_LICENSE("GPL");
 
 /* global data */
 static const char device_name[] = "pcieport-driver";
+
+static int pcie_portdrv_save_config(struct pci_dev *dev)
+{
+       return pci_save_state(dev);
+}
+
+#ifdef CONFIG_PM
+static int pcie_portdrv_restore_config(struct pci_dev *dev)
+{
+       int retval;
+
+       pci_restore_state(dev);
+       retval = pci_enable_device(dev);
+       if (retval)
+               return retval;
+       pci_set_master(dev);
+       return 0;
+}
+
+static int pcie_portdrv_suspend(struct pci_dev *dev, pm_message_t state)
+{
+       int ret = pcie_port_device_suspend(dev, state);
+
+       if (!ret)
+               ret = pcie_portdrv_save_config(dev);
+       return ret;
+}
+
+static int pcie_portdrv_resume(struct pci_dev *dev)
+{
+       pcie_portdrv_restore_config(dev);
+       return pcie_port_device_resume(dev);
+}
+#else
+#define pcie_portdrv_suspend NULL
+#define pcie_portdrv_resume NULL
+#endif
 
 /*
  * pcie_portdrv_probe - Probe PCI-Express port devices
@@ -61,6 +100,10 @@ static int __devinit pcie_portdrv_probe 
                return -ENOMEM;
        }
 
+       pcie_portdrv_save_config(dev);
+
+       pci_enable_pcie_error_reporting(dev);
+
        return 0;
 }
 
@@ -70,39 +113,143 @@ static void pcie_portdrv_remove (struct 
        kfree(pci_get_drvdata(dev));
 }
 
-#ifdef CONFIG_PM
-static int pcie_portdrv_save_config(struct pci_dev *dev)
-{
-       return pci_save_state(dev);
-}
-
-static int pcie_portdrv_restore_config(struct pci_dev *dev)
-{
-       int retval;
-
-       pci_restore_state(dev);
-       retval = pci_enable_device(dev);
-       if (retval)
-               return retval;
-       pci_set_master(dev);
-       return 0;
-}
-
-static int pcie_portdrv_suspend (struct pci_dev *dev, pm_message_t state)
-{
-       int ret = pcie_port_device_suspend(dev, state);
-
-       if (!ret)
-               ret = pcie_portdrv_save_config(dev);
-       return ret;
-}
-
-static int pcie_portdrv_resume (struct pci_dev *dev)
-{
-       pcie_portdrv_restore_config(dev);
-       return pcie_port_device_resume(dev);
-}
-#endif
+static int error_detected_iter(struct device *device, void *data)
+{
+       struct pcie_device *pcie_device;
+       struct pcie_port_service_driver *driver;
+       struct aer_broadcast_data *result_data;
+       pci_ers_result_t status;
+
+       result_data = (struct aer_broadcast_data *) data;
+
+       if (device->bus == &pcie_port_bus_type && device->driver) {
+               driver = to_service_driver(device->driver);
+               if (!driver ||
+                       !driver->err_handler ||
+                       !driver->err_handler->error_detected)
+                       return 0;
+
+               pcie_device = to_pcie_device(device);
+
+               /* Forward error detected message to service drivers */
+               status = driver->err_handler->error_detected(
+                       pcie_device->port,
+                       result_data->state);
+               result_data->result =
+                       merge_result(result_data->result, status);
+       }
+
+       return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
+                                       enum pci_channel_state error)
+{
+       struct aer_broadcast_data result_data =
+                       {error, PCI_ERS_RESULT_CAN_RECOVER};
+
+       device_for_each_child(&dev->dev, &result_data, error_detected_iter);
+
+       return result_data.result;
+}
+
+static int mmio_enabled_iter(struct device *device, void *data)
+{
+       struct pcie_device *pcie_device;
+       struct pcie_port_service_driver *driver;
+       pci_ers_result_t status, *result;
+
+       result = (pci_ers_result_t *) data;
+
+       if (device->bus == &pcie_port_bus_type && device->driver) {
+               driver = to_service_driver(device->driver);
+               if (driver &&
+                       driver->err_handler &&
+                       driver->err_handler->mmio_enabled) {
+                       pcie_device = to_pcie_device(device);
+
+                       /* Forward error message to service drivers */
+                       status = driver->err_handler->mmio_enabled(
+                                       pcie_device->port);
+                       *result = merge_result(*result, status);
+               }
+       }
+
+       return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_mmio_enabled(struct pci_dev *dev)
+{
+       pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
+
+       device_for_each_child(&dev->dev, &status, mmio_enabled_iter);
+       return status;
+}
+
+static int slot_reset_iter(struct device *device, void *data)
+{
+       struct pcie_device *pcie_device;
+       struct pcie_port_service_driver *driver;
+       pci_ers_result_t status, *result;
+
+       result = (pci_ers_result_t *) data;
+
+       if (device->bus == &pcie_port_bus_type && device->driver) {
+               driver = to_service_driver(device->driver);
+               if (driver &&
+                       driver->err_handler &&
+                       driver->err_handler->slot_reset) {
+                       pcie_device = to_pcie_device(device);
+
+                       /* Forward error message to service drivers */
+                       status = driver->err_handler->slot_reset(
+                                       pcie_device->port);
+                       *result = merge_result(*result, status);
+               }
+       }
+
+       return 0;
+}
+
+static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
+{
+       pci_ers_result_t status;
+
+       /* If fatal, restore cfg space for possible link reset at upstream */
+       if (dev->error_state == pci_channel_io_frozen) {
+               pcie_portdrv_restore_config(dev);
+               pci_enable_pcie_error_reporting(dev);
+       }
+
+       device_for_each_child(&dev->dev, &status, slot_reset_iter);
+
+       return status;
+}
+
+static int resume_iter(struct device *device, void *data)
+{
+       struct pcie_device *pcie_device;
+       struct pcie_port_service_driver *driver;
+
+       if (device->bus == &pcie_port_bus_type && device->driver) {
+               driver = to_service_driver(device->driver);
+               if (driver &&
+                       driver->err_handler &&
+                       driver->err_handler->resume) {
+                       pcie_device = to_pcie_device(device);
+
+                       /* Forward error message to service drivers */
+                       driver->err_handler->resume(pcie_device->port);
+               }
+       }
+
+       return 0;
+}
+
+static void pcie_portdrv_err_resume(struct pci_dev *dev)
+{
+       device_for_each_child(&dev->dev, NULL, resume_iter);
+}
 
 /*
  * LINUX Device Driver Model
@@ -114,6 +261,13 @@ static const struct pci_device_id port_p
 };
 MODULE_DEVICE_TABLE(pci, port_pci_ids);
 
+static struct pci_error_handlers pcie_portdrv_err_handler = {
+               .error_detected = pcie_portdrv_error_detected,
+               .mmio_enabled = pcie_portdrv_mmio_enabled,
+               .slot_reset = pcie_portdrv_slot_reset,
+               .resume = pcie_portdrv_err_resume,
+};
+
 static struct pci_driver pcie_portdrv = {
        .name           = (char *)device_name,
        .id_table       = &port_pci_ids[0],
@@ -121,10 +275,10 @@ static struct pci_driver pcie_portdrv = 
        .probe          = pcie_portdrv_probe,
        .remove         = pcie_portdrv_remove,
 
-#ifdef CONFIG_PM
        .suspend        = pcie_portdrv_suspend,
        .resume         = pcie_portdrv_resume,
-#endif /* PM */
+
+       .err_handler    = &pcie_portdrv_err_handler,
 };
 
 static int __init pcie_portdrv_init(void)
diff -r 61d1f2810617 -r 6591b4869889 drivers/pci/search.c
--- a/drivers/pci/search.c      Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/pci/search.c      Wed Nov 26 10:24:15 2008 +0900
@@ -380,6 +380,36 @@ exit:
        up_read(&pci_bus_sem);
        return found;
 }
+
+/**
+ * pci_get_bus_and_slot - locate PCI device from a given PCI bus & slot
+ * @bus: number of PCI bus on which desired PCI device resides
+ * @devfn: encodes number of PCI slot in which the desired PCI
+ * device resides and the logical device number within that slot
+ * in case of multi-function devices.
+ *
+ * Note: the bus/slot search is limited to PCI domain (segment) 0.
+ *
+ * Given a PCI bus and slot/function number, the desired PCI device
+ * is located in system global list of PCI devices.  If the device
+ * is found, a pointer to its data structure is returned.  If no
+ * device is found, %NULL is returned. The returned device has its
+ * reference count bumped by one.
+ */
+
+struct pci_dev * pci_get_bus_and_slot(unsigned int bus, unsigned int devfn)
+{
+       struct pci_dev *dev = NULL;
+
+       while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL) {
+               if (pci_domain_nr(dev->bus) == 0 &&
+                  (dev->bus->number == bus && dev->devfn == devfn))
+                       return dev;
+       }
+       return NULL;
+}
+
+
 EXPORT_SYMBOL(pci_dev_present);
 
 EXPORT_SYMBOL(pci_find_bus);
@@ -390,4 +420,5 @@ EXPORT_SYMBOL(pci_get_device);
 EXPORT_SYMBOL(pci_get_device);
 EXPORT_SYMBOL(pci_get_subsys);
 EXPORT_SYMBOL(pci_get_slot);
+EXPORT_SYMBOL(pci_get_bus_and_slot);
 EXPORT_SYMBOL(pci_get_class);
diff -r 61d1f2810617 -r 6591b4869889 drivers/scsi/ahci.c
--- a/drivers/scsi/ahci.c       Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/scsi/ahci.c       Wed Nov 26 10:24:15 2008 +0900
@@ -370,6 +370,31 @@ static const struct pci_device_id ahci_p
          board_ahci }, /* ICH10 */
        { PCI_VENDOR_ID_INTEL, 0x3a25, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
          board_ahci }, /* ICH10 */
+       /* SATA Controller AHCI (PCH) */
+       { PCI_VENDOR_ID_INTEL, 0x3b22, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+        board_ahci },
+       /* SATA Controller AHCI (PCH) */
+       { PCI_VENDOR_ID_INTEL, 0x3b23, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+        board_ahci },
+       /* SATA Controller RAID (PCH) */
+       { PCI_VENDOR_ID_INTEL, 0x3b24, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+        board_ahci },
+       /* SATA Controller RAID (PCH) */
+       { PCI_VENDOR_ID_INTEL, 0x3b25, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+        board_ahci },
+       /* SATA Controller Mobile AHCI (PCH) */
+       { PCI_VENDOR_ID_INTEL, 0x3b29, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+        board_ahci },
+       /* SATA Controller Mobile AHCI (PCH) */
+       { PCI_VENDOR_ID_INTEL, 0x3b2f, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+        board_ahci },
+       /* SATA Controller Mobile RAID (PCH) */
+       { PCI_VENDOR_ID_INTEL, 0x3b2b, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+        board_ahci },
+       /* SATA Controller Mobile RAID (PCH) */
+       { PCI_VENDOR_ID_INTEL, 0x3b2c, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+        board_ahci },
+
 
        /* JMicron */
        { 0x197b, 0x2360, PCI_ANY_ID, PCI_ANY_ID, 0, 0,
diff -r 61d1f2810617 -r 6591b4869889 drivers/scsi/ata_piix.c
--- a/drivers/scsi/ata_piix.c   Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/scsi/ata_piix.c   Wed Nov 26 10:24:15 2008 +0900
@@ -220,6 +220,18 @@ static const struct pci_device_id piix_p
        { 0x8086, 0x3a20, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci },
        /* SATA Controller IDE (ICH10) */
        { 0x8086, 0x3a26, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata },
+       /* SATA Controller IDE (PCH) */
+       { 0x8086, 0x3b20, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci },
+       /* SATA Controller IDE (PCH) */
+       { 0x8086, 0x3b21, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata },
+       /* SATA Controller IDE (PCH) */
+       { 0x8086, 0x3b26, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata },
+       /* SATA Controller IDE (PCH) */
+       { 0x8086, 0x3b28, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci },
+       /* SATA Controller IDE (PCH) */
+       { 0x8086, 0x3b2d, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_2port_sata },
+       /* SATA Controller IDE (PCH) */
+       { 0x8086, 0x3b2e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ich8_sata_ahci },
 
        { }     /* terminate list */
 };
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/balloon/balloon.c
--- a/drivers/xen/balloon/balloon.c     Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/balloon/balloon.c     Wed Nov 26 10:24:15 2008 +0900
@@ -577,8 +577,8 @@ subsys_initcall(balloon_init);
 
 static void __exit balloon_exit(void)
 {
-    /* XXX - release balloon here */
-    return; 
+       balloon_sysfs_exit();
+       /* XXX - release balloon here */
 }
 
 module_exit(balloon_exit); 
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/balloon/sysfs.c
--- a/drivers/xen/balloon/sysfs.c       Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/balloon/sysfs.c       Wed Nov 26 10:24:15 2008 +0900
@@ -30,6 +30,7 @@
 
 #include <linux/capability.h>
 #include <linux/errno.h>
+#include <linux/init.h>
 #include <linux/stat.h>
 #include <linux/string.h>
 #include <linux/sysdev.h>
@@ -111,7 +112,7 @@ static struct sysdev_class balloon_sysde
 
 static struct sys_device balloon_sysdev;
 
-static int register_balloon(struct sys_device *sysdev)
+static int __init register_balloon(struct sys_device *sysdev)
 {
        int i, error;
 
@@ -148,7 +149,7 @@ static int register_balloon(struct sys_d
        return error;
 }
 
-static void unregister_balloon(struct sys_device *sysdev)
+static __exit void unregister_balloon(struct sys_device *sysdev)
 {
        int i;
 
@@ -159,12 +160,12 @@ static void unregister_balloon(struct sy
        sysdev_class_unregister(&balloon_sysdev_class);
 }
 
-int balloon_sysfs_init(void)
+int __init balloon_sysfs_init(void)
 {
        return register_balloon(&balloon_sysdev);
 }
 
-void balloon_sysfs_exit(void)
+void __exit balloon_sysfs_exit(void)
 {
        unregister_balloon(&balloon_sysdev);
 }
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/blkback/blkback.c
--- a/drivers/xen/blkback/blkback.c     Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/blkback/blkback.c     Wed Nov 26 10:24:15 2008 +0900
@@ -151,9 +151,9 @@ static void unplug_queue(blkif_t *blkif)
        blkif->plug = NULL;
 }
 
-static void plug_queue(blkif_t *blkif, struct bio *bio)
-{
-       request_queue_t *q = bdev_get_queue(bio->bi_bdev);
+static void plug_queue(blkif_t *blkif, struct block_device *bdev)
+{
+       request_queue_t *q = bdev_get_queue(bdev);
 
        if (q == blkif->plug)
                return;
@@ -389,8 +389,8 @@ static void dispatch_rw_block_io(blkif_t
                unsigned long buf; unsigned int nsec;
        } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
        unsigned int nseg;
-       struct bio *bio = NULL, *biolist[BLKIF_MAX_SEGMENTS_PER_REQUEST];
-       int ret, i, nbio = 0;
+       struct bio *bio = NULL;
+       int ret, i;
        int operation;
 
        switch (req->operation) {
@@ -477,6 +477,10 @@ static void dispatch_rw_block_io(blkif_t
                goto fail_flush;
        }
 
+       plug_queue(blkif, preq.bdev);
+       atomic_set(&pending_req->pendcnt, 1);
+       blkif_get(blkif);
+
        for (i = 0; i < nseg; i++) {
                if (((int)preq.sector_number|(int)seg[i].nsec) &
                    ((bdev_hardsect_size(preq.bdev) >> 9) - 1)) {
@@ -490,7 +494,12 @@ static void dispatch_rw_block_io(blkif_t
                                     virt_to_page(vaddr(pending_req, i)),
                                     seg[i].nsec << 9,
                                     seg[i].buf & ~PAGE_MASK) == 0)) {
-                       bio = biolist[nbio++] = bio_alloc(GFP_KERNEL, nseg-i);
+                       if (bio) {
+                               atomic_inc(&pending_req->pendcnt);
+                               submit_bio(operation, bio);
+                       }
+
+                       bio = bio_alloc(GFP_KERNEL, nseg-i);
                        if (unlikely(bio == NULL))
                                goto fail_put_bio;
 
@@ -505,7 +514,7 @@ static void dispatch_rw_block_io(blkif_t
 
        if (!bio) {
                BUG_ON(operation != WRITE_BARRIER);
-               bio = biolist[nbio++] = bio_alloc(GFP_KERNEL, 0);
+               bio = bio_alloc(GFP_KERNEL, 0);
                if (unlikely(bio == NULL))
                        goto fail_put_bio;
 
@@ -515,12 +524,7 @@ static void dispatch_rw_block_io(blkif_t
                bio->bi_sector  = -1;
        }
 
-       plug_queue(blkif, bio);
-       atomic_set(&pending_req->pendcnt, nbio);
-       blkif_get(blkif);
-
-       for (i = 0; i < nbio; i++)
-               submit_bio(operation, biolist[i]);
+       submit_bio(operation, bio);
 
        if (operation == READ)
                blkif->st_rd_sect += preq.nr_sects;
@@ -529,16 +533,22 @@ static void dispatch_rw_block_io(blkif_t
 
        return;
 
- fail_put_bio:
-       for (i = 0; i < (nbio-1); i++)
-               bio_put(biolist[i]);
  fail_flush:
        fast_flush_area(pending_req);
  fail_response:
        make_response(blkif, req->id, req->operation, BLKIF_RSP_ERROR);
        free_req(pending_req);
        msleep(1); /* back off a bit */
-} 
+       return;
+
+ fail_put_bio:
+       __end_block_io_op(pending_req, -EINVAL);
+       if (bio)
+               bio_put(bio);
+       unplug_queue(blkif);
+       msleep(1); /* back off a bit */
+       return;
+}
 
 
 
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/blktap/blktap.c
--- a/drivers/xen/blktap/blktap.c       Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/blktap/blktap.c       Wed Nov 26 10:24:15 2008 +0900
@@ -611,9 +611,13 @@ static int blktap_release(struct inode *
 
        /* Clear any active mappings and free foreign map table */
        if (info->vma) {
+               struct mm_struct *mm = info->vma->vm_mm;
+
+               down_write(&mm->mmap_sem);
                zap_page_range(
                        info->vma, info->vma->vm_start, 
                        info->vma->vm_end - info->vma->vm_start, NULL);
+               up_write(&mm->mmap_sem);
 
                kfree(info->vma->vm_private_data);
 
@@ -992,12 +996,13 @@ static void fast_flush_area(pending_req_
                            int tapidx)
 {
        struct gnttab_unmap_grant_ref unmap[BLKIF_MAX_SEGMENTS_PER_REQUEST*2];
-       unsigned int i, invcount = 0;
+       unsigned int i, invcount = 0, locked = 0;
        struct grant_handle_pair *khandle;
        uint64_t ptep;
        int ret, mmap_idx;
        unsigned long kvaddr, uvaddr;
        tap_blkif_t *info;
+       struct mm_struct *mm;
        
 
        info = tapfds[tapidx];
@@ -1007,13 +1012,15 @@ static void fast_flush_area(pending_req_
                return;
        }
 
+       mm = info->vma ? info->vma->vm_mm : NULL;
+
        if (info->vma != NULL &&
            xen_feature(XENFEAT_auto_translated_physmap)) {
-               down_write(&info->vma->vm_mm->mmap_sem);
+               down_write(&mm->mmap_sem);
                zap_page_range(info->vma, 
                               MMAP_VADDR(info->user_vstart, u_idx, 0), 
                               req->nr_pages << PAGE_SHIFT, NULL);
-               up_write(&info->vma->vm_mm->mmap_sem);
+               up_write(&mm->mmap_sem);
                return;
        }
 
@@ -1038,10 +1045,13 @@ static void fast_flush_area(pending_req_
 
                if (khandle->user != INVALID_GRANT_HANDLE) {
                        BUG_ON(xen_feature(XENFEAT_auto_translated_physmap));
+                       if (!locked++)
+                               down_write(&mm->mmap_sem);
                        if (create_lookup_pte_addr(
-                               info->vma->vm_mm,
+                               mm,
                                MMAP_VADDR(info->user_vstart, u_idx, i),
                                &ptep) !=0) {
+                               up_write(&mm->mmap_sem);
                                WPRINTK("Couldn't get a pte addr!\n");
                                return;
                        }
@@ -1060,10 +1070,17 @@ static void fast_flush_area(pending_req_
                GNTTABOP_unmap_grant_ref, unmap, invcount);
        BUG_ON(ret);
        
-       if (info->vma != NULL && !xen_feature(XENFEAT_auto_translated_physmap))
+       if (info->vma != NULL &&
+           !xen_feature(XENFEAT_auto_translated_physmap)) {
+               if (!locked++)
+                       down_write(&mm->mmap_sem);
                zap_page_range(info->vma, 
                               MMAP_VADDR(info->user_vstart, u_idx, 0), 
                               req->nr_pages << PAGE_SHIFT, NULL);
+       }
+
+       if (locked)
+               up_write(&mm->mmap_sem);
 }
 
 /******************************************************************
@@ -1346,6 +1363,7 @@ static void dispatch_rw_block_io(blkif_t
        int pending_idx = RTN_PEND_IDX(pending_req,pending_req->mem_idx);
        int usr_idx;
        uint16_t mmap_idx = pending_req->mem_idx;
+       struct mm_struct *mm;
 
        if (blkif->dev_num < 0 || blkif->dev_num > MAX_TAP_DEV)
                goto fail_response;
@@ -1389,6 +1407,9 @@ static void dispatch_rw_block_io(blkif_t
        pending_req->status    = BLKIF_RSP_OKAY;
        pending_req->nr_pages  = nseg;
        op = 0;
+       mm = info->vma->vm_mm;
+       if (!xen_feature(XENFEAT_auto_translated_physmap))
+               down_write(&mm->mmap_sem);
        for (i = 0; i < nseg; i++) {
                unsigned long uvaddr;
                unsigned long kvaddr;
@@ -1407,9 +1428,9 @@ static void dispatch_rw_block_io(blkif_t
 
                if (!xen_feature(XENFEAT_auto_translated_physmap)) {
                        /* Now map it to user. */
-                       ret = create_lookup_pte_addr(info->vma->vm_mm, 
-                                                    uvaddr, &ptep);
+                       ret = create_lookup_pte_addr(mm, uvaddr, &ptep);
                        if (ret) {
+                               up_write(&mm->mmap_sem);
                                WPRINTK("Couldn't get a pte addr!\n");
                                goto fail_flush;
                        }
@@ -1431,6 +1452,8 @@ static void dispatch_rw_block_io(blkif_t
        BUG_ON(ret);
 
        if (!xen_feature(XENFEAT_auto_translated_physmap)) {
+               up_write(&mm->mmap_sem);
+
                for (i = 0; i < (nseg*2); i+=2) {
                        unsigned long uvaddr;
                        unsigned long kvaddr;
@@ -1504,7 +1527,7 @@ static void dispatch_rw_block_io(blkif_t
                goto fail_flush;
 
        if (xen_feature(XENFEAT_auto_translated_physmap))
-               down_write(&info->vma->vm_mm->mmap_sem);
+               down_write(&mm->mmap_sem);
        /* Mark mapped pages as reserved: */
        for (i = 0; i < req->nr_segments; i++) {
                unsigned long kvaddr;
@@ -1518,13 +1541,13 @@ static void dispatch_rw_block_io(blkif_t
                                             MMAP_VADDR(info->user_vstart,
                                                        usr_idx, i), pg);
                        if (ret) {
-                               up_write(&info->vma->vm_mm->mmap_sem);
+                               up_write(&mm->mmap_sem);
                                goto fail_flush;
                        }
                }
        }
        if (xen_feature(XENFEAT_auto_translated_physmap))
-               up_write(&info->vma->vm_mm->mmap_sem);
+               up_write(&mm->mmap_sem);
        
        /*record [mmap_idx,pending_idx] to [usr_idx] mapping*/
        info->idx_map[usr_idx] = MAKE_ID(mmap_idx, pending_idx);
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/core/evtchn.c
--- a/drivers/xen/core/evtchn.c Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/core/evtchn.c Wed Nov 26 10:24:15 2008 +0900
@@ -756,18 +756,281 @@ static struct hw_interrupt_type dynirq_t
        .retrigger = resend_irq_on_evtchn,
 };
 
-void evtchn_register_pirq(int irq)
-{
-       struct irq_desc *desc;
-       unsigned long flags;
-
-       irq_info[irq] = mk_irq_info(IRQT_PIRQ, irq, 0);
-
-       /* Cannot call set_irq_probe(), as that's marked __init. */
-       desc = irq_desc + irq;
-       spin_lock_irqsave(&desc->lock, flags);
-       desc->status &= ~IRQ_NOPROBE;
-       spin_unlock_irqrestore(&desc->lock, flags);
+static inline void pirq_unmask_notify(int irq)
+{
+       struct physdev_eoi eoi = { .irq = evtchn_get_xen_pirq(irq) };
+       if (unlikely(test_bit(irq - PIRQ_BASE, pirq_needs_eoi)))
+               VOID(HYPERVISOR_physdev_op(PHYSDEVOP_eoi, &eoi));
+}
+
+static inline void pirq_query_unmask(int irq)
+{
+       struct physdev_irq_status_query irq_status;
+       irq_status.irq = evtchn_get_xen_pirq(irq);
+       if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status))
+               irq_status.flags = 0;
+       clear_bit(irq - PIRQ_BASE, pirq_needs_eoi);
+       if (irq_status.flags & XENIRQSTAT_needs_eoi)
+               set_bit(irq - PIRQ_BASE, pirq_needs_eoi);
+}
+
+/*
+ * On startup, if there is no action associated with the IRQ then we are
+ * probing. In this case we should not share with others as it will confuse us.
+ */
+#define probing_irq(_irq) (irq_desc[(_irq)].action == NULL)
+
+static unsigned int startup_pirq(unsigned int irq)
+{
+       struct evtchn_bind_pirq bind_pirq;
+       int evtchn = evtchn_from_irq(irq);
+
+       if (VALID_EVTCHN(evtchn))
+               goto out;
+
+       bind_pirq.pirq = evtchn_get_xen_pirq(irq);
+       /* NB. We are happy to share unless we are probing. */
+       bind_pirq.flags = probing_irq(irq) ? 0 : BIND_PIRQ__WILL_SHARE;
+       if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq) != 0) {
+               if (!probing_irq(irq))
+                       printk(KERN_INFO "Failed to obtain physical IRQ %d\n",
+                              irq);
+               return 0;
+       }
+       evtchn = bind_pirq.port;
+
+       pirq_query_unmask(irq);
+
+       evtchn_to_irq[evtchn] = irq;
+       bind_evtchn_to_cpu(evtchn, 0);
+       irq_info[irq] = mk_irq_info(IRQT_PIRQ, bind_pirq.pirq, evtchn);
+
+ out:
+       unmask_evtchn(evtchn);
+       pirq_unmask_notify(irq);
+
+       return 0;
+}
+
+static void shutdown_pirq(unsigned int irq)
+{
+       struct evtchn_close close;
+       int evtchn = evtchn_from_irq(irq);
+
+       if (!VALID_EVTCHN(evtchn))
+               return;
+
+       mask_evtchn(evtchn);
+
+       close.port = evtchn;
+       if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
+               BUG();
+
+       bind_evtchn_to_cpu(evtchn, 0);
+       evtchn_to_irq[evtchn] = -1;
+       irq_info[irq] = mk_irq_info(IRQT_PIRQ, index_from_irq(irq), 0);
+}
+
+static void enable_pirq(unsigned int irq)
+{
+       startup_pirq(irq);
+}
+
+static void disable_pirq(unsigned int irq)
+{
+}
+
+static void ack_pirq(unsigned int irq)
+{
+       int evtchn = evtchn_from_irq(irq);
+
+       move_native_irq(irq);
+
+       if (VALID_EVTCHN(evtchn)) {
+               mask_evtchn(evtchn);
+               clear_evtchn(evtchn);
+       }
+}
+
+static void end_pirq(unsigned int irq)
+{
+       int evtchn = evtchn_from_irq(irq);
+
+       if ((irq_desc[irq].status & (IRQ_DISABLED|IRQ_PENDING)) ==
+           (IRQ_DISABLED|IRQ_PENDING)) {
+               shutdown_pirq(irq);
+       } else if (VALID_EVTCHN(evtchn)) {
+               unmask_evtchn(evtchn);
+               pirq_unmask_notify(irq);
+       }
+}
+
+static struct hw_interrupt_type pirq_type = {
+       .typename = "Phys-irq",
+       .startup  = startup_pirq,
+       .shutdown = shutdown_pirq,
+       .enable   = enable_pirq,
+       .disable  = disable_pirq,
+       .ack      = ack_pirq,
+       .end      = end_pirq,
+#ifdef CONFIG_SMP
+       .set_affinity = set_affinity_irq,
+#endif
+       .retrigger = resend_irq_on_evtchn,
+};
+
+int irq_ignore_unhandled(unsigned int irq)
+{
+       struct physdev_irq_status_query irq_status = { .irq = irq };
+
+       if (!is_running_on_xen())
+               return 0;
+
+       if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status))
+               return 0;
+       return !!(irq_status.flags & XENIRQSTAT_shared);
+}
+
+void notify_remote_via_irq(int irq)
+{
+       int evtchn = evtchn_from_irq(irq);
+
+       if (VALID_EVTCHN(evtchn))
+               notify_remote_via_evtchn(evtchn);
+}
+EXPORT_SYMBOL_GPL(notify_remote_via_irq);
+
+int irq_to_evtchn_port(int irq)
+{
+       return evtchn_from_irq(irq);
+}
+EXPORT_SYMBOL_GPL(irq_to_evtchn_port);
+
+void mask_evtchn(int port)
+{
+       shared_info_t *s = HYPERVISOR_shared_info;
+       synch_set_bit(port, s->evtchn_mask);
+}
+EXPORT_SYMBOL_GPL(mask_evtchn);
+
+void unmask_evtchn(int port)
+{
+       shared_info_t *s = HYPERVISOR_shared_info;
+       unsigned int cpu = smp_processor_id();
+       vcpu_info_t *vcpu_info = &s->vcpu_info[cpu];
+
+       BUG_ON(!irqs_disabled());
+
+       /* Slow path (hypercall) if this is a non-local port. */
+       if (unlikely(cpu != cpu_from_evtchn(port))) {
+               struct evtchn_unmask unmask = { .port = port };
+               VOID(HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask));
+               return;
+       }
+
+       synch_clear_bit(port, s->evtchn_mask);
+
+       /* Did we miss an interrupt 'edge'? Re-fire if so. */
+       if (synch_test_bit(port, s->evtchn_pending) &&
+           !synch_test_and_set_bit(port / BITS_PER_LONG,
+                                   &vcpu_info->evtchn_pending_sel))
+               vcpu_info->evtchn_upcall_pending = 1;
+}
+EXPORT_SYMBOL_GPL(unmask_evtchn);
+
+void disable_all_local_evtchn(void)
+{
+       unsigned i, cpu = smp_processor_id();
+       shared_info_t *s = HYPERVISOR_shared_info;
+
+       for (i = 0; i < NR_EVENT_CHANNELS; ++i)
+               if (cpu_from_evtchn(i) == cpu)
+                       synch_set_bit(i, &s->evtchn_mask[0]);
+}
+
+static void restore_cpu_virqs(unsigned int cpu)
+{
+       struct evtchn_bind_virq bind_virq;
+       int virq, irq, evtchn;
+
+       for (virq = 0; virq < NR_VIRQS; virq++) {
+               if ((irq = per_cpu(virq_to_irq, cpu)[virq]) == -1)
+                       continue;
+
+               BUG_ON(irq_info[irq] != mk_irq_info(IRQT_VIRQ, virq, 0));
+
+               /* Get a new binding from Xen. */
+               bind_virq.virq = virq;
+               bind_virq.vcpu = cpu;
+               if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
+                                               &bind_virq) != 0)
+                       BUG();
+               evtchn = bind_virq.port;
+
+               /* Record the new mapping. */
+               evtchn_to_irq[evtchn] = irq;
+               irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
+               bind_evtchn_to_cpu(evtchn, cpu);
+
+               /* Ready for use. */
+               unmask_evtchn(evtchn);
+       }
+}
+
+static void restore_cpu_ipis(unsigned int cpu)
+{
+       struct evtchn_bind_ipi bind_ipi;
+       int ipi, irq, evtchn;
+
+       for (ipi = 0; ipi < NR_IPIS; ipi++) {
+               if ((irq = per_cpu(ipi_to_irq, cpu)[ipi]) == -1)
+                       continue;
+
+               BUG_ON(irq_info[irq] != mk_irq_info(IRQT_IPI, ipi, 0));
+
+               /* Get a new binding from Xen. */
+               bind_ipi.vcpu = cpu;
+               if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
+                                               &bind_ipi) != 0)
+                       BUG();
+               evtchn = bind_ipi.port;
+
+               /* Record the new mapping. */
+               evtchn_to_irq[evtchn] = irq;
+               irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
+               bind_evtchn_to_cpu(evtchn, cpu);
+
+               /* Ready for use. */
+               unmask_evtchn(evtchn);
+
+       }
+}
+
+void irq_resume(void)
+{
+       unsigned int cpu, irq, evtchn;
+
+       init_evtchn_cpu_bindings();
+
+       /* New event-channel space is not 'live' yet. */
+       for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
+               mask_evtchn(evtchn);
+
+       /* Check that no PIRQs are still bound. */
+       for (irq = PIRQ_BASE; irq < (PIRQ_BASE + NR_PIRQS); irq++)
+               BUG_ON(irq_info[irq] != IRQ_UNBOUND);
+
+       /* No IRQ <-> event-channel mappings. */
+       for (irq = 0; irq < NR_IRQS; irq++)
+               irq_info[irq] &= ~((1U << _EVTCHN_BITS) - 1);
+       for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
+               evtchn_to_irq[evtchn] = -1;
+
+       for_each_possible_cpu(cpu) {
+               restore_cpu_virqs(cpu);
+               restore_cpu_ipis(cpu);
+       }
+
 }
 
 #if defined(CONFIG_X86_IO_APIC)
@@ -777,6 +1040,15 @@ void evtchn_register_pirq(int irq)
 #else
 #define identity_mapped_irq(irq) (1)
 #endif
+
+void evtchn_register_pirq(int irq)
+{
+       BUG_ON(irq < PIRQ_BASE || irq - PIRQ_BASE > NR_PIRQS);
+       if (identity_mapped_irq(irq))
+               return;
+       irq_info[irq] = mk_irq_info(IRQT_PIRQ, irq, 0);
+       irq_desc[irq].chip = &pirq_type;
+}
 
 int evtchn_map_pirq(int irq, int xen_pirq)
 {
@@ -798,9 +1070,11 @@ int evtchn_map_pirq(int irq, int xen_pir
                spin_unlock(&irq_alloc_lock);
                if (irq < PIRQ_BASE)
                        return -ENOSPC;
+               irq_desc[irq].chip = &pirq_type;
        } else if (!xen_pirq) {
                if (unlikely(type_from_irq(irq) != IRQT_PIRQ))
                        return -EINVAL;
+               irq_desc[irq].chip = &no_irq_type;
                irq_info[irq] = IRQ_UNBOUND;
                return 0;
        } else if (type_from_irq(irq) != IRQT_PIRQ
@@ -821,283 +1095,6 @@ int evtchn_get_xen_pirq(int irq)
        return index_from_irq(irq);
 }
 
-static inline void pirq_unmask_notify(int irq)
-{
-       struct physdev_eoi eoi = { .irq = evtchn_get_xen_pirq(irq) };
-       if (unlikely(test_bit(irq - PIRQ_BASE, pirq_needs_eoi)))
-               VOID(HYPERVISOR_physdev_op(PHYSDEVOP_eoi, &eoi));
-}
-
-static inline void pirq_query_unmask(int irq)
-{
-       struct physdev_irq_status_query irq_status;
-       irq_status.irq = evtchn_get_xen_pirq(irq);
-       if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status))
-               irq_status.flags = 0;
-       clear_bit(irq - PIRQ_BASE, pirq_needs_eoi);
-       if (irq_status.flags & XENIRQSTAT_needs_eoi)
-               set_bit(irq - PIRQ_BASE, pirq_needs_eoi);
-}
-
-/*
- * On startup, if there is no action associated with the IRQ then we are
- * probing. In this case we should not share with others as it will confuse us.
- */
-#define probing_irq(_irq) (irq_desc[(_irq)].action == NULL)
-
-static unsigned int startup_pirq(unsigned int irq)
-{
-       struct evtchn_bind_pirq bind_pirq;
-       int evtchn = evtchn_from_irq(irq);
-
-       if (VALID_EVTCHN(evtchn))
-               goto out;
-
-       bind_pirq.pirq = evtchn_get_xen_pirq(irq);
-       /* NB. We are happy to share unless we are probing. */
-       bind_pirq.flags = probing_irq(irq) ? 0 : BIND_PIRQ__WILL_SHARE;
-       if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq) != 0) {
-               if (!probing_irq(irq))
-                       printk(KERN_INFO "Failed to obtain physical IRQ %d\n",
-                              irq);
-               return 0;
-       }
-       evtchn = bind_pirq.port;
-
-       pirq_query_unmask(irq);
-
-       evtchn_to_irq[evtchn] = irq;
-       bind_evtchn_to_cpu(evtchn, 0);
-       irq_info[irq] = mk_irq_info(IRQT_PIRQ, bind_pirq.pirq, evtchn);
-
- out:
-       unmask_evtchn(evtchn);
-       pirq_unmask_notify(irq);
-
-       return 0;
-}
-
-static void shutdown_pirq(unsigned int irq)
-{
-       struct evtchn_close close;
-       int evtchn = evtchn_from_irq(irq);
-
-       if (!VALID_EVTCHN(evtchn))
-               return;
-
-       mask_evtchn(evtchn);
-
-       close.port = evtchn;
-       if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
-               BUG();
-
-       bind_evtchn_to_cpu(evtchn, 0);
-       evtchn_to_irq[evtchn] = -1;
-       irq_info[irq] = mk_irq_info(IRQT_PIRQ, index_from_irq(irq), 0);
-}
-
-static void enable_pirq(unsigned int irq)
-{
-       startup_pirq(irq);
-}
-
-static void disable_pirq(unsigned int irq)
-{
-}
-
-static void ack_pirq(unsigned int irq)
-{
-       int evtchn = evtchn_from_irq(irq);
-
-       move_native_irq(irq);
-
-       if (VALID_EVTCHN(evtchn)) {
-               mask_evtchn(evtchn);
-               clear_evtchn(evtchn);
-       }
-}
-
-static void end_pirq(unsigned int irq)
-{
-       int evtchn = evtchn_from_irq(irq);
-
-       if ((irq_desc[irq].status & (IRQ_DISABLED|IRQ_PENDING)) ==
-           (IRQ_DISABLED|IRQ_PENDING)) {
-               shutdown_pirq(irq);
-       } else if (VALID_EVTCHN(evtchn)) {
-               unmask_evtchn(evtchn);
-               pirq_unmask_notify(irq);
-       }
-}
-
-static struct hw_interrupt_type pirq_type = {
-       .typename = "Phys-irq",
-       .startup  = startup_pirq,
-       .shutdown = shutdown_pirq,
-       .enable   = enable_pirq,
-       .disable  = disable_pirq,
-       .ack      = ack_pirq,
-       .end      = end_pirq,
-#ifdef CONFIG_SMP
-       .set_affinity = set_affinity_irq,
-#endif
-       .retrigger = resend_irq_on_evtchn,
-};
-
-int irq_ignore_unhandled(unsigned int irq)
-{
-       struct physdev_irq_status_query irq_status = { .irq = irq };
-
-       if (!is_running_on_xen())
-               return 0;
-
-       if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status))
-               return 0;
-       return !!(irq_status.flags & XENIRQSTAT_shared);
-}
-
-void notify_remote_via_irq(int irq)
-{
-       int evtchn = evtchn_from_irq(irq);
-
-       if (VALID_EVTCHN(evtchn))
-               notify_remote_via_evtchn(evtchn);
-}
-EXPORT_SYMBOL_GPL(notify_remote_via_irq);
-
-int irq_to_evtchn_port(int irq)
-{
-       return evtchn_from_irq(irq);
-}
-EXPORT_SYMBOL_GPL(irq_to_evtchn_port);
-
-void mask_evtchn(int port)
-{
-       shared_info_t *s = HYPERVISOR_shared_info;
-       synch_set_bit(port, s->evtchn_mask);
-}
-EXPORT_SYMBOL_GPL(mask_evtchn);
-
-void unmask_evtchn(int port)
-{
-       shared_info_t *s = HYPERVISOR_shared_info;
-       unsigned int cpu = smp_processor_id();
-       vcpu_info_t *vcpu_info = &s->vcpu_info[cpu];
-
-       BUG_ON(!irqs_disabled());
-
-       /* Slow path (hypercall) if this is a non-local port. */
-       if (unlikely(cpu != cpu_from_evtchn(port))) {
-               struct evtchn_unmask unmask = { .port = port };
-               VOID(HYPERVISOR_event_channel_op(EVTCHNOP_unmask, &unmask));
-               return;
-       }
-
-       synch_clear_bit(port, s->evtchn_mask);
-
-       /* Did we miss an interrupt 'edge'? Re-fire if so. */
-       if (synch_test_bit(port, s->evtchn_pending) &&
-           !synch_test_and_set_bit(port / BITS_PER_LONG,
-                                   &vcpu_info->evtchn_pending_sel))
-               vcpu_info->evtchn_upcall_pending = 1;
-}
-EXPORT_SYMBOL_GPL(unmask_evtchn);
-
-void disable_all_local_evtchn(void)
-{
-       unsigned i, cpu = smp_processor_id();
-       shared_info_t *s = HYPERVISOR_shared_info;
-
-       for (i = 0; i < NR_EVENT_CHANNELS; ++i)
-               if (cpu_from_evtchn(i) == cpu)
-                       synch_set_bit(i, &s->evtchn_mask[0]);
-}
-
-static void restore_cpu_virqs(unsigned int cpu)
-{
-       struct evtchn_bind_virq bind_virq;
-       int virq, irq, evtchn;
-
-       for (virq = 0; virq < NR_VIRQS; virq++) {
-               if ((irq = per_cpu(virq_to_irq, cpu)[virq]) == -1)
-                       continue;
-
-               BUG_ON(irq_info[irq] != mk_irq_info(IRQT_VIRQ, virq, 0));
-
-               /* Get a new binding from Xen. */
-               bind_virq.virq = virq;
-               bind_virq.vcpu = cpu;
-               if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
-                                               &bind_virq) != 0)
-                       BUG();
-               evtchn = bind_virq.port;
-
-               /* Record the new mapping. */
-               evtchn_to_irq[evtchn] = irq;
-               irq_info[irq] = mk_irq_info(IRQT_VIRQ, virq, evtchn);
-               bind_evtchn_to_cpu(evtchn, cpu);
-
-               /* Ready for use. */
-               unmask_evtchn(evtchn);
-       }
-}
-
-static void restore_cpu_ipis(unsigned int cpu)
-{
-       struct evtchn_bind_ipi bind_ipi;
-       int ipi, irq, evtchn;
-
-       for (ipi = 0; ipi < NR_IPIS; ipi++) {
-               if ((irq = per_cpu(ipi_to_irq, cpu)[ipi]) == -1)
-                       continue;
-
-               BUG_ON(irq_info[irq] != mk_irq_info(IRQT_IPI, ipi, 0));
-
-               /* Get a new binding from Xen. */
-               bind_ipi.vcpu = cpu;
-               if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_ipi,
-                                               &bind_ipi) != 0)
-                       BUG();
-               evtchn = bind_ipi.port;
-
-               /* Record the new mapping. */
-               evtchn_to_irq[evtchn] = irq;
-               irq_info[irq] = mk_irq_info(IRQT_IPI, ipi, evtchn);
-               bind_evtchn_to_cpu(evtchn, cpu);
-
-               /* Ready for use. */
-               unmask_evtchn(evtchn);
-
-       }
-}
-
-void irq_resume(void)
-{
-       unsigned int cpu, irq, evtchn;
-
-       init_evtchn_cpu_bindings();
-
-       /* New event-channel space is not 'live' yet. */
-       for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
-               mask_evtchn(evtchn);
-
-       /* Check that no PIRQs are still bound. */
-       for (irq = PIRQ_BASE; irq < (PIRQ_BASE + NR_PIRQS); irq++)
-               BUG_ON(irq_info[irq] != IRQ_UNBOUND);
-
-       /* No IRQ <-> event-channel mappings. */
-       for (irq = 0; irq < NR_IRQS; irq++)
-               irq_info[irq] &= ~((1U << _EVTCHN_BITS) - 1);
-       for (evtchn = 0; evtchn < NR_EVENT_CHANNELS; evtchn++)
-               evtchn_to_irq[evtchn] = -1;
-
-       for_each_possible_cpu(cpu) {
-               restore_cpu_virqs(cpu);
-               restore_cpu_ipis(cpu);
-       }
-
-}
-
 void __init xen_init_IRQ(void)
 {
        unsigned int i;
@@ -1126,16 +1123,16 @@ void __init xen_init_IRQ(void)
        for (i = PIRQ_BASE; i < (PIRQ_BASE + NR_PIRQS); i++) {
                irq_bindcount[i] = 1;
 
+               if (!identity_mapped_irq(i))
+                       continue;
+
 #ifdef RTC_IRQ
                /* If not domain 0, force our RTC driver to fail its probe. */
-               if (identity_mapped_irq(i) && ((i - PIRQ_BASE) == RTC_IRQ)
-                   && !is_initial_xendomain())
+               if (i - PIRQ_BASE == RTC_IRQ && !is_initial_xendomain())
                        continue;
 #endif
 
                irq_desc[i].status = IRQ_DISABLED;
-               if (!identity_mapped_irq(i))
-                       irq_desc[i].status |= IRQ_NOPROBE;
                irq_desc[i].action = NULL;
                irq_desc[i].depth = 1;
                irq_desc[i].chip = &pirq_type;
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/core/pci.c
--- a/drivers/xen/core/pci.c    Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/core/pci.c    Wed Nov 26 10:24:15 2008 +0900
@@ -23,14 +23,6 @@ static int pci_bus_probe_wrapper(struct 
                return r;
 
        r = pci_bus_probe(dev);
-       if (r) {
-               int ret;
-
-               ret = HYPERVISOR_physdev_op(PHYSDEVOP_manage_pci_remove,
-                                           &manage_pci);
-               WARN_ON(ret && ret != -ENOSYS);
-       }
-
        return r;
 }
 
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/fbfront/xenfb.c
--- a/drivers/xen/fbfront/xenfb.c       Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/fbfront/xenfb.c       Wed Nov 26 10:24:15 2008 +0900
@@ -662,6 +662,10 @@ static int __devinit xenfb_probe(struct 
        }
        info->fb_info = fb_info;
 
+       ret = xenfb_connect_backend(dev, info);
+       if (ret < 0)
+               goto error;
+
        /* FIXME should this be delayed until backend XenbusStateConnected? */
        info->kthread = kthread_run(xenfb_thread, info, "xenfb thread");
        if (IS_ERR(info->kthread)) {
@@ -670,10 +674,6 @@ static int __devinit xenfb_probe(struct 
                xenbus_dev_fatal(dev, ret, "register_framebuffer");
                goto error;
        }
-
-       ret = xenfb_connect_backend(dev, info);
-       if (ret < 0)
-               goto error;
 
        return 0;
 
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pciback/controller.c
--- a/drivers/xen/pciback/controller.c  Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pciback/controller.c  Wed Nov 26 10:24:15 2008 +0900
@@ -406,3 +406,38 @@ void pciback_release_devices(struct pcib
        kfree(dev_data);
        pdev->pci_dev_data = NULL;
 }
+
+int pciback_get_pcifront_dev(struct pci_dev *pcidev, 
+               struct pciback_device *pdev, 
+               unsigned int *domain, unsigned int *bus, unsigned int *devfn)
+{
+       struct controller_dev_data *dev_data = pdev->pci_dev_data;
+       struct controller_dev_entry *dev_entry;
+       struct controller_list_entry *cntrl_entry;
+       unsigned long flags;
+       int found = 0;
+       spin_lock_irqsave(&dev_data->lock, flags);
+
+       list_for_each_entry(cntrl_entry, &dev_data->list, list) {
+               list_for_each_entry(dev_entry, &cntrl_entry->dev_list, list) {
+                       if ( (dev_entry->dev->bus->number == 
+                                       pcidev->bus->number) &&
+                               (dev_entry->dev->devfn ==
+                                       pcidev->devfn) &&
+                               (pci_domain_nr(dev_entry->dev->bus) ==
+                                       pci_domain_nr(pcidev->bus)))
+                       {
+                               found = 1;
+                               *domain = cntrl_entry->domain;
+                               *bus = cntrl_entry->bus;
+                               *devfn = dev_entry->devfn;
+                               goto out;
+                       }
+               }
+       }
+out:
+       spin_unlock_irqrestore(&dev_data->lock, flags);
+       return found;
+
+}
+
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pciback/passthrough.c
--- a/drivers/xen/pciback/passthrough.c Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pciback/passthrough.c Wed Nov 26 10:24:15 2008 +0900
@@ -164,3 +164,13 @@ void pciback_release_devices(struct pcib
        kfree(dev_data);
        pdev->pci_dev_data = NULL;
 }
+
+int pciback_get_pcifront_dev(struct pci_dev *pcidev, struct pciback_device 
*pdev, 
+               unsigned int *domain, unsigned int *bus, unsigned int *devfn)
+
+{
+       *domain = pci_domain_nr(pcidev->bus);
+       *bus = pcidev->bus->number;
+       *devfn = pcidev->devfn;
+       return 1;
+}
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pciback/pci_stub.c
--- a/drivers/xen/pciback/pci_stub.c    Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pciback/pci_stub.c    Wed Nov 26 10:24:15 2008 +0900
@@ -6,15 +6,24 @@
  */
 #include <linux/module.h>
 #include <linux/init.h>
+#include <linux/rwsem.h>
 #include <linux/list.h>
 #include <linux/spinlock.h>
 #include <linux/kref.h>
+#include <linux/pci.h>
+#include <linux/wait.h>
 #include <asm/atomic.h>
+#include <xen/evtchn.h>
 #include "pciback.h"
 #include "conf_space.h"
 #include "conf_space_quirks.h"
 
 static char *pci_devs_to_hide = NULL;
+wait_queue_head_t aer_wait_queue;
+/*Add sem for sync AER handling and pciback remove/reconfigue ops,
+* We want to avoid in middle of AER ops, pciback devices is being removed
+*/
+static DECLARE_RWSEM(pcistub_sem);
 module_param_named(hide, pci_devs_to_hide, charp, 0444);
 
 struct pcistub_device_id {
@@ -207,6 +216,10 @@ void pcistub_put_pci_dev(struct pci_dev 
 
        spin_unlock_irqrestore(&pcistub_devices_lock, flags);
 
+       /*hold this lock for avoiding breaking link between
+       * pcistub and pciback when AER is in processing
+       */
+       down_write(&pcistub_sem);
        /* Cleanup our device
         * (so it's ready for the next domain)
         */
@@ -219,6 +232,7 @@ void pcistub_put_pci_dev(struct pci_dev 
        spin_unlock_irqrestore(&found_psdev->lock, flags);
 
        pcistub_device_put(found_psdev);
+       up_write(&pcistub_sem);
 }
 
 static int __devinit pcistub_match_one(struct pci_dev *dev,
@@ -279,6 +293,8 @@ static int __devinit pcistub_init_device
        pci_set_drvdata(dev, dev_data);
 
        dev_dbg(&dev->dev, "initializing config\n");
+
+       init_waitqueue_head(&aer_wait_queue);
        err = pciback_config_init_dev(dev);
        if (err)
                goto out;
@@ -477,6 +493,308 @@ static const struct pci_device_id pcistu
        {0,},
 };
 
+static void kill_domain_by_device(struct pcistub_device *psdev)
+{
+       struct xenbus_transaction xbt;
+       int err;
+       char nodename[1024];
+
+       if (!psdev) 
+               dev_err(&psdev->dev->dev,
+                       "device is NULL when do AER recovery/kill_domain\n");
+       sprintf(nodename, "/local/domain/0/backend/pci/%d/0", 
+               psdev->pdev->xdev->otherend_id);
+       nodename[strlen(nodename)] = '\0';
+
+again:
+       err = xenbus_transaction_start(&xbt);
+       if (err)
+       {
+               dev_err(&psdev->dev->dev,
+                       "error %d when start xenbus transaction\n", err);
+               return;
+       }
+       /*PV AER handlers will set this flag*/
+       xenbus_printf(xbt, nodename, "aerState" , "aerfail" );
+       err = xenbus_transaction_end(xbt, 0);
+       if (err)
+       {
+               if (err == -EAGAIN)
+                       goto again;
+               dev_err(&psdev->dev->dev,
+                       "error %d when end xenbus transaction\n", err);
+               return;
+       }
+}
+
+/* For each aer recovery step error_detected, mmio_enabled, etc, front_end and
+ * backend need to have cooperation. In pciback, those steps will do similar
+ * jobs: send service request and waiting for front_end response. 
+*/
+static pci_ers_result_t common_process(struct pcistub_device *psdev, 
+               pci_channel_state_t state, int aer_cmd, pci_ers_result_t result)
+{
+       pci_ers_result_t res = result;
+       struct xen_pcie_aer_op *aer_op;
+       int ret;
+
+       /*with PV AER drivers*/
+       aer_op = &(psdev->pdev->sh_info->aer_op);
+       aer_op->cmd = aer_cmd ;
+       /*useful for error_detected callback*/
+       aer_op->err = state;
+       /*pcifront_end BDF*/
+       ret = pciback_get_pcifront_dev(psdev->dev, psdev->pdev,
+               &aer_op->domain, &aer_op->bus, &aer_op->devfn);
+       if (!ret) {
+               dev_err(&psdev->dev->dev,
+                       "pciback: failed to get pcifront device\n");
+               return PCI_ERS_RESULT_NONE; 
+       }
+       wmb();
+
+       dev_dbg(&psdev->dev->dev, 
+                       "pciback: aer_op %x dom %x bus %x devfn %x\n",  
+                       aer_cmd, aer_op->domain, aer_op->bus, aer_op->devfn);
+       /*local flag to mark there's aer request, pciback callback will use this
+       * flag to judge whether we need to check pci-front give aer service
+       * ack signal
+       */
+       set_bit(_PCIB_op_pending, (unsigned long *)&psdev->pdev->flags);
+
+       /*It is possible that a pcifront conf_read_write ops request invokes
+       * the callback which cause the spurious execution of wake_up. 
+       * Yet it is harmless and better than a spinlock here
+       */
+       set_bit(_XEN_PCIB_active, 
+               (unsigned long *)&psdev->pdev->sh_info->flags);
+       wmb();
+       notify_remote_via_irq(psdev->pdev->evtchn_irq);
+
+       ret = wait_event_timeout(aer_wait_queue, !(test_bit(_XEN_PCIB_active,
+                (unsigned long *)&psdev->pdev->sh_info->flags)), 300*HZ);
+
+       if (!ret) {
+               if (test_bit(_XEN_PCIB_active, 
+                       (unsigned long *)&psdev->pdev->sh_info->flags)) {
+                       dev_err(&psdev->dev->dev, 
+                               "pcifront aer process not responding!\n");
+                       clear_bit(_XEN_PCIB_active,
+                         (unsigned long *)&psdev->pdev->sh_info->flags);
+                       aer_op->err = PCI_ERS_RESULT_NONE;
+                       return res;
+               }
+       }
+       clear_bit(_PCIB_op_pending, (unsigned long *)&psdev->pdev->flags);
+
+       if ( test_bit( _XEN_PCIF_active,
+               (unsigned long*)&psdev->pdev->sh_info->flags)) {
+               dev_dbg(&psdev->dev->dev, 
+                       "schedule pci_conf service in pciback \n");
+               test_and_schedule_op(psdev->pdev);
+       }
+
+       res = (pci_ers_result_t)aer_op->err;
+       return res;
+} 
+
+/*
+* pciback_slot_reset: it will send the slot_reset request to  pcifront in case
+* of the device driver could provide this service, and then wait for pcifront
+* ack.
+* @dev: pointer to PCI devices
+* return value is used by aer_core do_recovery policy
+*/
+static pci_ers_result_t pciback_slot_reset(struct pci_dev *dev)
+{
+       struct pcistub_device *psdev;
+       pci_ers_result_t result;
+
+       result = PCI_ERS_RESULT_RECOVERED;
+       dev_dbg(&dev->dev, "pciback_slot_reset(bus:%x,devfn:%x)\n",
+               dev->bus->number, dev->devfn);
+
+       down_write(&pcistub_sem);
+       psdev = pcistub_device_find(pci_domain_nr(dev->bus),
+                               dev->bus->number,
+                               PCI_SLOT(dev->devfn),
+                               PCI_FUNC(dev->devfn));
+       if ( !psdev || !psdev->pdev || !psdev->pdev->sh_info )
+       {
+               dev_err(&dev->dev, 
+                       "pciback device is not found/in use/connected!\n");
+               goto end;
+       }
+       if ( !test_bit(_XEN_PCIB_AERHANDLER, 
+               (unsigned long *)&psdev->pdev->sh_info->flags) ) {
+               dev_err(&dev->dev, 
+                       "guest with no AER driver should have been killed\n");
+               goto release;
+       }
+       result = common_process(psdev, 1, XEN_PCI_OP_aer_slotreset, result);
+
+       if (result == PCI_ERS_RESULT_NONE ||
+               result == PCI_ERS_RESULT_DISCONNECT) {
+               dev_dbg(&dev->dev, 
+                       "No AER slot_reset service or disconnected!\n");
+               kill_domain_by_device(psdev);
+       }
+release:
+       pcistub_device_put(psdev);
+end:
+       up_write(&pcistub_sem);
+       return result;
+
+}
+
+
+/*pciback_mmio_enabled: it will send the mmio_enabled request to  pcifront 
+* in case of the device driver could provide this service, and then wait 
+* for pcifront ack.
+* @dev: pointer to PCI devices
+* return value is used by aer_core do_recovery policy
+*/
+
+static pci_ers_result_t pciback_mmio_enabled(struct pci_dev *dev)
+{
+       struct pcistub_device *psdev;
+       pci_ers_result_t result;
+
+       result = PCI_ERS_RESULT_RECOVERED;
+       dev_dbg(&dev->dev, "pciback_mmio_enabled(bus:%x,devfn:%x)\n",
+               dev->bus->number, dev->devfn);
+
+       down_write(&pcistub_sem);
+       psdev = pcistub_device_find(pci_domain_nr(dev->bus),
+                               dev->bus->number,
+                               PCI_SLOT(dev->devfn),
+                               PCI_FUNC(dev->devfn));
+       if ( !psdev || !psdev->pdev || !psdev->pdev->sh_info)
+       {
+               dev_err(&dev->dev, 
+                       "pciback device is not found/in use/connected!\n");
+               goto end;
+       }
+       if ( !test_bit(_XEN_PCIB_AERHANDLER, 
+               (unsigned long *)&psdev->pdev->sh_info->flags) ) {
+               dev_err(&dev->dev, 
+                       "guest with no AER driver should have been killed\n");
+               goto release;
+       }
+       result = common_process(psdev, 1, XEN_PCI_OP_aer_mmio, result);
+
+       if (result == PCI_ERS_RESULT_NONE ||
+               result == PCI_ERS_RESULT_DISCONNECT) {
+               dev_dbg(&dev->dev, 
+                       "No AER mmio_enabled service or disconnected!\n");
+               kill_domain_by_device(psdev);
+       }
+release:
+       pcistub_device_put(psdev);
+end:
+       up_write(&pcistub_sem);
+       return result;
+}
+
+/*pciback_error_detected: it will send the error_detected request to  pcifront 
+* in case of the device driver could provide this service, and then wait 
+* for pcifront ack.
+* @dev: pointer to PCI devices
+* @error: the current PCI connection state
+* return value is used by aer_core do_recovery policy
+*/
+
+static pci_ers_result_t pciback_error_detected(struct pci_dev *dev,
+       pci_channel_state_t error)
+{
+       struct pcistub_device *psdev;
+       pci_ers_result_t result;
+
+       result = PCI_ERS_RESULT_CAN_RECOVER;
+       dev_dbg(&dev->dev, "pciback_error_detected(bus:%x,devfn:%x)\n",
+               dev->bus->number, dev->devfn);
+
+       down_write(&pcistub_sem);
+       psdev = pcistub_device_find(pci_domain_nr(dev->bus),
+                               dev->bus->number,
+                               PCI_SLOT(dev->devfn),
+                               PCI_FUNC(dev->devfn));
+       if ( !psdev || !psdev->pdev || !psdev->pdev->sh_info)
+       {
+               dev_err(&dev->dev, 
+                       "pciback device is not found/in use/connected!\n");
+               goto end;
+       }
+       /*Guest owns the device yet no aer handler regiested, kill guest*/
+       if ( !test_bit(_XEN_PCIB_AERHANDLER, 
+               (unsigned long *)&psdev->pdev->sh_info->flags) ) {
+               dev_dbg(&dev->dev, "guest may have no aer driver, kill it\n");
+               kill_domain_by_device(psdev);
+               goto release;
+       }
+       result = common_process(psdev, error, XEN_PCI_OP_aer_detected, result);
+
+       if (result == PCI_ERS_RESULT_NONE ||
+               result == PCI_ERS_RESULT_DISCONNECT) {
+               dev_dbg(&dev->dev, 
+                       "No AER error_detected service or disconnected!\n");
+               kill_domain_by_device(psdev);
+       }
+release:
+       pcistub_device_put(psdev);
+end:
+       up_write(&pcistub_sem);
+       return result;
+}
+
+/*pciback_error_resume: it will send the error_resume request to  pcifront 
+* in case of the device driver could provide this service, and then wait 
+* for pcifront ack.
+* @dev: pointer to PCI devices
+*/
+
+static void pciback_error_resume(struct pci_dev *dev)
+{
+       struct pcistub_device *psdev;
+
+       dev_dbg(&dev->dev, "pciback_error_resume(bus:%x,devfn:%x)\n",
+               dev->bus->number, dev->devfn);
+
+       down_write(&pcistub_sem);
+       psdev = pcistub_device_find(pci_domain_nr(dev->bus),
+                               dev->bus->number,
+                               PCI_SLOT(dev->devfn),
+                               PCI_FUNC(dev->devfn));
+       if ( !psdev || !psdev->pdev || !psdev->pdev->sh_info)
+       {
+               dev_err(&dev->dev, 
+                       "pciback device is not found/in use/connected!\n");
+               goto end;
+       }
+
+       if ( !test_bit(_XEN_PCIB_AERHANDLER, 
+               (unsigned long *)&psdev->pdev->sh_info->flags) ) {
+               dev_err(&dev->dev, 
+                       "guest with no AER driver should have been killed\n");
+               kill_domain_by_device(psdev);
+               goto release;
+       }
+       common_process(psdev, 1, XEN_PCI_OP_aer_resume, 
PCI_ERS_RESULT_RECOVERED);
+release:
+       pcistub_device_put(psdev);
+end:
+       up_write(&pcistub_sem);
+       return;
+}
+
+/*add pciback AER handling*/
+static struct pci_error_handlers pciback_error_handler = {
+       .error_detected = pciback_error_detected,
+       .mmio_enabled = pciback_mmio_enabled,
+       .slot_reset = pciback_slot_reset,
+       .resume = pciback_error_resume,
+};
+
 /*
  * Note: There is no MODULE_DEVICE_TABLE entry here because this isn't
  * for a normal device. I don't want it to be loaded automatically.
@@ -487,6 +805,7 @@ static struct pci_driver pciback_pci_dri
        .id_table = pcistub_ids,
        .probe = pcistub_probe,
        .remove = pcistub_remove,
+       .err_handler = &pciback_error_handler,
 };
 
 static inline int str_to_slot(const char *buf, int *domain, int *bus,
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pciback/pciback.h
--- a/drivers/xen/pciback/pciback.h     Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pciback/pciback.h     Wed Nov 26 10:24:15 2008 +0900
@@ -22,6 +22,8 @@ struct pci_dev_entry {
 
 #define _PDEVF_op_active       (0)
 #define PDEVF_op_active        (1<<(_PDEVF_op_active))
+#define _PCIB_op_pending       (1)
+#define PCIB_op_pending                (1<<(_PCIB_op_pending))
 
 struct pciback_device {
        void *pci_dev_data;
@@ -81,6 +83,16 @@ struct pci_dev *pciback_get_pci_dev(stru
 struct pci_dev *pciback_get_pci_dev(struct pciback_device *pdev,
                                    unsigned int domain, unsigned int bus,
                                    unsigned int devfn);
+
+/** 
+* Add for domain0 PCIE-AER handling. Get guest domain/bus/devfn in pciback
+* before sending aer request to pcifront, so that guest could identify 
+* device, coopearte with pciback to finish aer recovery job if device driver
+* has the capability
+*/
+
+int pciback_get_pcifront_dev(struct pci_dev *pcidev, struct pciback_device 
*pdev, 
+                               unsigned int *domain, unsigned int *bus, 
unsigned int *devfn);
 int pciback_init_devices(struct pciback_device *pdev);
 int pciback_publish_pci_roots(struct pciback_device *pdev,
                              publish_pci_root_cb cb);
@@ -108,4 +120,7 @@ int pciback_disable_msix(struct pciback_
                         struct pci_dev *dev, struct xen_pci_op *op);
 #endif
 extern int verbose_request;
+
+void test_and_schedule_op(struct pciback_device *pdev);
 #endif
+
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pciback/pciback_ops.c
--- a/drivers/xen/pciback/pciback_ops.c Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pciback/pciback_ops.c Wed Nov 26 10:24:15 2008 +0900
@@ -4,6 +4,7 @@
  *   Author: Ryan Wilson <hap9@xxxxxxxxxxxxxx>
  */
 #include <linux/module.h>
+#include <linux/wait.h>
 #include <asm/bitops.h>
 #include <xen/evtchn.h>
 #include "pciback.h"
@@ -37,14 +38,29 @@ void pciback_reset_device(struct pci_dev
                }
        }
 }
-
-static inline void test_and_schedule_op(struct pciback_device *pdev)
+extern wait_queue_head_t aer_wait_queue;
+extern struct workqueue_struct *pciback_wq;
+/*
+* Now the same evtchn is used for both pcifront conf_read_write request
+* as well as pcie aer front end ack. We use a new work_queue to schedule
+* pciback conf_read_write service for avoiding confict with aer_core 
+* do_recovery job which also use the system default work_queue
+*/
+void test_and_schedule_op(struct pciback_device *pdev)
 {
        /* Check that frontend is requesting an operation and that we are not
         * already processing a request */
        if (test_bit(_XEN_PCIF_active, (unsigned long *)&pdev->sh_info->flags)
            && !test_and_set_bit(_PDEVF_op_active, &pdev->flags))
-               schedule_work(&pdev->op_work);
+       {
+               queue_work(pciback_wq, &pdev->op_work);
+       }
+       /*_XEN_PCIB_active should have been cleared by pcifront. And also make
+       sure pciback is waiting for ack by checking _PCIB_op_pending*/
+       if (!test_bit(_XEN_PCIB_active,(unsigned long *)&pdev->sh_info->flags)
+           &&test_bit(_PCIB_op_pending, &pdev->flags)) {
+               wake_up(&aer_wait_queue);
+       }
 }
 
 /* Performing the configuration space reads/writes must not be done in atomic
@@ -103,7 +119,8 @@ void pciback_do_op(void *data)
        smp_mb__after_clear_bit(); /* /before/ final check for work */
 
        /* Check to see if the driver domain tried to start another request in
-        * between clearing _XEN_PCIF_active and clearing _PDEVF_op_active. */
+        * between clearing _XEN_PCIF_active and clearing _PDEVF_op_active. 
+       */
        test_and_schedule_op(pdev);
 }
 
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pciback/slot.c
--- a/drivers/xen/pciback/slot.c        Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pciback/slot.c        Wed Nov 26 10:24:15 2008 +0900
@@ -155,3 +155,33 @@ void pciback_release_devices(struct pcib
        kfree(slot_dev);
        pdev->pci_dev_data = NULL;
 }
+
+int pciback_get_pcifront_dev(struct pci_dev *pcidev, struct pciback_device 
*pdev, 
+               unsigned int *domain, unsigned int *bus, unsigned int *devfn)
+{
+       int slot, busnr;
+       struct slot_dev_data *slot_dev = pdev->pci_dev_data;
+       struct pci_dev *dev;
+       int found = 0;
+       unsigned long flags;
+
+       spin_lock_irqsave(&slot_dev->lock, flags);
+
+       for (busnr = 0; busnr < PCI_BUS_NBR; bus++)
+               for (slot = 0; slot < PCI_SLOT_MAX; slot++) {
+                       dev = slot_dev->slots[busnr][slot];
+                       if (dev && dev->bus->number == pcidev->bus->number
+                               && dev->devfn == pcidev->devfn
+                               && pci_domain_nr(dev->bus) == 
pci_domain_nr(pcidev->bus)) {
+                               found = 1;
+                               *domain = 0;
+                               *bus = busnr;
+                               *devfn = PCI_DEVFN(slot,0);
+                               goto out;
+                       }
+               }
+out:
+       spin_unlock_irqrestore(&slot_dev->lock, flags);
+       return found;
+
+}
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pciback/vpci.c
--- a/drivers/xen/pciback/vpci.c        Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pciback/vpci.c        Wed Nov 26 10:24:15 2008 +0900
@@ -210,3 +210,33 @@ void pciback_release_devices(struct pcib
        kfree(vpci_dev);
        pdev->pci_dev_data = NULL;
 }
+
+int pciback_get_pcifront_dev(struct pci_dev *pcidev, struct pciback_device 
*pdev, 
+               unsigned int *domain, unsigned int *bus, unsigned int *devfn)
+{
+       struct pci_dev_entry *entry;
+       struct pci_dev *dev = NULL;
+       struct vpci_dev_data *vpci_dev = pdev->pci_dev_data;
+       unsigned long flags;
+       int found = 0, slot;
+
+       spin_lock_irqsave(&vpci_dev->lock, flags);
+       for (slot = 0; slot < PCI_SLOT_MAX; slot++) {
+               list_for_each_entry(entry,
+                           &vpci_dev->dev_list[slot],
+                           list) {
+                       dev = entry->dev;
+                       if (dev && dev->bus->number == pcidev->bus->number
+                               && pci_domain_nr(dev->bus) == 
pci_domain_nr(pcidev->bus)
+                               && dev->devfn == pcidev->devfn)
+                       {
+                               found = 1;
+                               *domain = 0;
+                               *bus = 0;
+                               *devfn = PCI_DEVFN(slot, 
PCI_FUNC(pcidev->devfn));
+                       }
+               }               
+       }
+       spin_unlock_irqrestore(&vpci_dev->lock, flags);
+       return found;
+}
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pciback/xenbus.c
--- a/drivers/xen/pciback/xenbus.c      Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pciback/xenbus.c      Wed Nov 26 10:24:15 2008 +0900
@@ -12,6 +12,7 @@
 #include "pciback.h"
 
 #define INVALID_EVTCHN_IRQ  (-1)
+struct workqueue_struct *pciback_wq;
 
 static struct pciback_device *alloc_pdev(struct xenbus_device *xdev)
 {
@@ -694,11 +695,17 @@ int __init pciback_xenbus_register(void)
 {
        if (!is_running_on_xen())
                return -ENODEV;
-
+       pciback_wq = create_workqueue("pciback_workqueue");
+       if (!pciback_wq) {
+               printk(KERN_ERR "pciback_xenbus_register: create"
+                       "pciback_workqueue failed\n");
+               return -EFAULT;
+       }
        return xenbus_register_backend(&xenbus_pciback_driver);
 }
 
 void __exit pciback_xenbus_unregister(void)
 {
+       destroy_workqueue(pciback_wq);
        xenbus_unregister_driver(&xenbus_pciback_driver);
 }
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pcifront/pci_op.c
--- a/drivers/xen/pcifront/pci_op.c     Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pcifront/pci_op.c     Wed Nov 26 10:24:15 2008 +0900
@@ -8,6 +8,7 @@
 #include <linux/init.h>
 #include <linux/pci.h>
 #include <linux/spinlock.h>
+#include <asm/bitops.h>
 #include <linux/time.h>
 #include <xen/evtchn.h>
 #include "pcifront.h"
@@ -151,6 +152,15 @@ static int errno_to_pcibios_err(int errn
                return PCIBIOS_SET_FAILED;
        }
        return errno;
+}
+
+static inline void schedule_pcifront_aer_op(struct pcifront_device *pdev)
+{
+       if (test_bit(_XEN_PCIB_active, (unsigned long *)&pdev->sh_info->flags)
+               && !test_and_set_bit(_PDEVB_op_active, &pdev->flags)) {
+               dev_dbg(&pdev->xdev->dev, "schedule aer frontend job\n");
+               schedule_work(&pdev->op_work);
+       }
 }
 
 static int do_pci_op(struct pcifront_device *pdev, struct xen_pci_op *op)
@@ -199,6 +209,18 @@ static int do_pci_op(struct pcifront_dev
                }
        }
 
+       /*
+       * We might lose backend service request since we 
+       * reuse same evtchn with pci_conf backend response. So re-schedule
+       * aer pcifront service.
+       */
+       if (test_bit(_XEN_PCIB_active, 
+                       (unsigned long*)&pdev->sh_info->flags)) {
+               dev_err(&pdev->xdev->dev, 
+                       "schedule aer pcifront service\n");
+               schedule_pcifront_aer_op(pdev);
+       }
+
        memcpy(op, active_op, sizeof(struct xen_pci_op));
 
        err = op->err;
@@ -549,3 +571,96 @@ void pcifront_free_roots(struct pcifront
                kfree(bus_entry);
        }
 }
+
+static pci_ers_result_t pcifront_common_process( int cmd, struct 
pcifront_device *pdev,
+       pci_channel_state_t state)
+{
+       pci_ers_result_t result;
+       struct pci_driver *pdrv;
+       int bus = pdev->sh_info->aer_op.bus;
+       int devfn = pdev->sh_info->aer_op.devfn;
+       struct pci_dev *pcidev;
+       int flag = 0;
+
+       dev_dbg(&pdev->xdev->dev, 
+               "pcifront AER process: cmd %x (bus:%x, devfn%x)",
+               cmd, bus, devfn);
+       result = PCI_ERS_RESULT_NONE;
+
+       pcidev = pci_get_bus_and_slot(bus, devfn);
+       if (!pcidev || !pcidev->driver){
+               dev_err(&pcidev->dev, 
+                       "device or driver is NULL\n");
+               return result;
+       }
+       pdrv = pcidev->driver;
+
+       if (get_driver(&pdrv->driver)) {
+               if (pdrv->err_handler && pdrv->err_handler->error_detected) {
+                       dev_dbg(&pcidev->dev,
+                               "trying to call AER service\n");
+                       if (pcidev) {
+                               flag = 1;
+                               switch(cmd) {
+                               case XEN_PCI_OP_aer_detected:
+                                       result = 
pdrv->err_handler->error_detected(pcidev, state);
+                                       break;
+                               case XEN_PCI_OP_aer_mmio:
+                                       result = 
pdrv->err_handler->mmio_enabled(pcidev);
+                                       break;
+                               case XEN_PCI_OP_aer_slotreset:
+                                       result = 
pdrv->err_handler->slot_reset(pcidev);
+                                       break;
+                               case XEN_PCI_OP_aer_resume:
+                                       pdrv->err_handler->resume(pcidev);
+                                       break;
+                               default:
+                                       dev_err(&pdev->xdev->dev,
+                                               "bad request in aer recovery 
operation!\n");
+
+                               }
+                       }
+               }
+               put_driver(&pdrv->driver);
+       }
+       if (!flag)
+               result = PCI_ERS_RESULT_NONE;
+
+       return result;
+}
+
+
+void pcifront_do_aer(void *data)
+{
+       struct pcifront_device *pdev = data;
+       int cmd = pdev->sh_info->aer_op.cmd;
+       pci_channel_state_t state = 
+               (pci_channel_state_t)pdev->sh_info->aer_op.err;
+
+       /*If a pci_conf op is in progress, 
+               we have to wait until it is done before service aer op*/
+       dev_dbg(&pdev->xdev->dev, 
+               "pcifront service aer bus %x devfn %x\n", 
pdev->sh_info->aer_op.bus,
+               pdev->sh_info->aer_op.devfn);
+
+       pdev->sh_info->aer_op.err = pcifront_common_process(cmd, pdev, state);
+
+       wmb();
+       clear_bit(_XEN_PCIB_active, (unsigned long*)&pdev->sh_info->flags);
+       notify_remote_via_evtchn(pdev->evtchn);
+
+       /*in case of we lost an aer request in four lines time_window*/
+       smp_mb__before_clear_bit();
+       clear_bit( _PDEVB_op_active, &pdev->flags);
+       smp_mb__after_clear_bit();
+
+       schedule_pcifront_aer_op(pdev);
+
+}
+
+irqreturn_t pcifront_handler_aer(int irq, void *dev, struct pt_regs *regs)
+{
+       struct pcifront_device *pdev = dev;
+       schedule_pcifront_aer_op(pdev);
+       return IRQ_HANDLED;
+}
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pcifront/pcifront.h
--- a/drivers/xen/pcifront/pcifront.h   Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pcifront/pcifront.h   Wed Nov 26 10:24:15 2008 +0900
@@ -10,12 +10,18 @@
 #include <linux/pci.h>
 #include <xen/xenbus.h>
 #include <xen/interface/io/pciif.h>
+#include <linux/interrupt.h>
 #include <xen/pcifront.h>
+#include <asm/atomic.h>
+#include <linux/workqueue.h>
 
 struct pci_bus_entry {
        struct list_head list;
        struct pci_bus *bus;
 };
+
+#define _PDEVB_op_active               (0)
+#define PDEVB_op_active                (1 << (_PDEVB_op_active))
 
 struct pcifront_device {
        struct xenbus_device *xdev;
@@ -28,6 +34,9 @@ struct pcifront_device {
        /* Lock this when doing any operations in sh_info */
        spinlock_t sh_info_lock;
        struct xen_pci_sharedinfo *sh_info;
+       struct work_struct op_work;
+       unsigned long flags;
+
 };
 
 int pcifront_connect(struct pcifront_device *pdev);
@@ -39,4 +48,8 @@ int pcifront_rescan_root(struct pcifront
                         unsigned int domain, unsigned int bus);
 void pcifront_free_roots(struct pcifront_device *pdev);
 
+void pcifront_do_aer( void *data);
+
+irqreturn_t pcifront_handler_aer(int irq, void *dev, struct pt_regs *regs);
+
 #endif /* __XEN_PCIFRONT_H__ */
diff -r 61d1f2810617 -r 6591b4869889 drivers/xen/pcifront/xenbus.c
--- a/drivers/xen/pcifront/xenbus.c     Tue Nov 04 12:43:37 2008 +0900
+++ b/drivers/xen/pcifront/xenbus.c     Wed Nov 26 10:24:15 2008 +0900
@@ -7,6 +7,7 @@
 #include <linux/init.h>
 #include <linux/mm.h>
 #include <xen/xenbus.h>
+#include <xen/evtchn.h>
 #include <xen/gnttab.h>
 #include "pcifront.h"
 
@@ -34,6 +35,9 @@ static struct pcifront_device *alloc_pde
        }
        pdev->sh_info->flags = 0;
 
+       /*Flag for registering PV AER handler*/
+       set_bit(_XEN_PCIB_AERHANDLER, (void*)&pdev->sh_info->flags);
+
        xdev->dev.driver_data = pdev;
        pdev->xdev = xdev;
 
@@ -45,6 +49,8 @@ static struct pcifront_device *alloc_pde
        pdev->evtchn = INVALID_EVTCHN;
        pdev->gnt_ref = INVALID_GRANT_REF;
 
+       INIT_WORK(&pdev->op_work, pcifront_do_aer, pdev);
+
        dev_dbg(&xdev->dev, "Allocated pdev @ 0x%p pdev->sh_info @ 0x%p\n",
                pdev, pdev->sh_info);
       out:
@@ -56,6 +62,11 @@ static void free_pdev(struct pcifront_de
        dev_dbg(&pdev->xdev->dev, "freeing pdev @ 0x%p\n", pdev);
 
        pcifront_free_roots(pdev);
+
+       /*For PCIE_AER error handling job*/
+       cancel_delayed_work(&pdev->op_work);
+       flush_scheduled_work();
+       unbind_from_irqhandler(pdev->evtchn, pdev);
 
        if (pdev->evtchn != INVALID_EVTCHN)
                xenbus_free_evtchn(pdev->xdev, pdev->evtchn);
@@ -83,6 +94,9 @@ static int pcifront_publish_info(struct 
        err = xenbus_alloc_evtchn(pdev->xdev, &pdev->evtchn);
        if (err)
                goto out;
+
+       bind_caller_port_to_irqhandler(pdev->evtchn, pcifront_handler_aer, 
+               SA_SAMPLE_RANDOM, "pcifront", pdev); 
 
       do_publish:
        err = xenbus_transaction_start(&trans);
diff -r 61d1f2810617 -r 6591b4869889 include/asm-i386/mach-xen/asm/hypercall.h
--- a/include/asm-i386/mach-xen/asm/hypercall.h Tue Nov 04 12:43:37 2008 +0900
+++ b/include/asm-i386/mach-xen/asm/hypercall.h Wed Nov 26 10:24:15 2008 +0900
@@ -280,13 +280,6 @@ HYPERVISOR_event_channel_op(
 }
 
 static inline int __must_check
-HYPERVISOR_acm_op(
-       int cmd, void *arg)
-{
-       return _hypercall2(int, acm_op, cmd, arg);
-}
-
-static inline int __must_check
 HYPERVISOR_xen_version(
        int cmd, void *arg)
 {
diff -r 61d1f2810617 -r 6591b4869889 include/asm-x86_64/mach-xen/asm/hypercall.h
--- a/include/asm-x86_64/mach-xen/asm/hypercall.h       Tue Nov 04 12:43:37 
2008 +0900
+++ b/include/asm-x86_64/mach-xen/asm/hypercall.h       Wed Nov 26 10:24:15 
2008 +0900
@@ -278,13 +278,6 @@ HYPERVISOR_event_channel_op(
 }
 
 static inline int __must_check
-HYPERVISOR_acm_op(
-       int cmd, void *arg)
-{
-       return _hypercall2(int, acm_op, cmd, arg);
-}
-
-static inline int __must_check
 HYPERVISOR_xen_version(
        int cmd, void *arg)
 {
diff -r 61d1f2810617 -r 6591b4869889 include/linux/aer.h
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/include/linux/aer.h       Wed Nov 26 10:24:15 2008 +0900
@@ -0,0 +1,24 @@
+/*
+ * Copyright (C) 2006 Intel Corp.
+ *     Tom Long Nguyen (tom.l.nguyen@xxxxxxxxx)
+ *     Zhang Yanmin (yanmin.zhang@xxxxxxxxx)
+ */
+
+#ifndef _AER_H_
+#define _AER_H_
+
+#if defined(CONFIG_PCIEAER)
+/* pci-e port driver needs this function to enable aer */
+extern int pci_enable_pcie_error_reporting(struct pci_dev *dev);
+extern int pci_find_aer_capability(struct pci_dev *dev);
+extern int pci_disable_pcie_error_reporting(struct pci_dev *dev);
+extern int pci_cleanup_aer_uncorrect_error_status(struct pci_dev *dev);
+#else
+#define pci_enable_pcie_error_reporting(dev)           do { } while (0)
+#define pci_find_aer_capability(dev)                   do { } while (0)
+#define pci_disable_pcie_error_reporting(dev)          do { } while (0)
+#define pci_cleanup_aer_uncorrect_error_status(dev)    do { } while (0)
+#endif
+
+#endif //_AER_H_
+
diff -r 61d1f2810617 -r 6591b4869889 include/linux/pci.h
--- a/include/linux/pci.h       Tue Nov 04 12:43:37 2008 +0900
+++ b/include/linux/pci.h       Wed Nov 26 10:24:15 2008 +0900
@@ -456,6 +456,7 @@ struct pci_dev *pci_get_subsys (unsigned
                                unsigned int ss_vendor, unsigned int ss_device,
                                struct pci_dev *from);
 struct pci_dev *pci_get_slot (struct pci_bus *bus, unsigned int devfn);
+struct pci_dev *pci_get_bus_and_slot(unsigned int bus, unsigned int devfn);
 struct pci_dev *pci_get_class (unsigned int class, struct pci_dev *from);
 int pci_dev_present(const struct pci_device_id *ids);
 
@@ -655,6 +656,11 @@ static inline struct pci_dev *pci_find_s
 static inline struct pci_dev *pci_find_slot(unsigned int bus, unsigned int 
devfn)
 { return NULL; }
 
+
+static inline struct pci_dev *pci_get_bus_and_slot(unsigned int bus, unsigned 
int devfn)
+{
+       return NULL;
+}
 static inline struct pci_dev *pci_get_device (unsigned int vendor, unsigned 
int device, struct pci_dev *from)
 { return NULL; }
 
diff -r 61d1f2810617 -r 6591b4869889 include/linux/pci_ids.h
--- a/include/linux/pci_ids.h   Tue Nov 04 12:43:37 2008 +0900
+++ b/include/linux/pci_ids.h   Wed Nov 26 10:24:15 2008 +0900
@@ -2219,6 +2219,9 @@
 #define PCI_DEVICE_ID_INTEL_ICH10_3    0x3a1a
 #define PCI_DEVICE_ID_INTEL_ICH10_4    0x3a30
 #define PCI_DEVICE_ID_INTEL_ICH10_5    0x3a60
+#define PCI_DEVICE_ID_INTEL_PCH_LPC_MIN        0x3b00
+#define PCI_DEVICE_ID_INTEL_PCH_LPC_MAX        0x3b1f
+#define PCI_DEVICE_ID_INTEL_PCH_SMBUS  0x3b30
 #define PCI_DEVICE_ID_INTEL_82371SB_0  0x7000
 #define PCI_DEVICE_ID_INTEL_82371SB_1  0x7010
 #define PCI_DEVICE_ID_INTEL_82371SB_2  0x7020
diff -r 61d1f2810617 -r 6591b4869889 include/linux/pcieport_if.h
--- a/include/linux/pcieport_if.h       Tue Nov 04 12:43:37 2008 +0900
+++ b/include/linux/pcieport_if.h       Wed Nov 26 10:24:15 2008 +0900
@@ -62,6 +62,12 @@ struct pcie_port_service_driver {
        int (*suspend) (struct pcie_device *dev, pm_message_t state);
        int (*resume) (struct pcie_device *dev);
 
+       /* Service Error Recovery Handler */
+       struct pci_error_handlers *err_handler;
+
+       /* Link Reset Capability - AER service driver specific */
+       pci_ers_result_t (*reset_link) (struct pci_dev *dev);
+
        const struct pcie_port_service_id *id_table;
        struct device_driver driver;
 };
diff -r 61d1f2810617 -r 6591b4869889 include/xen/interface/features.h
--- a/include/xen/interface/features.h  Tue Nov 04 12:43:37 2008 +0900
+++ b/include/xen/interface/features.h  Wed Nov 26 10:24:15 2008 +0900
@@ -62,6 +62,12 @@
 /* x86: Does this Xen host support the MMU_{CLEAR,COPY}_PAGE hypercall? */
 #define XENFEAT_highmem_assist             6
 
+/*
+ * If set, GNTTABOP_map_grant_ref honors flags to be placed into guest kernel
+ * available pte bits.
+ */
+#define XENFEAT_gnttab_map_avail_bits      7
+
 #define XENFEAT_NR_SUBMAPS 1
 
 #endif /* __XEN_PUBLIC_FEATURES_H__ */
diff -r 61d1f2810617 -r 6591b4869889 include/xen/interface/grant_table.h
--- a/include/xen/interface/grant_table.h       Tue Nov 04 12:43:37 2008 +0900
+++ b/include/xen/interface/grant_table.h       Wed Nov 26 10:24:15 2008 +0900
@@ -360,7 +360,7 @@ DEFINE_XEN_GUEST_HANDLE(gnttab_unmap_and
 
 
 /*
- * Bitfield values for update_pin_status.flags.
+ * Bitfield values for gnttab_map_grant_ref.flags.
  */
  /* Map the grant entry for access by I/O devices. */
 #define _GNTMAP_device_map      (0)
@@ -388,6 +388,13 @@ DEFINE_XEN_GUEST_HANDLE(gnttab_unmap_and
 #define GNTMAP_contains_pte     (1<<_GNTMAP_contains_pte)
 
 /*
+ * Bits to be placed in guest kernel available PTE bits (architecture
+ * dependent; only supported when XENFEAT_gnttab_map_avail_bits is set).
+ */
+#define _GNTMAP_guest_avail0    (16)
+#define GNTMAP_guest_avail_mask ((uint32_t)~0 << _GNTMAP_guest_avail0)
+
+/*
  * Values for error status returns. All errors are -ve.
  */
 #define GNTST_okay             (0)  /* Normal return.                        */
diff -r 61d1f2810617 -r 6591b4869889 include/xen/interface/io/pciif.h
--- a/include/xen/interface/io/pciif.h  Tue Nov 04 12:43:37 2008 +0900
+++ b/include/xen/interface/io/pciif.h  Wed Nov 26 10:24:15 2008 +0900
@@ -30,14 +30,22 @@
 /* xen_pci_sharedinfo flags */
 #define _XEN_PCIF_active     (0)
 #define XEN_PCIF_active      (1<<_XEN_PCI_active)
+#define _XEN_PCIB_AERHANDLER (1)
+#define XEN_PCIB_AERHANDLER  (1<<_XEN_PCIB_AERHANDLER)
+#define _XEN_PCIB_active     (2)
+#define XEN_PCIB_active      (1<<_XEN_PCIB_active)
 
 /* xen_pci_op commands */
-#define XEN_PCI_OP_conf_read    (0)
-#define XEN_PCI_OP_conf_write   (1)
-#define XEN_PCI_OP_enable_msi   (2)
-#define XEN_PCI_OP_disable_msi  (3)
-#define XEN_PCI_OP_enable_msix  (4)
-#define XEN_PCI_OP_disable_msix (5)
+#define XEN_PCI_OP_conf_read           (0)
+#define XEN_PCI_OP_conf_write          (1)
+#define XEN_PCI_OP_enable_msi          (2)
+#define XEN_PCI_OP_disable_msi         (3)
+#define XEN_PCI_OP_enable_msix         (4)
+#define XEN_PCI_OP_disable_msix        (5)
+#define XEN_PCI_OP_aer_detected        (6)
+#define XEN_PCI_OP_aer_resume          (7)
+#define XEN_PCI_OP_aer_mmio            (8)
+#define XEN_PCI_OP_aer_slotreset       (9)
 
 /* xen_pci_op error numbers */
 #define XEN_PCI_ERR_success          (0)
@@ -82,10 +90,25 @@ struct xen_pci_op {
     struct xen_msix_entry msix_entries[SH_INFO_MAX_VEC];
 };
 
+/*used for pcie aer handling*/
+struct xen_pcie_aer_op
+{
+
+    /* IN: what action to perform: XEN_PCI_OP_* */
+    uint32_t cmd;
+    /*IN/OUT: return aer_op result or carry error_detected state as input*/
+    int32_t err;
+
+    /* IN: which device to touch */
+    uint32_t domain; /* PCI Domain/Segment*/
+    uint32_t bus;
+    uint32_t devfn;
+};
 struct xen_pci_sharedinfo {
     /* flags - XEN_PCIF_* */
     uint32_t flags;
     struct xen_pci_op op;
+    struct xen_pcie_aer_op aer_op;
 };
 
 #endif /* __XEN_PCI_COMMON_H__ */
diff -r 61d1f2810617 -r 6591b4869889 include/xen/interface/kexec.h
--- a/include/xen/interface/kexec.h     Tue Nov 04 12:43:37 2008 +0900
+++ b/include/xen/interface/kexec.h     Wed Nov 26 10:24:15 2008 +0900
@@ -155,27 +155,6 @@ typedef struct xen_kexec_range {
     unsigned long start;
 } xen_kexec_range_t;
 
-/* vmcoreinfo stuff */
-#define VMCOREINFO_BYTES           (4096)
-#define VMCOREINFO_NOTE_NAME       "VMCOREINFO_XEN"
-void arch_crash_save_vmcoreinfo(void);
-void vmcoreinfo_append_str(const char *fmt, ...)
-       __attribute__ ((format (printf, 1, 2)));
-#define VMCOREINFO_PAGESIZE(value) \
-       vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
-#define VMCOREINFO_SYMBOL(name) \
-       vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
-#define VMCOREINFO_SYMBOL_ALIAS(alias, name) \
-       vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #alias, (unsigned long)&name)
-#define VMCOREINFO_STRUCT_SIZE(name) \
-       vmcoreinfo_append_str("SIZE(%s)=%zu\n", #name, sizeof(struct name))
-#define VMCOREINFO_OFFSET(name, field) \
-       vmcoreinfo_append_str("OFFSET(%s.%s)=%lu\n", #name, #field, \
-                             (unsigned long)offsetof(struct name, field))
-#define VMCOREINFO_OFFSET_ALIAS(name, field, alias) \
-       vmcoreinfo_append_str("OFFSET(%s.%s)=%lu\n", #name, #alias, \
-                             (unsigned long)offsetof(struct name, field))
-
 #endif /* _XEN_PUBLIC_KEXEC_H */
 
 /*
diff -r 61d1f2810617 -r 6591b4869889 include/xen/interface/trace.h
--- a/include/xen/interface/trace.h     Tue Nov 04 12:43:37 2008 +0900
+++ b/include/xen/interface/trace.h     Wed Nov 26 10:24:15 2008 +0900
@@ -142,7 +142,9 @@
 #define TRC_HVM_INVLPG64        (TRC_HVM_HANDLER + TRC_64_FLAG + 0x14)
 #define TRC_HVM_MCE             (TRC_HVM_HANDLER + 0x15)
 #define TRC_HVM_IO_ASSIST       (TRC_HVM_HANDLER + 0x16)
+#define TRC_HVM_IO_ASSIST64     (TRC_HVM_HANDLER + TRC_64_FLAG + 0x16)
 #define TRC_HVM_MMIO_ASSIST     (TRC_HVM_HANDLER + 0x17)
+#define TRC_HVM_MMIO_ASSIST64   (TRC_HVM_HANDLER + TRC_64_FLAG + 0x17)
 #define TRC_HVM_CLTS            (TRC_HVM_HANDLER + 0x18)
 #define TRC_HVM_LMSW            (TRC_HVM_HANDLER + 0x19)
 #define TRC_HVM_LMSW64          (TRC_HVM_HANDLER + TRC_64_FLAG + 0x19)
diff -r 61d1f2810617 -r 6591b4869889 kernel/kexec.c
--- a/kernel/kexec.c    Tue Nov 04 12:43:37 2008 +0900
+++ b/kernel/kexec.c    Wed Nov 26 10:24:15 2008 +0900
@@ -368,9 +368,6 @@ static void kimage_free_pages(struct pag
        count = 1 << order;
        for (i = 0; i < count; i++)
                ClearPageReserved(page + i);
-#ifdef CONFIG_XEN
-       xen_destroy_contiguous_region((unsigned long)page_address(page), order);
-#endif
        __free_pages(page, order);
 }
 
diff -r 61d1f2810617 -r 6591b4869889 sound/pci/hda/hda_intel.c
--- a/sound/pci/hda/hda_intel.c Tue Nov 04 12:43:37 2008 +0900
+++ b/sound/pci/hda/hda_intel.c Wed Nov 26 10:24:15 2008 +0900
@@ -82,6 +82,7 @@ MODULE_SUPPORTED_DEVICE("{{Intel, ICH6},
                         "{Intel, ICH8},"
                         "{Intel, ICH9},"
                         "{Intel, ICH10},"
+                        "{Intel, PCH},"
                         "{ATI, SB450},"
                         "{ATI, SB600},"
                         "{ATI, RS600},"
@@ -1640,6 +1641,7 @@ static struct pci_device_id azx_ids[] = 
        { 0x8086, 0x293f, PCI_ANY_ID, PCI_ANY_ID, 0, 0, AZX_DRIVER_ICH }, /* 
ICH9 */
        { 0x8086, 0x3a3e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, AZX_DRIVER_ICH }, /* 
ICH10 */
        { 0x8086, 0x3a6e, PCI_ANY_ID, PCI_ANY_ID, 0, 0, AZX_DRIVER_ICH }, /* 
ICH10 */
+       { 0x8086, 0x3b56, PCI_ANY_ID, PCI_ANY_ID, 0, 0, AZX_DRIVER_ICH }, /* 
PCH */
        { 0x1002, 0x437b, PCI_ANY_ID, PCI_ANY_ID, 0, 0, AZX_DRIVER_ATI }, /* 
ATI SB450 */
        { 0x1002, 0x4383, PCI_ANY_ID, PCI_ANY_ID, 0, 0, AZX_DRIVER_ATI }, /* 
ATI SB600 */
        { 0x1002, 0x793b, PCI_ANY_ID, PCI_ANY_ID, 0, 0, AZX_DRIVER_ATIHDMI }, 
/* ATI RS600 HDMI */

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-changelog

<Prev in Thread] Current Thread [Next in Thread>