This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] current xen/stable failed upgrade from

To: Josip Rodin <joy@xxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] current xen/stable failed upgrade from
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Tue, 23 Mar 2010 22:10:48 -0700
Cc: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 23 Mar 2010 22:11:34 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100323232223.GA22681@xxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20100306115833.GA28039@xxxxxxxxxxxxxxx> <20100306132711.GK2580@xxxxxxxxxxx> <20100307233147.GA20068@xxxxxxxxxxxxxxx> <20100311150823.GA9011@xxxxxxxxxxxxxxx> <20100311192456.GY1878@xxxxxxxxxxx> <20100312114139.GA4067@xxxxxxxxxxxxxxx> <20100312120914.GA15561@xxxxxxxxxxxxxxx> <20100323231853.GA21109@xxxxxxxxxxxxxxx> <20100323232223.GA22681@xxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20100301 Fedora/3.0.3-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.3
On 03/23/2010 04:22 PM, Josip Rodin wrote:
On Wed, Mar 24, 2010 at 12:18:53AM +0100, Josip Rodin wrote:
On Fri, Mar 12, 2010 at 01:09:14PM +0100, Josip Rodin wrote:
On Fri, Mar 12, 2010 at 12:41:39PM +0100, Josip Rodin wrote:
And now here goes the whole output preceding the 2.6.32 crash:
In the meantime there was another update to the stable branch, I'll go
compile that...
The symptoms remained the same, only the CPU MHz calculation and some memory
offsets are different.

(XEN) mm.c:720:d0 Bad L1 flags 800000
(XEN) mm.c:4221:d0 ptwr_emulate: could not get_page_from_l1e()
(XEN) d0:v0: unhandled page fault (ec=0003)
(XEN) Pagetable walk from ffff8800014fdfd8:
(XEN)  L4[0x110] = 0000000115002067 0000000000001002
(XEN)  L3[0x000] = 0000000115006067 0000000000001006
(XEN)  L2[0x00a] = 0000000116c8a067 0000000000002c8a
(XEN)  L1[0x0fd] = 00100001154fd065 00000000000014fd
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-3.4  x86_64  debug=n  Not tainted ]----
FWIW I tried to git bisect this in the last couple of days, but the result
turned out to be fairly obvious and useless as after 14 bisections
I only came to this:

commit 18ecfad3aaeead019b0e07078f643deaa7d10d44
     x86: make /dev/mem mappings _PAGE_IOMAP
commit 56f27a6d47275f6dc94adf3ecc5fe958cdcdebee
     xen/dom0: add XEN_DOM0 config option

I didn't follow through with the last bisection, it had seemed increasingly
futile for a while now... :)

I saw a peculiar side effect at one point, when I went back to a random
working dom0, all userland processes started crashing with Illegal
instruction. One iLO reset later, it's all good again. I'm guessing it was
a transient broken state.

And then when I gave up and updated to latest xen/stable for one last try,
that was the biggest d'oh moment - it's fixed :) Was it de67ec8b?


BTW with the working .32 kernel, the log says:

[    0.000000] ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 8, version 0, address 0xfec00000, GSI 0-0
[    0.000000] ACPI: IOAPIC (id[0x09] address[0xfec80000] gsi_base[24])
[    0.000000] IOAPIC[1]: apic_id 9, version 0, address 0xfec80000, GSI 24-24
[    0.000000] ACPI: IOAPIC (id[0x0a] address[0xfec80400] gsi_base[48])
[    0.000000] IOAPIC[2]: apic_id 10, version 0, address 0xfec80400, GSI 48-48
[    0.000000] ACPI: IOAPIC (id[0x0b] address[0xfec84000] gsi_base[72])
[    0.000000] IOAPIC[3]: apic_id 11, version 0, address 0xfec84000, GSI 72-72
[    0.000000] ACPI: IOAPIC (id[0x0c] address[0xfec84400] gsi_base[96])
[    0.000000] IOAPIC[4]: apic_id 12, version 0, address 0xfec84400, GSI 96-96
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 2
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ERROR: Unable to locate IOAPIC for GSI 9


[    0.023694] ACPI: bus type pci registered
[    0.023915] PCI: Found Intel Corporation E7520 Memory Controller Hub with 
MMCONFIG support.
[    0.023935] PCI: MCFG configuration 0: base e0000000 segment 0 buses 0 - 255
[    0.023942] PCI: Not using MMCONFIG.
[    0.023948] PCI: Using configuration type 1 for base access
[    0.023959] PCI: HP ProLiant DL380 detected, enabling pci=bfsort.
[    0.028634] bio: create slab<bio-0>  at 0
[    0.030115] ERROR: Unable to locate IOAPIC for GSI 9

Is there anything I can do to avoid these?

These are just noise; the kernel thinks it can poke at the IO APICs, but they're owned by Xen and so don't exist for the kernel; instead some alternate mechanisms come into play to keep the interrupts flowing. At some point I hope we can completely remove all trace of the APICs from the kernel's sight, so it won't even try to access them and print these confused messages.


Xen-devel mailing list