This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] L1[0x1fb] = 0000000000000000 which faults on one type of

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] L1[0x1fb] = 0000000000000000 which faults on one type of machine but on another works?
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Thu, 17 Mar 2011 11:52:12 -0400
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, andrew.thomas@xxxxxxxxxx, Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>, keir.xen@xxxxxxxxx, swente@xxxxxxxxxxxxx, gianni.tedesco@xxxxxxxxxx
Delivery-date: Thu, 17 Mar 2011 08:53:14 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4D81EF97020000780003706E@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20110316221912.GA13035@xxxxxxxxxxxx> <4D81EF97020000780003706E@xxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
On Thu, Mar 17, 2011 at 10:25:11AM +0000, Jan Beulich wrote:
> >>> On 16.03.11 at 23:19, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> 
> >>> wrote:
> > But one thing I can't understand is why on one machine (IBM x3850)
> > I get this crash, while another one with the same pagetable contents
> > (L1 has nothing for 0x1fb) it works just fine? I added a panic and used
> > the Xen hypervisor kdb to manually inspect the pagetable, and it has
> > the same contents as the IBM x3850 -but it boots fine with this invalid 
> > value.
> > Any ideas?
> Without seeing the full stack trace it's hard to tell. To me, it looks
> like a mistake for native_apic_read() to be called at all under Xen,
> and perhaps there's one lurking somewhere that gets hit only on
> those IBM (Summit?) machines.

That was it. When we bootup we call 'set_xen_basic_apic_ops' which
sets apic->read to xen_apic_read. The default 'apic' is set to
apic_flat, so in essence we change apic_flat->read from native_apic_read
to xen_apic_read.

During bootup, the default_acpi_madt_oem_check is run which
runs through all of the apic_probe[] array, on which the last
one is is apic_physflat. And apic_physflat->probe() returns true
on this IBM Summit box (and ES7000 boxs, and whatever has FADT
set to ACPI_FADT_APIC_PHYSICAL) so we set apic now to apic_physflat
and the apic->read ends up being native_apic_read.

2.6.38 fixes this by allowing in acpi_register_lapic_address, the
the set_fixmap_nocache(FIX_APIC_BASE, address) to be called and we
can provide it with a dummy page and native_apic_read can happily
read from that fake page.

Xen-devel mailing list