WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Wed, 1 Sep 2010 09:49:18 +0100
Cc: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>, xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 01 Sep 2010 01:50:14 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4C7E24BE02000078000139EC@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: ActJrAqYJkJUCrRzQrq5Jltry49CGgABoV3g
Thread-topic: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
User-agent: Microsoft-Entourage/12.26.0.100708
On 01/09/2010 09:02, "Jan Beulich" <JBeulich@xxxxxxxxxx> wrote:

>> Well I agree with your logic anyway. So I don't see that this can be the
>> cause of MaoXiaoyun's bug. At least not directly. But then I'm stumped as to
>> why the page arithmetic and checks in free_heap_pages are (apparently)
>> resulting in a page pointer way outside the frame-table region and actually
>> in the directmap region.
> 
> There must be some unchecked use of PAGE_LIST_NULL, i.e.
> running off a list end without taking notice (0xffff8315ffffffe4
> exactly corresponds with that).

Okay, my next guess then is that we are deleting a chunk from the wrong list
head. I don't see any check that the adjacent chunks we are considering to
merge are from the same node and zone. I suppose the zone logic does just
work as we're dealing with 2**x aligned and sized regions. But, shouldn't
the merging logic in free_heap_pages be checking that the merging candidate
is from the same NUMA node? I see I have an ASSERTion later in the same
function, but it's too weak and wishful I suspect.

MaoXiaoyun: can you please test with the attached patch? If I'm right, you
will crash on one of the BUG_ON checks that I added, rather than crashing on
a pointer dereference. You may even crash during boot. Anyhow, what is
interesting is whether this patch always makes you crash on BUG_ON before
you would normally crash on pointer dereference. If so this is trivial to
fix.

 Thanks,
 Keir

Attachment: 00-bugcheck
Description: Binary data

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel