WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT

To: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>, "jbeulich@xxxxxxxxxx" <jbeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Wed, 1 Sep 2010 08:40:55 +0100
Cc: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 01 Sep 2010 00:43:01 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <BAY121-W20E047CEA1476A16D2A9F7DA8B0@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: ActJpbtnuYn2ey8fRXKMXrH9ZMdLkAAA0cRA
Thread-topic: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
User-agent: Microsoft-Entourage/12.26.0.100708
On 01/09/2010 08:17, "MaoXiaoyun" <tinnycloud@xxxxxxxxxxx> wrote:

> As I go through the chunk merge code in free_heap_pages, one thing I'd like
> to mention is, previously, I printted out all domain pages when allocated,
> and I found the order in assgin_pages in
> /xen-4.0.0/xen/common/page_alloc.c:1087,
> the order either be 0, or 9, and later I know that is because domain U
> populate physmap
> 2M Bytes everytime.
>  
> And here in the while statement,  the order is compare with MAX_ORDER, which
> is 20.
> I wonder if it might have some clues.

Xen's buddy allocator merges pairs of adjacent free chunks up to a maximum
size of 2**20 pages. That merging needs to be careful it doesn't merge off
the end of RAM. I'm just guessing that maybe there's an issue with that on
your fairly large memory system.

 -- Keir

> Thanks.
> -------------------------------
>  531 
>  532     /* Merge chunks as far as possible. */
>  533     while ( order < MAX_ORDER )
>  534     {
>  535         mask = 1UL << order;
>  
>> Date: Tue, 31 Aug 2010 18:03:41 +0100
>> Subject: Re: [Xen-devel] Xen-unstable panic: FATAL PAGE FAULT
>> From: keir.fraser@xxxxxxxxxxxxx
>> To: JBeulich@xxxxxxxxxx
>> CC: tinnycloud@xxxxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>> 
>> On 31/08/2010 17:35, "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx> wrote:
>> 
>>>> That's somewhat implicit: srat_parse_regions() gets passed an
>>>> address that is at least BOOTSTRAP_DIRECTMAP_END (i.e. 4G).
>>>> Thus srat_parse_regions() starts off with a mask with the lower
>>>> 32 bits all set (only more bits can get set subsequently). Thus
>>>> the earliest zero bit pfn_pdx_hole_setup() can find is bit 20
>>>> (due to the >> PAGE_SHIFT in the invocation). Consequently
>>>> the smallest chunk where arithmetic is valid really is 4Gb, not
>>>> 256Mb as I first wrote.
>>> 
>>> Well, that's a bit too implicit for me. How about we initialise 'j' to
>>> MAX_ORDER in pfn_pdx_hole_setup() with a comment about supporting page_info
>>> pointer arithmetic within allocatable multi-page regions?
>> 
>> Well I agree with your logic anyway. So I don't see that this can be the
>> cause of MaoXiaoyun's bug. At least not directly. But then I'm stumped as to
>> why the page arithmetic and checks in free_heap_pages are (apparently)
>> resulting in a page pointer way outside the frame-table region and actually
>> in the directmap region.
>> 
>> I think an obvious next step wpuld be to get your boot output, MaoXiaoyun.
>> Can you please post it? And you may as well stop your memtest if you haven't
>> already. If you've seen the issue on more than one machine then it certainly
>> isn't due to that kind of hardware failure.
>> 
>> -- Keir
>> 
>> 
>        



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel