WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] PV domU with 255GB boot failure

To: Jan Beulich <jbeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] PV domU with 255GB boot failure
From: Mukesh Rathor <mukesh.rathor@xxxxxxxxxx>
Date: Wed, 18 Feb 2009 11:12:26 -0800
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Wed, 18 Feb 2009 11:13:08 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <499C0B42.76E4.0078.0@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Oracle Corp
References: <C5C01F7D.2CAC%keir.fraser@xxxxxxxxxxxxx> <499B82D8.7060606@xxxxxxxxxx> <499C0B42.76E4.0078.0@xxxxxxxxxx>
Reply-to: mukesh.rathor@xxxxxxxxxx
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 2.0.0.5 (X11/20070719)


Jan Beulich wrote:
>>>> Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> 18.02.09 04:39 >>>
>
> First a general remark: You're doing this patch to support 256G domains,
> but by keeping extend_init_mapping() there'll continue to be no way to
> support domains with close to or above 512G (or, if making use of
> XEN_ELFNOTE_INIT_P2M, 1T). This function, rather than needing fixes,
> really just needs to go away.
>
> I've done this in our forward ported 2.6.27+ kernels, but unfortunately
> can't really contribute the changes to the 2.6.18 tree, as there are too
> many differences, and I'm simply lacking the time (and, honestly, interest)
> to work out all the issues. I could post the respective patch if you (or
> someone else) care(s).
>

I came up with this patch trying to fix the hang on less than 256 GB.
With 256 GB it's not even coming this far, pl see another thread. Since
256 GB was the original bug, we definitely need to support that. So please
post your patches with any relevant pointers and I'll take a crack at it...

thanks,
Mukesh



>> --- init-xen.c.orig    2009-02-17 18:58:58.716954000 -0800
>> +++ init-xen.c 2009-02-17 19:33:57.310074000 -0800
>> @@ -587,35 +587,45 @@ void __init xen_init_pt(void)
>> static void __init extend_init_mapping(unsigned long tables_space)
>> {
>>        unsigned long va = __START_KERNEL_map;
>> -      unsigned long phys, addr, *pte_page;
>> +      unsigned long phys, addr, *pte_page, *pmd_page;
>> +      pud_t *pud;
>>        pmd_t *pmd;
>>        pte_t *pte, new_pte;
>> -      unsigned long *page = (unsigned long *)init_level4_pgt;
>> +      unsigned long *pud_page = (unsigned long *)init_level4_pgt;
>
> This is certainly confusing. Please use typeless names here, or names
> that properly reflect what they're used for.
>
>> -      addr = page[pgd_index(va)];
>> -      addr_to_page(addr, page);
>> -      addr = page[pud_index(va)];
>> -      addr_to_page(addr, page);
>> -
>> -      /* Kill mapping of low 1MB. */
>> +      /* Kill low mappings */
>>        while (va < (unsigned long)&_text) {
>>                if (HYPERVISOR_update_va_mapping(va, __pte_ma(0), 0))
>>                        BUG();
>>                va += PAGE_SIZE;
>>        }
>> +      addr = pud_page[pgd_index(va)];    /* get pud entry from pgd tbl */
>> +      addr_to_page(addr, pud_page);      /* pud_page now va of pud tbl */
>
> If you follow above request for using proper (or type-neutral) names, you
> won't need extra comments here.
>
> Also, the code could stay at the place it was so far (since clearly 
pgd_index()
> can never be different for __START_KERNEL_map and _text).
>
>>        /* Ensure init mappings cover kernel text/data and initial tables. */
>>        while (va < (__START_KERNEL_map
>>                     + (start_pfn << PAGE_SHIFT)
>>                     + tables_space)) {
>> -              pmd = (pmd_t *)&page[pmd_index(va)];
>> +
>> + pud = (pud_t *)&pud_page[pud_index(va)]; /* get pud entry */
>> +                if (pud_none(*pud)) {
>> +                        pmd_page = alloc_static_page(&phys);
>> +                        early_make_page_readonly(
>> +                                pmd_page, XENFEAT_writable_page_tables);
>> +                        set_pud(pud, __pud(phys | _KERNPG_TABLE));
>> +                } else {
>> +                        addr = pud_page[pud_index(va)];
>> +                        addr_to_page(addr, pmd_page);
>> +                }
>> +
>> +              pmd = (pmd_t *)&pmd_page[pmd_index(va)];
>>                if (pmd_none(*pmd)) {
>>                        pte_page = alloc_static_page(&phys);
>>                        early_make_page_readonly(
>>                                pte_page, XENFEAT_writable_page_tables);
>>                        set_pmd(pmd, __pmd(phys | _KERNPG_TABLE));
>>                } else {
>> -                      addr = page[pmd_index(va)];
>> +                      addr = pmd_page[pmd_index(va)];
>>                        addr_to_page(addr, pte_page);
>>                }
>>                pte = (pte_t *)&pte_page[pte_index(va)];
>> @@ -630,7 +640,7 @@ static void __init extend_init_mapping(u
>>
>>        /* Finally, blow away any spurious initial mappings. */
>>        while (1) {
>
> Didn't you indicate in an earlier mail that we're lacking a pud_none() check
> here? Just by adding the pud allocation above, you're not excluding to
> run over a pud entry boundary here.
>
> Likewise would pmd_page need to be updated here when you cross a
> pud entry boundary.
>
>> -              pmd = (pmd_t *)&page[pmd_index(va)];
>> +              pmd = (pmd_t *)&pmd_page[pmd_index(va)];
>>                if (pmd_none(*pmd))
>>                        break;
>>                if (HYPERVISOR_update_va_mapping(va, __pte_ma(0), 0))
>> @@ -719,6 +729,7 @@ static void xen_finish_init_mapping(void
>>        table_end = start_pfn;
>> }
>>
>> +
>> /* Setup the direct mapping of the physical memory at PAGE_OFFSET.
>>    This runs before bootmem is initialized and gets pages directly from the
>>    physical memory. To access them they are temporarily mapped. */
>
> Please omit this last hunk altogether.
>
> Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel