On 02/11/2011 11:00 AM, Kay, Allen M wrote:
> The code for memblock_x86_reserve_range() does not exist in 2.6.32.27 pvops
> dom0.
No, the function changed name, but the concept is the same..
> I did find it in Konrad's tree at
> git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git.
>
> So is this a problem for 2.6.32.27 stable tree? If so, which pvops dom0 tree
> should I be using?
I *just* pushed .32.27 and haven't had a chance to test it. The
xen/stable-2.6.32.x branch contains the version of xen/next-2.6.32 which
has at least passed an amount of testing (ie, boots on something at the
very least).
J
> Allen
>
> -----Original Message-----
> From: Jeremy Fitzhardinge [mailto:jeremy@xxxxxxxx]
> Sent: Friday, February 11, 2011 9:07 AM
> To: Kay, Allen M
> Cc: Konrad Rzeszutek Wilk; Stefano Stabellini; xen-devel; Keir Fraser
> Subject: Re: [Xen-devel] 2.6.32.27 dom0 + latest xen staging boot failure
>
> On 02/10/2011 07:07 PM, Kay, Allen M wrote:
>>> That "extra memory" stuff is reserving some physical address space for
>>> ballooning. It should be completely unused (and unbacked by any pages)
>>> until the balloon driver populates it; it is reserved memory in the
>>> meantime.
>> On my system, the entire chunk is marked as usable memory:
>>
>> 0000000100000000 - 000000023a6f4000 (usable)
>>
>> When you said it is reserved memory, are you saying it should be marked as
>> "reserved" or is there somewhere else in the code that keeps track of which
>> portion of this e820 chunk is back by real memory and which chunk is "extra
>> memory"?
> Yes, it is marked as usable in the E820 so that the kernel will allocate
> page structures for it. But then the extra part is reserved with
> memblock_x86_reserve_range(), which should prevent the kernel from ever
> trying to use that memory (ie, it will never get added to the pools of
> memory the allocator allocates from). The balloon driver backs these
> pseudo-physical pageframes with real memory pages, and then releases
> into the pool for allocation.
>
> J
>
>> -----Original Message-----
>> From: Jeremy Fitzhardinge [mailto:jeremy@xxxxxxxx]
>> Sent: Thursday, February 10, 2011 6:56 PM
>> To: Kay, Allen M
>> Cc: Konrad Rzeszutek Wilk; Stefano Stabellini; xen-devel; Keir Fraser
>> Subject: Re: [Xen-devel] 2.6.32.27 dom0 + latest xen staging boot failure
>>
>> On 02/10/2011 05:03 PM, Kay, Allen M wrote:
>>> Konrad/Stefano,
>>>
>>> Getting back to the xen/dom0 boot failure on my Sandybridge SDP I reported
>>> a few weeks ago.
>>>
>>> I finally got around to narrow down the problem the call to
>>> xen_add_extra_mem() in arch/x86/xen/setup.c/xen_memory_setup(). This call
>>> increase the top of E820 memory in dom0 beyond what is actually available.
>>>
>>> Before xen_add_extra_mem() is called, the last entry of dom0 e820 table is:
>>>
>>> 0000000100000000 - 000000016b45a000 (usable)
>>>
>>> After xen_add_extra_mem() is called, the last entry of dom0 e820 table
>>> becomes:
>>>
>>> 0000000100000000 - 000000023a6f4000 (usable)
>>>
>>> This pushes the top of RAM beyond what was reported by Xen's e820 table,
>>> which is:
>>>
>>> (XEN) 0000000100000000 - 00000001de600000 (usable)
>>>
>>> AFAICT, the failure is caused by dom0 accessing non-existent physical
>>> memory. The failure went away after I removed the call to
>>> xen_add_extra_mem().
>> That "extra memory" stuff is reserving some physical address space for
>> ballooning. It should be completely unused (and unbacked by any pages)
>> until the balloon driver populates it; it is reserved memory in the
>> meantime.
>>
>> How is that memory getting referenced in your case?
>>
>>> Another potential problem I noticed with e820 processing is that there is a
>>> discrepancy between how Xen processes e820 and how dom0 does it. In Xen
>>> (arch/x86/setup.c/start_xen()), e820 entries are aligned on
>>> L2_PAGETABLE_SHIFT boundary while dom0 e820 code does not. As a result,
>>> one of my e820 entry that is 1 page in size got dropped by Xen but got
>>> picked up in dom0. This does not cause problem in my case but the
>>> inconsistency on how memory is used by xen and dom0 can potentially be a
>>> problem.
>> I don't think that matters. Xen can choose not to use non-2M aligned
>> pieces of memory if it wants, but that doesn't really affect the dom0
>> kernel's use of the host E820, because dom0 is only looking for possible
>> device memory, rather than RAM.
>>
>> J
>>> Allen
>>>
>>> -----Original Message-----
>>> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
>>> Sent: Friday, January 28, 2011 7:48 AM
>>> To: Kay, Allen M
>>> Cc: xen-devel; Stefano Stabellini
>>> Subject: Re: [Xen-devel] 2.6.32.27 dom0 + latest xen staging boot failure
>>>
>>> On Fri, Jan 28, 2011 at 10:28:43AM -0500, Konrad Rzeszutek Wilk wrote:
>>>> On Thu, Jan 27, 2011 at 10:51:42AM -0800, Kay, Allen M wrote:
>>>>> Following are the brief error messages from the serial console log. I
>>>>> have also attached the full serial console log and dom0 system map.
>>>>>
>>>>> (XEN) mm.c:802:d0 Bad L1 flags 400000
>>>> On a second look, this is a different issue than I had encountered.
>>>>
>>>> The 400000 translates to Xen thinking you had PAGE_GNTTAB set, but that
>>>> is not right. Googling for this shows that I had fixed this with a
>>>> Xorg server at some point, but I can't remember the details so that is not
>>>> that useful :-(
>>>>
>>>> You said it works if you give the domain 1024MB, but I wonder if
>>>> it also works if you disable the IOMMU? What happens then?
>>> Can you also patch your Xen hypervisor with this patch? It will print out
>>> the
>>> other 89 entries so we can see what type of values they have.. You might
>>> need to
>>> move it a bit as this is for xen-unstable.
>>>
>>> diff -r 003acf02d416 xen/arch/x86/mm.c
>>> --- a/xen/arch/x86/mm.c Thu Jan 20 17:04:06 2011 +0000
>>> +++ b/xen/arch/x86/mm.c Fri Jan 28 10:46:13 2011 -0500
>>> @@ -1201,11 +1201,12 @@
>>> return 0;
>>>
>>> fail:
>>> - MEM_LOG("Failure in alloc_l1_table: entry %d", i);
>>> + MEM_LOG("Failure in alloc_l1_table: entry %d of L1 (mfn: %lx). Other
>>> L1 values:", i, pfn);
>>> while ( i-- > 0 )
>>> - if ( is_guest_l1_slot(i) )
>>> + if ( is_guest_l1_slot(i) ) {
>>> + MEM_LOG("L1[%d] = %lx", i, (unsigned
>>> long)l1e_get_intpte(pl1e[i]));
>>> put_page_from_l1e(pl1e[i], d);
>>> -
>>> + }
>>> unmap_domain_page(pl1e);
>>> return -EINVAL;
>>> }
>>>
>>>>> (XEN) mm.c:1204:d0 Failure in alloc_l1_table: entry 90
>>>>> (XEN) mm.c:2142:d0 Error while validating mfn 1d7e97 (pfn 3d69) for type
>>>>> 1000000
>>>>> 000000000: caf=8000000000000003 taf=1000000000000001
>>>>> (XEN) mm.c:2965:d0 Error while pinning mfn 1d7e97
>>>>> (XEN) traps.c:451:d0 Unhandled invalid opcode fault/trap [#6] on VCPU 0
>>>>> [ec=0000
>>>>> ]
>>>>> (XEN) domain_crash_sync called from entry.S
>>>>> (XEN) Domain 0 (vcpu#0) crashed on cpu#0:
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>> http://lists.xensource.com/xen-devel
>>>>
>>>>
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|