On 10/27/2010 10:39 AM, Ian Campbell wrote:
> On Wed, 2010-10-27 at 18:20 +0100, Jeremy Fitzhardinge wrote:
>> On 10/27/2010 09:04 AM, Ian Campbell wrote:
>>> On Wed, 2010-10-27 at 14:18 +0100, Ian Campbell wrote:
>>>> IOW we are faulting on precisely one of the PFNs which we earlier
>>>> released. We released these because of a hole in the e820 between
>>>> 0xa0000-0x100000 which dom0 apparently manufactured.
>>>>
>>>> IIRC in a PV domU we reserve some of the legacy address space to stop
>>>> old ISA drivers etc from getting at it. I suspect this is wrong for a
>>>> dom0 where we want the e820 to be more or less unmolested. I'll track
>>>> down the code in question and see if removing it for dom0 helps.
>>> More than the issue with unnecessarily reserving memory we simply can't
>>> trust the e820 to cover all the firmware tables here. I think it is
>>> better to simply not give any memory under 1M back to Xen in the domain
>>> 0 case.
>>>
>>> 8<-----------
>>>
>>> >From e25932748e354f5da3fbb5719fb17f9ca2e35f09 Mon Sep 17 00:00:00 2001
>>> From: Ian Campbell <ian.campbell@xxxxxxxxxx>
>>> Date: Wed, 27 Oct 2010 17:02:25 +0100
>>> Subject: [PATCH] xen: do not release any memory under 1M in domain 0
>>>
>>> We already deliberately setup a 1-1 P2M for the region up to 1M in
>>> order to allow code which assumes this region is already mapped to
>>> work without having to convert everything to ioremap.
>>>
>> Looks good, but:
>>
>>> Domain 0 should not return any apparently unused memory regions
>>> (reserved or otherwise) in this region to Xen since the e820 may not
>>> accurately reflect what the BIOS has stashed in this region.
>>>
>>> Similarly do not reserve the ISA range if we are domain 0 since it is
>>> not necessarily normal, usable memory in that case.
>>>
>>> Since we now assume that we have a (reasonably) sensible e820 map in
>>> domain 0 make a failure to obtain a machine memory map from the
>>> hypervisor fatal rather than struggling on with a made up map and
>>> suffering all the potential fallout.
>>>
>>> Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
>>> ---
>>> arch/x86/xen/setup.c | 32 +++++++++++++++++++++++++-------
>>> 1 files changed, 25 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
>>> index 915b0c3..baad88c 100644
>>> --- a/arch/x86/xen/setup.c
>>> +++ b/arch/x86/xen/setup.c
>>> @@ -84,6 +84,22 @@ static unsigned long __init
>>> xen_release_chunk(phys_addr_t start_addr,
>>> start = PFN_UP(start_addr);
>>> end = PFN_DOWN(end_addr);
>>>
>>> + /*
>>> + * Domain 0 maintains a 1-1 P2M mapping for the first megabyte
>>> + * so do not return such memory to the hypervisor.
>>> + *
>>> + * This region can contain various firmware tables and the
>>> + * like which are often assumed to be always mapped and
>>> + * available via phys_to_virt.
>>> + */
>>> + if (xen_initial_domain()) {
>>> + if (end < PFN_DOWN(0x100000UL))
>>> + return 0;
>>> +
>>> + if (start < PFN_DOWN(0x100000UL))
>>> + start = PFN_DOWN(0x100000UL);
>> Use ISA_END_ADDRESS I think.
> Yes.
>
> I notice that the 1-1 P2M only seems to extend from ISA_START_ADDRESS
> (0xa0000) but the range missing from the e820 here starts at 0x9e000.
> Should we be doing the 1-1 P2M over the whole lower 1M?
I suppose. The lower 640k really should be RAM so you can boot DOS, but
I think Xen (Win7 too, I gather) just leaves it completely unused just
in case BIOS uses/trashes it.
>>> + }
>>> +
>>> if (end <= start)
>>> return 0;
>>>
>>> @@ -163,6 +179,7 @@ char * __init xen_memory_setup(void)
>>> XENMEM_memory_map;
>>> rc = HYPERVISOR_memory_op(op, &memmap);
>>> if (rc == -ENOSYS) {
>>> + BUG_ON(xen_initial_domain());
>>> memmap.nr_entries = 1;
>>> map[0].addr = 0ULL;
>>> map[0].size = mem_end;
>>> @@ -200,15 +217,16 @@ char * __init xen_memory_setup(void)
>>> }
>>>
>>> /*
>>> - * Even though this is normal, usable memory under Xen, reserve
>>> - * ISA memory anyway because too many things think they can poke
>>> - * about in there.
>>> + * Even though this is normal, usable memory in a Xen domU,
>>> + * reserve ISA memory anyway because too many things think
>>> + * they can poke about in there.
>>> *
>>> - * In a dom0 kernel, this region is identity mapped with the
>>> - * hardware ISA area, so it really is out of bounds.
>>> + * In dom0 we use the host e820 and therefore do not need to
>>> + * specially reserved anything.
>>> */
>>> - e820_add_region(ISA_START_ADDRESS, ISA_END_ADDRESS -
>>> ISA_START_ADDRESS,
>>> - E820_RESERVED);
>>> + if (!xen_initial_domain())
>>> + e820_add_region(ISA_START_ADDRESS, ISA_END_ADDRESS -
>>> ISA_START_ADDRESS,
>>> + E820_RESERVED);
>> What's the problem with making it unconditionally reserved even if the
>> host E820 doesn't?
> Perhaps nothing, but if it was necessary wouldn't it be needed on native
> too?
I don't think its *necessary* (BIOS would have to be desperately broken
to mark anything in that range as E820_RAM), but I'd prefer to avoid
unneeded conditionals just to try and make testing simpler.
J
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|