>-----Original Message-----
>From: Tim Deegan [mailto:Tim.Deegan@xxxxxxxxxx]
>Sent: Wednesday, July 14, 2010 5:28 PM
>To: Jiang, Yunhong
>Cc: Keir Fraser; xen-devel
>Subject: Re: [PATCH] Unmmap guest's EPT mapping for poison memory
>
>Hi,
>
>At 08:41 +0100 on 14 Jul (1279096872), Jiang, Yunhong wrote:
>> diff -r bf51b671f269 xen/arch/x86/cpu/mcheck/vmce.c
>> --- a/xen/arch/x86/cpu/mcheck/vmce.c Mon Jul 12 13:59:39 2010 +0800
>> +++ b/xen/arch/x86/cpu/mcheck/vmce.c Mon Jul 12 14:30:21 2010 +0800
>> @@ -558,3 +558,28 @@ int is_vmce_ready(struct mcinfo_bank *ba
>>
>> return 0;
>> }
>> +
>> +/* Now we only have support for HAP guest */
>> +int unmmap_broken_page(struct domain *d, unsigned long mfn, unsigned long
>gfn)
>> +{
>> + /* Always trust dom0 */
>> + if ( d == dom0 )
>> + return 0;
>> +
>> + if (is_hvm_domain(d) && (paging_mode_hap(d)) )
>> + {
>> + p2m_type_t pt;
>> +
>> + gfn_to_mfn_query(d, gfn, &pt);
>> + /* What will happen if is paging-in? */
>> + if ( pt == p2m_ram_rw )
>
>Or any of the other types? This should be called for ram_ro, and
>ram_logdirty certainly, and probably mmio_direct too.
Yes, we need consider rw/ro/logdirty. Thanks for remind and will fix it. But
why should we cover mmio_direct? Can you please give some hints?
For ram_shared, it deserve more consideration, seems currently the shared
memory situation is not handled in the whole offline page flow.
>
>I'm not sure that it's safe to nobble other types - e.g. changing from
>grant_map_*, paging_* or ram_shared might break state-machines/refcounts
>elsewhere.
I think this code does not change anything for the refcounts, we simply destroy
the guest.
Or you mean race happens when other components is changing the p2m table also?
I assume that should be ok since we only query the type and destroy the guest.
Did I missed anything?
>
>Actually wouldn't it be be better to encode brokenness in the frametable
Encode brokeness in frametable is done already. But that is only a mark, and
that page will not be allocated anymore. If the page is being used by guest, we
need unmap for the guest, so that guest can't access the memory anymore.
The background here is: In some platform, system can find poison memory through
like memory scrubbing or L3 cache explicit write back (i.e. async memory
checking, not in current context). However, whenenever the poison memory is
accessed, it will cause fatal MCE and system crash. So we need make sure the
guest can't access the broken memory.
>instead of the P2M and then forbid new mappings of broken MFNs? It's
>not really a property of the PFN (wasn't there a patch series a while
>ago that swapped broken MFNs under a VM's feet?).
The swap broken mfn is in fact when the page is likely to be broken (i.e. the
page can still be accessed). For example, for page with ECC support, when too
many corrected error (i.e. 1 bit error) happens to a page, we assume the page
is fragile, and may have un-correctable error ( two bit error ) in future, and
swap it with a new page wil keep thing continue. However, if the page is broken
already, we can't access the page anymore (this usually causes MCE), in such
situation, we can't swap the page, but unmap it.
Hope this make thing clear.
Thanks
--jyh
>
>Cheers,
>
>Tim.
>
>--
>Tim Deegan <Tim.Deegan@xxxxxxxxxx>
>Principal Software Engineer, XenServer Engineering
>Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|