Hi,
At 14:56 +0100 on 14 Jul (1279119388), Jiang, Yunhong wrote:
> >Or any of the other types? This should be called for ram_ro, and
> >ram_logdirty certainly, and probably mmio_direct too.
>
> Yes, we need consider rw/ro/logdirty. Thanks for remind and will fix
> it. But why should we cover mmio_direct? Can you please give some
> hints?
I've seen cases where people use mmio_direct to point at actual memory,
in order to allow uncached mappings.
> For ram_shared, it deserve more consideration, seems currently the
> shared memory situation is not handled in the whole offline page flow.
Or vice versa. I'm happy to ack an initial patch that deals with the
easy cases, though.
> >I'm not sure that it's safe to nobble other types - e.g. changing from
> >grant_map_*, paging_* or ram_shared might break state-machines/refcounts
> >elsewhere.
>
> I think this code does not change anything for the refcounts, we simply
> destroy the guest.
> Or you mean race happens when other components is changing the p2m table
> also? I assume that should be ok since we only query the type and destroy the
> guest.
> Did I missed anything?
No, I was just suggesting that if you do handle other p2m types here
it might not be safe to change a page from shared, grant-map &c to
broken because it would cause bugs in the sharing/granting code.
> The background here is: In some platform, system can find poison
> memory through like memory scrubbing or L3 cache explicit write back
> (i.e. async memory checking, not in current context). However,
> whenenever the poison memory is accessed, it will cause fatal MCE and
> system crash. So we need make sure the guest can't access the broken
> memory.
OK - you're protecting against a _host_ crash here? Now I understand,
thanks.
In that case I definitely suggest that you move the domain_crash()
into the p2m lookup functions - all p2m lookups of a broken page should
return type=broken, and non-"query" p2m lookups should call
domain_crash() too. That will catch all the MMIO-emulator and shadow
paths for free.
If you also signal qemu-dm to blow its mapcache that will catch DMA too
(since it won't be able to re-map the broken page) though it's
unfortunate to have to rely on good behaviour from qemu-dm for safety.
Presumably the PV patches will solve that in a better way.
Cheers,
Tim.
P.S. Another thing I forgot - please wrap the code that sets the type to
broken in a "#ifdef __x86_64__"; it won't work on 32-bit.
--
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|