> For virtual MCEs that is ok. But note, for unmodified guests,
> the MC handler
> is written with the assumption that the CPU powers off when an #MCE
> happens before the handler cleared the MCIP bit in the MCG_STATUS MSR.
That should depends on implementation, for example, we can inject the vMCE one
by one, i.e. only inject next after the first is handled already.
>> For the contigous pages, I agree with Gavin that such contiguous page error
>> should be triggered as multiple #MC and so is ok.
>> For PCI config space issue, Christoph, can you please share more
>> information on it (or provide some document as Frank suggested), like is it
>> for CE (Correctable error or UC(UnCorrectable error), is it in PCI range or
>> PCI-E range (i.e. through 0xCF8/CFC or through MMCONFIG), how the device's
>> BDF caculated etc. Followed is some of my understanding.
> I would like to see a generic solution that works with any feature
> requiring access to the pci space rather a per-feature solution.
I think the solution is , Xen care for MCE while dom0 care for CE error. Or
another solution is all PCI access for CPU RAS is done by Xen since Xen owns
CPU. ISome information like how the pci config space is arranged will be
helpful, I think.
>> Firstly, if it is CE, Xen will do nothing and dom0 will take recovery
>> action. If it is UC, Xen will take action when all CPU is in SoftIRQ
>> context, and dom0 will not take action, so it should be ok.
>> Secondly, in Xen environment, per my understanding, CPU is owned by Xen HV,
>> so I'm not sure when dom0 disable L3 cache (if it is CE), should Xen be
>> aware or not. That is, should dom0 disable the cache directly, or it should
>> user hypercall to ask Xen do that. Keir can give us more suggestion.
>> For item C, currently Xen/dom0 can both access configuration space, while
>> domU will do that through PCI_frontend/backend. Because PCI backend only
>> cover device assigned to domU, so we don't need worry about domU and dom0
>> should be trusted. However, one thing left is, if this range is beyond
>> 0x100 (i.e. in pci-e range), we need add mmconfig support in Xen, although
>> it can be added simply.
>> -- Yunhong Jiang
>>> As for the Shanghai feature: Christoph, are there any documents
>>> available on that feature? What kind of errors are delivered
>>> - Frank
> ---to satisfy European Law for business letters:
> Advanced Micro Devices GmbH
> Karl-Hammerschmidt-Str. 34, 85609 Dornach b. Muenchen
> Geschaeftsfuehrer: Jochen Polster, Thomas M. McCoy, Giuliano Meroni
> Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
> Registergericht Muenchen, HRB Nr. 43632
Xen-devel mailing list