I think the major difference including: a) How to handle the #MC, i.e. reset
system, decide impacted components, take recover action like page offline etc.
b) How to handle error impact guest. As to other item like log/telemetry, I
think our implementation didn't have much different to current implementation.
For how the handle the #MC, we think keep #MC handling in the hypervisor
handler will have following benifit:
a) When there is #MC happen, we need take action to reduce the severity of the
error as soon as possible. After all, #MC is something different to normal
interrupt.
b) Even if Dom0 will take central action, most of the work will be to invoke
hypercall to Xen HV to take action still.
c) Currently all #MC will first go-through Dom0 before inject to DomU, but we
didn't think much benifit for such path, since HV knows about guest quite well.
Above is the main reason that we keep #MC handling in Xen HV.
As how to handle error impact guest, I tried to describe 3 options in
http://lists.xensource.com/archives/html/xen-devel/2008-12/msg00643.html,
basically we have 3 options (you can refer to above URL for more information):
1) A PV #MC handler is implemented in guest. This PV handler gets MCA
information from Xen HV through hypercall, it is what currently implemented.;
2) Xen will provide MCA MSR virtualization so that guest's native #MC handler
can run without changes;
3)uses a PV #MC handler for guest as option 1, but interface between Xen/guest
is abstract event, like offline offending page, terminate current execution
context etc.
We select option 2 in our current implementation, with following consideration:
1) With this method, we can re-use native MCE handler , which may be tested
more widely
2) We can benifit from native MCE handler's improvement
3) it can support HVM guest better, especially this method can provide support
to HVM/PV guest at the same time.
4) We don't need maintain PV handler anymore, for various guest type.
One dis-advantage for this option is, guest (dom0) missed the physical CPU
information.
We think it will be much better if we can define a clear abstract interface
between Xen/guest, i.e. option 3, but even in that situation, current
implementation can be the last resorted method if guest has no PV abstract
event handler installed.
Especially we apply this method to Dom0 , because after we place all #MC
handling in Xen HV, dom0's MCE handler is same to normal guest, and we don't
need to diffrenciate it anymore, you can see the changes to dom0 for MCA is
very small now. BTW, one assumption here is, dom0's log/telemetry will all
go-through the VIRQ handler while Dom0's #MC is just for it's recovery.
Of course, currently keep system running is far more important than guest #MC,
and we can simply kill impacted guest. We implement the virtual MSR read/write
mainly for Dom0 support (or maybe even dom0 can be killed now since it can't do
much recovery still ).
Thanks
Yunhong Jiang
>>
>Today is a holiday here in the US, so I have only taken a superficial
>look at the patches.
>
>However, my initial impression is that I share Christoph's concern. I
>like the original design, where the hypervisor deals with low-level
>information collection, passes it on to dom0, which then can make a
>high-level decision and instructs the hypervisor to take high-level
>action via a hypercall. The hypervisor does the actual MSR reads and
>writes, dom0 only acts on the values provided via hypercalls.
>
>We added the physcpuinfo hypercall to stay in this framework: get
>physical information needed for analysis, but don't access any
>registers
>directly.
>
>It seems that these new patches blur this distinction, especially the
>virtualized msr reads/writes. I am not sure what added value
>they have,
>except for being able to run an unmodified MCA handler.
>However, I think
>that any active MCA decision making should be centralized, and that
>centralized place would be dom0. Dom0 is already very much
>aware of the
>hypervisor, so I don't see the advantage of having an unmodified MCA
>handler there (our MCA handlers are virtually unmodified, it's
>just that
>the part where the telemetry is collected is inside Xen for
>the dom0 case).
>
>I also agree that different behavior for AMD and Intel chips would not
>be good.
>
>Perhaps the Intel folks can explain what the advantages of their
>approach are, and give some scenarios where there approach would be
>better? My first impression is that staying within the general
>framework
>as provided by Christoph's original work is the better option.
>
>- Frank
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|