WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler

>>> On 30.09.11 at 04:51, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:

> 
>> -----Original Message-----
>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx] 
>> Sent: Thursday, September 29, 2011 11:42 PM
>> To: Liu, Jinsong
>> Cc: Keir Fraser; Jiang, Yunhong; xen-devel@xxxxxxxxxxxxxxxxxxx 
>> Subject: Re: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
>> 
>> >>> On 29.09.11 at 17:20, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx> wrote:
>> > @@ -782,8 +821,12 @@ static void intel_default_mce_uhandler(
>> >
>> >      switch (type)
>> >      {
>> > -    /* Panic if no handler for SRAR error */
>> >      case intel_mce_ucr_srar:
>> > +        if ( !guest_mode(regs) )
>> > +            *result = MCER_RESET;
>> > +        else
>> > +            *result = MCER_CONTINUE;
>> > +        break;
>> >      case intel_mce_fatal:
>> >          *result = MCER_RESET;
>> >          break;
>> 
>> Using the stack based registers for any decision in an MCE handler
>> seems bogus to me - without knowing that the error occurred
>> synchronously, the result is meaningless. Unfortunately I wasn't
> 
> I think the usage of stack in MCE handler should be case by case. 
> For example, it's ok to use it at data load instruction since the EIPV is
> valid for it.

According to the table in the manual, this is only the case on the local
thread.

> For the instruction load, not that sure and I will check it internally.
> 
> But agree that we should not do this depends on the error type (like SRAR),
> but should depends on the specific error code.
> 
>> able to spot - throughout your patch - what SRAR actually stands
>> for, and the manual is no help in that respect either. It does state,
>> however, that EIPV in three of four cases would be clear for these,
>> so using the registers on stack is likely wrong here.
>> 
>> This made me look at the current source, and there I see in
>> mce_urgent_action()
>> 
>>     if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs))
>>         return -1;
>> 
>> which I think should say ... _EIPV and use || instead. Thoughts?
> 
> I think this code means, if the error happens in hypervisor mode (i.e.
> !guest_mode()), and RIPV indicate the RIP in stack can't be restarted, we
> have to panic.

Then the guest_mode() check still lacks an extra check of EIPV, like

     if ( !(gstatus & MCG_STATUS_RIPV) &&
          (!(gstatus & MCG_STATUS_EIPV) || !guest_mode(regs)))
         return -1;

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel