WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler

To: Jan Beulich <JBeulich@xxxxxxxx>, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
From: "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
Date: Fri, 30 Sep 2011 15:44:22 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "keir.xen@xxxxxxxxx" <keir.xen@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Fri, 30 Sep 2011 00:45:21 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4E858AF30200007800058A64@xxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <BC00F5384FCFC9499AF06F92E8B78A9E263B557B77@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E84ADF70200007800058882@xxxxxxxxxxxxxxxxxxxx> <789F9655DD1B8F43B48D77C5D306597312D2366A9B@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E858AF30200007800058A64@xxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acx/QhdPC4ZtXKfdTlGx7gQZNw7G3wAALQVQ
Thread-topic: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
Jan Beulich wrote:
>>>> On 30.09.11 at 04:51, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
>>>> wrote: 
> 
>> 
>>> -----Original Message-----
>>> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
>>> Sent: Thursday, September 29, 2011 11:42 PM
>>> To: Liu, Jinsong
>>> Cc: Keir Fraser; Jiang, Yunhong; xen-devel@xxxxxxxxxxxxxxxxxxx
>>> Subject: Re: [Xen-devel] [PATCH] X86 MCE: Add SRAR handler
>>> 
>>>>>> On 29.09.11 at 17:20, "Liu, Jinsong" <jinsong.liu@xxxxxxxxx>
>>>>>> wrote: 
>>>> @@ -782,8 +821,12 @@ static void intel_default_mce_uhandler(
>>>> 
>>>>      switch (type)
>>>>      {
>>>> -    /* Panic if no handler for SRAR error */
>>>>      case intel_mce_ucr_srar:
>>>> +        if ( !guest_mode(regs) )
>>>> +            *result = MCER_RESET;
>>>> +        else
>>>> +            *result = MCER_CONTINUE;
>>>> +        break;
>>>>      case intel_mce_fatal:
>>>>          *result = MCER_RESET;
>>>>          break;
>>> 
>>> Using the stack based registers for any decision in an MCE handler
>>> seems bogus to me - without knowing that the error occurred
>>> synchronously, the result is meaningless. Unfortunately I wasn't
>> 
>> I think the usage of stack in MCE handler should be case by case.
>> For example, it's ok to use it at data load instruction since the
>> EIPV is valid for it.
> 
> According to the table in the manual, this is only the case on the
> local thread.
> 

OK, we can remove srar check at mce isr 'uhandler'.
so at mce isr, the check
        if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs)) 
                return -1;
is a total insurance for the error which triggered at hypervisor while cannot 
restart from RIP.


>> For the instruction load, not that sure and I will check it
>> internally. 
>> 
>> But agree that we should not do this depends on the error type (like
>> SRAR), but should depends on the specific error code.
>> 
>>> able to spot - throughout your patch - what SRAR actually stands
>>> for, and the manual is no help in that respect either. It does
>>> state, however, that EIPV in three of four cases would be clear for
>>> these, so using the registers on stack is likely wrong here.
>>> 
>>> This made me look at the current source, and there I see in
>>> mce_urgent_action() 
>>> 
>>>     if ( !(gstatus & MCG_STATUS_RIPV) && !guest_mode(regs))        
>>> return -1; 
>>> 
>>> which I think should say ... _EIPV and use || instead. Thoughts?
>> 
>> I think this code means, if the error happens in hypervisor mode
>> (i.e. !guest_mode()), and RIPV indicate the RIP in stack can't be
>> restarted, we have to panic.
> 
> Then the guest_mode() check still lacks an extra check of EIPV, like
> 
>      if ( !(gstatus & MCG_STATUS_RIPV) &&
>           (!(gstatus & MCG_STATUS_EIPV) || !guest_mode(regs)))
>          return -1;
> 

That would be overkilled.
Considering instruction fetch error occur at guest context, hypervisor deliver 
to guest to handle the error is perfer, not panic all system.

Thanks,
Jinsong
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel