WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] expose MWAIT to dom0

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: RE: [Xen-devel] expose MWAIT to dom0
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Mon, 15 Aug 2011 16:09:35 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, "Wei, Gang" <gang.wei@xxxxxxxxx>
Delivery-date: Mon, 15 Aug 2011 01:14:09 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4E48EEB50200007800051398@xxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <625BA99ED14B2D499DC4E29D8138F15062D2E80C3A@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <4E48EEB50200007800051398@xxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcxbIbLP/ZBaSIAcSnus6w6MAL3QqAAAFYAw
Thread-topic: [Xen-devel] expose MWAIT to dom0
> From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> Sent: Monday, August 15, 2011 4:02 PM
> 
> >>> On 15.08.11 at 07:35, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> > There're basically two methods to enter a given C-state: legacy (hlt + I/O
> > read),
> > and native(using mwait). MWAIT is always preferred when both underlying
> CPU
> > and OS support, which is a more efficient way to conduct C-state transition.
> >
> > Xen PM relies on Dom0 to parse ACPI Cx/Px information, which involves one
> > step to notify BIOS about a set of capabilities supported by OSPM. One
> > capability
> > is about mwait support, which if true ACPI Cx entry contains entry
> > parameters
> > for mwait, or else I/O port information is provided. Xen PM later decides
> > entry
> > method (i/o or mwait) based on parsed ACPI information from dom0.
> >
> > However Xen doesn't expose MWAIT capability to dom0 due to changeset
> 17573:
> >
> > This then brings a problem to Dom0 which thinks underlying CPU
> > doesn't report mwait, and thus notify BIOS to use old I/O based method.
> >
> > Later a trick is integrated in Jeremy's pvops tree:
> >
> > --- a/arch/x86/kernel/acpi/processor.c
> > +++ b/arch/x86/kernel/acpi/processor.c
> > @@ -60,7 +60,7 @@ static void init_intel_pdc(struct acpi_processor *pr,
> > struct cpuinfo_x86 *c)
> >         /*
> >          * If mwait/monitor is unsupported, C2/C3_FFH will be disabled
> >          */
> > -       if (!cpu_has(c, X86_FEATURE_MWAIT))
> > +       if (!cpu_has(c, X86_FEATURE_MWAIT) && !xen_initial_domain())
> >                 buf[2] &= ~(ACPI_PDC_C_C2C3_FFH);
> >
> >         obj->type = ACPI_TYPE_BUFFER;
> >
> > Above trick is ugly and error-prone, since it always enable mwait regardless
> > of actual CPU capability.
> 
> 3.x (and later 2.6.3x) don't look at the CPUID flag anymore, they just
> check boot_option_idle_override, which is being controlled from the
> command line or enforced for some particular systems based on DMI
> data.

I don't think so. "boot_option_idle_override" controls the way how idle loop
is implemented, which has the side effect to disable MWAIT if cpuid says it
but "boot=nomwait" is specified. But it has no effect to enable MWAIT if
Xen doesn't tell dom0 about it. 

Check arch_acpi_set_pdc_bits under x86:
static inline void arch_acpi_set_pdc_bits(u32 *buf)
{
        struct cpuinfo_x86 *c = &cpu_data(0);

        buf[2] |= ACPI_PDC_C_CAPABILITY_SMP;

        if (cpu_has(c, X86_FEATURE_EST))
                buf[2] |= ACPI_PDC_EST_CAPABILITY_SWSMP;
 
        if (cpu_has(c, X86_FEATURE_ACPI))
                buf[2] |= ACPI_PDC_T_FFH;
        
        /*
         * If mwait/monitor is unsupported, C2/C3_FFH will be disabled
         */
        if (!cpu_has(c, X86_FEATURE_MWAIT))
                buf[2] &= ~(ACPI_PDC_C_C2C3_FFH);
}       

> 
> > It's unlikely to make into upstream, and also get lost in
> > into some distro such as SLES11.
> 
> We can certainly fix it there.
> 

that'd be great. I/O method has observable impact on power efficiency,
and the fix would be very welcomed. :-)

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel