WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] expose MWAIT to dom0

To: Jan Beulich <JBeulich@xxxxxxxxxx>
Subject: RE: [Xen-devel] expose MWAIT to dom0
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Fri, 19 Aug 2011 09:31:03 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "Zhang, Yang Z" <yang.z.zhang@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>, "Wei, Gang" <gang.wei@xxxxxxxxx>
Delivery-date: Thu, 18 Aug 2011 18:32:28 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4E4D23370200007800051D2C@xxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4E4D23370200007800051D2C@xxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcxdozTgTc5N7HzuQiqq5gIzujcFmQAak3Iw
Thread-topic: [Xen-devel] expose MWAIT to dom0
> From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> Sent: Thursday, August 18, 2011 8:36 PM
> 
> >>> On 16.08.11 at 10:29, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> >>  From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> >> Sent: Tuesday, August 16, 2011 4:13 PM
> >>
> >> >>> On 16.08.11 at 08:53, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> >> >>  From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> >> >> Sent: Tuesday, August 16, 2011 2:42 PM
> >> >>
> >> >> >>> On 16.08.11 at 08:03, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> >> >> >>  From: Jan Beulich [mailto:JBeulich@xxxxxxxxxx]
> >> >> >> Sent: Monday, August 15, 2011 6:29 PM
> >> >> >>
> >> >> >> that while improving the situation on CPUs that support the break-on-
> >> >> >> interrupt extension to mwait, it would result in C2/C3 not being 
> >> >> >> usable
> >> >> >> at all on CPUs that don't (but support mwait in its simpler form and
> >> >> >> have ACPI tables specifying FFH as address space id). Is that only a
> >> >> >> theoretical concern (i.e. is there an implicit guarantee that for 
> >> >> >> other
> >> >> >> than C1 FFH wouldn't be specified without that extension being
> >> >> >> available)? I thinks it's a practical one, or otherwise there 
> >> >> >> wouldn't
> >> >> >> be a point in removing the ACPI_PDC_C_C2C3_FFH bit prior to _PDC
> >> >> >> evaluation.
> >> >> >
> >> >> > Yes, this is a practical one, though I don't know any box doing that. 
> >> >> > In
> >> > all
> >> >> > the boxes I've been using so far, all the extensions are available. 
> >> >> > But
> >> >> > since
> >> >> > BIOS vendor may also impact the availability of CPUID bits, I think we
> >> >> > should do it right by strictly conforming to theSDM. I.e. we need 
> >> >> > check
> >> >> > CPUID leafs and then verify all Cx states propagated from dom0,
> instead
> >> >> > of blindly following its info. Will work a patch for that.
> >> >>
> >> >> You're getting it sort of wrong way round: What I don't want to do (but
> >> >> seemingly being necessary) is mimic the decision logic the hypervisor
> >> >> uses (i.e. require the break-on-interrupt extension for C2/C3 entering
> >> >> through MWAIT) in Dom0 when deciding about the bits to pass to
> >> >
> >> > break-on-interrupt is not a hard requirement to use MWAIT. Even when
> >> > that extension is not available, MWAIT can be still used to enter C2/C3,
> >> > just with interrupt enabled.
> >>
> >> And that's why this implementation detail should be confined to the
> >> hypervisor - Dom0 should not care about this if at all possible.
> >>
> >> >> _PDC. That ought to be an implementation detail (subject to change)
> >> >> in the hypervisor alone. The hypervisor itself, otoh, already properly
> >> >> checks CPUID leaf 5 (and that's what might cause it to not use mwait
> >> >> despite the bit in CPUID leaf 1 being set, which should be all Dom0
> >> >> ought to look at for deciding whether to clear ACPI_PDC_C_C2C3_FFH).
> >> >>
> >> >
> >> > I made a mistake, that currently CPUID leaf 5 is already checked in
> >> > check_cx in hypervisor, so it should be sane. However I still fail to 
> >> > catch
> >> > your real concern here. :/
> >>
> >> If Dom0 finds (real) CPUID leaf 1 report MWAIT to be available, it
> >> will (with the logic outlined above) call _PDC without clearing
> >> ACPI_PDC_C_C2C3_FFH. If now the break-on-interrupt extension
> >> is not present, but the address space ID for C2 or C3 is set to FFH,
> >> then Xen (in acpi_processor_ffh_cstate_probe()) will reject the
> >> Cx entry (and hence refrain from using the respective C-state),
> >> whereas if Dom0 cleared ACPI_PDC_C_C2C3_FFH in that case,
> >> firmware would (normally) have converted the address space ID to
> >> SYSTEM_IO, and hence Xen would have decided to use C2/C3 with
> >> the SYSIO entry method.
> >>
> >> So this is only acceptable if there are *no* production CPUs of any
> >> vendor that would support MWAIT without the break-on-interrupt
> >> extension.
> >>
> >
> > yes, that's also the way that native Linux code currently uses:
> >     - notify BIOS ACPI_PDC_C_C2C3_FFH if cpu has mwait
> >     - reject Cx entry if break-on-interrupt extension is not present later
> > in acpi processor driver when parsing Cx entries.
> >
> > From BIOS ACPI p.o.v, OSPM can notify BIOS about FFH style if following
> > conditions are true:
> >     a) cpu supports mwait
> >     b) OSPM itself supports mwait
> >
> > a) is architectural, but b) is implementation specific regarding to what
> > can be called "support". Obviously both Xen and Linux here use an
> > inconsistent way between the place notifying BIOS and the point parsing
> > ACPI Cx entry. So your conclusion is correct that C2/C3 would be rejected
> > on the CPU which doesn't support MWAIT with break-on-interrupt
> > extension. But it should be fine in the real world, and we may consider
> > whether to do something when a real case is encountered in the future.
> 
> Actually, I just checked, and I have two systems that have MWAIT but
> no extensions to it. While one is an old SDV of yours, the other is a
> production Dell system (which I only run Windows on, so I can't really
> check how Xen would behave on it prior to and with the discussed
> adjustment).

OK, to avoid the regression, if it's really cared, then we may change Xen 
to support entering C-state by mwait with interrupt enabled.

But the next question is whether it's really worthy of enabling Xen PM 
such way:
        - I think native Linux only supports mwait with break-on-interrupt
extension too. You may confirm on such machines which I think should 
have no C2/C3 available. It's less likely for a customer to try PM on a 
platform where native Linux fails to do that
        - using mwait with interrupt enabled lacks the trace capability, 
while w/o trace I don't think any customer would enable Xen PM w/o a
verification process.

Another approach, if we really want to keep original I/O style, is to
reveal Xen's mwait related conditions in shared info page, say a simple
flag to indicate whether mwait bit should be set by pvops cpuid hook.
Xen will check mwait extension earlier before dom0 is launched, instead
of current point where dom0 registers cx info. This way there's no Xen
implementation detail encoded in dom0, while concerned regression 
could be removed.

You're OSV guys. So let's draw a conclusion from your expertise. If you
think such regression is important and should be avoided, and the 2nd
approach about share info is reasonable, I'll go for it. :-)

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel