WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR

To: "Xu, Dongxiao" <dongxiao.xu@xxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
From: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date: Thu, 10 Dec 2009 18:00:18 -0800 (PST)
Cc: "Dugger, Donald D" <donald.d.dugger@xxxxxxxxx>
Delivery-date: Thu, 10 Dec 2009 18:02:07 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <EADF0A36011179459010BDF5142A457501D13FE61A@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> However for HVM, we should keep its behavior the same as
> on native machine. So if hardware support rdtscp, we will also
> support it in HVM; if not, we will not expose that bit in cpuid 
> to guest. 

As I said, I think this a very bad idea because there
is no way to ensure the behavior of an app/OS in a VM
gives the same results as in a physical machine.
So I think the cpuid rdtscp bit should always be off. 

> increase at save/restore/migration). This makes HVM support a bit
> Tricky because we need to save/restore guest/host TSC_AUX at every
> VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in 
> context_switch(), then things will become easier for HVM support. 

If you are doing a full faithful implementation of
rdtscp (as if cpuid rdtscp bit is on), I agree this
is a problem.  If not, and the only use of TSC_AUX
is for the pvrdtscp algorithm, I think setting
TSC_AUX in __update_vcpu_system_time() is fine
because TSC_AUX is not part of a VM's context,
it is a communication of information from system
software (Xen) to applications.

I expect that Keir will not support putting TSC_AUX
in the context switch code unless it is absolutely
necessary, as it is certainly expensive to read and
write to TSC_AUX and this cost will add to every
context switch of every VM even though very few will
actually use rdtscp/TSC_AUX.

So I think we need to decide first about approach (1),
the full faithful implementation of rdtscp.

> -----Original Message-----
> From: Xu, Dongxiao [mailto:dongxiao.xu@xxxxxxxxx]
> Sent: Thursday, December 10, 2009 6:23 PM
> To: Dan Magenheimer; Nakajima, Jun; 
> xen-devel@xxxxxxxxxxxxxxxxxxx; Keir
> Fraser
> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> 
> 
> Dan, 
>       Thanks for reply, some comments below. 
> 
> Best Regards,
> -- Dongxiao
> 
> Dan Magenheimer wrote:
> > Hi Dongxiao --
> > 
> > There are two approaches to adding rdtscp support:
> > 
> > 1) Faithful full implementation of rdtscp instruction
> > 2) Support pvrtdtscp algorithm
> > 
> > For (1), you would enable the rdtscp bit in cpuid.  Then
> > on hardware that supports rdtscp, you would do context
> > switching of TSC_AUX.  On hardware that doesn't support
> > rdtscp, you would intercept the illegal instruction trap
> > and emulate the instruction.  (TSC_AUX emulation
> > could be handled "lazily", no need to do context
> > switch for that.)
> > 
> > BUT if you look at how TSC_AUX is used by a native
> > OS**, the OS sets TSC_AUX to each physical CPU number
> > so an application can easily determine if successive
> > rdtscp instructions were not executed on the same
> > processor.  (This was important on older processors
> > that did not have invariant TSC.)  Unfortunately,
> > on Xen, this mechanism is worthless and misleading
> > because the OS believes it is setting TSC_AUX to
> > a physical CPU number but it is actually setting
> > it to a virtual CPU number, and the physical CPU
> > number may change at any time due to scheduling
> > or migration.  So an app using rdtscp will get a
> > wrong answer.
> 
> However for HVM, we should keep its behavior the same as
> on native machine. So if hardware support rdtscp, we will also
> support it in HVM; if not, we will not expose that bit in cpuid 
> to guest. 
> 
> > 
> > As a result, I do NOT recommend (1) and do recommend
> > that Xen should continue to return zero for the rdtscp
> > bit in cpuid.
> > 
> > For (2), setting TSC_AUX in __update_vcpu_system_time()
> > is fine (I think).  On hardware that supports, for HVM
> > you would need to ensure that the rdtscp instruction
> > works natively (even though the rdtscp bit in cpuid
> > is not turned on for the guest).  On hardware that
> > does not support rdtscp, you would intercept the illegal
> > instruction trap and call the existing code in
> > pv_soft_rdtsc().
> 
> Put the writing of TSC_AUX MSR in __update_vcpu_system_time()
> has a problem that, Hypervisor will overwrite the value time to time,
> ( For example, at do_softirq()->local_time_calibration() ), 
> even if the
> value didn't change (Currently the domain incarnation value only
> increase at save/restore/migration). This makes HVM support a bit
> Tricky because we need to save/restore guest/host TSC_AUX at every
> VMEXIT/VMENTRY. If both PV/HVM could put TSC_AUX writing in 
> context_switch(), then things will become easier for HVM support. 
> Do you have idea about It? Thanks!  :-)
> 
> > 
> > Does that make sense?
> > 
> > Thanks,
> > Dan
> > 
> > ** I've looked at RHEL5.  Windows actually always
> > returns 0 for TSC_AUX.
> > 
> >> -----Original Message-----
> >> From: Xu, Dongxiao [mailto:dongxiao.xu@xxxxxxxxx]
> >> Sent: Thursday, December 10, 2009 4:22 AM
> >> To: Dan Magenheimer; Nakajima, Jun;
> >> xen-devel@xxxxxxxxxxxxxxxxxxx; Keir
> >> Fraser
> >> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
> >> 
> >> 
> >> Hi, Dan,
> >>    I am now trying to add the rdtscp support for Xen HVM guest.
> >>    I have some questions about your pvrdtscp patch. See below.
> >> 
> >> Dan Magenheimer wrote:
> >>> Hi Jun --
> >>> 
> >>>> But it's possible that multiple domains use the pvrdtscp
> >>>> algorithm, and the incarnation number is domain specific.
> >>> 
> >>> OK, I see.  The code for writing TSC_AUX is in
> >>> __update_vcpu_system_time() not in context switch.
> >> 
> >> Will you modify the place where Hypervisor writes TSC_AUX MSR?
> >> In the current pvrdtscp logic, I think this MSR should be
> >> written while
> >> vcpu context switch. Also, this will make HVM support much easier
> >> because that MSR would not be modified by Hypervisor time to time.
> >> 
> >>> 
> >>>> We also have the issue when adding RDTSCP support for
> >>>> HVM guests.
> >>> 
> >>> Only if you expose the rdtscp bit via cpuid.  This could
> >>> certainly be done but, as I said, is probably pointless.
> >>> (The pvrdtscp algorithm uses the instruction whether or
> >>> not the rdtscp bit is set in cpuid, since Xen emulates
> >>> it -- for PV domains only now -- if the physical machine
> >>> doesn't support the instruction.
> >> 
> >> We are planning to add HVM support for RDTSCP, and the
> >> behavior for this instruction
> >> will follow the native way.
> >> This caused a problem that RDTSCP instruction in application
> >> has different experience
> >> upon PV and HVM domains. Do you have any comment about 
> this? Thanks!
> >> 
> >> Thanks!
> >> Dongxiao
> >> 
> >>> 
> >>> Dan
> >>> 
> >>>> -----Original Message-----
> >>>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
> >>>> Sent: Wednesday, December 09, 2009 10:08 AM
> >>>> To: Dan Magenheimer; xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>> Subject: RE: Saving/Restoring IA32_TSC_AUX MSR
> >>>> 
> >>>> 
> >>>> Dan Magenheimer wrote on Wed, 9 Dec 2009 at 08:59:59:
> >>>> 
> >>>>> Hi Jun --
> >>>>> 
> >>>> 
> >>>> Dan,
> >>>> 
> >>>>> Xen doesn't expose the TSC rdtscp bit so assumes that
> >>>>> no guests depend on it.  So no save/restore of TSC_AUX
> >>>>> is necessary.  Xen could provide support for the TSC
> >>>> 
> >>>> But it's possible that multiple domains use the pvrdtscp
> >>>> algorithm, and the incarnation number is domain specific. We
> >>>> also have the issue when adding RDTSCP support for HVM guests.
> >>>> 
> >>>>> rdtscp bit and allow a guest OS to manage TSC_AUX, but
> >>>>> the existing use of TSC_AUX by Linux would fail to
> >>>>> provide the desired result across migration, so there's
> >>>>> little point.  Also the pvrdtscp algorithm now assumes
> >>>>> that Xen itself is responsible for updating TSC_AUX
> >>>>> whenever a migration (across physical machines) occurs.
> >>>>> 
> >>>>> The #define for write_rdtscp_aux is from Linux source,
> >>>>> so I didn't change the code and define the constant.
> >>>>> 
> >>>>> Dan
> >>>>> 
> >>>>>> -----Original Message-----
> >>>>>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
> >>>>>> Sent: Wednesday, December 09, 2009 9:42 AM
> >>>>>> To: xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>>>> Cc: Dan Magenheimer
> >>>>>> Subject: Saving/Restoring IA32_TSC_AUX MSR
> >>>>>> 
> >>>>>> 
> >>>>>> I see the code like (in arch/x86/time.c), and wondering how
> >>>>>> IA32_TSC_AUX MSR is saved/restored at domain switch time.
> >>>>>> 
> >>>>>>     if ( (d->arch.tsc_mode ==  TSC_MODE_PVRDTSCP) &&
> >>>>>>          boot_cpu_has(X86_FEATURE_RDTSCP) )
> >>>>>>         write_rdtscp_aux(d->arch.incarnation);
> >>>>>> 
> >>>>>> BTW,
> >>>>>> 
> >>>>>> include/asm-x86/msr.h
> >>>>>> #define write_rdtscp_aux(val) wrmsr(0xc0000103, (val), 0)
> >>>>>> 
> >>>>>> We should write like wrmsr(MSR_TSC_AUX, (val), 0) by adding
> >>>>>> +#define MSR_TSC_AUX           0xc0000103 /* Auxiliary TSC */
> >>>>>> in include/asm-x86/msr-index.h
> >>>>>> 
> >>>>>> Thanks,
> >>>>>> Jun
> >>>>>> ---
> >>>>>> Intel Open Source Technology Center
> >>>>>> 
> >>>>>> 
> >>>> 
> >>>> Jun
> >>>> ___
> >>>> Intel Open Source Technology Center
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>> 
> >>> _______________________________________________
> >>> Xen-devel mailing list
> >>> Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel