Dan Magenheimer wrote:
> Well, although it might be nice to be able to use
> rdtscp and TSC_AUX to determine pcpu/vcpu/pnode/vnode
> information, I think Jeremy and Jan convinced me in
> another thread a couple of months ago that in userland:
> x = vgetcpu()
> y = vgetcpu()
> if x==1 and y==2, there's no way to determine that
> do_other_stuff() was executed on cpu 1 vs cpu 2,
> or (though unlikely) even on cpu 3. And if
> x==y==4, there's no guarantee that do_other_stuff()
> is executed on cpu 4.
> If this is true the only safe use of TSC_AUX is for
> its originally designed intent: To determine if two
> successive rdtscp instructions were or were not
> executed on the same processor. Since this cannot
> be guaranteed in a VM, that's a reasonable argument
> that TSC_AUX shouldn't be exposed at all (meaning the
> rdtscp bit in cpuid should be turned off by Xen).
Why do you think this is the design intent of this instruction ?
For guest NUMA support, it should be a must to pin each vcpu of one VM to some
logical proceossors which belong to one specific node(disable vcpu migration
between nodes), I think, otherwise, virutal numa may suffer from performance
loss. For example, in a numa system which has two nodes and each node has 4G
memory and 8 logical processors. And in this Xen-configured system, if we
carete a VM with 2 G memory with4 vcpu support, Xen system may allocate 1 G
memory from physical node 0 and another 1 G memory from physical node 1. And
in this case, if we virtualize numa for this VM, vcpu0 and vcpu1 can be
assinged to virtual node0 , vcpu2 and vcpu3 can be configured for virtual
node1, certainly, we also can safely pin vcpu0 and vpcu1 to the physical
node0's 8 locial processors and accordingly pin vcpu2 and vcpu3 to the physical
node1's 8 physical processors. Since virtual TSC_AUX is virtualized for each
vcpu, and the value is saved/restored for the vcpu when its migration occurs,
so if one application always runs on a virtual processors, it should get a
fixed value when it calls vgetcpu, envn if this vcpu often migrates among
logical processors of one node.
Back to this topic, in all, we can't mix the virtual TSC_AUX of guest with
the host's TSC_AUX. If switch to HVM's vcpu context, load this vcpu's virtual
TSC_AUX_MSR to physical TSC_AUX_MSR, and when it is sheduled out, host's
TSC_AUX_MSR(which maybe used for pv guests) is loaded.
> True, as long as the information is ONLY used
> heuristically to obtain pcpu/vcpu/pnode/vnode info,
> and no guarantee of correctness is implied or expected,
> it might be useful some of the time.
> But frankly, if "performance sucks" when the heuristic
> fails due to the fact that the app is running on
> a VM instead of native OS, I'd see that as a problem
> and suggest the proper way to fix that is to define
> more App-to-Xen ABIs so that the app can get the
> real information, not a heuristic. Which also argues
> for Xen leaving the rdtscp bit in cpuid turned off
>> -----Original Message-----
>> From: Nakajima, Jun [mailto:jun.nakajima@xxxxxxxxx]
>> Sent: Friday, December 11, 2009 12:30 PM
>> To: Jeremy Fitzhardinge; Dan Magenheimer
>> Cc: Keir Fraser; Zhang, Xiantao; Xu, Dongxiao;
>> xen-devel@xxxxxxxxxxxxxxxxxxx; Dugger, Donald D
>> Subject: RE: [Xen-devel] RE: Saving/Restoring IA32_TSC_AUX MSR
>> Jeremy Fitzhardinge wrote on Fri, 11 Dec 2009 at 10:50:29:
>>> On 12/11/09 10:35, Dan Magenheimer wrote:
>>>>> However, the vcpu number is definitely useful to usermode apps,
>>>>> so they can get some idea how they're moved between (v)cpus. I
>>>>> don't think it will matter to them that it isn't pcpu.
>>>> My point is that an app running on native Linux can
>>>> safely assume that, if TSC_AUX==3 at time T1 and
>>>> TSC_AUX is still 3 at time T2,it is running
>>>> on the same processor and the same node at both T1
>>>> and T2. In a virtual environment it cannot even
>>>> assume it is running on the same machine.
>>>> Further if the app sees that TSC_AUX==2 at time T3
>>>> and TSC_AUX==3 at time T4, on native Linux it
>>>> can safely assume that it is running on a different
>>>> processor. While rarer, in a virtual environment,
>>>> this may also be a false assumption.
>>>> That's why I say the information is misleading.
>>> Sure, but that info is, at best, of heuristic value, and won't
>>> cause any correctness problems if it is wrong. The performance may
>>> suck, but that's part of the larger problem of running NUMA-aware
>>> code in a virtual environment.
>> And to utilize various NUMA optimizations in the kernel/apps
>> in the guest, we need "the virtual numa info bears some vague
>> resemblance to the real topology" (from Jeremy's email) with
>> the vcpus bound to the CPU/node.
>> I understand that enabling RDTSCP in HVM will disable the
>> pvrdtscp algorithm if used by the kernel. One way is to mask
>> off the feature in CPUID (by default). Then kernel won't use it.
>> Intel Open Source Technology Center
Xen-devel mailing list