This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] RE: Live migration fails due to c/s 20627

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Subject: [Xen-devel] RE: Live migration fails due to c/s 20627
From: "Xu, Dongxiao" <dongxiao.xu@xxxxxxxxx>
Date: Wed, 16 Dec 2009 00:10:11 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "kurt.hackel@xxxxxxxxxx" <kurt.hackel@xxxxxxxxxx>, "Dugger, Donald D" <donald.d.dugger@xxxxxxxxx>, "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>, "Zhang, Xiantao" <xiantao.zhang@xxxxxxxxx>
Delivery-date: Tue, 15 Dec 2009 08:10:29 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <bcd71c68-8c44-40f7-a4f8-8e6102af2bee@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <6CADD16F56BC954D8E28F3836FA7ED7105AC8685A3@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> <bcd71c68-8c44-40f7-a4f8-8e6102af2bee@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acp9n0Y7Z1EfbupYS/u3VNWDydN1PAAAHgtw
Thread-topic: Live migration fails due to c/s 20627
Dan Magenheimer wrote:
> Hi Dongxiao --
> Why would you disable live migraton between two
> very widely used Intel processors for ALL HVM domains
> just because some domains use the rdtscp instruction?

Dan, I won't disable the migration. As Keir said, I will put
the cpuid logic in xc_cpuid_x86.c so that admin can use
configuration file to mask rdtscp feature through cpuid. 
This is the common usage model for live migration 
between two different hosts.

> Why not just add the code to do rdtscp emulation,
> which would NOT break live migration?

Add rdtscp emulation has such problem that, in Intel VMX, the 
vmexit control for rdtsc and rdtscp is the same, so if we trap
rdtscp for emulation, OS will suffer from looooots of rdtsc vmexit,
which will bring performance downgrade.

> There are many cases where rdtsc/rdtscp instructions
> are emulated and so most of the code is already there.
> You only need to intercept illegal instruction traps,
> so there is not a significant performance issue.
> And the code to do the emulation is necessary
> to implement the pvrdtscp algorithm on hvm anyway

I think in HVM environment, we should respect the native
behavior. Moreover, it would be valuable for
guest if it could get the node/cpu info which reflects
hardware topology.


> (which I think was the reason this whole discussion
> started).
> Dan
>> -----Original Message-----
>> From: Xu, Dongxiao [mailto:dongxiao.xu@xxxxxxxxx]
>> Sent: Monday, December 14, 2009 9:40 PM
>> To: Keir Fraser; Dan Magenheimer
>> Cc: Jeremy Fitzhardinge; xen-devel@xxxxxxxxxxxxxxxxxxx; Kurt Hackel;
>> Dugger, Donald D; Nakajima, Jun; Zhang, Xiantao
>> Subject: RE: Live migration fails due to c/s 20627
>> Keir Fraser wrote:
>>> On 14/12/2009 18:02, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx>
>>> wrote: 
>>>> This may be true in concept, but existing tools (including
>>>> the default xm tools) do NOT check for this... I just
>>>> tested a live migration between a Nehalem (which supports
>>>> rdtscp) and a Conroe (which does not).  The live migration
>>>> works fine and the app using rdtscp runs fine on the
>>>> Nehalem and then crashes when the live migration completes
>>>> on the Conroe.  I *know* of existing code in Oracle
>>>> that will be broken by this!
>>> This is a general problem for migration between dissimilar
>>> processors. The solution is to 'level' the feature sets, by masking
>>> CPUID flags from the more-featured processor. In this case you would
>>> mask out RDTSCP (and perhaps others too). This does need the RDTSCP
>>> flag setting/clearing to be moved to xc_cpuid_x86.c, as currently
>>> the user cannot override the policy wedged into the hypervisor
>>> itself. That's an easy thing to fix.
>> I will write a patch for this. Thanks!
>> -- Dongxiao
>>>  -- Keir
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>> http://lists.xensource.com/xen-devel
Xen-devel mailing list