This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] [RFC] Physical hot-add cpus and TSC

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Xen-Devel (xen-devel@xxxxxxxxxxxxxxxxxxx)" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [RFC] Physical hot-add cpus and TSC
From: "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Date: Mon, 31 May 2010 15:54:53 +0800
Accept-language: en-US
Acceptlanguage: en-US
Delivery-date: Mon, 31 May 2010 00:56:17 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <2d0b4e25-6fa9-4d66-9efe-a1b9e27612f5@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <789F9655DD1B8F43B48D77C5D30659731E78D370@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <C823EF64.1603B%keir.fraser@xxxxxxxxxxxxx> <789F9655DD1B8F43B48D77C5D30659731E78D500@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <3c99c55d-68ce-4150-b895-72fda1ff3b89@default> <789F9655DD1B8F43B48D77C5D30659731E78D89D@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <26342d1d-2141-4fb1-94ac-a398d7f553d6@default 789F9655DD1B8F43B48D77C5D30659731E78DA70@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <2d0b4e25-6fa9-4d66-9efe-a1b9e27612f5@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acr+o/c6aubIrcwBS8SSB5Q7oe4rLwB7UlZQ
Thread-topic: [Xen-devel] [RFC] Physical hot-add cpus and TSC

>-----Original Message-----
>From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
>Sent: Saturday, May 29, 2010 4:25 AM
>To: Jiang, Yunhong; Keir Fraser; Xen-Devel (xen-devel@xxxxxxxxxxxxxxxxxxx); Ian
>Subject: RE: [Xen-devel] [RFC] Physical hot-add cpus and TSC
>> >> b) With the patch:
>> >> After add:
>> >> (XEN) TSC marked as reliable, warp = 407 (count=12)
>> >> (XEN) TSC marked as reliable, warp = 444 (count=17)
>> >> (XEN) TSC marked as reliable, warp = 525 (count=19)
>> >
>> >Hi Yunhong --
>> >
>> >Does this continue to grow?  I'm concerned that
>> >the hot-added CPU might be skewing as well?
>> >I didn't think this was possible with an Invariant
>> >TSC machine but maybe something in the hot-add
>> >isolation electronics changes the characteristics
>> >of the clock signal?
>> >Please try:
>> >
>> >for i in {0..1000}; do xm debug-key t; sleep 3; done; \
>> >xm dmesg | tail
>> >
>> >then wait an hour, and see how large the warp is.
>> >Hopefully the trend (407,444,525) is a coincidence.
>> I just looped 10 times, I will try with 1000 loops next Monday.
>> I suspect there are any isolation electronics for hot-add. Basically
>> each CPU can be hot-added except socket 0. But yes, I can have a look
>> on it.
>Hmmm... I'm not a system hardware expert, but the more I think
>about this, the more likely it seems that any hot-plug
>board must have separate QPI buses that are driven by separate
>crystals.  And there would be some kind of bus bridge/repeater
>to connect the two with a forwarding protocol.  That would
>certainly explain a growing TSC skew.

Hmm, I'm not either.
I have no idea of the hot-plug board situation, our current system is in fact 
NOT physically hot-add. Instead, a hardware switch will turn the CPU on/off and 
trigger the CPU hotplug. And at least on this platform, there is only one 

But I suspect if hotplug CPU board really need seperated crystals, any special 
reason? Of course, it totally depends on system/board design.

The test result is followed.
(XEN) TSC marked as reliable, warp = 203 (count=155) ---->Before the insert

(XEN) TSC marked as reliable, warp = 637 (count=156) --> After hotadd

(XEN) TSC marked as reliable, warp = 644 (count=165) 

(XEN) TSC marked as reliable, warp = 652 (count=311)

(XEN) TSC marked as reliable, warp = 652 (count=609)

(XEN) TSC marked as reliable, warp = 655 (count=1206)

So some increase in the early stage, and then it's stable from count 609 to 
count 1206.

BTW, I notice one more thing, when system booting w/o hotplug, the warp is 0. 
However, after I return back after weekend, I noticed the warp is 182. Because 
I did the hotplug action before getting the warp, I'm not sure if it's caused 
by the hotplug action, or the system TSC will drift very slowly.
 (XEN) TSC marked as reliable, warp = 182 (count=2)

>If so, even two single socket boards connected like that at
>initial boot (no hot-add) are really a "big NUMA" system due
>to higher cross-node latencies and might deserve a separate
>boot option anyway... this is really NOT a single system...
>it is multiple systems glued together with a fast interconnect.
>Xen (and users) should be warned that there is no free
>lunch here and the performance degradation from TSC emulation
>may be only a small part of the problem.
>Some boot option like "multiboard_interconnect" (but shorter)
>might be appropriate?  Or is there some way at boot-time
>to determine that this box does, or might (via hot-add), or
>definitely does not, go beyond point-to-point interconnect?
>A boot-time decision on TSC emulation could be driven off
>of that if it existed.

If there is no hot-plug happen, this should be detectable already when booting, 
so no difference.
When hot-plug do happen, it should makes no difference, unless we can provide a 
better software algrithm, which can solve one-crystal situation, but can't 
resolve this one.


Xen-devel mailing list