This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


RE: [Xen-devel] Time skew on HP DL785 (and possibly other boxes)

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>, Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Subject: RE: [Xen-devel] Time skew on HP DL785 (and possibly other boxes)
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Sun, 5 Apr 2009 20:59:23 +0800
Accept-language: en-US
Acceptlanguage: en-US
Cc: "john.v.morris@xxxxxx" <john.v.morris@xxxxxx>, "Xen-Devel \(E-mail\)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Sun, 05 Apr 2009 06:01:06 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <13d04f6f-9469-4644-a735-0ec846433397@default>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <49CD550B.1070908@xxxxxxxx> <13d04f6f-9469-4644-a735-0ec846433397@default>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acm0q2WP6PCpxPlrTuO7JQEdaR4e0ABQN5WA
Thread-topic: [Xen-devel] Time skew on HP DL785 (and possibly other boxes)
>From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx] 
>Sent: 2009年4月4日 6:23
>I think I still have a real concern here.  Let me see if
>I can explain.
>The goal for Xen timekeeping is to ensure that if a guest
>could somehow magically read any of its virtual clocks
>(tsc, pit, hpet, pmtimer, ??) on all its virtual processors
>simultaneously, the values read must always obey this
>"virtual clock law":
>   max - min < delta
>We can argue how large that delta can reasonably be and it
>may vary depending on what the workload is, but
>it's certainly under a millisecond, ten microseconds
>might not be a bad starting point, and it is getting
>smaller as processors get faster.
>If xen can't guarantee that, then it must turn on "numa"
>mode, which appears to me to be extremely restrictive
>and no system vendor could sell honestly sell the true
>promise of virtualization on such a box.  So we'd like
>to avoid that if possible.

I also heard one concern that completely random load balance
may also work suboptimally on large scale system, being
fierce contention on shared data structures, and thus some
coarse-grained soft partition or limitation are welcomed to
ensure accurate control on assigned resources to given VM
and also avoid cross node traffic as possible. In such case
enable 'numa' could serve the purpose to some extent, which
simply refine given VM's activity within one node, but definitely 
allow administrative tools to move it across node at its
disposal. I once heard that typical deployed VMs nowadays
are provisioned with 1 - 4 vcpus which normally fits in one 
node. But this may not be true in all cases.

Well, my point is a bit out of topic here. Of course your
concern about cross-node TSC variance still makes sense
whether or not node affinity is enforced, as long as VM is
possibly migrated cross-nodes. My point is just that turn
on 'numa' itself is really not a 'extremely restrictive' thing. :-)

>Note that the Linux approach doesn't work here
>because: 1) a guest's clocks might obey the "virtual clock
>law" at one moment on one set of physical processors
>and not at the next moment; 2) guests access to all
>clocks (except the tsc) is emulated so even if a guest
>decides the tsc is unreliable, that just doesn't help
>if the alternate clock it chooses (e.g. HPET) is silently
>emulated on top of xen system time using the physical tsc.

As Keir said, Xen system time itself is implemented in
a stable style, and thus as long as HVM timer virtualization
finally falls into emulation path, it should be stable too by
adding some overhead atop current tsc virtualization path.

Xen-devel mailing list