WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] Fix hvm guest time to be more accurate

To: Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] Fix hvm guest time to be more accurate
From: Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>
Date: Mon, 29 Oct 2007 15:55:08 -0400
Cc: haitao.shan@xxxxxxxxx, Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, Ben Guthro <bguthro@xxxxxxxxxxxxxxx>
Delivery-date: Tue, 30 Oct 2007 09:55:00 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <C34BC908.1795D%Keir.Fraser@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C34BC908.1795D%Keir.Fraser@xxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929)
Keir,

I think its a good idea to have other modes.
However, I don't believe that the mode checked in to the staging
tree will keep good time for a 64 bit Linux guest, if that was what was intended.

Here's why:
The guest running under the new option gets a clock interrupt
after being de-scheduled for a while. It calculates missed_ticks
and bumps jiffies by missed_ticks. Jiffies is now correct.
Then, with the new mode as submitted, the guest will get missed_ticks
additional interrupts. For each, the guest will add 1 to jiffies.
The guest is now missed_ticks * clock_period ahead of where it should be.

Under the old/other option, the guest tsc is continuous after a de-scheduled
period, and thus the missed_ticks calculation in the guest results in zero.
Then missed_ticks interrupts are delivered and jiffies is correct.

I just ran a test with two 64bit Linux guests, one Red Hat and one Sles,
under load. The hypervisor has constant tsc offset per the code submitted to the staging tree. In each 5 sec period the guest gained 6-10 seconds against
ntp time, an error of almost 200%.

[root@vs079 ~]# while :; do ntpdate -q 0.us.pool.ntp.org; sleep 5; done
server 8.15.10.42, stratum 2, offset -0.061007, delay 0.04959
29 Oct 15:21:21 ntpdate[3892]: adjust time server 8.15.10.42 offset -0.061007 sec
server 8.15.10.42, stratum 2, offset -0.077763, delay 0.07129
29 Oct 15:21:28 ntpdate[3894]: adjust time server 8.15.10.42 offset -0.077763 sec
server 8.15.10.42, stratum 2, offset -1.733141, delay 0.20813

(load started here.)

29 Oct 15:21:35 ntpdate[3968]: step time server 8.15.10.42 offset -1.733141 sec
server 8.15.10.42, stratum 2, offset -9.648700, delay 0.04861
29 Oct 15:21:54 ntpdate[4002]: step time server 8.15.10.42 offset -9.648700 sec
server 8.15.10.42, stratum 2, offset -22.872883, delay 0.05319
29 Oct 15:22:21 ntpdate[4027]: step time server 8.15.10.42 offset -22.872883 sec
server 8.15.10.42, stratum 2, offset -29.036008, delay 0.19337
29 Oct 15:22:38 ntpdate[4039]: step time server 8.15.10.42 offset -29.036008 sec
server 8.15.10.42, stratum 2, offset -34.880845, delay 0.04944
29 Oct 15:22:46 ntpdate[4058]: step time server 8.15.10.42 offset -34.880845 sec



With these three changes to the constant tsc offset policy in staging,
the error compared to ntp is about .02% under this load.

> 1. Since you are in missed_ticks(), why not increase the threshold
>     to 10 sec?
>
> 2. In missed_ticks() you should only increment pending_intr_nr by
> missed_ticks
>     calculated when  pt_support_time_frozen(domain).
>
> 3. You might as well fix this one too since its what we discussed and is so
>     related to constant tsc offset:
>       In pt_timer_fn, if !pt_support_time_frozen(domain) then
>       pending_intr_nr should end up with a maximum value of one.
>

So, I think these changes are necessary for a 64bit Linux policy. If you agree, should they go in as fixes to the constant tsc offset policy in staging now or as a new policy?

thanks,
Dave



Keir Fraser wrote:

I thought the point of the mode in Haitao's patch was to still deliver the
'right' number of pending interrupts, but not stall the guest TSC while
delivering them? That's what I checked in as c/s 16237 (in staging tree). If
we want other modes too they can be added to the enumeration that c/s
defines.

-- Keir

On 29/10/07 15:00, "Dave Winchell" <dwinchell@xxxxxxxxxxxxxxx> wrote:

Eddie, Haitao:

The patch looks good with the following comments.

1. Since you are in missed_ticks(), why not increase the threshold
   to 10 sec?

2. In missed_ticks() you should only increment pending_intr_nr by
missed_ticks
   calculated when  pt_support_time_frozen(domain).

3. You might as well fix this one too since its what we discussed and is so
   related to constant tsc offset:
     In pt_timer_fn, if !pt_support_time_frozen(domain) then
     pending_intr_nr should end up with a maximum value of one.

regards,
Dave


Dong, Eddie wrote:

Dave Winchell wrote:


Eddie,

I implemented #2B and ran a three hour test
with sles9-64 and rh4u4-64 guests. Each guest had 8 vcpus
and the box was Intel with 2 physical processors.
The guests were running large loads.
Clock was pit. This is my usual test setup, except that I just
as often used AMD nodes with more processors.

The time error was .02%, good enough for ntpd.

The implementation keeps a constant guest tsc offset.
There is no pending_nr cancellation.
When the vpt.c timer expires, it only increments pending_nr
if its value is zero.
Missed_ticks() is still calculated, but only to update the new
timeout value. There is no adjustment to the tsc offset
(set_guest_time())
at clock interrupt delivery time nor at re-scheduling time.

So, I like this method better than the pending_nr subtract.
I'm going to work on this some more and, if all goes well,
propose a new code submission soon.
I'll put some kind of policy switch in too, which we can discuss
and modify, but it will be along the lines of what we discussed below.

Thanks for your input!

-Dave

Haitao Shai may posted his patch, can u check if there are something
missed?
thx,eddie


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel