Keir,
I think its a good idea to have other modes.
However, I don't believe that the mode checked in to the staging
tree will keep good time for a 64 bit Linux guest, if that was what was
intended.
Here's why:
The guest running under the new option gets a clock interrupt
after being de-scheduled for a while. It calculates missed_ticks
and bumps jiffies by missed_ticks. Jiffies is now correct.
Then, with the new mode as submitted, the guest will get missed_ticks
additional interrupts. For each, the guest will add 1 to jiffies.
The guest is now missed_ticks * clock_period ahead of where it should be.
Under the old/other option, the guest tsc is continuous after a de-scheduled
period, and thus the missed_ticks calculation in the guest results in zero.
Then missed_ticks interrupts are delivered and jiffies is correct.
I just ran a test with two 64bit Linux guests, one Red Hat and one Sles,
under load. The hypervisor has constant tsc offset per the code
submitted to
the staging tree. In each 5 sec period the guest gained 6-10 seconds
against
ntp time, an error of almost 200%.
[root@vs079 ~]# while :; do ntpdate -q 0.us.pool.ntp.org; sleep 5; done
server 8.15.10.42, stratum 2, offset -0.061007, delay 0.04959
29 Oct 15:21:21 ntpdate[3892]: adjust time server 8.15.10.42 offset
-0.061007 sec
server 8.15.10.42, stratum 2, offset -0.077763, delay 0.07129
29 Oct 15:21:28 ntpdate[3894]: adjust time server 8.15.10.42 offset
-0.077763 sec
server 8.15.10.42, stratum 2, offset -1.733141, delay 0.20813
(load started here.)
29 Oct 15:21:35 ntpdate[3968]: step time server 8.15.10.42 offset
-1.733141 sec
server 8.15.10.42, stratum 2, offset -9.648700, delay 0.04861
29 Oct 15:21:54 ntpdate[4002]: step time server 8.15.10.42 offset
-9.648700 sec
server 8.15.10.42, stratum 2, offset -22.872883, delay 0.05319
29 Oct 15:22:21 ntpdate[4027]: step time server 8.15.10.42 offset
-22.872883 sec
server 8.15.10.42, stratum 2, offset -29.036008, delay 0.19337
29 Oct 15:22:38 ntpdate[4039]: step time server 8.15.10.42 offset
-29.036008 sec
server 8.15.10.42, stratum 2, offset -34.880845, delay 0.04944
29 Oct 15:22:46 ntpdate[4058]: step time server 8.15.10.42 offset
-34.880845 sec
With these three changes to the constant tsc offset policy in staging,
the error compared to ntp is about .02% under this load.
> 1. Since you are in missed_ticks(), why not increase the threshold
> to 10 sec?
>
> 2. In missed_ticks() you should only increment pending_intr_nr by
> missed_ticks
> calculated when pt_support_time_frozen(domain).
>
> 3. You might as well fix this one too since its what we discussed and
is so
> related to constant tsc offset:
> In pt_timer_fn, if !pt_support_time_frozen(domain) then
> pending_intr_nr should end up with a maximum value of one.
>
So, I think these changes are necessary for a 64bit Linux policy. If you
agree, should they go in
as fixes to the constant tsc offset policy in staging now or as a new
policy?
thanks,
Dave
Keir Fraser wrote:
I thought the point of the mode in Haitao's patch was to still deliver the
'right' number of pending interrupts, but not stall the guest TSC while
delivering them? That's what I checked in as c/s 16237 (in staging tree). If
we want other modes too they can be added to the enumeration that c/s
defines.
-- Keir
On 29/10/07 15:00, "Dave Winchell" <dwinchell@xxxxxxxxxxxxxxx> wrote:
Eddie, Haitao:
The patch looks good with the following comments.
1. Since you are in missed_ticks(), why not increase the threshold
to 10 sec?
2. In missed_ticks() you should only increment pending_intr_nr by
missed_ticks
calculated when pt_support_time_frozen(domain).
3. You might as well fix this one too since its what we discussed and is so
related to constant tsc offset:
In pt_timer_fn, if !pt_support_time_frozen(domain) then
pending_intr_nr should end up with a maximum value of one.
regards,
Dave
Dong, Eddie wrote:
Dave Winchell wrote:
Eddie,
I implemented #2B and ran a three hour test
with sles9-64 and rh4u4-64 guests. Each guest had 8 vcpus
and the box was Intel with 2 physical processors.
The guests were running large loads.
Clock was pit. This is my usual test setup, except that I just
as often used AMD nodes with more processors.
The time error was .02%, good enough for ntpd.
The implementation keeps a constant guest tsc offset.
There is no pending_nr cancellation.
When the vpt.c timer expires, it only increments pending_nr
if its value is zero.
Missed_ticks() is still calculated, but only to update the new
timeout value. There is no adjustment to the tsc offset
(set_guest_time())
at clock interrupt delivery time nor at re-scheduling time.
So, I like this method better than the pending_nr subtract.
I'm going to work on this some more and, if all goes well,
propose a new code submission soon.
I'll put some kind of policy switch in too, which we can discuss
and modify, but it will be along the lines of what we discussed below.
Thanks for your input!
-Dave
Haitao Shai may posted his patch, can u check if there are something
missed?
thx,eddie
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|