Hi Keir,
You can help me analyze the Linux code.
Here is the code from sles (linux-2.6.5-7.244) x86_64.
static irqreturn_t timer_interrupt(int irq, void *dev_id, struct pt_regs
*regs)
spin_lock(&i8253_lock);
outb_p(0x00, 0x43);
delay = inb_p(0x40);
delay |= inb(0x40) << 8;
spin_unlock(&i8253_lock);
delay = LATCH - 1 - delay;
rdtscll_sync(&tsc);
offset = (((tsc - vxtime.last_tsc) *
vxtime.tsc_quot) >> 32) - (USEC_PER_SEC / HZ);
if (offset < 0)
offset = 0;
if (offset > (USEC_PER_SEC / HZ)) {
lost = offset / (USEC_PER_SEC / HZ);
offset %= (USEC_PER_SEC / HZ);
}
monotonic_base += (tsc - vxtime.last_tsc)*1000000/cpu_khz ;
vxtime.last_tsc = tsc - vxtime.quot * delay / vxtime.tsc_quot;
if ((((tsc - vxtime.last_tsc) *
vxtime.tsc_quot) >> 32) < offset)
vxtime.last_tsc = tsc -
(((long) offset << 32) / vxtime.tsc_quot) - 1;
if (lost > 0) {
if (report_lost_ticks) {
printk(KERN_WARNING "time.c: Lost %ld timer "
"tick(s)! ", lost);
print_symbol("rip %s)\n", regs->rip);
}
jiffies += lost;
}
do_timer(regs);
Now as 'offset' gets modified it represents the remainder in the
lost calculation. Our data indicates that when offset is close to zero,
timekeeping is more accurate. So this leads us to this line of code:
if ((((tsc - vxtime.last_tsc) *
vxtime.tsc_quot) >> 32) < offset)
vxtime.last_tsc = tsc -
(((long) offset << 32) / vxtime.tsc_quot) - 1;
Thus the problem seems to be the way the code switches from adjusting
last_tsc
by the delay which is the normal mode when interrupts are timely or SYNCed
to handling offsets larger than the delay.
I'm concerned about the way delay is blown off in the final calculation
vxtime.last_tsc = tsc -
(((long) offset << 32) / vxtime.tsc_quot) - 1;
If an interrupt is right on time, and the delay is the same as last time,
then the offset should be zero in my mind. In the code above, offset will
equal delay for this example.
What do you think?
thanks,
Dave
Keir Fraser wrote:
Oh dear, that was a silly typo on my part. Thanks for tracking it down!
Well, I'm surprised that SYNC vs ASYNC differs, if the guest is really
tracking time via the TSC value. But it sounds like the actual time-tracking
algorithm in the guest must be a bit more complicated than that? I'd be very
happy to help with any further analysis.
-- Keir
On 7/11/07 14:39, "Dave Winchell" <dwinchell@xxxxxxxxxxxxxxx> wrote:
Keir,
Attached is a fix for pt_process_missed_ticks().
Without this fix, systems run very erratically and some
guests panic on boot complaining that timer interrupts
are not working. As you can imagine.
Also, I have some longer term measurements of the
accuracy of the sync and async methods.
The hardware is an eight cpu AMD box. Two eight
vcpu guests, rh4u4-64 and sles9sp3-64.
All vcpus running loads.
Method Test duration Clock errors
SYNC 56000 sec 6.4, 6.7 sec (.012%)
ASYNC 52000 sec 13, 19 sec (.036%)
More testing should be done to validate the significance
of this difference.
regards,
Dave
Dave Winchell wrote:
Keir Fraser wrote:
On 3/11/07 21:17, "Dave Winchell" <dwinchell@xxxxxxxxxxxxxxx> wrote:
Thanks for applying the fixes in the last submit.
In moving the test for no_missed_tick_accounting into
pt_process_missed_ticks()
you forgot to add the scheduling part.
Actually it was deliberate, but clearly it was one code
simplification too
far: thanks for spotting it! I'll go the async route, but we do need
to set
pending_intr_nr to 1. We can't leave that out -- the point of the async
route is to send a tick to the vcpu immediately, since it hasn't had
one for
more than a tick period. If we wait for the timeout to do that then
we have
to wait a whole extra period after the vcpu is re-scheduled.
Attached is my proposed patch. I think it's quite neat. :-)
It looks good to me.
thanks,
Dave
-- Keir
diff -r dfe9c0c10a2c xen/arch/x86/hvm/vpt.c
--- a/xen/arch/x86/hvm/vpt.c Mon Nov 05 13:23:55 2007 +0000
+++ b/xen/arch/x86/hvm/vpt.c Wed Nov 07 08:55:45 2007 -0500
@@ -59,7 +59,7 @@ static void pt_process_missed_ticks(stru
if ( mode_is(pt->vcpu->domain, no_missed_tick_accounting) )
{
pt->pending_intr_nr = 1;
- pt->scheduled = now + pt->scheduled;
+ pt->scheduled = now + pt->period;
}
else
{
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|