WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH] Add a timer mode that disables pending missed ti

To: Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH] Add a timer mode that disables pending missed ticks
From: Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>
Date: Thu, 08 Nov 2007 09:57:57 -0500
Cc: "Shan, Haitao" <haitao.shan@xxxxxxxxx>, Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>
Delivery-date: Mon, 19 Nov 2007 10:21:00 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <47332084.8090305@xxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C3587454.101C7%Keir.Fraser@xxxxxxxxxxxx> <47332084.8090305@xxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla Thunderbird 1.0.7-1.1.fc4 (X11/20050929)
Keir,

I ran a 24 hour (23hr:40min) test. The usual setup. Protocol was ASYNC.

Errors:
 sles9sp3-64:  -4.96 sec  -.0058%
 rh4u4-64:     +4.42 sec  +.0052%

So, lets leave it ASYNC unless someone produces some test cases
where the error gets up to close to .05%.
I'll do some testing here with overnight runs or, perhaps,
different loads.

thanks,
Dave

Dave Winchell wrote:

Hi Keir,

I've added comments below.
See my next mail on some interesting performance numbers.

thanks,
Dave

Keir Fraser wrote:

On 7/11/07 19:38, "Dave Winchell" <dwinchell@xxxxxxxxxxxxxxx> wrote:

My feeling is that we should go full SYNC. Yes, in theory the
guests should be able to handle ASYNC, but in reality it appears that
some do not. Since it is easy for us to give them SYNC,
lets just do it and not stress them out.


One problem with pure SYNC is there's a fair chance you won't deliver any ticks at all for a long time, if the guest only runs in short bursts (e.g., I/O bound) and happens not to be running on any tick boundary. I'm not sure
how much that matters. It could cause time goes backwards if the time
extrapolation via the TSC is not perfectly accurate, or cause problems if
there are any assumptions that TSC delta since last tick fits in 32 bits
(less likely in x64 code I suppose). Anyway, my point is that only testing
VCPUs under full load may cause us to optimise in ways that have nasty
unexpected effects for other workloads.
I agree that this could be a problem. I have an idea that could give us full
SYNC and eliminate the long periods without clock interrupts.
In pt_process_missed_ticks() when missed_ticks > 0 set pt->run_timer = 1.
In pt_save_timer():

   list_for_each_entry ( pt, head, list )
       if(!pt->run_timer)
            stop_timer(&pt->timer);

And in pt_timer_fn():

   pt->run_timer = 0;

So, for a guest that misses a tick, we will interrupt him once from the
descheduled state and then leave him alone in the descheduled state.

For default mode as checked into unstable is now,
64 bit guests should run quite fast as missed is calculated and then a bunch
of additional interrupts are delivered. On the other hand
32bit guests very well in default mode.

For the original code, before we put in the constant tsc offset business,
64bit guests run poorly and 32bit quests very well time-wise.


The default mode hasn't changed. Are you under the impression that
missed-ticks-but-no-delay-of-tsc is the default mode now? I know x64 guests run badly with that because they treat every one of the missed ticks they
receive as a full tick.
Sorry, I was confused.
However, the default mode will still run poorly for 64 bit guests because
of the pending_nr's accumulated while the guest has interrupts disabled.
As I recall, the effect is quite large, on the order of 10% error.
I'll get you a number later today.

-- Keir

Or is the lack of
synchronization of TSCs across VCPUs causing issues that you're trying to
avoid?

This does cause issues, but its not the only contributor to poor timing.
Having TSCs synchronized across vcpus will help some of the time going
backwards problems we have seen, I think.

Regards,
Dave

Keir Fraser wrote:

On 7/11/07 17:29, "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx> wrote:



So, you can see we send an interrupt immediately (and ASYNC) if any ticks have been missed, but then successive ticks are delivered 'on the beat'. A possible middleground? Or perhaps we should just go with SYNC after all...

How do these Linux x64 guests fare with the original and default timer mode, by the way? I would expect that time should be accounted pretty accurately in that mode, albeit with more interrupts than you'd like. Or is the lack of synchronisation of TSCs across VCPUs causing issues that you're trying to
avoid?

-- Keir










_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel