This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Re: [PATCH] rendezvous-based local time calibration WOW!

To: "dan.magenheimer@xxxxxxxxxx" <dan.magenheimer@xxxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Re: [PATCH] rendezvous-based local time calibration WOW!
From: Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date: Mon, 04 Aug 2008 20:47:09 +0100
Cc: Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>, Dave Winchell <dwinchell@xxxxxxxxxxxxxxx>
Delivery-date: Mon, 04 Aug 2008 12:48:13 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20080804134006640.00000008444@djm-pc>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acj1iQfsUvemUZOnSa6LjZrOeI3kWAABMseMAC2VPzAABDI6swAAyxFAAAQsP4AAAITTQg==
Thread-topic: [PATCH] rendezvous-based local time calibration WOW!
User-agent: Microsoft-Entourage/
Thanks, Dan! Of course, there are new features since 3.2 that I did not
include in by version-number-change announcement email. I'll make a suitably
updated list for the actual 4.0 release announcement.

 -- Keir

On 4/8/08 20:40, "Dan Magenheimer" <dan.magenheimer@xxxxxxxxxx> wrote:

> After two hours of constant samples with c/s 18229, max
> skew is at 251ns!  That's 70-150x better than I was
> measuring just a couple of weeks ago.  YMMV of course.
> If you are looking for another marketing-speak bullet for
> the 4.0 release announcement, you can call this:
> * Greatly improved precision for time-sensitive SMP VMs
> or as I am subject to American hyperbole:
> * Dramatically improved precision for time-sensitive SMP VMs
> Thanks again!
> Dan
>> -----Original Message-----
>> From: Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
>> Sent: Monday, August 04, 2008 11:37 AM
>> To: 'Keir Fraser'; 'Xen-Devel (E-mail)'
>> Cc: 'Ian Pratt'; 'Dave Winchell'
>> Subject: RE: [PATCH] rendezvous-based local time calibration WOW!
>> Looks good to me (and much cleaner).  I've booted it and
>> will leave it running for a few hours.
>> Thanks!
>> Dan
>>> -----Original Message-----
>>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
>>> Sent: Monday, August 04, 2008 11:10 AM
>>> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail)
>>> Cc: Ian Pratt; Dave Winchell
>>> Subject: Re: [PATCH] rendezvous-based local time calibration WOW!
>>> Applied as c/s 18229. I rewrote it quite a bit, although
>> the principle
>>> remains the same.
>>>  -- Keir
>>> On 4/8/08 16:24, "Dan Magenheimer"
>> <dan.magenheimer@xxxxxxxxxx> wrote:
>>>> OK, how about this version.  The rendezvous only collects
>>>> the key per-cpu time data then sets up a per-cpu 1ms timer
>>>> to later update the timestamp record and vcpu system time,
>>>> so neither should have racing issues.
>>>> I've only run it for about an hour but still haven't seen
>>>> any skew over 600nsec so apparently it is the collection of
>>>> the key time data that must be closely synchronized (probably
>>>> to ensure the slope is correct) while exact synchronization
>>>> of setting the timestamp records is less important.
>>>> Note that I'm not positive I got the clocksource=tsc part
>>>> correct... but am interested in your opinion on whether
>>>> clocksource=tsc can now be eliminated anyway (as the
>>>> main reason I pushed for it was because of unacceptable
>>>> skew which with this patch appears to be fixed).
>>>> Signed-off-by: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
>>>>> -----Original Message-----
>>>>> From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
>>>>> Sent: Sunday, August 03, 2008 11:25 AM
>>>>> To: dan.magenheimer@xxxxxxxxxx; Xen-Devel (E-mail)
>>>>> Cc: Ian Pratt; Dave Winchell
>>>>> Subject: Re: [PATCH] rendezvous-based local time calibration WOW!
>>>>> It's not safe to poke a new timestamp record from an
>>> interrupt handler
>>>>> (which is what the smp_call_function() callback functions
>>>>> are). Users of the
>>>>> timestamp records (e.g., get_s_time) need
>>>>> local_irq_save/restore() or an
>>>>> equivalent of the Linux seqlock. The latter is likely faster.
>>>>> I'm dubious
>>>>> about update_vcpu_system_time() from an interrupt handler
>>>>> too. It needs
>>>>> thought about how it might race with a context switch (change
>>>>> of 'current')
>>>>> or if it interrupts an existing invocation of
>>>>> update_vcpu_system_time().
>>>>>  -- Keir
>>>>> On 3/8/08 17:50, "Dan Magenheimer"
>>> <dan.magenheimer@xxxxxxxxxx> wrote:
>>>>>> The synchronization of local_time_calibration (l_t_c) via
>>>>>> round-to-nearest-epoch provided some improvement, but I was
>>>>>> still seeing skew up to 16usec and higher.  I measured the
>>>>>> temporal distance between the rounded-epoch vs when ltc
>>>>>> was actually running to ensure there wasn't some kind of
>>>>>> bug and found that l_t_c was running up to 150us after the
>>>>>> round-epoch and sometimes up to 50us before.  I guess this
>>>>>> is the granularity of setting a Xen timer.  While it seemed
>>>>>> that +/- 100us shouldn't cause that much skew, I finally
>>>>>> decided to try synchronization-via-rendezvous, as suggested
>>>>>> by Ian here:
>>>>> http://lists.xensource.com/archives/html/xen-devel/2008-07/msg
>>>> 01074.html
>> http://lists.xensource.com/archives/html/xen-devel/2008-07/msg
> 01080.html
>>> The result is phenomenal... using this approach (in attached
>>> patch), I have yet to see a skew exceed 1usec!!!  So this is
>>> about a 10-fold increase in accuracy vs the rounded-epoch
>>> method and about 20-fold over the one-epoch-from-NOW() method.
>>> The platform time is now read once for all processors rather
>>> than once per processor.  (Actually, it is read once again
>>> in platform_time_calibration()... by "inlining" that routine
>>> into master_local_time_calibration() that extra read can
>>> be -- and probably should be -- avoided too.)
>>> It may be too late to get this into 3.3.0 but, if so, please
>>> consider it asap for 3.3.1 rather than just xen-unstable/3.4.
>>> Dan
>>> ===================================
>>> Thanks... for the memory
>>> I really could use more / My throughput's on the floor
>>> The balloon is flat / My swap disk's fat / I've OOM's in store
>>> Overcommitted so much
>>> (with apologies to the late great Bob Hope)

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>