OK, Deepak repeated the test without ntpd and using ntpdate -b before
the test.
The attached graph shows his results: el5u1-64 (best=~0.07%),
el4u5-64 (middle=~0.2%), and el4u5-32 (worst=~0.3%).
We will continue to look at LTP to try to isolate.
Thanks,
Dan
P.S. elXuY is essentially RHEL XuY with some patches.
> -----Original Message-----
> From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
> Sent: Wednesday, January 30, 2008 2:45 PM
> To: Deepak Patel
> Cc: dan.magenheimer@xxxxxxxxxx; Keir Fraser;
> xen-devel@xxxxxxxxxxxxxxxxxxx; akira.ijuin@xxxxxxxxxx; Dave Winchell
> Subject: Re: [Xen-devel] [PATCH] Add a timer mode that
> disables pending
> missed ticks
>
>
> Dan, Deeepak,
>
> It may be that the underlying clock error is too great for ntp
> to handle. It would be useful if you did not run ntpd
> and, instead did ntpdate -b <timeserver> at the start of the test
> for each guest. Then capture the data as you have been doing.
> If the drift is greater than .05%, then we need to address that.
>
> Another option is, when running ntpd, to enable loop statistics in
> /etc/ntp.conf
> by adding this to the file:
>
> statistics loopstats
> statsdir /var/lib/ntp/
>
> Then you will see loop data in that directory.
> Correlating the data in the loopstats files with the
> peaks in skew would be interesting. You will see entries of the form
>
> 54495 76787.701 -0.045153303 -132.569229 0.020806776 239.735511 10
>
> Where the second to last column is the Allan Deviation. When that
> gets over 1000, ntpd is working pretty hard. However, I have
> not seen ntpd
> completely loose it like you have.
>
> I'm on vacation until Monday, and won't be reading
> email.
>
> Thanks for all your work on this!
>
> -Dave
>
> Deepak Patel wrote:
>
> >
> >>
> >> Is the graph for RHEL5u1-64? (I've never tested this one.)
> >
> >
> > I do not know which graph was attached with this. But I saw this
> > behavior in EL4u5 - 32, EL4U5 - 64 and EL5U1 - 64 hvm guests when I
> > was running ltp tests continuously.
> >
> >> What was the behaviour of the other guests running?
> >
> >
> > All pvm guests are fine. But behavior of most of the hvm guests were
> > as described.
> >
> >> If they had spikes, were they at the same wall time?
> >
> >
> > No. They are not at the same wall time.
> >
> >> Were the other guests running ltp as well?
> >>
> > Yes all 6 guests (4 hvm and 2 pvm) the guests are running ltp
> > continuously.
> >
> >> How are you measuring skew?
> >
> >
> > I was collecting output of "ntpdate -q <timeserver> every
> 300 seconds
> > (5 minutes) and have created graph based on that.
> >
> >>
> >> Are you running ntpd?
> >>
> > Yes. ntp was running on all the guests.
> >
> > I am investigating what causes this spikes and let everyone
> know what
> > are my findings.
> >
> > Thanks,
> > Deepak
> >
> >> Anything that you can discover that would be in sync with
> >> the spikes would be very helpful!
> >>
> >> The code that I test with is our product code, which is based
> >> on 3.1. So it is possible that something in 3.2 other than vpt.c
> >> is the cause. I can test with 3.2, if necessary.
> >>
> >> thanks,
> >> Dave
> >>
> >>
> >>
> >> Dan Magenheimer wrote:
> >>
> >>> Hi Dave (Keir, see suggestion below) --
> >>>
> >>> Thanks!
> >>>
> >>> Turning off vhpet certainly helps a lot (though see below).
> >>>
> >>> I wonder if timekeeping with vhpet is so bad that it should be
> >>> turned off by default (in 3.1, 3.2, and unstable) until it is
> >>> fixed? (I have a patch that defaults it off, can post it if
> >>> there is agreement on the above point.) The whole point of an
> >>> HPET is to provide more precise timekeeping and if vhpet is
> >>> worse than vpit, it can only confuse users. Comments?
> >>>
> >>>
> >>> In your testing, are you just measuring % skew over a long
> >>> period of time?
> >>> We are graphing the skew continuously and
> >>> seeing periodic behavior that is unsettling, even with pit.
> >>> See attached. Though your algorithm recovers, the "cliffs"
> >>> could still cause real user problems. I wonder if there is
> >>> anything that can be done to make the "recovery" more
> >>> responsive?
> >>>
> >>> We are looking into what part(s) of LTP is causing the cliffs.
> >>>
> >>> Thanks,
> >>> Dan
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
> >>>> Sent: Monday, January 28, 2008 8:21 AM
> >>>> To: dan.magenheimer@xxxxxxxxxx
> >>>> Cc: Keir Fraser; xen-devel@xxxxxxxxxxxxxxxxxxx;
> >>>> deepak.patel@xxxxxxxxxx;
> >>>> akira.ijuin@xxxxxxxxxx; Dave Winchell
> >>>> Subject: Re: [Xen-devel] [PATCH] Add a timer mode that disables
> >>>> pending
> >>>> missed ticks
> >>>>
> >>>>
> >>>> Dan,
> >>>>
> >>>> I guess I'm a bit out of date calling for clock= usage.
> >>>> Looking at linux 2.6.20.4 sources, I think you should specify
> >>>> "clocksource=pit nohpet" on the linux guest bootline.
> >>>>
> >>>> You can leave the xen and dom0 bootlines as they are.
> >>>> The xen and guest clocksources do not need to be the same.
> >>>> In my tests, xen is using the hpet for its timekeeping and
> >>>> that appears to be the default.
> >>>>
> >>>> When you boot the guests you should see
> >>>> time.c: Using PIT/TSC based timekeeping.
> >>>> on the rh4u5-64 guest, and something similar on the others.
> >>>>
> >>>> > (xm dmesg shows 8x Xeon 3.2GHz stepping 04, Platform timer
> >>>> > 14.318MHz HPET.)
> >>>>
> >>>> This appears to be the xen state, which is fine.
> >>>> I was wrongly assuming that this was the guest state.
> >>>> You might want to look in your guest logs and see what they were
> >>>> picking
> >>>> for a clock source.
> >>>>
> >>>> Regards,
> >>>> Dave
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> Dan Magenheimer wrote:
> >>>>
> >>>>
> >>>>
> >>>>> Thanks, I hadn't realized that! No wonder we didn't
> see the same
> >>>>> improvement you saw!
> >>>>>
> >>>>>
> >>>>>
> >>>>>> Try specifying clock=pit on the linux boot line...
> >>>>>>
> >>>>>
> >>>>>
> >>>>> I'm confused... do you mean "clocksource=pit" on the Xen
> >>>>
> >>>>
> >>>> command line or
> >>>>
> >>>>
> >>>>> "nohpet" / "clock=pit" / "clocksource=pit" on the guest (or
> >>>>
> >>>>
> >>>> dom0?) command
> >>>>
> >>>>
> >>>>> line? Or both places? Since the tests take awhile, it
> >>>>
> >>>>
> >>>> would be nice
> >>>>
> >>>>
> >>>>> to get this right the first time. Do the Xen and guest
> >>>>
> >>>>
> >>>> clocksources need
> >>>>
> >>>>
> >>>>> to be the same?
> >>>>>
> >>>>> Thanks,
> >>>>> Dan
> >>>>>
> >>>>> -----Original Message-----
> >>>>> *From:* Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
> >>>>> *Sent:* Sunday, January 27, 2008 2:22 PM
> >>>>> *To:* dan.magenheimer@xxxxxxxxxx; Keir Fraser
> >>>>> *Cc:* xen-devel@xxxxxxxxxxxxxxxxxxx; deepak.patel@xxxxxxxxxx;
> >>>>> akira.ijuin@xxxxxxxxxx; Dave Winchell
> >>>>> *Subject:* RE: [Xen-devel] [PATCH] Add a timer mode
> that disables
> >>>>> pending missed ticks
> >>>>>
> >>>>> Hi Dan,
> >>>>>
> >>>>> Hpet timer does have a fairly large error, as I was
> trying this
> >>>>> one recently.
> >>>>> I don't remember what I got for error, but 1% sounds
> >>>>
> >>>>
> >>>> about right.
> >>>>
> >>>>
> >>>>> The problem is that hpet is not built on top of vpt.c,
> >>>>
> >>>>
> >>>> the module
> >>>>
> >>>>
> >>>>> Keir and I did
> >>>>> all the recent work in, for its periodic timer needs. Try
> >>>>> specifying clock=pit
> >>>>> on the linux boot line. If it still picks the hpet, which it
> >>>>> might, let me know
> >>>>> and I'll tell you how to get around this.
> >>>>>
> >>>>> Regards,
> >>>>> Dave
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --------------------------------------------------------------
> >>>> ----------
> >>>>
> >>>>
> >>>>> *From:* Dan Magenheimer [mailto:dan.magenheimer@xxxxxxxxxx]
> >>>>> *Sent:* Fri 1/25/2008 6:50 PM
> >>>>> *To:* Dave Winchell; Keir Fraser
> >>>>> *Cc:* xen-devel@xxxxxxxxxxxxxxxxxxx; deepak.patel@xxxxxxxxxx;
> >>>>> akira.ijuin@xxxxxxxxxx
> >>>>> *Subject:* RE: [Xen-devel] [PATCH] Add a timer mode
> >>>>
> >>>>
> >>>> that disables
> >>>>
> >>>>
> >>>>> pending missed ticks
> >>>>>
> >>>>> Sorry for the very late followup on this but we finally
> >>>>
> >>>>
> >>>> were able
> >>>>
> >>>>
> >>>>> to get our testing set up again on stable 3.1 bits and have
> >>>>> seen some very bad results on 3.1.3-rc1, on the order of 1%.
> >>>>>
> >>>>> Test enviroment was a 4-socket dual core machine with 24GB of
> >>>>> memory running six two-vcpu 2GB domains, four hvm
> plus two pv.
> >>>>> All six guests were running LTP simultaneously. The four hvm
> >>>>> guests were: RHEL5u1-64, RHEL4u5-32, RHEL5-64, and
> RHEL4u5-64.
> >>>>> Timer_mode was set to 2 for 64-bit guests and 0 for
> >>>>
> >>>>
> >>>> 32-bit guests.
> >>>>
> >>>>
> >>>>> All four hvm guests experienced skew around -1%,
> even the 32-bit
> >>>>> guest. Less intensive testing didn't exhibit much
> skew at all.
> >>>>>
> >>>>> A representative graph is attached.
> >>>>>
> >>>>> Dave, I wonder if some portion of your patches
> didn't end up in
> >>>>> the xen trees?
> >>>>>
> >>>>> (xm dmesg shows 8x Xeon 3.2GHz stepping 04, Platform timer
> >>>>> 14.318MHz HPET.)
> >>>>>
> >>>>> Thanks,
> >>>>> Dan
> >>>>>
> >>>>> P.S. Many thanks to Deepak and Akira for running tests.
> >>>>>
> >>>>> > -----Original Message-----
> >>>>> > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> >>>>> > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of
> >>>>> > Dave Winchell
> >>>>> > Sent: Wednesday, January 09, 2008 9:53 AM
> >>>>> > To: Keir Fraser
> >>>>> > Cc: dan.magenheimer@xxxxxxxxxx;
> >>>>
> >>>>
> >>>> xen-devel@xxxxxxxxxxxxxxxxxxx; Dave
> >>>>
> >>>>
> >>>>> > Winchell
> >>>>> > Subject: Re: [Xen-devel] [PATCH] Add a timer mode that
> >>>>> > disables pending
> >>>>> > missed ticks
> >>>>> >
> >>>>> >
> >>>>> > Hi Keir,
> >>>>> >
> >>>>> > The latest change, c/s 16690, looks fine.
> >>>>> > I agree that the code in c/s 16690 is equivalent to
> >>>>> > the code I submitted. Also, your version is more
> >>>>> > concise.
> >>>>> >
> >>>>> > The error tests confirm the equivalence. With
> >>>>
> >>>>
> >>>> overnight cpu loads,
> >>>>
> >>>>
> >>>>> > the checked in version was accurate to +.048% for sles
> >>>>> > and +.038% for red hat. My version was +.046% and
> +.032% in a
> >>>>> > 2 hour test.
> >>>>> > I don't think the difference is significant.
> >>>>> >
> >>>>> > i/o loads produced errors of +.01%.
> >>>>> >
> >>>>> > Thanks for all your efforts on this issue.
> >>>>> >
> >>>>> > Regards,
> >>>>> > Dave
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > Keir Fraser wrote:
> >>>>> >
> >>>>> > >Applied as c/s 16690, although the checked-in patch is
> >>>>> > smaller. I think the
> >>>>> > >only important fix is to pt_intr_post() and the
> only bit of
> >>>>> > the patch I
> >>>>> > >totally omitted was the change to
> pt_process_missed_ticks().
> >>>>> > I don't think
> >>>>> > >that change can be important, but let's see what
> >>>>
> >>>>
> >>>> happens to the
> >>>>
> >>>>
> >>>>> error
> >>>>> > >percentage...
> >>>>> > >
> >>>>> > > -- Keir
> >>>>> > >
> >>>>> > >On 4/1/08 23:24, "Dave Winchell"
> >>>>
> >>>>
> >>>> <dwinchell@xxxxxxxxxxxxxxx> wrote:
> >>>>
> >>>>
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> > >>Hi Dan and Keir,
> >>>>> > >>
> >>>>> > >>Attached is a patch that fixes some issues with the
> >>>>
> >>>>
> >>>> SYNC policy
> >>>>
> >>>>
> >>>>> > >>(no_missed_ticks_pending).
> >>>>> > >>I have not tried to make the change the minimal one, but,
> >>>>> > rather, just
> >>>>> > >>ported into
> >>>>> > >>the new code what I know to work well. The error for
> >>>>> > >>no_missed_ticks_pending goes from
> >>>>> > >>over 3% to .03% with this change according to my testing.
> >>>>> > >>
> >>>>> > >>Regards,
> >>>>> > >>Dave
> >>>>> > >>
> >>>>> > >>Dan Magenheimer wrote:
> >>>>> > >>
> >>>>> > >>
> >>>>> > >>
> >>>>> > >>>Hi Dave --
> >>>>> > >>>
> >>>>> > >>>Did you get your correction ported? If so, it would be
> >>>>> > nice to see this get
> >>>>> > >>>into 3.1.3.
> >>>>> > >>>
> >>>>> > >>>Note that I just did some very limited testing with
> >>>>> > timer_mode=2(=SYNC=no
> >>>>> > >>>missed ticks pending)
> >>>>> > >>>on tip of xen-3.1-testing (64-bit Linux hv
> guest) and the
> >>>>> > worst error I've
> >>>>> > >>>seen so far
> >>>>> > >>>is 0.012%. But I haven't tried any exotic
> loads, just LTP.
> >>>>> > >>>
> >>>>> > >>>Thanks,
> >>>>> > >>>Dan
> >>>>> > >>>
> >>>>> > >>>
> >>>>> > >>>
> >>>>> > >>>
> >>>>> > >>>
> >>>>> > >>>>-----Original Message-----
> >>>>> > >>>>From: Dave Winchell [mailto:dwinchell@xxxxxxxxxxxxxxx]
> >>>>> > >>>>Sent: Wednesday, December 19, 2007 12:33 PM
> >>>>> > >>>>To: dan.magenheimer@xxxxxxxxxx
> >>>>> > >>>>Cc: Keir Fraser; Shan, Haitao;
> >>>>> > xen-devel@xxxxxxxxxxxxxxxxxxx; Dong,
> >>>>> > >>>>Eddie; Jiang, Yunhong; Dave Winchell
> >>>>> > >>>>Subject: Re: [Xen-devel] [PATCH] Add a timer mode that
> >>>>> > >>>>disables pending
> >>>>> > >>>>missed ticks
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>Dan,
> >>>>> > >>>>
> >>>>> > >>>>I did some testing with the constant tsc offset
> >>>>
> >>>>
> >>>> SYNC method
> >>>>
> >>>>
> >>>>> > >>>>(now called
> >>>>> > >>>>no_missed_ticks_pending)
> >>>>> > >>>>and found the error to be very high, much larger
> >>>>
> >>>>
> >>>> than 1 %, as
> >>>>
> >>>>
> >>>>> > >>>>I recall.
> >>>>> > >>>>I have not had a chance to submit a correction. I
> >>>>
> >>>>
> >>>> will try to
> >>>>
> >>>>
> >>>>> > >>>>do it later
> >>>>> > >>>>this week or the first week in January. My version of
> >>>>> constant tsc
> >>>>> > >>>>offset SYNC method
> >>>>> > >>>>produces .02 % error, so I just need to port
> that into the
> >>>>> > >>>>current code.
> >>>>> > >>>>
> >>>>> > >>>>The error you got for both of those kernels is
> >>>>
> >>>>
> >>>> what I would
> >>>>
> >>>>
> >>>>> expect
> >>>>> > >>>>for the default mode, delay_for_missed_ticks.
> >>>>> > >>>>
> >>>>> > >>>>I'll let Keir answer on how to set the time mode.
> >>>>> > >>>>
> >>>>> > >>>>Regards,
> >>>>> > >>>>Dave
> >>>>> > >>>>
> >>>>> > >>>>Dan Magenheimer wrote:
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>>Anyone make measurements on the final patch?
> >>>>> > >>>>>
> >>>>> > >>>>>I just ran a 64-bit RHEL5.1 pvm kernel and
> saw a loss of
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>about 0.2% with no load. This was
> xen-unstable tip today
> >>>>> > >>>>with no options specified. 32-bit was about 0.01%.
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>>I think I missed something... how do I run the various
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>accounting choices and which ones are known to be
> >>>>
> >>>>
> >>>> appropriate
> >>>>
> >>>>
> >>>>> > >>>>for which kernels?
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>>Thanks,
> >>>>> > >>>>>Dan
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>>-----Original Message-----
> >>>>> > >>>>>>From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> >>>>> >
> >>>>>
> >>>>>>>>>> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>Keir Fraser
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>>>Sent: Thursday, December 06, 2007 4:57 AM
> >>>>> > >>>>>>To: Dave Winchell
> >>>>> > >>>>>>Cc: Shan, Haitao;
> xen-devel@xxxxxxxxxxxxxxxxxxx; Dong,
> >>>>> > Eddie; Jiang,
> >>>>> > >>>>>>Yunhong
> >>>>> > >>>>>>Subject: Re: [Xen-devel] [PATCH] Add a timer
> mode that
> >>>>> > >>>>>>disables pending
> >>>>> > >>>>>>missed ticks
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>Please take a look at xen-unstable changeset 16545.
> >>>>> > >>>>>>
> >>>>> > >>>>>>-- Keir
> >>>>> > >>>>>>
> >>>>> > >>>>>>On 26/11/07 20:57, "Dave Winchell"
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>><dwinchell@xxxxxxxxxxxxxxx> wrote:
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>Keir,
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>The accuracy data I've collected for i/o
> loads for the
> >>>>> > >>>>>>>various time protocols follows. In
> addition, the data
> >>>>> > >>>>>>>for cpu loads is shown.
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>The loads labeled cpu and i/o-8 are on an 8
> >>>>
> >>>>
> >>>> processor AMD
> >>>>
> >>>>
> >>>>> box.
> >>>>> > >>>>>>>Two guests, red hat and sles 64 bit, 8 vcpu each.
> >>>>> > >>>>>>>The cpu load is usex -e36 on each guest.
> >>>>> > >>>>>>>(usex is available at
> >>>>> http://people.redhat.com/anderson/usex.)
> >>>>> > >>>>>>>i/o load is 8 instances of dd if=/dev/hda6
> >>>>
> >>>>
> >>>> of=/dev/null.
> >>>>
> >>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>The loads labeled i/o-32 are 32 instances of dd.
> >>>>> > >>>>>>>Also, these are run on 4 cpu AMD box.
> >>>>> > >>>>>>>In addition, there is an idle rh-32bit guest.
> >>>>> > >>>>>>>All three guests are 8vcpu.
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>The loads labeled i/o-4/32 are the same as i/o-32
> >>>>> > >>>>>>>except that the redhat-64 guest has 4
> instances of dd.
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>Date Duration Protocol sles, rhat error load
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>11/07 23 hrs 40 min ASYNC -4.96 sec, +4.42
> sec -.006%,
> >>>>> > +.005% cpu
> >>>>> > >>>>>>>11/09 3 hrs 19 min ASYNC -.13 sec, +1.44
> sec, -.001%,
> >>>>> > +.012% cpu
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>11/08 2 hrs 21 min SYNC -.80 sec, -.34 sec, -.009%,
> >>>>> -.004% cpu
> >>>>> > >>>>>>>11/08 1 hr 25 min SYNC -.24 sec, -.26 sec,
> >>>>
> >>>>
> >>>> -.005%, -.005% cpu
> >>>>
> >>>>
> >>>>> > >>>>>>>11/12 65 hrs 40 min SYNC -18 sec, -8 sec,
> >>>>
> >>>>
> >>>> -.008%, -.003% cpu
> >>>>
> >>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>11/08 28 min MIXED -.75 sec, -.67 sec -.045%,
> >>>>
> >>>>
> >>>> -.040% cpu
> >>>>
> >>>>
> >>>>> > >>>>>>>11/08 15 hrs 39 min MIXED -19. sec,-17.4
> sec, -.034%,
> >>>>> > -.031% cpu
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>11/14 17 hrs 17 min ASYNC -6.1 sec,-55.7 sec, -.01%,
> >>>>> > -.09% i/o-8
> >>>>> > >>>>>>>11/15 2 hrs 44 min ASYNC -1.47 sec,-14.0 sec, -.015%
> >>>>> > -.14% i/o-8
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>11/13 15 hrs 38 min SYNC -9.7 sec,-12.3 sec, -.017%,
> >>>>> > -.022% i/o-8
> >>>>> > >>>>>>>11/14 48 min SYNC - .46 sec, - .48 sec,
> >>>>
> >>>>
> >>>> -.017%, -.018% i/o-8
> >>>>
> >>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>11/14 4 hrs 2 min MIXED -2.9 sec, -4.15 sec, -.020%,
> >>>>> > -.029% i/o-8
> >>>>> > >>>>>>>11/20 16 hrs 2 min MIXED -13.4 sec,-18.1
> sec, -.023%,
> >>>>> > -.031% i/o-8
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>11/21 28 min MIXED -2.01 sec, -.67 sec, -.12%,
> >>>>
> >>>>
> >>>> -.04% i/o-32
> >>>>
> >>>>
> >>>>> > >>>>>>>11/21 2 hrs 25 min SYNC -.96 sec, -.43 sec, -.011%,
> >>>>> > -.005% i/o-32
> >>>>> > >>>>>>>11/21 40 min ASYNC -2.43 sec, -2.77 sec -.10%,
> >>>>
> >>>>
> >>>> -.11% i/o-32
> >>>>
> >>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>11/26 113 hrs 46 min MIXED -297. sec, 13. sec -.07%,
> >>>>> > .003% i/o-4/32
> >>>>> > >>>>>>>11/26 4 hrs 50 min SYNC -3.21 sec, 1.44 sec, -.017%,
> >>>>> > .01% i/o-4/32
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>Overhead measurements:
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>Progress in terms of number of passes
> through a fixed
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>system workload
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>on an 8 vcpu red hat with an 8 vcpu sles idle.
> >>>>> > >>>>>>>The workload was usex -b48.
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>ASYNC 167 min 145 passes .868 passes/min
> >>>>> > >>>>>>>SYNC 167 min 144 passes .862 passes/min
> >>>>> > >>>>>>>SYNC 1065 min 919 passes .863 passes/min
> >>>>> > >>>>>>>MIXED 221 min 196 passes .887 passes/min
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>Conclusions:
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>The only protocol which meets the .05% accuracy
> >>>>> > requirement for ntp
> >>>>> > >>>>>>>tracking under the loads
> >>>>> > >>>>>>>above is the SYNC protocol. The worst case
> >>>>
> >>>>
> >>>> accuracies for
> >>>>
> >>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>SYNC, MIXED,
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>and ASYNC
> >>>>> > >>>>>>>are .022%, .12%, and .14%, respectively.
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>We could reduce the cost of the SYNC method by only
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>scheduling the extra
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>wakeups if a certain number
> >>>>> > >>>>>>>of ticks are missed.
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>Regards,
> >>>>> > >>>>>>>Dave
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>Keir Fraser wrote:
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>
> >>>>> > >>>>>>>>On 9/11/07 19:22, "Dave Winchell"
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>><dwinchell@xxxxxxxxxxxxxxx> wrote:
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>>Since I had a high error (~.03%) for the
> >>>>
> >>>>
> >>>> ASYNC method a
> >>>>
> >>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>couple of days ago,
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>>I ran another ASYNC test. I think there may have
> >>>>> > been something
> >>>>> > >>>>>>>>>wrong with the code I used a couple of
> days ago for
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>ASYNC. It may have been
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>>missing the immediate delivery of interrupt
> >>>>
> >>>>
> >>>> after context
> >>>>
> >>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>switch in.
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>>My results indicate that either SYNC or ASYNC give
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>acceptable accuracy,
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>>each running consistently around or under
> >>>>
> >>>>
> >>>> .01%. MIXED has
> >>>>
> >>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>a fairly high
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>>error of
> >>>>> > >>>>>>>>>greater than .03%. Probably too close to .05% ntp
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>threshold for comfort.
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>>I don't have an overnight run with SYNC. I
> >>>>
> >>>>
> >>>> plan to leave
> >>>>
> >>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>SYNC running
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>>over the weekend. If you'd rather I can
> leave MIXED
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>running instead.
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>>It may be too early to pick the protocol and
> >>>>
> >>>>
> >>>> I can run
> >>>>
> >>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>more overnight tests
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>>next week.
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>>
> >>>>> > >>>>>>>>I'm a bit worried about any unwanted side
> >>>>
> >>>>
> >>>> effects of the
> >>>>
> >>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>SYNC+run_timer
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>approach -- e.g., whether timer wakeups will
> >>>>
> >>>>
> >>>> cause higher
> >>>>
> >>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>system-wide CPU
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>contention. I find it easier to think through the
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>implications of ASYNC. I'm
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>surprised that MIXED loses time, and is less
> >>>>
> >>>>
> >>>> accurate than
> >>>>
> >>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>ASYNC. Perhaps it
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>delivers more timer interrupts than the other
> >>>>
> >>>>
> >>>> approaches,
> >>>>
> >>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>and each interrupt
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>event causes a small accumulated error?
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>Overall I would consider MIXED and ASYNC as
> >>>>
> >>>>
> >>>> favourites and
> >>>>
> >>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>if the latter is
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>actually more accurate then I can simply revert the
> >>>>> > changeset that
> >>>>> > >>>>>>>>implemented MIXED.
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>Perhaps rather than running more of the same
> >>>>
> >>>>
> >>>> workloads you
> >>>>
> >>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>could try idle
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>VCPUs and I/O bound VCPUs (e.g., repeated
> >>>>
> >>>>
> >>>> large disc reads
> >>>>
> >>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>to /dev/null)? We
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>don't have any data on workloads that aren't
> >>>>
> >>>>
> >>>> CPU bound, so
> >>>>
> >>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>that's really an
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>>>obvious place to put any further effort imo.
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>-- Keir
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>>>
> >>>>> > >>>>>>_______________________________________________
> >>>>> > >>>>>>Xen-devel mailing list
> >>>>> > >>>>>>Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>>> > >>>>>>http://lists.xensource.com/xen-devel
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>>
> >>>>> > >>>
> >>>>> > >>>
> >>>>> > >>>
> >>>>> > >>>
> >>>>> > >>diff -r cfdbdca5b831 xen/arch/x86/hvm/vpt.c
> >>>>> > >>--- a/xen/arch/x86/hvm/vpt.c Thu Dec 06 15:36:07
> 2007 +0000
> >>>>> > >>+++ b/xen/arch/x86/hvm/vpt.c Fri Jan 04 17:58:16
> 2008 -0500
> >>>>> > >>@@ -58,7 +58,7 @@ static void
> pt_process_missed_ticks(stru
> >>>>> > >>
> >>>>> > >> missed_ticks = missed_ticks / (s_time_t)
> >>>>
> >>>>
> >>>> pt->period + 1;
> >>>>
> >>>>
> >>>>> > >> if ( mode_is(pt->vcpu->domain,
> >>>>
> >>>>
> >>>> no_missed_ticks_pending) )
> >>>>
> >>>>
> >>>>> > >>- pt->do_not_freeze = !pt->pending_intr_nr;
> >>>>> > >>+ pt->do_not_freeze = 1;
> >>>>> > >> else
> >>>>> > >> pt->pending_intr_nr += missed_ticks;
> >>>>> > >> pt->scheduled += missed_ticks * pt->period;
> >>>>> > >>@@ -127,7 +127,12 @@ static void pt_timer_fn(void *data)
> >>>>> > >>
> >>>>> > >> pt_lock(pt);
> >>>>> > >>
> >>>>> > >>- pt->pending_intr_nr++;
> >>>>> > >>+ if ( mode_is(pt->vcpu->domain,
> >>>>
> >>>>
> >>>> no_missed_ticks_pending) ) {
> >>>>
> >>>>
> >>>>> > >>+ pt->pending_intr_nr = 1;
> >>>>> > >>+ pt->do_not_freeze = 0;
> >>>>> > >>+ }
> >>>>> > >>+ else
> >>>>> > >>+ pt->pending_intr_nr++;
> >>>>> > >>
> >>>>> > >> if ( !pt->one_shot )
> >>>>> > >> {
> >>>>> > >>@@ -221,8 +226,6 @@ void pt_intr_post(struct
> vcpu *v, struct
> >>>>> > >> return;
> >>>>> > >> }
> >>>>> > >>
> >>>>> > >>- pt->do_not_freeze = 0;
> >>>>> > >>-
> >>>>> > >> if ( pt->one_shot )
> >>>>> > >> {
> >>>>> > >> pt->enabled = 0;
> >>>>> > >>@@ -235,6 +238,10 @@ void pt_intr_post(struct vcpu
> >>>>
> >>>>
> >>>> *v, struct
> >>>>
> >>>>
> >>>>> > >> pt->last_plt_gtime = hvm_get_guest_time(v);
> >>>>> > >> pt->pending_intr_nr = 0; /* 'collapse' all
> >>>>> > missed ticks */
> >>>>> > >> }
> >>>>> > >>+ else if ( mode_is(v->domain,
> no_missed_ticks_pending) ) {
> >>>>> > >>+ pt->pending_intr_nr--;
> >>>>> > >>+ pt->last_plt_gtime = hvm_get_guest_time(v);
> >>>>> > >>+ }
> >>>>> > >> else
> >>>>> > >> {
> >>>>> > >> pt->last_plt_gtime += pt->period_cycles;
> >>>>> > >>
> >>>>> > >>
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> > >
> >>>>> >
> >>>>> >
> >>>>> > _______________________________________________
> >>>>> > Xen-devel mailing list
> >>>>> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> >>>>> > http://lists.xensource.com/xen-devel
> >>>>> >
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>
>
>
hvm-compare.png
Description: Binary data
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|