xen-devel
RE: 答复: Re: [Xen-devel] when timer go back in dom0 save and restore ormi
Sorry for a typo. I did mean domU instead of dom0. :-) The
point here is that time_resume will sync to new system time and wall clock at
restore, and thus pv guest should be able to continue... Xen system time is not
wallclock time which just counts up from power up. As Keir points out, only
its progress is used to drive internal jiffies.
Then what do you mean for "system time stop" here? TOD
at user level, or within kernel you observe xen system time never
changing?
Thanks,
Kevin
Hi, yes, there is a patch before to fix
problem wc_sec/wc_nsec in xc_domain_restore.c, but it still missed
something. If constucting dom0 or restoring of a PV dom. Guest os will read
the local wc_sec from xen as it base time.wc_sec is initialized with CMOS
data. There were some case which wc_sec will be changed. One is that go back
dom0's system-time will change dom0's time and wc_sec smaller which is both
Guest os and Xen. Actually, we can do a simple test, starting a pv domain,
then change dom0's time, and you will find the system time of guest os
stopped. That because you change wc_sec of both xen and guest os.
This patch only consider the case of save/restore. I
still not sure the policy of this case that is when dom0's system-time go
back. what VMs should do? So, I have add this case to this
patch By the way, Kevin, Guest OS will hang not dom0 ;-) and
also the time of hang just is equivlant to the time interval you go back in
dom0 or new machine you migrate. Thanks --
James
>>> Keir Fraser 08?11?26? ??
22:58 >>> So what happens if someone changes wallclock using
'date'? That's basically kind of what will appear to happen when s/r
occurs.
-- Keir
On 26/11/08 14:32, "Tian, Kevin"
<kevin.tian@xxxxxxxxx> wrote:
hrtimer supports two timer bases: CLOCK_MONOTONIC and
CLOCK_REALTIME. wall_to_monotonic is only added in former case, and for
latter instead TOD is used directly per my reading. I did a quick search,
and it looks that futex and ntp are using CLOCK_REALTIME. Also there's one
vsyscall gate which can pass CLOCK_REALTIME from caller
too.
Thanks, Kevin
From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
Sent: Wednesday, November 26, 2008 10:26 PM To:
Tian, Kevin; 'James Song';
xen-devel@xxxxxxxxxxxxxxxxxxx Subject: Re: [Xen-devel]
when timer go back in dom0 save and restore or migrate, PV domain
hung
hrtimers add
wall_to_monotonic to xtime to get a timesource that doesn't (or
shouldn't!) warp.
-- Keir
On 26/11/08 14:20,
"Tian, Kevin" <kevin.tian@xxxxxxxxx>
wrote:
how about hrtimers? one mode is CLOCK_REALTIME, which uses
getnstimeofday as expiration. Once system time is changed either
in local or new machine, that expiration can't be adjusted. but
i'm not sure whether it still makes sense to try hrtimers in a
guest.
Thanks Kevin
From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
Sent: Wednesday, November 26, 2008 10:11
PM To: Tian, Kevin; 'James Song';
xen-devel@xxxxxxxxxxxxxxxxxxx Subject: Re:
[Xen-devel] when timer go back in dom0 save and restore or
migrate, PV domain hung
The problem
hasn't been fully explained, but I can say that PV guests
expect system time to jump across s/r and deal with that. For
example, Linux doesn't use Xen system time internally, but
uses its progress to periodically update jiffies, which
does not warp across s/r.
We have had problems
corrupting wc_sec/wc_nsec in xc_domain_restore.c, but that was
fixed some time ago.
-- Keir
On
26/11/08 14:00, "Tian, Kevin" <kevin.tian@xxxxxxxxx>
wrote:
This is not a s/r or lm specific issue. For example,
system time can be changed even when pv guest is
running. Your patch only hacks restore point once, and
wc_sec can still be changed later when system time is
changed on-the-fly again.
IIRC, pv guest can catch up wall
clock change in timer interrupt, and time_resume will
sync internal processed system time with new system time
after restored. But I'm not sure whether it's enough. Actually
the more interesting is the uptime difference. For
example, timer with expiration calculated on previous
system time may wait nearly infinite if uptime among two
boxes vary a lot. But I think such issue should have been
considered already, e.g. some user tool assistance. I
think Keir can comment better
here.
BTW, do you happen to know what
exactly dom0 hangs on? In some busy loop to catch up
time, or long delay to some critical timer
expiration?
Thanks, Kevin
From:
xen-devel-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx]
On Behalf Of James Song Sent:
Tuesday, November 25, 2008 4:02 PM To:
xen-devel@xxxxxxxxxxxxxxxxxxx Subject:
[Xen-devel] when timer go back in dom0 save and
restore or migrate, PV domain hung
Hi, I
find PV domin hung, When we take those steps
1,
save PV domain
2,
change system time of PV domain back
3,
restore a PV domain
or
1,
migrate a PV domain from Machine A to Machine
B 2,
the system time of Machine B is slower than
Machine A. the problem is
wc_sec will be change when system-time chanaged in
dom0 or restore in a slower-system-time machine,
but when restoring, xen don't restore the wc_sec
of share_info from xenstore and use native one.
So guest os will hang. this patch will work for
this issue.
Thanks -- Song
Wei
diff -r a5ed0dbc829f
tools/libxc/xc_domain_restore.c ---
a/tools/libxc/xc_domain_restore.c
Tue Nov 18 14:34:14 2008
+0800 +++ b/tools/libxc/xc_domain_restore.c
Fri Nov 21 17:34:15 2008
+0800 @@ -328,6 +328,16
@@ /* For
info only
*/ nr_pfns = 0; +
//jsong@xxxxxxxxxx, james
song + memset(&domctl, 0,
sizeof(domctl)); +
domctl.domain = dom; +
domctl.cmd =
XEN_DOMCTL_restoredomain; +
frc = do_domctl(xc_handle,
&domctl); + if ( frc !=
0 ) + { +
ERROR("Unable
to set flag of restore."); +
goto
out; +
} if
( read_exact(io_fd, &p2m_size,
sizeof(unsigned long))
) { @@
-1120,6 +1130,8
@@ /*
restore saved vcpu_info and arch specific info
*/ MEMCPY_FIELD(new_shared_info,
old_shared_info, vcpu_info); +
MEMCPY_FIELD(new_shared_info,
old_shared_info, wc_nsec); +
MEMCPY_FIELD(new_shared_info,
old_shared_info,
wc_sec); MEMCPY_FIELD(new_shared_info,
old_shared_info,
arch); /*
clear any pending events and the selector
*/ diff -r a5ed0dbc829f xen/arch/x86/time.c ---
a/xen/arch/x86/time.c Tue Nov
18 14:34:14 2008 +0800 +++
b/xen/arch/x86/time.c Fri Nov
21 17:34:15 2008 +0800 @@ -689,7 +689,6
@@ wmb(); (*version)++; } - void
update_vcpu_system_time(struct vcpu
*v) { struct
cpu_time *t; @@
-703,7 +702,6
@@ if (
u->tsc_timestamp == t->local_tsc_stamp
) return; - version_update_begin(&u->version); u->tsc_timestamp
= t->local_tsc_stamp; @@
-713,14 +711,19
@@ version_update_end(&u->version); } - void
update_domain_wallclock_time(struct domain
*d) { spin_lock(&wc_lock); +
if(d->after_restore ) +
{ +
d->after_restore
= 0; +
goto out;
//jsong@xxxxxxxxxx +
} version_update_begin(&shared_info(d,
wc_version)); shared_info(d,
wc_sec) = wc_sec +
d->time_offset_seconds; shared_info(d,
wc_nsec) =
wc_nsec; version_update_end(&shared_info(d,
wc_version)); +out: spin_unlock(&wc_lock); } @@
-751,7 +754,6
@@ u64
x; u32 y,
_wc_sec,
_wc_nsec; struct
domain
*d; - x =
(secs * 1000000000ULL) + (u64)nsecs -
system_time_base; y
= do_div(x, 1000000000); @@ -1050,7
+1052,6 @@ struct tm
wallclock_time(void) { uint64_t
seconds; - if
( !wc_sec
) return
(struct tm) { 0 }; diff -r
a5ed0dbc829f xen/common/domctl.c ---
a/xen/common/domctl.c Tue Nov
18 14:34:14 2008 +0800 +++
b/xen/common/domctl.c Fri Nov
21 17:34:15 2008 +0800 @@ -24,7 +24,6
@@ #include
<asm/current.h> #include
<public/domctl.h> #include
<xsm/xsm.h> - extern long
arch_do_domctl( struct
xen_domctl *op, XEN_GUEST_HANDLE(xen_domctl_t)
u_domctl); @@ -315,6 +314,16
@@ ret
=
0; } break; +
case XEN_DOMCTL_restoredomain: +
{ +
struct
domain *d; +
if ( (d =
rcu_lock_domain_by_id(op->domain)) == NULL
) +
break; +
+
d->after_restore
= 1; +
rcu_unlock_domain(d); +
break; +
} case
XEN_DOMCTL_createdomain: { diff
-r a5ed0dbc829f
xen/include/public/domctl.h ---
a/xen/include/public/domctl.h
Tue Nov 18 14:34:14 2008
+0800 +++ b/xen/include/public/domctl.h
Fri Nov 21 17:34:15 2008
+0800 @@ -61,6 +61,7 @@ #define
XEN_DOMCTL_destroydomain
2 #define
XEN_DOMCTL_pausedomain
3 #define
XEN_DOMCTL_unpausedomain
4 +#define
XEN_DOMCTL_restoredomain
51 #define
XEN_DOMCTL_resumedomain
27 #define
XEN_DOMCTL_getdomaininfo
5 diff -r
a5ed0dbc829f xen/include/xen/sched.h ---
a/xen/include/xen/sched.h Tue
Nov 18 14:34:14 2008 +0800 +++
b/xen/include/xen/sched.h Fri Nov 21
17:34:15 2008 +0800 @@ -231,6 +231,7
@@ * cause a
deadlock. Acquirers don't spin waiting; they
preempt. */ spinlock_t
hypercall_deadlock_mutex; + int
after_restore;
//jsong@xxxxxxxxxx }; struct
domain_setup_info --------------------------------------------------------------------------------------------- Thanks --Song
wei
|
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|