xen-users
RE: [Xen-users] Live migration problem
Spoke too soon...failed after about the 20th or so migration. But it is more
stable than it was...
-- Ray
-----Original Message-----
From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of Cole, Ray
Sent: Wednesday, August 31, 2005 4:14 PM
To: Steven Hand
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: RE: [Xen-users] Live migration problem
I think I have it fixed, but I'm not sure why :-)
I modified reboot.c's shutdown_handler routine to NOT call
ctrl_if_send_response(). This appears to make live migration rock solid on my
machines. It appears to me that if the xenU kernel attempts to give a response
to the suspend command that it runs the possibility of locking up. I have very
little knowledge about the Xen code and such, but it seems to me that if it
works when the response is removed then nobody must be expecting a response on
the other end of the conversation or a response is already being sent from
somewhere else. I realize commenting this out would then cause a response to
not be sent for SYSRQ commands and such so this is my no means a proper 'fix',
but I think the root cause of the problem I've been having with live migration
periodically giving me errors that it cannot suspend has perhaps been found.
I've not performed a live migration about 14 times now without it failing with
this change in place.
Is this enough information for someone to figure out what the real cure should
be? I'm starting to think that shutdown_handler should not call
ctrl_if_send_response if it is a suspend request and no previous suspend
request was pending, else call ctrl_if_send_response. But I'd just be guessing.
-- Ray
-----Original Message-----
From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of Cole, Ray
Sent: Wednesday, August 31, 2005 3:25 PM
To: Steven Hand
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: RE: [Xen-users] Live migration problem
Looks like the suspend message is received in the shutdown handler.
schedule_work is called to schedule the work but, sporadically, that work is
never executed. It is as if schedule_work doesn't really schedule it or it is
unable to get executed.
-----Original Message-----
From: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx]On Behalf Of Cole, Ray
Sent: Wednesday, August 31, 2005 12:41 PM
To: Steven Hand
Cc: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: RE: [Xen-users] Live migration problem
I decided to put in some printk's into reboot.c's __do_suspend. During a
"good" live migration run I see the printk's show up on the console. In the
bad one I see that __do_suspend never gets called :-(
I'll continue to follow it up the chain to see if it never gets the message to
suspend at all or if something is going bad between getting the message and
suspending.
I'm running xen-2.0-testing with the xen-2.0 2.6.11.12-xenU kernel BTW.
-- Ray
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
<Prev in Thread] |
Current Thread |
[Next in Thread> |
- Re: [Xen-users] Live migration problem, (continued)
- Re: [Xen-users] Live migration problem, Cole, Ray
- RE: Re: [Xen-users] Live migration problem, Cole, Ray
- RE: Re: [Xen-users] Live migration problem, Cole, Ray
- RE: [Xen-users] Live migration problem, Cole, Ray
- RE: [Xen-users] Live migration problem, Cole, Ray
- RE: [Xen-users] Live migration problem, Cole, Ray
- RE: [Xen-users] Live migration problem, Cole, Ray
- RE: [Xen-users] Live migration problem, Cole, Ray
- RE: [Xen-users] Live migration problem, Cole, Ray
- RE: [Xen-users] Live migration problem,
Cole, Ray <=
|
|
|