|
|
|
|
|
|
|
|
|
|
xen-users
Re: [Xen-users] "xm save" only works once...
Am Freitag, 19. August 2005 04:14 schrieb Steven Hand:
> >Am Montag, 15. August 2005 23:29 schrieb Anthony Liguori:
> >> Steven Hand wrote:
> >> >>I am using Xen-2.0.7 on a Dual Intel Xeon 2.8GHz system with 4GB of
> >> >> ram. I am using 2.6.11 as kernel for my domain 0. Domain 0 uses
> >> >> Debian Sarge with a backported Xen 2.0.7 package (only litte changes
> >> >> to the debian 2.0.6 package, nothing important enough to get
> >> >> metioned). All kernels were compiled against vanilla kernels with
> >> >> xen-patch. The domain U's are using 2.6.11 or 2.4.30 (debian, suse).
> >> >>
> >> >>I have no problems within domains and everything is running very
> >> >> smoothly, exepct one thing (which was also not working correctly in
> >> >> xen-2.0.6 for me): I can save a domain with "xm save <domainname>
> >> >> <suspendfile>" once and I can restore this domain again, but if I try
> >> >> a second "xm save ..." it simply seems to hang. Nothing happens and
> >> >> the last thing in the logs are these lines:
> >> >
> >> >Is this the same with both 2.4 and 2.6 domUs? I've noticed something
> >> > similar with 2.0.7 but only with 2.4 domUs ... it would be useful to
> >> > know if it affects 2.6 also - I'm trying to track it down.
> >
> >yes, it's the same with 2.4 and 2.6 domUs...
> >
> >> There's a very similiar problem in 3.0 that has to do with a race
> >> condition with the xc_save/Xend interaction. xc_save thinks it has sent
> >> the "suspend" command over the pipe and Xend is waiting for it to
> >> arrive.
> >
> >... but after some more testing I noticed another interessting thing. "xm
> >save" hangs if the suspend file doesn't exist. For the first time after a
> >dom0 reboot it's normaly no problem, but if I delete the file and try a
> > "xm save" again it will not work for 95%.
> >
> >If I keep the save-file and then make a "xm save" and a "xm restore" it
> > seems to be no problem. I made 10 tests and all worked.
>
> Fix attached below - it's actually nothing to do with whether the file
> exists or not. Rather the problem is that on occasion xfrd sends a response
> and a request in the same 'message', and Xend only deals with the first.
>
> The below fixes this for me - please let me know if it works for you,
I can't test it right now, because the server is in production use now. I have
to schedule a maintaince window to reboot the system (and that is needed if
the problem is not fixed and a "xm save" crashes.
I tried to reproduce the bug on my notebook and on a normal desktop pc, but
there I haven't any problems with "xm save" at all.
The only difference between my notebook/desktop system and the production
system is that the production system is a smp system (2x xeon cpu's) with
hyperthreading enabled.
And there was definitly a difference if I delete the file everytime before I
make a "xm save" or not. I am not saying that the bug has something to do
with the file itself, but maybe it just triggers the error (because creating
a file takes longer than overwriting?!?). Maybe thats why the problem exists
once and the next time not.
I let you know if I could test the patch on the production system (or another
smp/ht system), but that can take some more days... sorry.
thx for your help,
--Ralph
>
> cheers,
>
> S.
>
>
>
> diff -r 973a2d3c7a63 tools/python/xen/xend/XendMigrate.py
> --- a/tools/python/xen/xend/XendMigrate.py Wed Aug 3 23:24:27 2005
> +++ b/tools/python/xen/xend/XendMigrate.py Thu Aug 18 19:14:42 2005
> @@ -54,7 +54,7 @@
>
> def dataReceived(self, data):
> self.parser.input(data)
> - if self.parser.ready():
> + while(self.parser.ready()):
> val = self.parser.get_val()
> self.xinfo.dispatch(self, val)
> if self.parser.at_eof():
>
>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|
|
|
|
|