WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] "xm save" only works once...

To: xen-users@xxxxxxxxxxxxxxxxxxx, Steven.Hand@xxxxxxxxxxxx
Subject: Re: [Xen-users] "xm save" only works once...
From: Ralph Passgang <ralph@xxxxxxxxxxxxx>
Date: Mon, 22 Aug 2005 17:19:42 +0200
Delivery-date: Mon, 22 Aug 2005 15:17:56 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <E1E5wPK-00015M-00@xxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <E1E5wPK-00015M-00@xxxxxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: KMail/1.8.1
Am Freitag, 19. August 2005 04:14 schrieb Steven Hand:
> >Am Montag, 15. August 2005 23:29 schrieb Anthony Liguori:
> >> Steven Hand wrote:
> >> >>I am using Xen-2.0.7 on a Dual Intel Xeon 2.8GHz system with 4GB of
> >> >> ram. I am using 2.6.11 as kernel for my domain 0. Domain 0 uses
> >> >> Debian Sarge with a backported Xen 2.0.7 package (only litte changes
> >> >> to the debian 2.0.6 package, nothing important enough to get
> >> >> metioned). All kernels were compiled against vanilla kernels with
> >> >> xen-patch. The domain U's are using 2.6.11 or 2.4.30 (debian, suse).
> >> >>
> >> >>I have no problems within domains and everything is running very
> >> >> smoothly, exepct one thing (which was also not working correctly in
> >> >> xen-2.0.6 for me): I can save a domain with "xm save <domainname>
> >> >> <suspendfile>" once and I can restore this domain again, but if I try
> >> >> a second "xm save ..." it simply seems to hang. Nothing happens and
> >> >> the last thing in the logs are these lines:
> >> >
> >> >Is this the same with both 2.4 and 2.6 domUs? I've noticed something
> >> > similar with 2.0.7 but only with 2.4 domUs ... it would be useful to
> >> > know if it affects 2.6 also - I'm trying to track it down.
> >
> >yes, it's the same with 2.4 and 2.6 domUs...
> >
> >> There's a very similiar problem in 3.0 that has to do with a race
> >> condition with the xc_save/Xend interaction.  xc_save thinks it has sent
> >> the "suspend" command over the pipe and Xend is waiting for it to
> >> arrive.
> >
> >... but after some more testing I noticed another interessting thing. "xm
> >save" hangs if the suspend file doesn't exist. For the first time after a
> >dom0 reboot it's normaly no problem, but if I delete the file and try a
> > "xm save" again it will not work for 95%.
> >
> >If I keep the save-file and then make a "xm save" and a "xm restore" it
> > seems to be no problem. I made 10 tests and all worked.
>
> Fix attached below - it's actually nothing to do with whether the file
> exists or not. Rather the problem is that on occasion xfrd sends a response
> and a request in the same 'message', and Xend only deals with the first.
>
> The below fixes this for me - please let me know if it works for you,

I can't test it right now, because the server is in production use now. I have 
to schedule a maintaince window to reboot the system (and that is needed if 
the problem is not fixed and a "xm save" crashes.

I tried to reproduce the bug on my notebook and on a normal desktop pc, but 
there I haven't any problems with "xm save" at all.

The only difference between my notebook/desktop system and the production 
system is that the production system is a smp system (2x xeon cpu's) with 
hyperthreading enabled.

And there was definitly a difference if I delete the file everytime before I 
make a "xm save" or not. I am not saying that the bug has something to do 
with the file itself, but maybe it just triggers the error (because creating 
a file takes longer than overwriting?!?). Maybe thats why the problem exists 
once and the next time not.

I let you know if I could test the patch on the production system (or another 
smp/ht system), but that can take some more days... sorry.

thx for your help,
--Ralph

>
> cheers,
>
> S.
>
>
>
> diff -r 973a2d3c7a63 tools/python/xen/xend/XendMigrate.py
> --- a/tools/python/xen/xend/XendMigrate.py      Wed Aug  3 23:24:27 2005
> +++ b/tools/python/xen/xend/XendMigrate.py      Thu Aug 18 19:14:42 2005
> @@ -54,7 +54,7 @@
>
>      def dataReceived(self, data):
>          self.parser.input(data)
> -        if self.parser.ready():
> +        while(self.parser.ready()):
>              val = self.parser.get_val()
>              self.xinfo.dispatch(self, val)
>          if self.parser.at_eof():
>
>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users