WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] "xm save" trouble -- deadlock?

To: Gerd Knorr <kraxel@xxxxxxx>
Subject: Re: [Xen-devel] "xm save" trouble -- deadlock?
From: Ewan Mellor <ewan@xxxxxxxxxxxxx>
Date: Tue, 1 Nov 2005 18:54:56 +0000
Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 01 Nov 2005 18:52:04 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4367A2AF.2090707@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <43679B4C.3030804@xxxxxxx> <4367A2AF.2090707@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.9i
On Tue, Nov 01, 2005 at 06:15:27PM +0100, Gerd Knorr wrote:

> >xend in turn doesn't read from the pipe but is waiting for some lock:
> >
> >  master-xen root /vm/ttylinux# strace -p6567
> >  Process 6567 attached - interrupt to quit
> >  futex(0x8087370, FUTEX_WAIT, 0, NULL <unfinished ...>
> >  Process 6567 detached
> 
> Oh, xend is multithreaded:
> 
>   master-xen root /vm/ttylinux# ls /proc/6567/task
>   .  ..  6567  6568  6569  6570  6571  6581  7977
> 
> 7977 seems to be responsible for the xc_save and does this:
> 
>   master-xen root /vm/ttylinux# strace -p7977
>   Process 7977 attached - interrupt to quit
>   read(20,  <unfinished ...>
>   Process 7977 detached
> 
> fd 20 is the other end of the *stdout* pipe, whereas xc_save writes 
> stuff to *stderr*.  Hmm.  Maybe xend causes the deadlock by simply 
> reading from the wrong file handle?

The code that does this is in XendCheckpoint.py:forkHelper.  It's using
select.poll() and file.readline() to read from both the stdout and the
stderr.  This is a pretty daft thing to do -- there's definitely potential for
deadlock here.

I'll rewrite this to use a separate thread to pull the data from stderr, which
should solve the problem.

Thanks for your diagnostic efforts,

Ewan.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel