WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] can't create any more pv-on-hvm domains after~38under3.3

To: "Steve Ofsthun" <sofsthun@xxxxxxxxxxxxxxx>, "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxxx>
Subject: RE: [Xen-devel] can't create any more pv-on-hvm domains after~38under3.3-testing
From: "Ian Pratt" <Ian.Pratt@xxxxxxxxxxxxx>
Date: Wed, 3 Dec 2008 17:23:02 -0000
Cc: Ian Pratt <Ian.Pratt@xxxxxxxxxxxxx>, James Harper <james.harper@xxxxxxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Wed, 03 Dec 2008 09:23:59 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4936BBA5.80600@xxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C55C1F6F.77A%keir.fraser@xxxxxxxxxxxxx> <4936BBA5.80600@xxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AclVaQjcV5LlytChQRSmTvXPYE53AQAAoE8g
Thread-topic: [Xen-devel] can't create any more pv-on-hvm domains after~38under3.3-testing
> In our case, the foreign pages actually originate from blkback, are
> passed to iSCSI for processing, and are abused by the ref manipulation
> in the dom0 network stack.  On return to blkback, the page refs are
> off.  What we haven't been able to do yet, is identify the exact
> circumstances that trigger the issue.  We have a fairly elaborate
> reproducer involving running a pool of domains and continuously
> rebooting them.  Eventually, one domain will hang on exit with a stuck
> page with elevated ref counts.
> 
> In our case, the stuck page is always a blkback I/O page.
> 
> Running the same test on a FC SAN or local SCSI backend device doesn't
> hang.


I'd be inclined to investigate this by hacking the start_xmit function
of the NIC driver to randomly corrupt 1 in 100 packets. That's usually a
good way of exercising some of the darker corners of the networking
stack. (Better than creating a netfilter DROP rule).

Ian




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel