WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] i/o scheduler deadlocks with loopback devices

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] i/o scheduler deadlocks with loopback devices
From: Nathan Gamber <ngamber@xxxxxxxxxxxxx>
Date: Wed, 20 Oct 2010 10:30:17 -0400
Delivery-date: Wed, 20 Oct 2010 07:30:46 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4CBDDAED.5070503@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4CBDDAED.5070503@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4
Oddly enough, this only occurs on Intel hardware (core i5s, xeon boxen) and not Opteron/Phenom systems.

On 10/19/10 13:52, Nathan Gamber wrote:
 Hello all,

I'm able to consistently reproduce lockups in my domU with heavy I/O with the following error:

36841.420662] INFO: task rsyslogd:15014
blocked for more than 120 seconds. [36841.420843] "echo 0>
/proc/sys/kernel/hung_task_timeout_secs" disables this message.

The task varies between any of the tasks that might be active (kjournald, loop0, etc.)

My setup is:
Xen dom0  version 3.4.2.
domU: Ubuntu 10.04, 2.6.36-rc6 based on Stefano Stabellini's v2.6.36-rc6-urgent-fixes tree.
Paravirtual disks and network interfaces.
Root filesystem on /dev/xvda3, formatted ext3, mounted with default options.
Both dom0 and domU are using the CFQ i/o scheduler.

The xvbd is based on LVM, on top of a local SATA RAID array.


To produce this, I can do one of the following:

Set up domU as a primary drbd node, with my drbd volume on top of a local loopback device, and then rsync many files to the volume, delete them, and repeat until the crash.

Mount a linux iso via loopback on a /mnt/test, rsync /mnt/test/ to another directory on xvda3, delete the files, and then repeat until the crash.

This is very similar to the following situation:

http://www.amailbox.org/mailarchive/linux-kernel/2010/9/1/4614107

Jeremy Fitzhardinge replied to that thread, indicating that his "xen: use percpu interrupts for IPIs and VIRQs" and "xen: handle events as edge-triggered" patches should fix the issue. These were introduced into 2.6.36-rc3, I believe, and the issue persists. Disabling irqbalanced in dom0, as he suggested as a workaround, has no effect. I've also tried changing the scheduler, and reducing the number of vcpus from 4 to 1, which also had no effect.

Regards,

Nathan Gamber

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>