WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] blktap and file-backed qcow: crashes and bad performance

To: Christoph Dwertmann <lists.cd@xxxxxxxxx>
Subject: Re: [Xen-users] blktap and file-backed qcow: crashes and bad performance?
From: Brian Kosick <bkosick@xxxxxxxxxxx>
Date: Fri, 11 Aug 2006 12:38:06 -0600
Cc: Xen-users@xxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 11 Aug 2006 11:38:48 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <3d9676cb0608110759j3f173553o1af5c1d0b3ab3ac5@xxxxxxxxxxxxxx>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <3d9676cb0608110759j3f173553o1af5c1d0b3ab3ac5@xxxxxxxxxxxxxx>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
No I think that we have a few weeks left in our league.   Debbie plays
2-3 nights a week, and is in 2 different leagues.

On Fri, 2006-08-11 at 16:59 +0200, Christoph Dwertmann wrote:
> Hi!
> 
> I'm running the latest Xen unstable x86_64 on a Dell Poweredge 1950
> Dual CPU Dual Core Xeon with 16GB RAM. I'm using file-backed sparse
> qcow images as root filesystems for the Xen guests. All qcow images
> are backed by the same image file (a 32bit Debian sid installation).
> The Xen disk config looks like this:
> 
> disk   = [ 'tap:qcow:/home/images/%s.%d.qcow,xvda1,w' % (vmname, vmid)]
> 
> Before that I use the qcow-create tool to create those qcow files.
> 
> I use grub to boot Xen like this:
> root    (hd0,0)
> kernel /boot/xen-3.0-unstable.gz com2=57600,8n1 console=com2
> dom0_mem=4097152 noreboot xenheap_megabytes=32
> module /boot/xen0-linux root=/dev/sda1 ro noapic console=tty0
> xencons=ttyS1 console=ttyS1
> module /boot/xen0-linux-initrd
> 
> My goal is to run 100+ Xen guests, but this seems impossible. I
> observe several things:
> 
> - after creating a few Xen guests (and even after shutting them down),
> my process list is cluttered with "tapdisk" processes that put full
> load on all 8 virtual CPUs on the dom0. The system gets unuseable.
> Killing the tapdisk processes also apparently destroys the qcow
> images.
> 
> - I (randomly?) get the messages "Error: (28, 'No space left on
> device')" or "Error: Device 0 (vif) could not be connected. Hotplug
> scripts not working." or even "Error: (12, 'Cannot allocate memory')"
> on domU creation. There is plenty of disk space and RAM available at
> that time. This mostly happens when creating more than 80 guests.
> 
> - the dom0 will sooner or later crash with a message like this:
> 
> ----------- [cut here ] --------- [please bite here ] ---------
> Kernel BUG at fs/aio.c:511
> invalid opcode: 0000 [1] SMP
> CPU 0
> Modules linked in: ipt_MASQUERADE iptable_nat ip_nat ip_conntrack
> nfnetlink ip_tables x_tables bridge dm_snapshot dm_mirror dm_mod
> usbhid ide_cd sers
> Pid: 46, comm: kblockd/0 Not tainted 2.6.16.13-xen-kasuari-dom0 #1
> RIP: e030:[<ffffffff8018f8ee>] <ffffffff8018f8ee>{__aio_put_req+39}
> RSP: e02b:ffffffff803a89c8  EFLAGS: 00010086
> RAX: 00000000ffffffff RBX: ffff8800f43d7a80 RCX: 00000000f3bdc000
> RDX: 0000000000001458 RSI: ffff8800f43d7a80 RDI: ffff8800f62d1c80
> RBP: ffff8800f62d1c80 R08: 6db6db6db6db6db7 R09: ffff88000193d000
> R10: 0000000000000000 R11: ffffffff80153e48 R12: ffff8800f62d1ce8
> R13: 0000000000000200 R14: 0000000000000000 R15: 0000000000000000
> FS:  00002b9bf01bccb0(0000) GS:ffffffff80472000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000
> Process kblockd/0 (pid: 46, threadinfo ffff8800005e4000, task 
> ffff8800005c57e0)
> Stack: ffff8800f43d7a80 ffff8800f62d1c80 ffff8800f62d1ce8 ffffffff80190082
>        ffff880004e83d10 ffff8800f4db7400 0000000000000200 ffff8800f4db7714
>        ffff8800f4db7400 0000000000000001
> Call Trace: <IRQ> <ffffffff80190082>{aio_complete+297}
>        <ffffffff80195b0b>{finished_one_bio+159}
> <ffffffff80195be8>{dio_bio_complete+150}
>        <ffffffff80195d24>{dio_bio_end_aio+32}
> <ffffffff801cf1b7>{__end_that_request_first+328}
>        <ffffffff801d00ca>{blk_run_queue+50}
> <ffffffff8800524d>{:scsi_mod:scsi_end_request+40}
>        <ffffffff880054fe>{:scsi_mod:scsi_io_completion+525}
>        <ffffffff880741ce>{:sd_mod:sd_rw_intr+598}
> <ffffffff88005792>{:scsi_mod:scsi_device_unbusy+85}
>        <ffffffff801d1534>{blk_done_softirq+175}
> <ffffffff80132544>{__do_softirq+122}
>        <ffffffff8010bada>{call_softirq+30} <ffffffff8010d231>{do_softirq+73}
>        <ffffffff8010d626>{do_IRQ+65} <ffffffff8023bf5a>{evtchn_do_upcall+134}
>        <ffffffff801d8a66>{cfq_kick_queue+0}
> <ffffffff8010b60a>{do_hypervisor_callback+30} <EOI>
>        <ffffffff801d8a66>{cfq_kick_queue+0}
> <ffffffff8010722a>{hypercall_page+554}
>        <ffffffff8010722a>{hypercall_page+554} 
> <ffffffff801dac97>{kobject_get+18}
>        <ffffffff8023b7aa>{force_evtchn_callback+10}
> <ffffffff8800641d>{:scsi_mod:scsi_request_fn+935}
>        <ffffffff801d8adc>{cfq_kick_queue+118}
> <ffffffff8013d3e6>{run_workqueue+148}
>        <ffffffff8013db18>{worker_thread+0}
> <ffffffff80140abd>{keventd_create_kthread+0}
>        <ffffffff8013dc08>{worker_thread+240}
> <ffffffff80125cdb>{default_wake_function+0}
>        <ffffffff80140abd>{keventd_create_kthread+0}
> <ffffffff80140abd>{keventd_create_kthread+0}
>        <ffffffff80140d61>{kthread+212} <ffffffff8010b85e>{child_rip+8}
>        <ffffffff80140abd>{keventd_create_kthread+0}
> <ffffffff80140c8d>{kthread+0}
>        <ffffffff8010b856>{child_rip+0}
> 
> Code: 0f 0b 68 c3 9b 2f 80 c2 ff 01 85 c0 74 07 31 c0 e9 09 01 00
> RIP <ffffffff8018f8ee>{__aio_put_req+39} RSP <ffffffff803a89c8>
>  <0>Kernel panic - not syncing: Aiee, killing interrupt handler!
>  (XEN) Domain 0 crashed: 'noreboot' set - not rebooting.
> 
> Is it just my setup or
> - does Xen not scale at all to 100+ machines?
> - does blktap not scale at all?
> - is blktap with qcow very unstable right now?
> 
> Thank you for any pointers,
> 

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>