WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] with heavy VM IO, clocksource causes random dom0 reboots

To: "xen-users@xxxxxxxxxxxxxxxxxxx" <xen-users@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-users] with heavy VM IO, clocksource causes random dom0 reboots
From: Benjamin Weaver <benjamin.weaver@xxxxxxxxxxxxx>
Date: Mon, 29 Aug 2011 13:37:37 +0100
Delivery-date: Mon, 29 Aug 2011 05:36:45 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110617 Thunderbird/3.1.11
On Debian Squeeze (2.6.32-5) running Xen 4.0, I have created 2 Ubuntu Lucid Lynx (Ubuntu 10.04) vms. The vms, in a stress test, pass a large file between them via nfs file sharing. A previous entry in this forum helped to establish that some ethernet cards improve VM IO performance.

However, our box installed with better intel nics is still rebooting under heavy VM IO loads. The kernel call trace is copied below. Note the first line, a reference to PV clocksource

note: these bug notices

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=603727

and this suggestion page:

http://wiki.xensource.com/xenwiki/xenpm#head-253cbbe6cf12fa31e10022610cd7090aa980921f


These pages, (the last, in particular) describe IO jitter related to clocksource. This would be consistent with the problems with interrupts we have been experiencing.


1. I have been told to set clocksource=pit on dom0. How and where, in Squeeze is this clocksource parameter set? At compile time, or at boot time? And if at boot time, in /etc/default/grub?

How exactly, if in /etc/default/grub, is this parameter set. Lines in that file begin with capital letters:

GRUB_TIMEOUT=5
GRUB_CMDLINE_XEN="com1=9600,8n1 console=com1,vga noreboot"
GRUB_CMDLINE_LINUX="console=tty0 console=hvc0"


2. Is a package necessary for pit ?

3. should clocksource = pit be set on domUs as well?







Aug 29 06:28:53 vm2 kernel: [53400.204119] updatedb.mloc D 0000000000000000 0 3605 3601 0x00000000 Aug 29 06:28:53 vm2 kernel: [53400.204125] ffff8802ed071530 0000000000000286 0000000000000000 ffff88009babd260 Aug 29 06:28:53 vm2 kernel: [53400.204132] ffff8802e95f0000 0000000000000088 000000000000f9e0 ffff88009e399fd8 Aug 29 06:28:53 vm2 kernel: [53400.204138] 0000000000015780 0000000000015780 ffff8802e95e1530 ffff8802e95e1828
Aug 29 06:28:53 vm2 kernel: [53400.204144] Call Trace:
Aug 29 06:28:53 vm2 kernel: [53400.204154] [<ffffffff8102ddcc>] ? pvclock_clocksource_read+0x3a/0x8b Aug 29 06:28:53 vm2 kernel: [53400.204160] [<ffffffff8110f19a>] ? sync_buffer+0x0/0x40 Aug 29 06:28:53 vm2 kernel: [53400.204166] [<ffffffff8130c16a>] ? io_schedule+0x73/0xb7 Aug 29 06:28:53 vm2 kernel: [53400.204169] [<ffffffff8110f1d5>] ? sync_buffer+0x3b/0x40 Aug 29 06:28:53 vm2 kernel: [53400.204174] [<ffffffff8130d42a>] ? _spin_unlock_irqrestore+0xd/0xe Aug 29 06:28:53 vm2 kernel: [53400.204178] [<ffffffff8130c677>] ? __wait_on_bit+0x41/0x70 Aug 29 06:28:53 vm2 kernel: [53400.204181] [<ffffffff8110f19a>] ? sync_buffer+0x0/0x40 Aug 29 06:28:53 vm2 kernel: [53400.204185] [<ffffffff8130c711>] ? out_of_line_wait_on_bit+0x6b/0x77 Aug 29 06:28:53 vm2 kernel: [53400.204190] [<ffffffff81065f34>] ? wake_bit_function+0x0/0x23 Aug 29 06:28:53 vm2 kernel: [53400.204202] [<ffffffffa04bf824>] ? ocfs2_read_blocks+0x55d/0x6c2 [ocfs2] Aug 29 06:28:53 vm2 kernel: [53400.204208] [<ffffffff8100eccf>] ? xen_restore_fl_direct_end+0x0/0x1 Aug 29 06:28:53 vm2 kernel: [53400.204217] [<ffffffffa04db9a1>] ? ocfs2_validate_inode_block+0x0/0x1ab [ocfs2] Aug 29 06:28:53 vm2 kernel: [53400.204226] [<ffffffffa04db58c>] ? ocfs2_read_inode_block_full+0x37/0x51 [ocfs2] Aug 29 06:28:53 vm2 kernel: [53400.204235] [<ffffffffa04d135b>] ? ocfs2_inode_lock_atime+0x73/0x23f [ocfs2] Aug 29 06:28:53 vm2 kernel: [53400.204243] [<ffffffffa04c8564>] ? ocfs2_dir_foreach_blk+0x48/0x435 [ocfs2] Aug 29 06:28:53 vm2 kernel: [53400.204248] [<ffffffff810fc34c>] ? filldir+0x0/0xb7 Aug 29 06:28:53 vm2 kernel: [53400.204257] [<ffffffffa04d13d5>] ? ocfs2_inode_lock_atime+0xed/0x23f [ocfs2] Aug 29 06:28:53 vm2 kernel: [53400.204261] [<ffffffff810fc34c>] ? filldir+0x0/0xb7 Aug 29 06:28:53 vm2 kernel: [53400.204269] [<ffffffffa04c9af3>] ? ocfs2_readdir+0x161/0x1d0 [ocfs2] Aug 29 06:28:53 vm2 kernel: [53400.204273] [<ffffffff810fc34c>] ? filldir+0x0/0xb7 Aug 29 06:28:53 vm2 kernel: [53400.204277] [<ffffffff810fc51c>] ? vfs_readdir+0x75/0xa7 Aug 29 06:28:53 vm2 kernel: [53400.204281] [<ffffffff810fc686>] ? sys_getdents+0x7a/0xc7 Aug 29 06:28:53 vm2 kernel: [53400.204285] [<ffffffff81011b42>] ? system_call_fastpath+0x16/0x1b

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>