WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] DomU crashing in CPU hotplug after migration

To: xen-users@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-users] DomU crashing in CPU hotplug after migration
From: Tim Evers <it@xxxxxxxxxx>
Date: Sun, 13 Nov 2011 10:42:44 +0100
Delivery-date: Sun, 13 Nov 2011 01:43:40 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1
I have the following setup:

2 Dom0 running Debian Lenny Kernel 2.6.32.21/64bit ontop Xen 4.0.1 HV.
Storage is iSCSI (Equallogic). Dom0 Hardware is Dell M610 with 5640 CPUs.

I'am trying to implement migration and cpu/memory hotplug in kernel
2.6.32.48/32bit PAE running in a Debian 6 domU. I can hotplug and
-remove CPU and RAM without problems (after adding udev rules for taking
hotplugged CPUs online) after createion of the cdomU, but if I do a
migration from one dom0 to the other and hotplug cpus afterwards via xm
vcpu-set the domU crashes with an error similar to this:

[49525.372432] installing Xen timer for CPU 1
[49525.372469] SMP alternatives: switching to SMP code
[49451.900582] Initializing CPU#1
[2575388.916907] CPU: L1 I cache: 32K, L1 D cache: 32K
[2575388.916907] CPU: L2 cache: 256K
[2575388.916907] CPU: L3 cache: 12288K
[2575388.916907] CPU: Unsupported number of siblings 32
[2575388.944891] BUG: soft lockup - CPU#0 stuck for 2352393s! [bash:7130]
[2575388.944900] Modules linked in: iptable_filter ip_tables x_tables
nfsd exportfs nfs lockd fscache nfs_acl auth_rpcgss sunrpc loop evdev
snd_pcm snd_timer snd soundcore snd_page_alloc pcspkr ext3 jbd mbcache
raid1 md_mod xen_netfront xen_blkfront
[2575388.944978]
[2575388.944985] Pid: 7130, comm: bash Not tainted (2.6.32.48 #5)
[2575388.944993] EIP: 0061:[<c1002227>] EFLAGS: 00000246 CPU: 0
[2575388.945004] EIP is at hypercall_page+0x227/0x1001
[2575388.945011] EAX: 00040000 EBX: 00000000 ECX: 00000000 EDX: cf235500
[2575388.945020] ESI: 7fffffff EDI: cf235500 EBP: d5d1be6c ESP: d5d1be00
[2575388.945028]  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069
[2575388.945036] CR0: 8005003b CR2: 083e27d4 CR3: 1fbb9000 CR4: 00002660
[2575388.945046] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[2575388.945054] DR6: ffff0ff0 DR7: 00000400
[2575388.945060] Call Trace:
[2575388.945071]  [<c100570c>] ? xen_force_evtchn_callback+0xc/0x10
[2575388.945082]  [<c1005d58>] ? check_events+0x8/0xc
[2575388.945092]  [<c1005d17>] ? xen_irq_enable_direct_end+0x0/0x1
[2575388.945104]  [<c127666d>] ? wait_for_common+0xac/0x113
[2575388.945116]  [<c10324fe>] ? default_wake_function+0x0/0x8
[2575388.945127]  [<c1047b16>] ? synchronize_sched+0x3e/0x43
[2575388.945137]  [<c1047b1b>] ? wakeme_after_rcu+0x0/0x9
[2575388.945146]  [<c102f02f>] ? free_rootdomain+0x8/0x18
[2575388.945155]  [<c102f5ee>] ? cpu_attach_domain+0x11f/0x159
[2575388.945165]  [<c102ec07>] ? sd_free_ctl_entry+0x35/0x3e
[2575388.945176]  [<c10b4175>] ? kfree+0xa5/0xaa
[2575388.945185]  [<c102fe17>] ? partition_sched_domains+0xed/0x257
[2575388.945195]  [<c10324f4>] ? try_to_wake_up+0x282/0x28c
[2575388.945206]  [<c1068cf4>] ? cpuset_track_online_cpus+0x6b/0x77
[2575388.945217]  [<c127919c>] ? notifier_call_chain+0x23/0x46
[2575388.945227]  [<c104cb7c>] ? raw_notifier_call_chain+0x9/0xc
[2575388.945237]  [<c1273efb>] ? _cpu_up+0xba/0xf8
[2575388.945246]  [<c1273f7d>] ? cpu_up+0x44/0x52
[2575388.945256]  [<c1267d32>] ? store_online+0x37/0x54
[2575388.945265]  [<c1267cfb>] ? store_online+0x0/0x54
[2575388.945275]  [<c11bc195>] ? sysdev_store+0x19/0x1d
[2575388.945285]  [<c10f8774>] ? sysfs_write_file+0xb8/0xe5
[2575388.945295]  [<c10f86bc>] ? sysfs_write_file+0x0/0xe5
[2575388.945305]  [<c10b9d0c>] ? vfs_write+0x7f/0xda
[2575388.945314]  [<c10b9dfa>] ? sys_write+0x3c/0x60
[2575388.945324]  [<c100801b>] ? sysenter_do_call+0x12/0x28

after I set the cpu1 online by issuing

echo 1 > /sys/devices/system/cpu/cpu1/online

either by udev script or by hand.

I've searched the net up and down and tried various acpi and timer
settings but found nothing which has impact on this error. The error
also appears with stock Debian 6 kernel 2.6.32-5-*.

Any idea anyone?

regards

tim

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users