WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] soft lockups during live migrate..

On Fri, 23 Oct 2009 15:16:51 -0700
Mukesh Rathor <mukesh.rathor@xxxxxxxxxx> wrote:

> On Fri, 23 Oct 2009 11:09:36 +0100
> Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:
> 
> > At 05:21 +0100 on 23 Oct (1256275309), Mukesh Rathor wrote:
> > > Trying to migrate a 64bit PV guest with 64GB running medium to
> > > heavy load on xen 3.4.0, it is showing lot of soft lockups. The
> > > softlockups are causing dom0 reboot by the cluster FS. The
> > > hardware has 256GB and 32 CPUs.
> > > 
> > > Looking into the hypervisor thru kdb, I see one cpu in
> > > sh_resync_all() while all other 31 appear spinning on the
> > > shadow_lock.
> > 
> > How many vcpus does the guest have?  Scalability issues in the OOS
> > shadow code are more related to number of VCPUs than amount of RAM.
> 
> Actually, things are fine with 32GB/32vcpus. Problem happens with
> 64GB/32vcpus. Trying the unstable version now.

Nah, with c/s 20365 and oos=0 in vm.cfg, it fails right away:

[root@OVM_EL5U3_X86_64_PVM_4GB]# xm migrate -l 3 vega7183
Error: /usr/lib/xen/bin/xc_save 82 3 0 0 1 failed


On source xend.log:

[2009-09-23 16:22:33 16993] DEBUG (balloon:181) Balloon: 199147540 KiB free; 
need 16384; done.
[2009-09-23 16:22:34 16993] DEBUG (XendCheckpoint:110) [xc_save]: 
/usr/lib/xen/bin/xc_save 82 3 0 0 1
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) xc_save: failed to get 
the suspend evtchn port
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) 
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) ERROR Internal error: 
xc_get_m2p_mfns
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) ERROR Internal error: 
Failed to map live M2P table
[2009-09-23 16:22:34 16993] INFO (XendCheckpoint:418) Save exit rc=1
[2009-09-23 16:22:34 16993] ERROR (XendCheckpoint:164) Save failed on domain 
OVM_EL5U3_X86_64_PVM_4GB (3) - resuming.


on TARGET looks pretty screwy:

domain', ['domid', '3'], ['on_crash', 'restart'], ['uuid', 
'b990db11-57f4-a553-5ee0-c022234f3dd5'], ['bootloader_args', '-q'], ['vcpus', 
'32'], ['name', 'OVM_EL5U3_X86_64_PVM_4GB'], ['on_poweroff', 'destroy'], 
['on_reboot', 'restart'], ['cpus', [['0', '1', '2', '3', '4', '5', '6', '7', 
'8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', 
'21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31'], ['0', '1', 
'2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', 
'16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', 
'29', '30', '31'], ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', 
'11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', 
'24', '25', '26', '27', '28', '29', '30', '31'], ['0', '1', '2', '3', '4', '5', 
'6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', 
'20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31'], ['0', 
'!
 
 1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', 
'15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', 
'28', '29', '30', '31'], ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 
'10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', 
'23', '24', '25', '26', '27', '28', '29', '30', '31'], ['0', '1', '2', '3', 
'4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', 
'18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '30', 
'31'], ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', 
'13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', 
'26', '27', '28', '29', '30', '31'] .......
..........

[2009-10-23 16:21:15 11795] DEBUG (image:319) No VNC passwd configured for vfb 
access
[2009-10-23 16:21:15 11795] DEBUG (XendCheckpoint:261) restore:shadow=0x0, 
_static_max=0xfa0000000, _static_min=0x0,
[2009-10-23 16:21:15 11795] DEBUG (balloon:181) Balloon: 264942044 KiB free; 
need 65536000; done.
[2009-10-23 16:21:15 11795] DEBUG (XendCheckpoint:278) [xc_restore]: 
/usr/lib/xen/bin/xc_restore 4 3 1 2 0 0 0
[2009-10-23 16:21:15 11795] INFO (XendCheckpoint:418) ERROR Internal error: 
read: p2m_size
[2009-10-23 16:21:15 11795] INFO (XendCheckpoint:418) Restore exit with rc=1
[2009-10-23 16:21:15 11795] DEBUG (XendDomainInfo:2748) XendDomainInfo.destroy: 
domid=3
[2009-10-23 16:21:15 11795] ERROR (XendDomainInfo:2762) XendDomainInfo.destroy: 
domain destruction failed.
Traceback (most recent call last):
  File "/usr/lib/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 
2755, in destroy







_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel