WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Begging for help with cloned OpenSuse 10.3 DomU on OpenSuse

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] Begging for help with cloned OpenSuse 10.3 DomU on OpenSuse 11.1 Dom0
From: Glen <gb2@xxxxxxx>
Date: Wed, 24 Nov 2010 08:49:31 -0800
Delivery-date: Wed, 24 Nov 2010 08:50:43 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20101121000829.GA32177@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20101121000829.GA32177@xxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.12) Gecko/20101027 Lightning/1.0b2 Thunderbird/3.1.6
Dear Xen Developers:

Please forgive me in advance, but I've exhausted every other option,
read every manual I can find, and got no response on the Xen-Users list,
so before I give up completely may I please beg for any insight any of
you might have into a strange problem I'm experiencing...

I'm experiencing some really strange behavior with an OpenSuse 10.3 guest
running in Xen.  Every 48-72 hours, the machine starts running at a very
high load average, dumping tons of messages in the message log, finally
becoming completly inaccessible.  When the guest finally becomes unusable,
the host "xm top" display shows 399% CPU utilzation, and contstant NET
and VBD activity, but the host cannot even "shutdown" the guest - I have
to destroy it to make it stop.

The host machine is a Dell Poweredge 2950 III server, running OpenSuse 11.1,
64 bit, kernel 2.6.27.45-0.1-xen, and Xen package
xen-3.3.1_18546_24-0.4.13 .
It has 20GB of RAM, a quad-core 2GHz Intel CPU, and a Dell Perc5 RAID.  It
runs other guest machines with no problem.

The guest machine is running OpenSuse 10.3, kernel 2.6.22.19-0.4-xenpae, in
32 bit mode, with Xen package xen-3.1.0_15042-51.3.

The guest machine is a clone of a running phyical machine that I'm
trying to
virtualize.  I did the creation of the drive, the attach, and so forth, on
the Xen host, then I did an rsync of the 10.3 physical machine's filesystems
onto the 11.1 host.  I removed and reinstalled the Xen kernel package as
suggested on the net, and, against even my predictions, got the guest to
boot.  And it works great... for a few days or so.

But, then, what happens is that the guest starts to go crazy.  I see rapidly
repeating messages like this start to appear in the syslog
/var/log/messages:

Nov 20 15:35:55 guestc kernel: b_state=0x00000029, b_size=4096
Nov 20 15:35:55 guestc kernel: device blocksize: 4096
Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed.
block=210137505, b
_blocknr=20676879

Occasionally these messages show up garbled, like this:

Nov 20 15:35:55 guestc kernel: __find_get_block_slow() failed.
block=21_f__f__f__
f___f__f_f_e_f_f____f_f_f_f_____f___f_f_____f__f__f_f___f__f__f_f__f__f____f__f_
f_f___f_f__f_____f__f__f__f__f_f_____f_f_f____f______f__f__f__f____f__f____f__f_
f__f___f__f___f__f__f__f_f_f__f__f____f__f____f__f___f___f__f_f___f__f__f_f_f__f
_f___f___f__f__f__f_f___f___f__f__f___f__f_e_f__f_f__f__f__f______f__f______f__f
__f__f_f___f_f___f_f_____f__f_f__f___f__f_f____f_f__f__f_f___f__f___f__f__f_f___
f__f_____f__f__f__f___state_f__f___f_f___f______f_fe___f___f_____f___f____f_____
f__f__f_f__f__f___f__f__f_____f______f__f____f_f___f_f_f____f___f__f___f____f__f
__f____f__f_____f___f_f_____f__f_____f__f__f_f_f________f___f___f_f__f__f__f__f_
f_f_____f_f_f__e_f__f___f__f__f__f_f_f___f___f___f__f__f__state=0x000000__f__f_s
tate=0x00000029, b_size=4096

And then, of course, I can't even get in to the guest at all, via network
or xm console.  xm shutdown does nothing, and I must xm destroy the guest.

After re-creating the guest, everything runs fine again, until another few
days have passed.

Today I was actually in the guest when this happened.  An rsync was running,
and that process was pegged, with the guest showing a load average of 5.0
from within the guest, and "xm top" showing a usage of 199% (2 of the 4
CPUS?)
I couldn't kill the rsync process, and the messages above were flooding into
the syslog.  The guest could not shut all the way down even with "init 0",
and, eventually, I had to destroy it again.

Here is the machine config:

name="guestc"
uuid="91919191-3676-3f68-bada-993e5adb1088"
memory=8192
maxmem=8192
vcpus=4
on_poweroff="destroy"
on_reboot="restart"
on_crash="destroy"
localtime=0
keymap="en-us"
builder="linux"
bootloader="/usr/lib/xen/boot/domUloader.py"
bootargs="--entry=xvda2:/boot/vmlinuz-xenpae,/boot/initrd-xenpae"
extra=" "
disk=[ 'file:/a/disks/guestc/disk0,xvda,w', 'phy:sdc1,sdc1,w', ]
vif=[ 'mac=00:16:3e:52:f9:96,bridge=br0', ]
vfb=['type=vnc,vncunused=1']

Now, I get that I'm doing some unorthodox things here.  Cloning a physical
machine into a virtual machine.  Running 10.3 as a guest under an 11.1 host.
A 32-bit guest on a 64-bit host.  But the thing DOES run, and I feel like
I'm SO CLOSE to making this work, so I'm really hopeful that someone can
recognize these symptoms and help me find a solution.

Is there any way this can be made to work?  Or am I totally out of luck?
 (Or just crazy to even try?)

Any ideas or guidance would be greatly appreciated!

Thank you!
Glen
Glen Barney

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel