Hi folks,
I wanted to do a stress test on a domU (Debian 3.1, 2.6.11.12-xenU).
running on dom0 (Debian 3.1, 2.6.11.12-xen0), Xen-2.0.7, IA32.
Similar problems occured to me running Xen-devel from last week. Since it is
under heavy development, I thought testing it with Xen-2.0.7 stable should
work for sure.
Scenario:
2 of theses hosts (IPs: 192.168.112.17 and 192.168.112.18) are running a
script.
The script checks if the domain is on the host, if yes, it will be migrated to
the other one, if no, the script will sleep for a few minutes.
It's kind of a domU-pingpong.
domU has a file-backed VBD lying on a 3rd host (NFS Server).
It worked for several migrations, but suddenly, the domU died.
Hopefully, someone could help me find out what the problem is.
script pingpong:
----------------------------------------------------------------------------------
SCRIPTNAME=`basename $0`
HOSTNAME=`hostname`
LOGFILE=$SCRIPTNAME.$HOSTNAME.log
XENHOST1="192.168.112.17"
XENHOST2="192.168.112.18"
XENGUEST1="debian1"
# determine the xen target host
MYIP=`ifconfig eth0 | grep "inet" | awk '{print $2}' | cut -d: -f 2`
if [ $MYIP = $XENHOST1 ]
then
TARGETHOST=$XENHOST2
else
TARGETHOST=$XENHOST1
fi
log() {
echo `date '+[%y/%m/%d %H:%M:%S]'` - $1 >> $LOGFILE
}
while [ 1 -eq 1 ]
do
RUNNINGGUESTS=`xm list | grep -v Name | grep -v Domain-0 | awk '{print
$1}'`
if `echo $RUNNINGGUESTS | grep $XENGUEST1 > /dev/null`
then
XENGUESTID=`xm list | grep $XENGUEST1 | awk '{print $2}'`
log "domain $XENGUEST1 (ID=$XENGUESTID) is running here"
if `xm list | grep $XENGUEST1 | awk '{print $5}' | grep p > /dev/null`
then
log "domain $XENGUEST1 is currently paused and hence cannot be
migrated!"
else
log "START migration of $XENGUEST1 to $TARGETHOST..."
if `xm migrate -l $XENGUEST1 $TARGETHOST`
then
log "END migration of domain $XENGUEST1 successfull!"
else
log "ERROR while migrating $XENGUEST1!"
fi
fi
else
log "domain $XENGUEST1 is not here. I'll go back to sleep."
fi
sleep `expr $RANDOM % 1200 + 120`
done
-----------------------------------------------------------------------------------------
logfile snippets of the last migration (from 192.168.112.18 to 192.168.112.17:
192.168.112.17:
--------------------
xend.log:
------------
[2005-11-28 01:14:30 xend] DEBUG (blkif:203) Connecting blkif to event channel
<BlkifBackendInterface 24 0> ports=16:4
[2005-11-28 01:14:30 xend] DEBUG (XendDomain:244) XendDomain>reap> domain died
name=debian1 id=24
[2005-11-28 01:14:30 xend] INFO (XendDomain:568) Destroying domain:
name=debian1
[2005-11-28 01:14:30 xend] DEBUG (XendDomainInfo:665) Destroying vifs for
domain 24
[2005-11-28 01:14:30 xend] DEBUG (netif:305) Destroying vif domain=24 vif=0
[2005-11-28 01:14:30 xend] DEBUG (XendDomainInfo:674) Destroying vbds for
domain 24
[2005-11-28 01:14:30 xend] DEBUG (blkif:552) Destroying blkif domain=24
[2005-11-28 01:14:30 xend] DEBUG (blkif:408) Destroying vbd domain=24 idx=0
[2005-11-28 01:14:30 xend] DEBUG (blkif:408) Destroying vbd domain=24 idx=1
[2005-11-28 01:14:30 xend] DEBUG (XendDomainInfo:634) Closing console, domain
24
[2005-11-28 01:14:30 xend] DEBUG (XendDomainInfo:622) Closing channel to
domain 24
[2005-11-28 01:14:30 xend] INFO (XendRoot:113) EVENT> xend.virq 4
[2005-11-28 01:14:30 xend] DEBUG (blkif:363) Unbinding vbd (type file)
from /dev/loop0
[2005-11-28 01:14:30 xend] DEBUG (blkif:363) Unbinding vbd (type file)
from /dev/loop1
[2005-11-28 01:14:30 xend] INFO (XendRoot:113) EVENT> xend.domain.exit
['debian1', '24', 'crash']
[2005-11-28 01:14:30 xend] INFO (XendRoot:113) EVENT> xend.domain.destroy
['debian1', '24']
[2005-11-28 01:14:31 xend] INFO (XendRoot:113) EVENT> xend.domain.died
['debian1', '24']
---------------------
xfrd.log:
------------
3488 [INF] XFRD> Forked child pid=5047
3488 [INF] XFRD> Accepted connection from 192.168.112.18:1777 on 2
5069 [INF] XFRD> Xfr service for 192.168.112.18:1777
[DEBUG] Conn_init> flags=1
[DEBUG] Conn_init> write stream...
[DEBUG] stream_init>mode=w flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_init> read stream...
[DEBUG] stream_init>mode=r flags=1 compress=0
[DEBUG] stream_init> unbuffer...
[DEBUG] stream_init< err=0
[DEBUG] Conn_sxpr>
(xfr.hello 1 0)[DEBUG] Conn_sxpr< err=0
[DEBUG] Conn_sxpr>
(xfr.xfr 16)[DEBUG] Conn_sxpr< err=0
[1133136756.095291] xc_linux_restore start
xc_linux_restore start
[1133136756.108828] Created domain 24
Created domain 24
(Domain-0 Domain-24)'domain id=24 name=debian1 memory=256 console=9624
image=/boot/vmlinuz-2.6.11.12-xenU'[1133136756.570952] Reloading memory
pages: 0%
Reloading memory pages: 6%
337%[1133136870.173691] Received all pages
Received all pages
100%
100%
[1133136870.183801] Memory reloaded.
Memory reloaded.
Decreased reservation by 7 pages
[1133136870.184758] Domain ready to be built.
Domain ready to be built.
[1133136870.184844] Domain ready to be unpaused
Domain ready to be unpaused
[1133136870.185234] DOM=24
DOM=24
[DEBUG] Conn_sxpr>
(xfr.err 0)[DEBUG] Conn_sxpr< err=0
[DEBUG] Conn_sxpr>
(xfr.err 0)[DEBUG] Conn_sxpr< err=0
5069 [INF] XFRD> Transfer complete in 114 seconds
5069 [INF] XFRD> Xfr service err=0
192.168.112.18:
----------------------------------------------
xend.log:
------------
[2005-11-28 01:14:16 xend] DEBUG (XendDomain:487) domain_restart_schedule> 16
suspend 1
[2005-11-28 01:14:16 xend] INFO (XendRoot:113) EVENT> xend.domain.shutdown
['debian1', '16', 'suspend']
[2005-11-28 01:14:16 xend] DEBUG (XendDomain:244) XendDomain>reap> domain died
name=debian1 id=16
[2005-11-28 01:14:16 xend] DEBUG (XendDomain:247) XendDomain>reap> shutdown
id=16 reason=suspend
[2005-11-28 01:14:16 xend] INFO (XendRoot:113) EVENT> xend.virq 4
[2005-11-28 01:14:16 xend] INFO (XendRoot:113) EVENT> xend.domain.suspended
['debian1', '16']
[2005-11-28 01:14:25 xend] INFO (XendDomain:568) Destroying domain:
name=debian1
[2005-11-28 01:14:25 xend] DEBUG (XendDomainInfo:665) Destroying vifs for
domain 16
[2005-11-28 01:14:25 xend] DEBUG (netif:305) Destroying vif domain=16 vif=0
[2005-11-28 01:14:26 xend] DEBUG (XendDomainInfo:674) Destroying vbds for
domain 16
[2005-11-28 01:14:26 xend] DEBUG (blkif:552) Destroying blkif domain=16
[2005-11-28 01:14:26 xend] DEBUG (blkif:408) Destroying vbd domain=16 idx=0
[2005-11-28 01:14:26 xend] DEBUG (blkif:408) Destroying vbd domain=16 idx=1
[2005-11-28 01:14:26 xend] DEBUG (XendDomainInfo:634) Closing console, domain
16
[2005-11-28 01:14:26 xend] DEBUG (XendDomainInfo:622) Closing channel to
domain 16
[2005-11-28 01:14:26 xend] INFO (XendRoot:113) EVENT> xend.domain.destroy
['debian1', '16']
[2005-11-28 01:14:26 xend] DEBUG (blkif:363) Unbinding vbd (type file)
from /dev/loop0
[2005-11-28 01:14:26 xend] DEBUG (blkif:363) Unbinding vbd (type file)
from /dev/loop1
[2005-11-28 01:14:27 xend] INFO (XendDomain:568) Destroying domain:
name=debian1
[2005-11-28 01:14:27 xend] DEBUG (XendDomainInfo:634) Closing console, domain
16
[2005-11-28 01:14:27 xend] ERROR (XendDomainInfo:627) Domain destroy failed:
debian1
Traceback (most recent call last):
File
"/usr/src/xen-2.0/dist/install/usr/lib/python/xen/xend/XendDomainInfo.py",
line 625, in destroy_domain
return xc.domain_destroy(dom=self.dom)
error: (3, 'No such process')
[2005-11-28 01:14:27 xend] INFO (XendRoot:113) EVENT> xend.domain.destroy
['debian1', '16']
[2005-11-28 01:14:27 xend] INFO (XendMigrate:345) Migrate OK: ['migrate',
['id', '16'], ['state', 'ok'], ['live', 1], ['resource', 0], ['src', ['host',
'testpc-018'], ['domain', '16']], ['dst', ['host', '192.168.112.17'],
['domain', 24]]]
[2005-11-28 01:14:27 xend] INFO (XendRoot:113) EVENT> xend.domain.died
['debian1', '16']
[2005-11-28 01:14:27 xend] INFO (XendRoot:113) EVENT> xend.domain.migrate
['debian1', '16', 'ok', ['migrate', ['id', '16'], ['state', 'ok'], ['live',
1], ['resource', 0], ['src', ['host', 'testpc-018'], ['domain', '16']],
['dst', ['host', '192.168.112.17'], ['domain', 24]]]]
xfrd.log:
------------
1: sent 37192, skipped 28339, delta 27582ms, dom0 44%, target 4%, sent
44Mb/s, dirtied 59Mb/s 49812 pages
[1133136779.371602] Saving memory pages: iter 2 0%
2: sent 38822, skipped 10990, %
96%
2: sent 38822, skipped 10990, delta 18590ms, dom0 33%, target 5%, sent
68Mb/s, dirtied 50Mb/s 28781 pages
[1133136797.962478] Saving memory pages: iter 3 0%
3: sent 26495, skipped 2285, 7%
92%
3: sent 26495, skipped 2285, delta 9687ms, dom0 36%, target 8%, sent 89Mb/s,
dirtied 62Mb/s 18477 pages
[1133136807.650291] Saving memory pages: iter 4 0%
4: sent 16933, skipped 1544, 5%
96%
4: sent 16933, skipped 1544, delta 6007ms, dom0 38%, target 9%, sent 92Mb/s,
dirtied 63Mb/s 11599 pages
[1133136813.658026] Saving memory pages: iter 5 0%
5: sent 10261, skipped 1338, 8%
99%
5: sent 10261, skipped 1338, delta 3642ms, dom0 39%, target 9%, sent 92Mb/s,
dirtied 72Mb/s 8112 pages
[1133136817.300454] Saving memory pages: iter 6 0%
6: sent 6907, skipped 1205, 11%
87%
6: sent 6907, skipped 1205, delta 2451ms, dom0 36%, target 17%, sent 92Mb/s,
dirtied 82Mb/s 6143 pages
[1133136819.752019] Saving memory pages: iter 7 0%
7: sent 5127, skipped 1011, 16%
99%
7: sent 5127, skipped 1011, delta 1849ms, dom0 47%, target 13%, sent 90Mb/s,
dirtied 173Mb/s 9795 pages
[1133136821.601429] Saving memory pages: iter 8 0%
8: sent 8308, skipped 1487, 10%
98%
8: sent 8308, skipped 1487, delta 4210ms, dom0 35%, target 7%, sent 64Mb/s,
dirtied 86Mb/s 11162 pages
[1133136825.811665] Saving memory pages: iter 9 0%
9: sent 9356, skipped 1805, 8%
98%
9: sent 9356, skipped 1805, delta 4216ms, dom0 38%, target 18%, sent 72Mb/s,
dirtied 98Mb/s 12663 pages
[1133136830.028020] Saving memory pages: iter 10 0%
10: sent 11249, skipped 1410, 7%
90%
10: sent 11249, skipped 1410, delta 8518ms, dom0 26%, target 5%, sent 43Mb/s,
dirtied 74Mb/s 19253 pages
[1133136838.546941] Saving memory pages: iter 11 0%
11: sent 17449, skipped 1803, 5%
99%
11: sent 17449, skipped 1803, delta 10874ms, dom0 30%, target 4%, sent
52Mb/s, dirtied 50Mb/s 16807 pages
[1133136849.421847] Saving memory pages: iter 12 0%
12: sent 14596, skipped 2211, 6%
98%
12: sent 14596, skipped 2211, [DEBUG] Conn_sxpr>
(debian1 16)[DEBUG] Conn_sxpr< err=0
[1133136856.172376] SUSPEND flags 00020004 shinfo 00000be8 eip c01068fe esi
00035a84
SUSPEND flags 00020004 shinfo 00000be8 eip c01068fe esi 00035a84
delta 6750ms, dom0 39%, target 8%, sent 70Mb/s, dirtied 102Mb/s 21143 pages
[1133136856.172581] Saving memory pages: iter 13 0%
13: sent 21143, skipped 0, 9%
97%
13: sent 21143, skipped 0, delta 9709ms, dom0 22%, target 0%, sent 71Mb/s,
dirtied 71Mb/s 21143 pages
[1133136865.882150] Total pages sent= 223838 (3.42x)
Total pages sent= 223838 (3.42x)
[1133136865.882212] (of which 0 were fixups)
(of which 0 were fixups)
[DEBUG] Conn_sxpr>
(xfr.err 0)[DEBUG] Conn_sxpr< err=0
[DEBUG] Conn_sxpr>
(xfr.xfr.ok 24)[DEBUG] Conn_sxpr< err=0
3555 [INF] XFRD> Transfer complete in 115 seconds
3555 [WRN] XFRD> Transfer OK
3555 [INF] XFRD> Xfr service err=0
Michael
--
----------------------------------------------------------------------------------------
Michael Mey
Thinking Objects Software GmbH | mailto: michael.mey@xxxxxx
Lilienthalstrasse 2/1 | phone: +49 711 88770-147
70825 Stuttgart-Korntal, Germany | fax: +49 711 88770-449
----------------------------------------------------------------------------------------
pgpnBFZMyBsnF.pgp
Description: PGP signature
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users
|