Hey Brendan & all,
> > I ran into some problems trying remus on xen4.0.1rc4 with the 2.6.31.13
> > dom0 (checkout from yesterday):
>
> Wat's your domU kernel? pvops support was recently added to dom0, but
> still doesn't work for domU.
Ah, that explains a few things, however similar behaviour occurs with
hvm. Remus starts, spits out the following output:
qemu logdirty mode: enable
1: sent 267046, skipped 218, delta 8962ms, dom0 68%, target 0%, sent
976Mb/s, dirtied 1Mb/s 290 pages
2: sent 290, skipped 0, delta 12ms, dom0 66%, target 0%, sent 791Mb/s,
dirtied 43Mb/s 16 pages
3: sent 16, skipped 0, Start last iteration
PROF: suspending at 1278503125.101352
issuing HVM suspend hypercall
suspend hypercall returned 0
pausing QEMU
SUSPEND shinfo 000fffff
delta 11ms, dom0 18%, target 0%, sent 47Mb/s, dirtied 47Mb/s 16 pages
4: sent 16, skipped 0, delta 5ms, dom0 20%, target 0%, sent 104Mb/s,
dirtied 104Mb/s 16 pages
Total pages sent= 267368 (0.25x)
(of which 0 were fixups)
All memory is saved
PROF: resumed at 1278503125.111614
resuming QEMU
Sending 6017 bytes of QEMU state
PROF: flushed memory at 1278503125.112014
and then seems to become inactive. ps tree looks like this:
root 4756 0.4 0.1 82740 11040 pts/0 SLl+ 13:45 0:03
/usr/bin/python /usr/bin/remus --no-net remus1 backup
according to strace, it's stuck reading FD6, which is a FIFO file:
/var/run/tap/remus_nas1_9000.msg
the domU comes up in blocked state on the backup machine and seems to
run fine there. however xm list on the primary shows no state whatsoever:
Domain-0 0 10208 12 r-----
468.6
remus1 1 1024 1 ------
41.8
and after a ctrl-c remus segfaults:
remus[4756]: segfault at 0 ip 00007f3f49cc7376 sp 00007fffec999fd8 error
4 in libc-2.11.1.so[7f3f49ba1000+178000]
> Are these in dom0 or the primary domU? Looks a bit like dom0, but I
> haven't seen these before.
those were in dom0. this time dmesg shows output after destroying
the domU on the primary:
[ 1920.059226] INFO: task xenwatch:55 blocked for more than 120 seconds.
[ 1920.059262] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1920.059315] xenwatch D 0000000000000000 0 55 2
0x00000000
[ 1920.059363] ffff8802e2e656c0 0000000000000246 0000000000011200
0000000000000000
[ 1920.059439] ffff8802e2e65720 0000000000000000 ffff8802d55d20c0
00000001001586b3
[ 1920.059520] ffff8802e2e683b0 000000000000f668 00000000000153c0
ffff8802e2e683b0
[ 1920.059592] Call Trace:
[ 1920.059626] [<ffffffff8157553d>] io_schedule+0x2d/0x40
[ 1920.059661] [<ffffffff812afbc9>] get_request_wait+0xe9/0x1c0
[ 1920.059695] [<ffffffff810af240>] ? autoremove_wake_function+0x0/0x40
[ 1920.059732] [<ffffffff812a3e87>] ? elv_merge+0x37/0x200
[ 1920.059765] [<ffffffff812afd41>] __make_request+0xa1/0x470
[ 1920.059800] [<ffffffff810389ff>] ? xen_restore_fl_direct_end+0x0/0x1
[ 1920.059837] [<ffffffff8103ed5d>] ? retint_restore_args+0x5/0x6
[ 1920.059874] [<ffffffff812ae5dc>] generic_make_request+0x17c/0x4a0
[ 1920.059909] [<ffffffff8111bdf6>] ? mempool_alloc+0x56/0x140
[ 1920.059946] [<ffffffff8103819d>] ?
xen_force_evtchn_callback+0xd/0x10
[ 1920.059979] [<ffffffff812ae978>] submit_bio+0x78/0xf0
[ 1920.060013] [<ffffffff81180489>] submit_bh+0xf9/0x140
[ 1920.060046] [<ffffffff81182600>] __block_write_full_page+0x1e0/0x3a0
[ 1920.060080] [<ffffffff811819c0>] ? end_buffer_async_write+0x0/0x1f0
[ 1920.060116] [<ffffffff81186980>] ? blkdev_get_block+0x0/0x70
[ 1920.060151] [<ffffffff81186980>] ? blkdev_get_block+0x0/0x70
[ 1920.060186] [<ffffffff811819c0>] ? end_buffer_async_write+0x0/0x1f0
[ 1920.060222] [<ffffffff81182ec1>]
block_write_full_page_endio+0xe1/0x120
[ 1920.060259] [<ffffffff81038a12>] ? check_events+0x12/0x20
[ 1920.060294] [<ffffffff81182f15>] block_write_full_page+0x15/0x20
[ 1920.060330] [<ffffffff81187928>] blkdev_writepage+0x18/0x20
[ 1920.060365] [<ffffffff81120937>] __writepage+0x17/0x40
[ 1920.060399] [<ffffffff81121897>] write_cache_pages+0x227/0x4d0
[ 1920.060434] [<ffffffff81120920>] ? __writepage+0x0/0x40
[ 1920.060469] [<ffffffff810389ff>] ? xen_restore_fl_direct_end+0x0/0x1
[ 1920.060504] [<ffffffff81121b64>] generic_writepages+0x24/0x30
[ 1920.060539] [<ffffffff81121b9d>] do_writepages+0x2d/0x50
[ 1920.060576] [<ffffffff81119beb>]
__filemap_fdatawrite_range+0x5b/0x60
[ 1920.060613] [<ffffffff8111a1ff>] filemap_fdatawrite+0x1f/0x30
[ 1920.060646] [<ffffffff8111a245>] filemap_write_and_wait+0x35/0x50
[ 1920.060681] [<ffffffff81187ba4>] __sync_blockdev+0x24/0x50
[ 1920.060716] [<ffffffff81187be3>] sync_blockdev+0x13/0x20
[ 1920.060748] [<ffffffff81187cc8>] __blkdev_put+0xa8/0x1a0
[ 1920.060784] [<ffffffff81187dd0>] blkdev_put+0x10/0x20
[ 1920.060819] [<ffffffff81344fea>] vbd_free+0x2a/0x40
[ 1920.060851] [<ffffffff81344499>] blkback_remove+0x59/0x90
[ 1920.060885] [<ffffffff8133e890>] xenbus_dev_remove+0x50/0x70
[ 1920.060921] [<ffffffff8138b9d8>] __device_release_driver+0x58/0xb0
[ 1920.060956] [<ffffffff8138bb4d>] device_release_driver+0x2d/0x40
[ 1920.060991] [<ffffffff8138ac0a>] bus_remove_device+0x9a/0xc0
[ 1920.061027] [<ffffffff81388da7>] device_del+0x127/0x1d0
[ 1920.061061] [<ffffffff81388e66>] device_unregister+0x16/0x30
[ 1920.061095] [<ffffffff813441a0>] frontend_changed+0x90/0x2a0
[ 1920.061131] [<ffffffff8133eb82>] xenbus_otherend_changed+0xb2/0xc0
[ 1920.061167] [<ffffffff81577aa7>] ? _spin_unlock_irqrestore+0x37/0x60
[ 1920.061209] [<ffffffff8133f150>] frontend_changed+0x10/0x20
[ 1920.061243] [<ffffffff8133c794>] xenwatch_thread+0xb4/0x190
[ 1920.061281] [<ffffffff810af240>] ? autoremove_wake_function+0x0/0x40
[ 1920.061314] [<ffffffff8133c6e0>] ? xenwatch_thread+0x0/0x190
[ 1920.061349] [<ffffffff810aecb6>] kthread+0xa6/0xb0
[ 1920.061383] [<ffffffff8103f3ea>] child_rip+0xa/0x20
[ 1920.061415] [<ffffffff8103e5d7>] ? int_ret_from_sys_call+0x7/0x1b
[ 1920.061451] [<ffffffff8103ed5d>] ? retint_restore_args+0x5/0x6
[ 1920.061485] [<ffffffff8103f3e0>] ? child_rip+0x0/0x20
Any idea what's going wrong? Thanks!
Cheers,
NN
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|