On Mon, 2011-07-11 at 13:09 -0400, Sébastien Riccio wrote:
> On 10.07.2011 20:19, Daniel Stodden wrote:
> > Okay, that needs to get fixed, but I don't know where. In XCP that's
> > how it's exclusively done, because it's the most general approach.
>
> My guess at the moment is that it might be a problem with blktap and
> vhd or, blktap and my kernel combo, or blktap and my kernel combo and
> vhd :)
>
> The kernel i'm actually playing with is: 2.6.39.2-xen-stable + blktap,
> built over a debian squeeze like this:
>
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git
> linux_xen_2.6.39.x-stable
> cd linux_xen_2.6.39.x-stable
> git checkout -b stable/2.6.39.x origin/stable/2.6.39.x
> git remote add daniel
> git://xenbits.xensource.com/people/dstodden/linux.git
> git fetch daniel
> git merge daniel/blktap/next-2.6.39
> make menuconfig (removing useless stuff, activating the needed ones)
> make bzImage -j9 ; make modules -j9 ; make modules_install
> [ etc ... ]
I'm not familiar with potential open issues in konrad's 2.6.39 tree, but
I wouldn't expect that to be the problem.
> Playing more with it this morning I managed to start a vm with a vhd
> file for the disk (after provisioning it with some files and
> reboots...) , with this config:
>
> box# cat /cloud/data2/configs/vm1.test.cfg
> bootloader = "/usr/bin/pygrub"
> memory = 1024
> name = "vm1"
> vcpus = 4
> #vif = [ 'ip=10.111.5.10, bridge=trunk0, vifname=vm1.0' ]
> disk = [ 'tap2:vhd:/cloud/data2/machines/vm1.vhd,xvda,w' ]
> root = "/dev/xvda1"
> extra = "fastboot"
> on_poweroff = 'destroy'
> on_reboot = 'restart'
> on_crash = 'restart'
Looks good, although I'm not really good with the 'disk' lines either.
E.g. mine just say tap:aio, tap:vhd, etc.
>
> vm1 is up and rocking
>
> box# xl list
> Name ID Mem VCPUs State
> Time(s)
> Domain-0 0 1024 16 r-----
> 26.2
> vm1 2 1024 4 -b----
> 2.8
>
> But now if I issue a ps -aux in the dom0, it displays some process
> then hangs the ps.
> (that was not the case before I start vm1)
>
> And if I try to list the attached block devices with xl:
>
> box# xl block-list 2
> Vdev BE handle state evt-ch ring-ref BE-path
> Segmentation fault
Ouch. What xen tree are you running? Unstable? What does tap-ctl list
say, does that work? You might want to try the 4.1 tree instead. If not,
you'll at least want to get yourself a coredump to get an idea of what
you're after.
> (dmesg)
> [ 1592.151122] xl[2292]: segfault at 0 ip 00007f7de314e6d2 sp
> 00007fff610e30b0 error 4 in libc-2.11.2.so[7f7de3117000+158000]
# ulimit -c unlimited
# xl block-list 2
... core dumped.
# gdb $(which xl) core
(gdb) backtrace
> but works if i try to list it with xm:
>
> box# xm block-list 2
> Vdev BE handle state evt-ch ring-ref BE-path
> 51712 0 0 4 23
> 8 /local/domain/0/backend/vbd/2/51712
Well, different codebase...
>
> I'll try with something else than vhd to see if the same happens, but
> my goal is to use vhd's ...
Just check if the tapdisk's are all working. You didn't see those
failing. But if your guests look happy (xl console, poke around) long
as, your problem is elsewhere.
>
>
> > Can you check if it works with some normal disk? Check out modules,
> > install lvm2, make sure you have dm-linear loaded, etc... You were
> > running a custom kernel, right? You're probably just missing sth.
>
> modules lists:
>
> root@xen-blade15:~# lsmod
> Module Size Used by
> blktap 17941 8
> ocfs2 618206 1
> quota_tree 7539 1 ocfs2
> ocfs2_dlmfs 17331 1
> ocfs2_stack_o2cb 3482 1
> ocfs2_dlm 204671 1 ocfs2_stack_o2cb
> ocfs2_nodemanager 186569 14
> ocfs2,ocfs2_dlmfs,ocfs2_stack_o2cb,ocfs2_dlm
> ocfs2_stackglue 7437 3 ocfs2,ocfs2_dlmfs,ocfs2_stack_o2cb
> dm_round_robin 2260 1
> configfs 21658 2 ocfs2_nodemanager
> crc32c 2688 8
> iscsi_tcp 8503 6
> libiscsi_tcp 11604 1 iscsi_tcp
> libiscsi 34844 2 iscsi_tcp,libiscsi_tcp
> scsi_transport_iscsi 28673 3 iscsi_tcp,libiscsi
> openvswitch_mod 71205 3
> xenfs 9815 1
> xfs 501098 1
> ext2 61369 1
> sg 27333 0
> sr_mod 14760 0
> cdrom 35494 1 sr_mod
> xen_evtchn 4739 2
> loop 16002 0
> tpm_tis 7821 0
> tpm 10878 1 tpm_tis
> i7core_edac 15891 0
> tpm_bios 4921 1 tpm
> dcdbas 5416 0
> edac_core 34483 1 i7core_edac
> evdev 9374 4
> usb_storage 43361 0
> thermal_sys 14045 0
> pcspkr 1779 0
> acpi_processor 5423 0 [permanent]
> button 4199 0
> usbhid 34740 0
> hid 78436 1 usbhid
> ext4 255423 1
> mbcache 5434 2 ext2,ext4
> jbd2 48549 2 ocfs2,ext4
> crc16 1319 1 ext4
> dm_multipath 16384 2 dm_round_robin
> scsi_dh 4876 1 dm_multipath
> dm_mod 63657 7 dm_multipath
Do you have dm-linear (I think it's just linear.ko) available? If not,
it might explain yesterday's kpartx issue.
> sd_mod 34293 6
> crc_t10dif 1292 1 sd_mod
> uhci_hcd 21828 0
> megaraid_sas 70747 3
> ehci_hcd 37665 0
> scsi_mod 144719 9
> iscsi_tcp,libiscsi,scsi_transport_iscsi,sg,sr_mod,usb_storage,scsi_dh,sd_mod,megaraid_sas
> usbcore 137744 5 usb_storage,usbhid,uhci_hcd,ehci_hcd
> bnx2 70964 0
>
> My vhd storage is on an ocfs2 shared storage attached with multipath
> iscsi. I will try it on a local storage too to eliminate
> that possible cause.
Well, yeah, that's a bit thicker than normally recommended for testing
patchworks, but then again, it doesn't really sound like that's your
most immediate problem.
Cheers,
Daniel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|