WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ppc-devel

[XenPPC] Status of CoW on xenppc - part1 based on dm-userspace

To: xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
Subject: [XenPPC] Status of CoW on xenppc - part1 based on dm-userspace
From: Christian Ehrhardt <ehrhardt@xxxxxxxxxxxxxxxxxx>
Date: Wed, 16 May 2007 10:34:25 +0200
Delivery-date: Wed, 16 May 2007 01:32:35 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-ppc-devel-request@lists.xensource.com?subject=help>
List-id: Xen PPC development <xen-ppc-devel.lists.xensource.com>
List-post: <mailto:xen-ppc-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ppc-devel>, <mailto:xen-ppc-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ppc-devel>, <mailto:xen-ppc-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-ppc-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 1.5.0.10 (X11/20070301)
Setting up CoW device on xenppc with dm-userspace:
!thanks to Dan Smith for the quick support for everything I struggled with on the way to get dm-userspace "working" so far! This is a howto, but because I was not able to get fully it up and running it is a description how to reach the bug I describe at the end of the document. I welcome every comment especially any help in deeper interpreting the DSISR/DAR/... registers in the bug statement information.

For now I switch part2 to test blktap based CoW device.

Step-by-Step dm-user based CoW
- get current dm-userspace http://static.danplanet.com/hg/. There are two versions now, I used unstable which sounded mor stable then ring ;) - merge the c files and headers with our linux-ppc tree and patch the kconfig file (install.sh was currently broken, but it would do the same)
- config the kernel with xen_maple_defconfig + dm-userspace support
- build the patched and configured kernel
Seen comile issues (code not fully platform independent?), but only warnings: /root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-chardev.c: In function ‘do_kill_mapping’: /root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-chardev.c:309: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 2 has type ‘uint64_t’
CC [M] drivers/md/dm-userspace-cache.o
/root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-cache.c: In function ‘dmu_remove_mapping’: /root/xen/linux/linux-ppc-2.6.hg_dmuserspace/drivers/md/dm-userspace-cache.c:199: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 2 has type ‘uint64_t’
CC [M] drivers/md/dm-crypt.o

- build the current xen_unstable tip with the dm-userspace supporting zImage just created
- The Readme says compile libdmu, but ...
it fails building libdmu because our kernel is missing some patches, but neither current vanilla nor xen-unstable/sparse/pristine, nor the patch et delivered with dm-userspace contain that missing function, the needed patches where on the xen-devel list on Aug 2006 - where are they gone ???
Error compiling libdmu:
root@c08b01-0[1]:~/dm-userspace.unstable/tools/libdmu# gcc -DPACKAGE_NAME=\"libdmu\" -DPACKAGE_TARNAME=\"libdmu\" -DPACKAGE_VERSION=\"0.4.0\" "-DPACKAGE_STRING=\"libdmu 0.4.0\"" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE=\"libdmu\" -DVERSION=\"0.4.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 -DSTDC_HEADERS=1 -DHAVE_FCNTL_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_NETINET_IN_H=1 -DHAVE_STDINT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_SYS_IOCTL_H=1 -DHAVE_UNISTD_H=1 -DHAVE_STRUCT_STAT_ST_RDEV=1 -DHAVE_UNISTD_H=1 -DHAVE_FORK=1 -DHAVE_VFORK=1 -DHAVE_WORKING_VFORK=1 -DHAVE_WORKING_FORK=1 -DHAVE_STDLIB_H=1 -DHAVE_MALLOC=1 -DRETSIGTYPE=void -DLSTAT_FOLLOWS_SLASHED_SYMLINK=1 -DHAVE_MEMSET=1 -DHAVE_STRTOL=1 -DHAVE_STRTOULL=1 -I. -I. -g -I/lib/modules/2.6.17_dmuserspace-Xen/build/include -Wall -MT libdmu_la-dmu.lo -MD -MP -MF .deps/libdmu_la-dmu.Tpo -c dmu.c -fPIC -DPIC -o .libs/libdmu_la-dmu.o
dmu.c: In function ‘dmu_ctl_queue_msg’:
dmu.c:209: warning: implicit declaration of function ‘dmu_get_msg_len’
[...]

- after discussing with the maintainer Dan Smith I did the following:
-> switched to dm-userspace.ring instead of unstable
-> ignored libdmu because it is deprecated
- fails to compile the kernel with an error Dan Smith assumed to happen
-> switch back to changeset 613:d5d4d8eaa4a4 (last merge) using "hg revert --all -r d5d4d8eaa4a4" - compile kernel with dm-userspace of this changeset (still hast the %llu vs. uint64_t warning but worked so far)
- go to the tools/cowd dir and make && make install it
- create a dscow file for my base loop image with 64k blocks "dscow_tool -b 64 -c SLES10_G1.dscow SLES10_G1.img"
the file size (real) is in 4114 bytes for the dscow (yet unused)
4113 -rw------- 1 root root 4296081408 2007-05-15 14:38 SLES10_G1.dscow
2465165 -rw-r--r-- 1 root root 4296015872 2007-05-04 10:16 SLES10_G1.img
while both are 4.1G sparse files (the image was created with dd to 4G size as sparse to increase the real block usage on demand)

creating a device-mapper dev node for the file with:
root@c08b01-0[1]:~/images# cowd -n -v -d -p dscow SLES10_G1 SLES10_G1.dscow
Daemon Configuration:
Plugin: dscow
Daemon: no
Init CoW: no
Verbose: yes
Block Size: 0 KB
Init device:yes
Adding plugin arg 0/2: SLES10_G1
Adding plugin arg 1/2: SLES10_G1.dscow
cowd[4945]: Starting
cowd[4945]: Loaded /usr/local/lib/libcowd_dscow.so
ioctl: LOOP_SET_FD: Device or resource busy
Device SLES10_G1: 0 blocks @ 65552 KB
Creating dm device: SLES10_G1 0 7:0 7:1
device-mapper: create ioctl failed: Device or resource busy
Failed to run device-mapper command!
Failed to create DM device

Try to get it up with cowmount:
root@c08b01-0[0]:~/images# cowmount SLES10_G1.dscow /mnt/SLES10_G1
ioctl: LOOP_SET_FD: Device or resource busy
device-mapper: create ioctl failed: Device or resource busy
Failed to run device-mapper command!
Failed to create DM device
Failed to start cowd

dmesg:
device-mapper: table: 254:0: userspace: unknown target type
device-mapper: ioctl: error adding target to table

Reason: dm-mod/dm-user have no autoload from cowd
-> always load them manually

When the module is loaded the mounting cowmount as well as the load with
cowd and mount /dev/mapper/device afterwards fail both with this:
In a xen image with debug=y & nosmp I also get this, well it's a linux bug so
it was expected that way, but it removed muli-cpu from the candiate list of
bug origin.

cpu 0x0: Vector: 300 (Data Access) at [c0000000188878b0]
pc: d0000000002cda40: .run_pages_job+0xd0/0x150 [dm_mod]
lr: d0000000002cd9bc: .run_pages_job+0x4c/0x150 [dm_mod]
sp: c000000018887b30
msr: 8000000000009032
dar: 0
dsisr: 40000000
current = 0xc000000003dff800
paca = 0xc0000000005e4100
pid = 4730, comm = kcopyd
enter ? for help
0:mon>

As far as I can read this dump it is a branch to 0x300 on cpu 0
0x300 is "Data Storage interrupt" and DSISR/DAR should say something about the
reason.
DSISR[33]=1 means:
Set to 1 if MSRDR=1 and the translation for
an attempted access is not found in the pri-
mary PTEG or in the secondary PTEG; oth-
erwise set to 0.
DAR is zero. I do not claim to have understood all about that in the PowerISA
document all I would assume now may be wrong. I hope someone reading this
continue interpreting here.

Base of .run_pages_job is 0x9970 so calculate pc/lr
this is dissassemble a part of .run_pages_job at 9a40 (pointed to by pc)
9a40: e9 6b 00 00 ld r11,0(r11)
9a44: 42 00 ff fc bdnz+ 9a40 <.run_pages_job+0xd0>
The lr target is lso in .run_pages_job at 99bc (pointed to by lr)
99bc: 60 00 00 00 nop
99c0: 81 3f 00 24 lwz r9,36(r31)

C-Code of this function (part of kcopyd.c):
360 static int run_pages_job(struct kcopyd_job *job)
361 {
362 int r;
363
364 job->nr_pages = dm_div_up(job->dests[0].count + job->offset,
365 PAGE_SIZE >> 9);
366 r = kcopyd_get_pages(job->kc, job->nr_pages, &job->pages);
367 if (!r) {
368 /* this job is ready for io */
369 push(&_io_jobs, job);
370 return 0;
371 }
372
373 if (r == -ENOMEM)
374 /* can't complete now */
375 return 1;
376
377 return r;
378 }

Full disassemble of this function, the mere size indicates vs. c-code seems
like something got inlined. The code around the bug is a loop based on ctr.
run_pages_job contains no loop (and nothing that would make sense to autoconvert it while compiling, but the called kcopyd_get_pages contains such a loop that
may be a candidate. Also kcopyd_get_pages is not there as function in the
disassembly which let me assume that it got completely inlined here:
0000000000009970 <.run_pages_job>:
9970: 7c 08 02 a6 mflr r0
9974: fb 81 ff e0 std r28,-32(r1)
9978: fb a1 ff e8 std r29,-24(r1)
997c: fb c1 ff f0 std r30,-16(r1)
9980: fb e1 ff f8 std r31,-8(r1)
9984: eb c2 00 00 ld r30,0(r2)
9988: f8 01 00 10 std r0,16(r1)
998c: f8 21 ff 71 stdu r1,-144(r1)
9990: 7c 7c 1b 78 mr r28,r3
9994: 60 00 00 00 nop
9998: e9 23 00 60 ld r9,96(r3)
999c: e8 03 01 10 ld r0,272(r3)
99a0: eb e3 00 00 ld r31,0(r3)
99a4: 39 29 00 07 addi r9,r9,7
99a8: 38 7f 00 10 addi r3,r31,16
99ac: 7d 29 02 14 add r9,r9,r0
99b0: 79 3d e8 22 rldicl r29,r9,61,32
99b4: 93 bc 01 18 stw r29,280(r28)
99b8: 48 00 00 01 bl 99b8 <.run_pages_job+0x48>
99bc: 60 00 00 00 nop
99c0: 81 3f 00 24 lwz r9,36(r31)
99c4: 7f 9d 48 40 cmplw cr7,r29,r9
99c8: 40 9d 00 48 ble- cr7,9a10 <.run_pages_job+0xa0>
99cc: 7c 20 04 ac lwsync
99d0: 38 00 00 00 li r0,0
99d4: 38 21 00 90 addi r1,r1,144
99d8: 38 60 00 01 li r3,1
99dc: 90 1f 00 10 stw r0,16(r31)
99e0: 60 00 00 00 nop
99e4: 60 00 00 00 nop
99e8: 60 00 00 00 nop
99ec: e8 01 00 10 ld r0,16(r1)
99f0: eb 81 ff e0 ld r28,-32(r1)
99f4: eb a1 ff e8 ld r29,-24(r1)
99f8: eb c1 ff f0 ld r30,-16(r1)
99fc: eb e1 ff f8 ld r31,-8(r1)
9a00: 7c 08 03 a6 mtlr r0
9a04: 4e 80 00 20 blr
9a08: 60 00 00 00 nop
9a0c: 60 00 00 00 nop
9a10: 38 1d ff ff addi r0,r29,-1
9a14: e9 7f 00 18 ld r11,24(r31)
9a18: 7d 3d 48 50 subf r9,r29,r9
9a1c: 78 0a 00 20 clrldi r10,r0,32
9a20: 91 3f 00 24 stw r9,36(r31)
9a24: 2f aa 00 00 cmpdi cr7,r10,0
9a28: f9 7c 01 20 std r11,288(r28)
9a2c: 41 9e 00 1c beq- cr7,9a48 <.run_pages_job+0xd8>
9a30: 39 2a ff ff addi r9,r10,-1
9a34: 79 29 00 20 clrldi r9,r9,32
9a38: 39 29 00 01 addi r9,r9,1
9a3c: 7d 29 03 a6 mtctr r9
9a40: e9 6b 00 00 ld r11,0(r11)
9a44: 42 00 ff fc bdnz+ 9a40 <.run_pages_job+0xd0>
9a48: e9 2b 00 00 ld r9,0(r11)
9a4c: 38 00 00 00 li r0,0
9a50: f9 3f 00 18 std r9,24(r31)
9a54: f8 0b 00 00 std r0,0(r11)
9a58: 7c 20 04 ac lwsync
9a5c: 90 1f 00 10 stw r0,16(r31)
9a60: eb be 80 50 ld r29,-32688(r30)
9a64: 7f a3 eb 78 mr r3,r29
9a68: 48 00 00 01 bl 9a68 <.run_pages_job+0xf8>
9a6c: 60 00 00 00 nop
9a70: e9 3e 80 08 ld r9,-32760(r30)
9a74: 39 7c 00 08 addi r11,r28,8
9a78: 7c 64 1b 78 mr r4,r3
9a7c: 7f a3 eb 78 mr r3,r29
9a80: e9 49 00 08 ld r10,8(r9)
9a84: f9 3c 00 08 std r9,8(r28)
9a88: f9 69 00 08 std r11,8(r9)
9a8c: f9 6a 00 00 std r11,0(r10)
9a90: f9 4b 00 08 std r10,8(r11)
9a94: 48 00 00 01 bl 9a94 <.run_pages_job+0x124>
9a98: 60 00 00 00 nop
9a9c: 38 21 00 90 addi r1,r1,144
9aa0: 38 60 00 00 li r3,0
9aa4: e8 01 00 10 ld r0,16(r1)
9aa8: eb 81 ff e0 ld r28,-32(r1)
9aac: eb a1 ff e8 ld r29,-24(r1)
9ab0: eb c1 ff f0 ld r30,-16(r1)
9ab4: eb e1 ff f8 ld r31,-8(r1)
9ab8: 7c 08 03 a6 mtlr r0
9abc: 4e 80 00 20 blr

--

Grüsse / regards, Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
+49 7031/16-3385
Ehrhardt@xxxxxxxxxxxxxxxxxxx
Ehrhardt@xxxxxxxxxx

IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen Geschäftsführung: Herbert Kircher Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel

<Prev in Thread] Current Thread [Next in Thread>
  • [XenPPC] Status of CoW on xenppc - part1 based on dm-userspace, Christian Ehrhardt <=