WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-ppc-devel

Re: [XenPPC] Re: libvirt status of bugt racking regarding bad paddr (pat

To: Christian Ehrhardt <ehrhardt@xxxxxxxxxxxxxxxxxx>
Subject: Re: [XenPPC] Re: libvirt status of bugt racking regarding bad paddr (patch)
From: Christian Ehrhardt <ehrhardt@xxxxxxxxxxxxxxxxxx>
Date: Mon, 02 Jul 2007 15:10:37 +0200
Cc: xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Mon, 02 Jul 2007 06:08:30 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <4684C525.4000505@xxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-ppc-devel-request@lists.xensource.com?subject=help>
List-id: Xen PPC development <xen-ppc-devel.lists.xensource.com>
List-post: <mailto:xen-ppc-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ppc-devel>, <mailto:xen-ppc-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-ppc-devel>, <mailto:xen-ppc-devel-request@lists.xensource.com?subject=unsubscribe>
References: <468114CE.1050204@xxxxxxxxxxxxxxxxxx> <1182871297.5819.11.camel@laptop> <4682929C.10803@xxxxxxxxxxxxxxxxxx> <46838F13.3070409@xxxxxxxxxxxxxxxxxx> <4684C525.4000505@xxxxxxxxxxxxxxxxxx>
Sender: xen-ppc-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird 1.5.0.10 (X11/20070301)
The change that brought the 64bit padded structures was this one:
changeset:   13594:30af6cfdb05c
user:        kfraser@xxxxxxxxxxxxxxxxxxxxx
date:        Wed Jan 24 16:33:19 2007 +0000
summary:     Make domctl/sysctl interfaces 32-/64-bit invariant.

It lifted the interface version to 5 which fits to the comments seen in libvirt code.

Until someone makes another suggestion how to implement this in a nicer way I change my workaround a bit and let it there together with some assertions. As far as I checked the backgrounds of the structure "xen_sysctl_getdomaininfolist" it is expected to be set up with the 64bit padding since interface version 5 and our guest sets it up 32bit + padding which breaks our native 64bit interpretation because the 32bit words are combined the other way around on 64bit power.

Since this seems currently to be only a power issue I add the following to arch/powerpc/platforms/xen/hcall.c: Assert that the incoming pointer in interface version 5 is read as "PaddingWordWithZero WordWith32bitPointer" by ensuring a two matches leading to a WARN
 -> buffer.p&0x00000000FFFFFFFF!=0x0
 -> buffer.p&0xFFFFFFFF00000000==0x0
Then I shift the 32bits.

The patch is attached and I welcome every more complete approach (compat wrapper, flags, whatever, ... ). There is still a "arch/powerpc/platforms/xen/hcall.c:488: warning: assignment from incompatible pointer type" and I'm not happy with it, but I played around with the variable and I was not able to bring it back from a shiftable type to something without assign type warning. @Hollis - If we are finished with the patch after some evolution you might want to fold it into our 2.6.18 patch queue before submission


Christian Ehrhardt wrote:
I verified my endian/32/64bit based assumption.
Time runs short today - in order that no one overlooks the questions I hope to read some comments on monday I put them here to the head ;) - Do we need a compat wrapper for the syscall here? Or do we have one already, but it is not used? - Should we fix this on kernel side or is the call from libvirt not fully standard compliant and we should fix that call/structure setup in libvirt?

The userspace code uses this structure to set up the address.
v* is the address we later fail to interpret correct:

/* As of HV version 2, sysctl version 3 the *buffer pointer is 64-bit aligned * */
struct xen_v2s3_getdomaininfolistop {
   domid_t   first_domain;
   uint32_t  max_domains;
   union {
       struct xen_v2d5_getdomaininfo *v;
       uint64_t pad ALIGN_64;
   } buffer;
   uint32_t  num_domains;
};
typedef struct xen_v2s3_getdomaininfolistop xen_v2s3_getdomaininfolistop;

As far as I understand it this will give the following 64bit view to v*
Set up from a 32bit Userspace:
 0xff??????PADPADPA
Set up from a 64bit userspace:
 0x00000000ff?????? (pad completely overwritten by v* assignement)

Now it may be possible that the 64 bit view is twisted because of little/big
endion interpretation of a 64bit value (H8byteL8byte vs. L8byteH8byte).
Therefor IA64 might read the correct 0x00000000FF?????? there but it does not
work that way on 64bit powerpc.

Verify this by filling the pad and checking the workaround output:
+ // DEBUG initialize union padding var with two identifiable 32bit sides
+ op.u.getdomaininfolists3.buffer.pad = 0xDEADAAAADEADBBBB;
 op.u.getdomaininfolists3.buffer.v = dominfos->v2d5;
The debug output of my workaround in hcall.c shows the expected (The value is
changed by misinterpreting the padding as lower half 32bit word):
xenppc_privcmd_sysctl - hack changing kern_op.u.getdomaininfolist.buffer.p 'fff40ec0deadbbbb' to kern_op.u.getdomaininfolist.buffer.p>>32 '00000000fff40ec0'

Christian Ehrhardt wrote:
Anyone not interested in how I came to the fix might want to read just
the last few paragraphs of this response.

First of all dominfos is a output structure so it is ok if it is zeroed,
but all the pointers/structures need to be valid.
It is also a union so the effect of having the same adresses in
v0/v2/vd2d5 pointers in there is ok.

In 'virXen_getdomaininfolist' it is of type 'xen_getdomaininfolist' and
this variable is allocated one
step before in the stack from 'virXen_getdomaininfo'
With my current debug patch I allocate the structure in
virXen_getdomaininfo and do
memset(&dominfos, 0, sizeof(dominfos));
This should ensure a flat and empty union structure that just consists
of its
(zeroed) pointer to the actual structures (one of them at a time):
    union xen_getdomaininfolist {
        struct xen_v0_getdomaininfo *v0;
        struct xen_v2_getdomaininfo *v2;
        struct xen_v2d5_getdomaininfo *v2d5;
    };
    typedef union xen_getdomaininfolist xen_getdomaininfolist;
One step more up in the stack is the local allocation of
'xen_getdomaininfo info;' in the function 'xenHypervisorInit'
This contains the real sub-structs not only pointers, but it is also a
union and
therefore has only one kind of them at a time.
    union xen_getdomaininfo {
        struct xen_v0_getdomaininfo v0;
        struct xen_v2_getdomaininfo v2;
        struct xen_v2d5_getdomaininfo v2d5;
    };
    typedef union xen_getdomaininfo xen_getdomaininfo;
In 'virXen_getdomaininfo' the code then assigns the "needed" union
substructure
pointer do dominfos.v0/v2/v2d5. This assignment depends on the global vars
hypervisor_version/sys_interface_version.

It is possible that the small patch I posted is not needed because the
union and
c handles this, but it is very unreadable and does nothing "wrong" so I
let this
part persist.
        if (sys_interface_version < 3)
                dominfos.v2 = &(dominfo->v2);
        else
                dominfos.v2d5 = &(dominfo->v2d5);

->Check if the Buggy 0xff??????00000000 is submitted that way.
More related debug + gdb breaks at
xenHypervisorDoV2Sys
-> check *op
virXen_getdomaininfo
-> check *dominfo dominfos and assignment

Without gdb output for a better readability:
virXen_getdomaininfo - dominfos after zeroing - dominfos '0xffa4ddcc'
dominfos.v0 '(nil)' dominfos.v2 '(nil)' dominfos.v2d5 '(nil)'
virXen_getdomaininfo - assigning &(dominfo->v2) '0xffa4deb0'
virXen_getdomaininfo - dominfos how it is passed - dominfos '0xffa4ddcc'
dominfos.v0 '0xffa4deb0' dominfos.v2 '0xffa4deb0' dominfos.v2d5 '0xffa4deb0'
virXen_getdomaininfolist - enter for firstdomain '0' maxids '1' on
handle '8'
virXen_getdomaininfolist - sys_interface_version '2' hypervisor_version '2'
virXen_getdomaininfolist - dominfos received - dominfos '0xffa4ddcc'
dominfos->v0 '0xffa4deb0' dominfos->v2 '0xffa4deb0' dominfos->v2d5
'0xffa4deb0'
virXen_getdomaininfolist - sleepwait
virXen_getdomaininfolist - allocated new and clean xen_op_v2_sys
virXen_getdomaininfolist - assiigning getdomaininfolist stuff
virXen_getdomaininfo - dominfos after zeroing - dominfos '0xffa4ddcc'
dominfos.v0 '(nil)' dominfos.v2 '(nil)' dominfos.v2d5 '(nil)'
virXen_getdomaininfo - assigning &(dominfo->v2d5) '0xffa4deb0'
virXen_getdomaininfo - dominfos how it is passed - dominfos '0xffa4ddcc'
dominfos.v0 '0xffa4deb0' dominfos.v2 '0xffa4deb0' dominfos.v2d5 '0xffa4deb0'
virXen_getdomaininfolist - enter for firstdomain '0' maxids '1' on
handle '8'
virXen_getdomaininfolist - sys_interface_version '3' hypervisor_version '2'
virXen_getdomaininfolist - dominfos received - dominfos '0xffa4ddcc'
dominfos->v0 '0xffa4deb0' dominfos->v2 '0xffa4deb0' dominfos->v2d5
'0xffa4deb0'
virXen_getdomaininfolist - sleepwait
virXen_getdomaininfolist - allocated new and clean xen_op_v2_sys
virXen_getdomaininfolist - assiigning getdomaininfolist3 stuff
Using hypervisor call v2, sys ver3 dom ver5

The union gets used dependend on the global vars op.u.???
and gets assigned the also related v2/v2d5 adresses.
The buffer of the v2d5 is now padded to 64bit and becasue of that the
buffer.v is valid too.
On that point the submitted op structure send via
    ret = ioctl(handle, xen_ioctl_hypercall_cmd, (unsigned long) &hc);
is still ok (0xff.. 32bit) and not the wrong interpreted one
(0xff??????00000000).

BTW @Hollis&Jerone ;-)
The ioctl with sys interface version 2 issued before our current bug
scenario is the famos "python 2!=3"
kernel message I already debugged in another scenario a few weeks ago
(were xend issues).

So check the kernel backend - apparently the libvirt submits a valid
pointer which is then somehow expanded to 64bit the wrong way.
The access in hcall.c looks the same way
kern_op.u.getdomaininfolist.buffer.p, but this one already reads the
64bit value 0xff??????0000000

The kernel makes copy_from_user with the incoming userspace address and
deal with the struct as xen_sysctl_t.
More debug statements showed that the address returned from
'xen_guest_handle' is
the one which is too big. Try debug this function, maybe it is only a
default
implementation missing a shift for powerpc or something similar.

For our platform it is defined as following which accesses the 64bit
padded .v
sub-element we already seen in libvirt:
#define xen_guest_handle(hnd)  ((hnd).p)

Test a dirty hack to verify this assumption by shifting (hnd).p>>32.

Test output:
xenppc_privcmd_sysctl - desc 'bff17ec000000000' for
                        kern_op.u.getdomaininfolist.buffer.p
'fff17ec000000000'

xen_guest_handle(kern_op.u.getdomaininfolist.buffer) 'fff17ec000000000'
                        user_op from '00000000fff17d00'
xenppc_privcmd_sysctl - verify hack changing
                        kern_op.u.getdomaininfolist.buffer.p
'fff17ec000000000' to
                        kern_op.u.getdomaininfolist.buffer.p>>32
'00000000fff17ec0'

Dirty Workaround (patch for xenppc linux attached):
libvirt debug output:
virXen_getdomaininfolist - allocated new and clean xen_op_v2_sys
virXen_getdomaininfolist - assiigning getdomaininfolist3 stuff
xenHypervisorDoV2Sys - submitting ioctl with op->interface_version '3'
with op->u.getdomaininfolist.buffer '0xffcbfec0'
                       op->u.getdomaininfolists3.buffer.v '0xffcbfec0'
op (hc.arg[0]) @ '0xffcbfd00' &hc '0xffcbfb90'
Using hypervisor call v2, sys ver3 dom ver5
post Initialize Libvirtmod / pre py_InitModule

Xen&Linux console:
[  214.408084] xenppc_privcmd_sysctl - hack changing
kern_op.u.getdomaininfolist.buffer.p 'ffcbfec000000000' to kern_op.u.getdomaininfolist.buffer.p>>32 '00000000ffcbfec0'
[  214.408113] xenppc_privcmd_sysctl - verify hack
               kern_op.u.getdomaininfolist.buffer.p '00000000ffcbfec0'
[  214.408130] xenppc_privcmd_sysctl - desc '0000000011835000' for
               kern_op.u.getdomaininfolist.buffer.p '00000000ffcbfec0'
               xen_guest_handle(kern_op.u.getdomaininfolist.buffer)
'00000000ffcbfec0'
               user_op from '00000000ffcbfd00'
(XEN) do_sysctl: cmd '6'
(XEN)      op->u.getdomaininfolist.first_domain '0'
(XEN)      op->u.getdomaininfolist.max_domains '1'
(XEN)      op->u.getdomaininfolist.buffer.p '0000000011835000'
(XEN)      adress of info '0000000000227b78'
(XEN) do_sysctl: iterate domain '0' num_domains '0'
(XEN) do_sysctl: post getdomaininfo, pre copy_to_guest_offset
(XEN) do_sysctl: pre copy_to_guest u_sysctl.p '800000000b0cf810' op
'0000000000227ae0'

So the bug is fixed that way, but it needs some polishing to be a real
fix we can commit.
Maybe someone with more ppc 32/64bit and big/little endian experience could write this in a nice way and see easily why this happens. Otherwise I try
to do that tomorrow the hard way.

Also we have to check if this issue only occurs for sysctl getdomaininfo
or if it is a general xen_guest_handle bug.

@Hollis - when we have the final not so 'hacky' version of this fix you
might want to fold it into our patch queue for 2.6.18 before commiting


Christian Ehrhardt wrote:
This is how the adresses are passed through the stack:
(read this monospaced)
userspace-libvirt linux-kernel Xen alloc/ioctl -> incoming -> xencomm_map -> incoming -> xencomm_inline_to_guest 0xffba2eb0 -> ffba2eb000000000 -> bfba2eb000000000 -> bfba2eb000000000 -> 0x3fba2eb000000000

The xencomm_map function is from our patchqueue for the xen-2.6.18 powerpc merge It uses xencomm_create_inline which xencomm_pa and adds the XENCOMM_INLINE_FLAG flag With the correct 0x80000000 00000000 adresses this works but it fails for
the wrong 0xff... address received in the bug scenario.
All lower effects are only subsequent effects
Since the numeric value of 0xffba2eb0 in userspace is different to the one received by the kernel 0xffba2eb000000000 and its just the upper/lower half
that is twisted it may be some kind of little/big endian bug here
(0x00000000ffba2eb0 would be right).

In libvirt src/xen_internal.c the assignment of the buffer struct is a bit
confusing because it relies on global variables through the stack etc.
This is how I think it works (or better doesn't) and somewhere here I get lost: -> the identified userspace call causing the bug is xenHypervisorDoV2Sys
  with hypervisor_version=2 && sys_interface_version=3
-> the identified call used the dominfos->v2d5 to assign it to buffer.v
-> difference of v2 and v2d5 : v2d5 uses ALIGN_64 for all 64 bit vars which is
  sysctl version 3 style
-> also the .v above is used to pack buffer in a 64bit aligned stuct for sysctl
  version 3
-> The dominfo struct use here is assigned in virXen_getdomaininfo and is
  &(dominfo->v2); or &(dominfo->v0);
-> I already debugged this, the critical call comes from xenHypervisorInit Here the local &info struct gets passed to virXen_getdomaininfo which is
  later accessed with &(dominfo->v2);
=>The dominfos->v2d5 should be empty maybe thats why it fails ?
1. info of type xen_getdomaininfo is allocated locally in xenHypervisorInit
 2. then its passed to virXen_getdomaininfo
2. virXen_getdomaininfo allocates a new local local xen_getdomaininfolist
    called dominfos
 3. depending on hypervisor_version there the v0 or the v2 part of the
    submitted info struct gets assigned (!no v2d5 assignment!)
 4. now we should have a xen_getdomaininfolist thats filled with either
"dominfos.v0=&(dominfo->v0);" or "dominfos.v2=&(dominfo->v2);" but is
    never assigned an therefore undefined
5. This dominfos struct is passed by reference to virXen_getdomaininfolist 6. depending on hypervisor_version and sys_interface_version the code now accesses in our bug scenario the v2d5 field (which should be in the best
    case undefined)
7. This op struct with the assingnment from v2d5 as buffer.v now gets passed
    to Xen which later fails to handle that

Because the stack is not cleaned and the call to sysctl version 2/3 is very close and similar it might be possible that we access accidentially the data of the v2 call wth the v3 layout and becaue of that our buffer address may be
shifted 32bits.

As a quick test I did the following in virXen_getdomaininfo
memset(&dominfos, 0, sizeof(dominfos));
-> zeroing local variable should never hurt except something like the described buggy stack usage is around and as expected I get segmentation faults if I zero the
dominfos struct.

The initial approach to fix that is:
   if (hypervisor_version < 2) {
       dominfos.v0 = &(dominfo->v0);
   } else {
       if (sys_interface_version < 3)
               dominfos.v2 = &(dominfo->v2);
       else
               dominfos.v2d5 = &(dominfo->v2d5);
   }

But I realized that &(dominfo->v2) == &(dominfo->v0) == &(dominfo->v2d5) which should all three be subparts of a struct and therefore the adresses should not
be the same.

union xen_getdomaininfolist {
   struct xen_v0_getdomaininfo *v0;
   struct xen_v2_getdomaininfo *v2;
   struct xen_v2d5_getdomaininfo *v2d5;
};
typedef union xen_getdomaininfolist xen_getdomaininfolist;

Here is the debugging output related to this from gdb and my debugging
statements.
 gdb --args /usr/bin/python mytest.py
I may overlook something but I think the addresses should be
different and the "print *op" in xenHypervisorDoV2Sys should not always see
getdomaininfolist and getdomaininfolists3 filled at the same time.

Breakpoint 2 at 0xf671534: file xen_internal.c, line 824.
Pending breakpoint "xenHypervisorDoV2Sys" resolved
Initialize Libvirtmod (sleepwait)
virXen_getdomaininfo - assigning &(dominfo->v2) '0xffb29eb0'
virXen_getdomaininfolist - enter for firstdomain '0' maxids '1' on handle '9' virXen_getdomaininfolist - sys_interface_version '2' hypervisor_version '2'
virXen_getdomaininfolist - dominfos->v2 '0xffb29eb0' dominfos->v2d5
'0xffb29eb0'
virXen_getdomaininfolist - sleepwait
[Switching to Thread 0xf7fc2000 (LWP 4876)]
Breakpoint 2, xenHypervisorDoV2Sys (handle=9, op=0xffb29d00) at
xen_internal.c:824
warning: Source file is more recent than executable.
824         memset(&hc, 0, sizeof(hc));
(gdb) print *op
$1 = {cmd = 6, interface_version = 0, u = {getdomaininfolist = {first_domain = 0, max_domains = 1, buffer = 0xffb29eb0, num_domains = 1}, getdomaininfolists3 = {first_domain = 0, max_domains = 1, buffer = {v = 0xffb29eb0, pad = 18424963504277553153}, num_domains = 0}, getschedulerid = {sched_id = 0}, padding = {0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 255 '', 178 '', 158 '\236', 176 '', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 0 '\0' <repeats 112 times>}}}
(gdb) c
Continuing.
virXen_getdomaininfo - assigning &(dominfo->v2d5) '0xffb29eb0'
virXen_getdomaininfolist - enter for firstdomain '0' maxids '1' on handle '9' virXen_getdomaininfolist - sys_interface_version '3' hypervisor_version '2'
virXen_getdomaininfolist - dominfos->v2 '0xffb29eb0' dominfos->v2d5
'0xffb29eb0'
virXen_getdomaininfolist - sleepwait
Breakpoint 2, xenHypervisorDoV2Sys (handle=9, op=0xffb29d00) at
xen_internal.c:824
824         memset(&hc, 0, sizeof(hc));
(gdb) print *op
$2 = {cmd = 6, interface_version = 0, u = {getdomaininfolist = {first_domain = 0, max_domains = 1, buffer = 0xffb29eb0, num_domains = 0}, getdomaininfolists3 = {first_domain = 0, max_domains = 1, buffer = {v = 0xffb29eb0, pad = 18424963504277553152}, num_domains = 1}, getschedulerid = {sched_id = 0}, padding = {0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 255 '', 178 '', 158 '\236', 176 '', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 0 '\0' <repeats 108 times>}}}
(gdb) c
Continuing.
-> Bug

I attached the current debug/workaround/testfix diff (not nice but working ;-) ) as well as my minimal debug python skript.
It's late today - Have fun to continue Jerone ;)

Jerone Young wrote:
On Tue, 2007-06-26 at 15:29 +0200, Christian Ehrhardt wrote:
Hi,
here is a very rude text that shows how far I get today. It's more or less a text file where I copy&paste everything I find over the day. It is a lot of stuff, but reading it brings us completely to the same page in this issue. It should be some kind of readable if you read it top->down.
BTW - I got no mail from you yesterday. This might have two reasons:
a) you did not send one - well thats no problem for me ;)
b) you send one and I didn't receive it - I had this issue once with Hollis where mails needed up to 5 days. Since that we always used notes & imap mail addresses for the mails to be sure (was an imap server issue)

First FYI:
In addition to your mail you need the following to be able to compile libvirt, virt-manager and virtinst:
libssldevel
ncurses-devel
libtool
libxml2
gnutls
phyton-urlgrabber
pygtk2-devel
gtk2-devel
now you have a complete list ;)

Thanks I probably just didn't put these there. But yes all these need to
be added to the grand list :-)

Then if youhave your custom xen with 2.6.18 xenolinux somewhere you need to make the headers available ln -s /root/xen_2.6.18/xen-unstable.hg/linux-2.6.18-xen.hg/include/xen/public/ /usr/include/xen/linux ln -s /root/xen_2.6.18/xen-unstable.hg/xen/include/public /usr/include/xen
Replace /root/xen_2.6.18/xen-unstable.hg with your xen directory
Replace xen_2.6.18/xen-unstable.hg/linux-2.6.18-xen.hg with your xenolinux directory

Well what you really need is just the xen headers. So they can be found
in the xen source. Just downloading the xen-unstable and doing "make
xen-install" (I think that's it). Will install all the needed headers
onto the system.


install virtinst
The compilation of virtinst causes the same Bug described than "virsh -r -t -d 5 connect". It occurs while trying to execute autobuild. Isolated the part "python setup.py test" of the
autobuild process of  virtinst.
(XEN) pfn2mfn: Dom[0] pfn 0x3fefdfc000000 is not a valid page
(XEN) paddr_to_maddr: Dom:0 bad paddr: 0x3fefdfc000000000

I have a feeling this has little to do with virtinst and more to do with
python exposing a problem in our Xen kernel.


In setup.py tracked down to the call that also appers with a tracestack:
The buggy thing is the "import tests.xmlconfig" that gets generated.
If I skip the xmlconfig part test.coverage gets loaded without issues
which implies that the load of test.xmlconfig should not be a
path/directory issue.

This is very interesting. I figure another python module whould set this
off also. Maybe test.xmlconfig is executing something that the others
are not.

-> A simple python script just containing the "import tests.xmlconfig" casues the bug, use this to debug this issue
-> The code printing that "bad paddr" is in "arch/powerpc/usercopy.c"
=> where is the connection


Ok yeap, just as I figured. I'm willing to bet this starts in
arch/powerpc/platforms/xen/hcall.c

I'll see if I get some free time this eveing to create the 2.6.18 kernel
you guys have, to narrow it down even more.



pbclient4:~/libvirt/virtinst-0.103.0 # cat mytest.py
import pdb
pdb.set_trace();
import tests.xmlconfig

Tracked down with pdb debugger (every -> is the triggering function one step deeper in the stack) pbclient4:~/libvirt/virtinst-0.103.0 # gdb --args /usr/bin/python mytest.py
import xmlconfig
-> From Guest Guest
-> import libvirt
This is not virtinst, its libvirt python binding in
/usr/local/python2.5/site-packages/libvirt.py
-> import libvirtmod
=> this is the c mapper to map python to C functions (partially generated code) -> in that code the bug is in virInitialize() which is called once initially
+use GDB to debug virInitialize
Ths is no more python bindings of libvirt/python its src/libvirt.c:57
-> without bug through some virRegisterDriver (driver=0xf6b7488) at libvirt.c:222 for test & qemu
-> fails at
74      #ifdef WITH_XEN
75          if (xenUnifiedRegister () == -1) return -1;
-> Breakpoint 2, xenUnifiedRegister () at xen_unified.c:923
-> bug at/below xenHypervisorInit()
-> Breakpoint 2, xenHypervisorInit () at xen_internal.c:1703
-> bug at/below 1839 if (virXen_getdomaininfo(fd, 0, &info) == 1) {
-> wrapper to virXen_getdomaininfolist()
=> the working one is later "Using hypervisor call v2, sys ver3 dom ver5" so
the buggy one should just fail and go on without bad paddr
-> hyp v 2 sys v 2 working but not the "wanted" one
-> hyp v 2 sys v 3
-> last libvirt function in buggy call stack is "xenHypervisorDoV2Sys" it issues an ioctl hypercall that then fails at the known bad paddr


Last critical point in Dom0 userspace is the ioctl call with the parameters:
(gdb) print hc.op
$14 = 35
(gdb) print *op
$15 = { cmd = 6,
    interface_version = 0,
    u = {getdomaininfolist = {    first_domain = 0,
                    max_domains = 1,
                    buffer = 0xff875eb0,
                    num_domains = 0},
         getdomaininfolists3 = {    first_domain = 0,
                    max_domains = 1,
buffer = {v = 0xff875eb0, pad = 18412789711534817280},
                    num_domains = 1},
    getschedulerid = {sched_id = 0},
padding = {0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 255 '', 135 '\207', 94 '^', 176 '', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 0 '\0', 1 '\001', 0 '\0' <repeats 108 times>}}}
(gdb) print sys_interface_version
$16 = 3

This ends in the bad paddr report - see how the address is became modified to 0x3f... removing the first 0xf I assume this is where things go wrong.

The ioctl is executed and we see the buffer address as the one reported as bad:
(XEN) pfn2mfn: Dom[0] pfn 0x3f875eb000000 is not a valid page
(XEN) paddr_to_maddr: Dom:0 bad paddr: 0x3f875eb000000000

Background of the used hypercall:
Hypercall 35 with command 6 is : do_sysctl which is in common/sysctl.c and
maps to arch/powerpc/sysctl.c if needed
command 6 is handled by common code
DEFINITION:
#define XEN_SYSCTL_getdomaininfolist 6
struct xen_sysctl_getdomaininfolist {
    /* IN variables. */
    domid_t               first_domain;
    uint32_t              max_domains;
    XEN_GUEST_HANDLE_64(xen_domctl_getdomaininfo_t) buffer;
    /* OUT variables. */
    uint32_t              num_domains;
};
IMPLEMENTATION (common code):
    case XEN_SYSCTL_getdomaininfolist:
    {
        struct domain *d;
        struct xen_domctl_getdomaininfo info;
        u32 num_domains = 0;

        rcu_read_lock(&domlist_read_lock);

        for_each_domain ( d )
        {
if ( d->domain_id < op->u.getdomaininfolist.first_domain )
                continue;
            if ( num_domains == op->u.getdomaininfolist.max_domains )
                break;

            getdomaininfo(d, &info);

            if ( copy_to_guest_offset(op->u.getdomaininfolist.buffer,
                                      num_domains, &info, 1) )
            {
                ret = -EFAULT;
                break;
            }

            num_domains++;
        }

        rcu_read_unlock(&domlist_read_lock);

        if ( ret != 0 )
            break;

        op->u.getdomaininfolist.num_domains = num_domains;

        if ( copy_to_guest(u_sysctl, op, 1) )
            ret = -EFAULT;
    }
    break;

-> debug this with print statements checking addresses

A valid section from some other call looks like this:
[...]
(XEN) xencomm_copy_to_guest - from '000000000cff38a5' to '0000000017eb7000' about '2' bytes skipping '0'
(XEN) xencomm_copy_to_guest - desc '00000001e1eb7000'
(XEN) cff38a5[2] -> 1e6e456d8
(XEN) xencomm_copy_to_guest - from '000000000cff38a4' to '0000000017eb7000' about '1' bytes skipping '2'
(XEN) xencomm_copy_to_guest - desc '00000001e1eb7000'
(XEN) cff38a4[1] -> 1e6e456da
(XEN) xencomm_copy_to_guest - from '000000000cff38a4' to '0000000017eb7000' about '1' bytes skipping '3'
(XEN) xencomm_copy_to_guest - desc '00000001e1eb7000'
(XEN) cff38a4[1] -> 1e6e456db
(XEN) 1ec68f920[144] -> cff3ad0
(XEN) cff3ad0[144] -> 1ec68f920
(XEN) 1ec68f920[144] -> cff3ad0
[...]

The buggy call to do_sysctl contians already the:
[...]
(XEN) do_sysctl: cmd '6' parameters op->u.getdomaininfolist.first_domain '0' op->u.getdomaininfolist.max_domains '1' op->u.getdomaininfolist.buffer.p 'bfb40a1000000000' adress of info '0000000000227b78'
(XEN) do_sysctl: iterate domain '0' num_domains '0'
(XEN) do_sysctl: post getdomaininfo, pre copy_to_guest_offset
(XEN) pfn2mfn: Dom[0] pfn 0x3fb40a1000000 is not a valid page
(XEN) paddr_to_maddr: Dom:0 bad paddr: 0x3fb40a1000000000
(XEN) 227b78[72] -> 0
(XEN) do_sysctl: pre copy_to_guest u_sysctl.p '800000000cffb810' op
'0000000000227ae0'
(XEN) 227ae0[136] -> 1ecffb810
(XEN) 1ecffb1e0[4] -> 227b6c
(XEN) 1ecffb1e0[4] -> 227b6c
(XEN) 1e2393400[4] -> 23fba0
[...]

So from above we saw the mappign from python to Xen:
buffer = {v = 0xff875eb0, pad = 18412789711534817280},
(XEN) pfn2mfn: Dom[0] pfn 0x3f875eb000000 is not a valid page
While I assume it should be something like 0x800000000f875eb or 0x0000000ff875eb but not 0x3f...

My current plan is:
1. remove some of the slowing down debug statements I already have
2. add more statements to check valid calls and which addresses they are using 2b. Hoping you will send me an easy "should be 0x0000000semething because whatever" mail ;9
Will try to get to you before you come in tommorow.

3. checkn the flow through the kernel from the sysctl call to the hypercall if there is e.g. a ifdef aroud a __pa or something like that


I'm not sure, but I'll try to come online this evening so we can chat about it.

It's ok .. I've been somewhat distracted (by other issues) but they are
now over. If things test well with my multiboot patch today I'll be on
this fulltime with you. I'll look more into the issues your are having
and get you an email before you come in tommorow.

Great work!



------------------------------------------------------------------------

diff -r 57c3b9568ea6 python/libvir.c
--- a/python/libvir.c    Thu Jul 19 13:52:36 2007 +0200
+++ b/python/libvir.c    Fri Jul 20 16:33:12 2007 +0200
@@ -15,6 +15,8 @@
 #include "libvirt_wrap.h"
 #include "libvirt-py.h"

+#define DEBUG_ERROR
+
 extern void initlibvirtmod(void);

PyObject *libvirt_virDomainGetUUID(PyObject *self ATTRIBUTE_UNUSED, PyObject *args);
@@ -95,7 +97,7 @@ libvirt_virErrorFuncHandler(ATTRIBUTE_UN
     PyObject *result;

 #ifdef DEBUG_ERROR
- printf("libvirt_virErrorFuncHandler(%p, %s, ...) called\n", ctx, msg);
+    printf("libvirt_virErrorFuncHandler(%p) called\n", ctx);
 #endif

     if ((err == NULL) || (err->code == VIR_ERR_OK))
@@ -688,10 +690,22 @@ initlibvirtmod(void)
     if (initialized != 0)
         return;

+    // DEBUG
+    printf("Initialize Libvirtmod (sleepwait)\n");
+    sleep(5);
+
     virInitialize();
+
+    // DEBUG
+ printf("post Initialize Libvirtmod / pre py_InitModule (sleepwait)\n");
+    sleep(5);

     /* intialize the python extension module */
     Py_InitModule((char *) "libvirtmod", libvirtMethods);
+
+    // DEBUG
+    printf("post py_InitModule (sleepwait)\n");
+    sleep(5);

     initialized = 1;
 }
diff -r 57c3b9568ea6 python/libvir.py
--- a/python/libvir.py    Thu Jul 19 13:52:36 2007 +0200
+++ b/python/libvir.py    Fri Jul 20 16:31:17 2007 +0200
@@ -4,6 +4,9 @@
 # Check python/generator.py in the source distribution of libvir
 # to find out more about the generation process
 #
+import pdb
+pdb.set_trace()
+
 import libvirtmod
 import types

diff -r 57c3b9568ea6 src/libvirt.c
--- a/src/libvirt.c    Thu Jul 19 13:52:36 2007 +0200
+++ b/src/libvirt.c    Fri Jul 20 17:07:51 2007 +0200
@@ -72,7 +72,7 @@ virInitialize(void)
     if (qemuRegister() == -1) return -1;
 #endif
 #ifdef WITH_XEN
-    if (xenUnifiedRegister () == -1) return -1;
+    if (xenUnifiedRegister() == -1) return -1;
 #endif
 #ifdef WITH_REMOTE
     if (remoteRegister () == -1) return -1;
diff -r 57c3b9568ea6 src/xen_internal.c
--- a/src/xen_internal.c    Thu Jul 19 13:52:36 2007 +0200
+++ b/src/xen_internal.c    Sun Jul 22 05:45:45 2007 +0200
@@ -36,6 +36,8 @@
 #include <xen/sched.h>

 #include "xml.h"
+
+#define DEBUG

 /* #define DEBUG */
 /*
@@ -904,24 +906,36 @@ virXen_getdomaininfolist(int handle, int
 {
     int ret = -1;

+
     if (mlock(XEN_GETDOMAININFOLIST_DATA(dominfos),
               XEN_GETDOMAININFO_SIZE * maxids) < 0) {
         virXenError(VIR_ERR_XEN_CALL, " locking",
                     XEN_GETDOMAININFO_SIZE * maxids);
         return (-1);
     }
+
+ printf("%s - enter for firstdomain '%d' maxids '%d' on handle '%d'\n",__func__,first_domain,maxids,handle); + printf("%s - sys_interface_version '%d' hypervisor_version '%d'\n",__func__,sys_interface_version,hypervisor_version); + printf("%s - dominfos->v2 '%p' dominfos->v2d5 '%p'\n",__func__,dominfos->v2,dominfos->v2d5);
+    printf("%s - sleepwait\n",__func__);
+    sleep(5);
+
     if (hypervisor_version > 1) {
         xen_op_v2_sys op;

         memset(&op, 0, sizeof(op));
         op.cmd = XEN_V2_OP_GETDOMAININFOLIST;
+ + printf("%s - allocated new and clean xen_op_v2_sys\n",__func__);

         if (sys_interface_version < 3) {
+ printf("%s - assiigning getdomaininfolist stuff \n",__func__); op.u.getdomaininfolist.first_domain = (domid_t) first_domain;
             op.u.getdomaininfolist.max_domains = maxids;
             op.u.getdomaininfolist.buffer = dominfos->v2;
             op.u.getdomaininfolist.num_domains = maxids;
         } else {
+ printf("%s - assiigning getdomaininfolist3 stuff \n",__func__); op.u.getdomaininfolists3.first_domain = (domid_t) first_domain;
             op.u.getdomaininfolists3.max_domains = maxids;
             op.u.getdomaininfolists3.buffer.v = dominfos->v2d5;
@@ -973,11 +987,20 @@ virXen_getdomaininfo(int handle, int fir
 virXen_getdomaininfo(int handle, int first_domain,
                      xen_getdomaininfo *dominfo) {
     xen_getdomaininfolist dominfos;
+    memset(&dominfos, 0, sizeof(dominfos));

     if (hypervisor_version < 2) {
         dominfos.v0 = &(dominfo->v0);
+ printf("%s - assigning &(dominfo->v0) '%p'\n",__func__,&(dominfo->v0));
     } else {
-        dominfos.v2 = &(dominfo->v2);
+    if (sys_interface_version < 3) {
+            dominfos.v2 = &(dominfo->v2);
+ printf("%s - assigning &(dominfo->v2) '%p'\n",__func__,&(dominfo->v2));
+    }
+    else {
+        dominfos.v2d5 = &(dominfo->v2d5);
+ printf("%s - assigning &(dominfo->v2d5) '%p'\n",__func__,&(dominfo->v2d5));
+    }
     }

return virXen_getdomaininfolist(handle, first_domain, 1, &dominfos);
@@ -1762,7 +1785,7 @@ xenHypervisorInit(void)

     if ((ret != -1) && (ret != 0)) {
 #ifdef DEBUG
-        fprintf(stderr, "Using new hypervisor call: %X\n", ret);
+        printf(stderr, "Using new hypervisor call: %X\n", ret);
 #endif
         hv_version = ret;
         xen_ioctl_hypercall_cmd = cmd;
@@ -1779,7 +1802,7 @@ xenHypervisorInit(void)
     ret = ioctl(fd, cmd, (unsigned long) &v0_hc);
     if ((ret != -1) && (ret != 0)) {
 #ifdef DEBUG
-        fprintf(stderr, "Using old hypervisor call: %X\n", ret);
+        printf(stderr, "Using old hypervisor call: %X\n", ret);
 #endif
         hv_version = ret;
         xen_ioctl_hypercall_cmd = cmd;
@@ -1808,7 +1831,7 @@ xenHypervisorInit(void)
     ipt = malloc(sizeof(virVcpuInfo));
     if (ipt == NULL){
 #ifdef DEBUG
- fprintf(stderr, "Memory allocation failed at xenHypervisorInit()\n"); + printf(stderr, "Memory allocation failed at xenHypervisorInit()\n");
 #endif
         return(-1);
     }
------------------------------------------------------------------------

_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel


------------------------------------------------------------------------

diff -r 1139ab948449 arch/powerpc/platforms/xen/hcall.c
--- a/arch/powerpc/platforms/xen/hcall.c Sun Jul 15 15:33:56 2007 +0200 +++ b/arch/powerpc/platforms/xen/hcall.c Sun Jul 22 22:41:56 2007 +0200
@@ -481,6 +481,13 @@ static int xenppc_privcmd_sysctl(privcmd
printk(KERN_ERR "%s: unknown sysctl cmd %d\n", __func__, kern_op.cmd);
         return -ENOSYS;
     case XEN_SYSCTL_getdomaininfolist:
+
+// DEBUG
+printk(KERN_EMERG"%s - hack changing kern_op.u.getdomaininfolist.buffer.p '%p' to kern_op.u.getdomaininfolist.buffer.p>>32 '%p'\n",__func__,kern_op.u.getdomaininfolist.buffer.p, ((struct xen_domctl_getdomaininfo_t *)(((unsigned long)(kern_op.u.getdomaininfolist.buffer.p))>>32)));
+// TODO DEBUG FIXME WHATEVER
+kern_op.u.getdomaininfolist.buffer.p=((struct xen_domctl_getdomaininfo_t *)(((unsigned long)(kern_op.u.getdomaininfolist.buffer.p))>>32)); +printk(KERN_EMERG"%s - verify hack kern_op.u.getdomaininfolist.buffer.p '%p'\n",__func__,(kern_op.u.getdomaininfolist.buffer.p));
+
         desc = xencomm_map(
             xen_guest_handle(kern_op.u.getdomaininfolist.buffer),
             kern_op.u.getdomaininfolist.max_domains *





--

Grüsse / regards, Christian Ehrhardt

IBM Linux Technology Center, Open Virtualization
+49 7031/16-3385
Ehrhardt@xxxxxxxxxxxxxxxxxx
Ehrhardt@xxxxxxxxxx

IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Johann Weihen Geschäftsführung: Herbert Kircher Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

diff -r 1139ab948449 arch/powerpc/platforms/xen/hcall.c
--- a/arch/powerpc/platforms/xen/hcall.c        Sun Jul 15 15:33:56 2007 +0200
+++ b/arch/powerpc/platforms/xen/hcall.c        Fri Jul 27 02:15:08 2007 +0200
@@ -481,6 +481,12 @@ static int xenppc_privcmd_sysctl(privcmd
                printk(KERN_ERR "%s: unknown sysctl cmd %d\n", __func__, 
kern_op.cmd);
                return -ENOSYS;
        case XEN_SYSCTL_getdomaininfolist:
+                WARN_ON(((((unsigned 
long)(kern_op.u.getdomaininfolist.buffer.p)) & 0x00000000FFFFFFFF) == 0x0)
+                     || ((((unsigned 
long)(kern_op.u.getdomaininfolist.buffer.p)) & 0xFFFFFFFF00000000) != 0x0));
+                kern_op.u.getdomaininfolist.buffer.p=
+                   (struct xen_domctl_getdomaininfo_t *)
+                   (((unsigned 
long)(kern_op.u.getdomaininfolist.buffer.p))>>32);
+
                desc = xencomm_map(
                        xen_guest_handle(kern_op.u.getdomaininfolist.buffer),
                        kern_op.u.getdomaininfolist.max_domains *
_______________________________________________
Xen-ppc-devel mailing list
Xen-ppc-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-ppc-devel
<Prev in Thread] Current Thread [Next in Thread>