WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Crash with paravirt-ops 2.6.31.6 kernel

To: Bastian Blank <bastian@xxxxxxxxxxxx>, Jeremy Fitzhardinge <jeremy@xxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxxxx>
Subject: Re: [Xen-devel] Crash with paravirt-ops 2.6.31.6 kernel
From: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
Date: Mon, 23 Nov 2009 15:25:35 +0000
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, 544145@xxxxxxxxxxxxxxx
Delivery-date: Mon, 23 Nov 2009 07:26:00 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20091122095403.GA1496@xxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Organization: Citrix Systems, Inc.
References: <28846609.721258484676784.JavaMail.root@xxxxxxxxxxxxxxxxxxxxxx> <20091122095403.GA1496@xxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Sun, 2009-11-22 at 09:54 +0000, Bastian Blank wrote:
> On Tue, Nov 17, 2009 at 10:04:36PM +0300, William Pitcock wrote:
> > [    1.254927] init[1] general protection ip:f779042f sp:ff9b0340 error:0
> 
> Hmm, this looks like the old Debian bug 544145[1]. For some reason the
> hypervisor jumps back into 64bit mode after a syscall instruction.
> Workaround: vdso32=0 or deinstall libc6-i686,

I just noticed that one of my test boxes has a AMD processor so I took a
bit of a look into this.

The issue seems to be with this bit of code in the hypervisor
(xen/arch/x86/x86_64/entry.S):

        restore_all_guest:
                ASSERT_INTERRUPTS_DISABLED
                RESTORE_ALL
                testw $TRAP_syscall,4(%rsp)
                jz    iret_exit_to_guest
        
                addq  $8,%rsp
                popq  %rcx                    # RIP
                popq  %r11                    # CS
                cmpw  $FLAT_USER_CS32,%r11
                popq  %r11                    # RFLAGS
                popq  %rsp                    # RSP
                je    1f
                sysretq
        1:      sysretl

We are attempting to return to the Linux defined __USER_CS32 (0x23)
which does not match the test for the Xen defined FLAT_USER_CS32
(0xe023) and therefore we hit the sysretq instead of the sysretl which
causes us to return with CS 0xe033 (FLAT_USER_CS64) instead of CS
0xe023.

This patch to the kernel fixes things but doesn't seem that
satisfactory:

diff --git a/arch/x86/xen/xen-asm_64.S b/arch/x86/xen/xen-asm_64.S
index 02f496a..203586d 100644
--- a/arch/x86/xen/xen-asm_64.S
+++ b/arch/x86/xen/xen-asm_64.S
@@ -93,7 +93,7 @@ ENTRY(xen_sysret32)
        pushq $__USER32_DS
        pushq PER_CPU_VAR(old_rsp)
        pushq %r11
-       pushq $__USER32_CS
+       pushq $FLAT_USER_CS32
        pushq %rcx
 
        pushq $VGCF_in_syscall

Coming from the other angle we could fix this in the hypervisor by
always returning to guest (user or kernel) via iret instead of sysret:

diff -r e7a1eab70fac xen/arch/x86/x86_64/entry.S
--- a/xen/arch/x86/x86_64/entry.S       Mon Nov 09 10:24:54 2009 +0000
+++ b/xen/arch/x86/x86_64/entry.S       Mon Nov 23 15:15:39 2009 +0000
@@ -48,22 +48,6 @@
 restore_all_guest:
         ASSERT_INTERRUPTS_DISABLED
         RESTORE_ALL
-        testw $TRAP_syscall,4(%rsp)
-        jz    iret_exit_to_guest
-
-        addq  $8,%rsp
-        popq  %rcx                    # RIP
-        popq  %r11                    # CS
-        cmpw  $FLAT_USER_CS32,%r11
-        popq  %r11                    # RFLAGS
-        popq  %rsp                    # RSP
-        je    1f
-        sysretq
-1:      sysretl
-
-        ALIGN
-/* No special register assumptions. */
-iret_exit_to_guest:
         addq  $8,%rsp
 .Lft0:  iretq

I think much of the issue stems from Xen defining several segment
descriptors which are essentially equivalent to the ones Linux uses. It
seems a bit ugly to expose these Xen defined descriptors to the guest
when it hasn't explicitly asked for them. On the other hand I'm not sure
what can realistically do since doing the Right Thing would seem to
involve looking up the descriptor in the GDT to determine if the
selector is 32 or 64 bit and/or context switching IA32_STAR in some
fashion to allow guests to specify their own userspace CS for sysret 32
and 64.

Perhaps simply not returning guest userspace with sysret (as above)
makes most sense, a syscall already takes a trap through the hypervisor
on both entry and exit so I'm not sure the difference between sysret and
iret is going to be noticeable.

Another option might be to define VGCF_compat_mode as a new flag to
HYPERVISOR_iret and select sysretq/sysretl based on that. This would
still expose Xen descriptors to guests which didn't ask for one but at
least it would (probably) be a compatible descriptor.

> It does not happen on XenSource 2.6.18 kernel

I assume that this kernel (perhaps coincidentally) manages to use
FLAT_USER_CS32 for compat mode processes.

> , or the Debian 2.6.26 kernel.

This was a forward ported 2.6.18-style kernel so I guess the same reason
as 2.6.18.

Ian.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel