No 64-bit process works under Solaris dom0 and Xen 3.1. The problem is an ABI
breakage, though as it's not documented as far as I can find out, it's somewhat
of a grey area.
In 3.0.4, the syscall trampoline did:
325 /* pushq $FLAT_KERNEL_CS32 */
326 stack[16] = 0x68;
327 *(u32 *)&stack[17] = FLAT_KERNEL_CS32;
328
329 /* jmp syscall_enter */
330 stack[21] = 0xe9;
331 *(u32 *)&stack[22] = (char *)syscall_enter - &stack[26];
ENTRY(syscall_enter)
sti
movl $FLAT_KERNEL_SS,24(%rsp)
pushq %rcx
pushq $0
movl $TRAP_syscall,4(%rsp)
SAVE_ALL
Thus %rcx and %r11 (the user %rip/%rflags as per syscall insn) were saved in
UREGS_rcx/r11, and appeared as such in the guest's syscall context. We were
relying on being able to just pop the stack into %rcx/%r11 and get syscall-like
values. This was broken by:
changeset: 15095:a06a28ebad5d
user: kfraser@xxxxxxxxxxxxxxxxxxxxx
...
+ /* movq $syscall_enter,%r11 */
+ stack[21] = 0x49;
+ stack[22] = 0xbb;
+ *(void **)&stack[23] = (void *)syscall_enter;
+
+ /* jmpq *%r11 */
+ stack[31] = 0x41;
+ stack[32] = 0xff;
+ stack[33] = 0xe3;
...
In particular it corrupted the user %rflags such that X86_EFLAGS_AC was getting
set, and breaking the world. (Actually Solaris still booted, which is quite
surprising.)
So, why the register indirect jmp?
An obvious fix is for us to snarf %rip and %rflags out of the syscall stack
instead of using the registers, but this does mean another arbitrary difference
between metal and Xen. Then again, the syscall entry point is already radically
different, and we've probably discovered this early enough for us to just fix
all our guests, presuming nobody else was making this mistake.
(Yes, it would have been nice if we'd managed to test 3.1.1, but we just
didn't find the resources by the deadline.)
regards
john
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|