Re: [Xen-devel] Using SYSCALL/SYSRET with a minios kernel

On Mon, 2008-02-25 at 14:14 +0100, Goswin von Brederlow wrote:
> Daniel Stodden <stodden@xxxxxxxxxx> writes:
> 
> > On Mon, 2008-02-25 at 11:04 +0100, Goswin von Brederlow wrote:
> >
> >> >> --- kernel.c ---
> >> >>   HYPERVISOR_set_callbacks((unsigned long)hypervisor_callback,
> >> >>                            (unsigned long)failsafe_callback,
> >> >>                            (unsigned long)syscall_callback);
> >> >> 
> >> >>   __asm__ __volatile__("syscall");
> >> >> 
> >> >> If I understood you right that should set the RIP to syscall_callback
> >> >> and execute from there.
> >> >
> >> > MÃƒÂ¶ÃƒÂ¶p! Only when calling in from virtual user mode. Otherwise, 
> >> > you're
> >> > triggering a hypercall service routine, and one might suspect you're
> >> > presently just generating an error condition with that. :)
> >> 
> >> That sounds verry odd. I'm getting no indication of it from xen.
> >
> > Why odd? That's how e.g. syscall processing in Xen's entry.S is structured.
> > Many hypercalls fail with messages. But e.g. an invalid hypercall number
> > would silently return -ENOSYS, so it does not appear too unlikely. 
> > What do you get instead?
> 
> Nothing. The 'syscall' instruction behaves like a 'nop'. If Xen's
> syscall emulation fails then I would expect it to raise some
> exception, e.g. illegal instruction.
> 
> The amd64 tech docs indicate that syscall can be used recursively
> (indicated by SS == 0) and no check on the CPL is performed by
> 'syscall'. So I would expect that i could call 'syscall' even in
> kernel mode in Xen too. But now that I know better that is ok too.

Well, that's how x86 does it. But the PV interface is only meant to
reflect x86 to a degree which provide fur i) porting existing OSs sane
and simple ii) keeps the hypervisor small and lean. Both at the same
time. And to my limited knowledge, syscalls from kernel mode are not
exactly a killer feature in practical x86 os design.

> >> But ok. How do I test that. Or differently phrased: What is the best
> >> way to go into user space for the verry first time? Do I really have
> >> to create a fake stack frame and call HYPERVISOR_iret?
> >
> > iret is the only method I am aware of, can't think about anything else. 
> > Doubt
> > that a stack switch would be forcibly required.
> >
> > Does not neccesarily mean much, however, since I did not write the freaky 
> > thing.
> 
> I added the following code to x86_64.S:
> 
> ENTRY(fail)
>         syscall
>         int $3
>         jmp fail
>         
> ENTRY(go_user)
>         pushq $0xe02b
>         pushq %rsp
>         subq  $64,(%rsp)
>         pushq $0x10212
>         pushq $0xe030
>         pushq $fail
>         orb   $3,1*8(%rsp)
>         orb   $3,4*8(%rsp)
>         pushq $0
>         jmp  hypercall_page + (__HYPERVISOR_iret * 32)
> 
> So I construct a stack frame that looks like an interrupt happened and
> the next instruction to run is at 'fail'. I set the ring to ring3
> (orb) and then do an iret.
> 
> The code switches the ESP and EIP and continued executing 'fail' in
> user mode. I know for sure it is user mode because first it gave an
> error that there is no user page table set (see below). Anyway, in
> user mode the 'syscall' then works.

Good.

> >> > BTW: I found building Xen with 'debug=y' generates a helpful comment on
> >> > the console every now and xen.
> >> 
> >> I did that and added a patch that makes HYPERVISOR_console_io work for
> >> domU so it shows up in "xm dmesg".
> >
> > Ah, I see. Good idea.
> >
> >> >> But still, the syscall opcode does nothing.
> >> >> In case you wonder. The "int $80" is there to crash the domain and
> >> >> tell me it reached that point.
> >
> > Shouldn't that just get you a GPF? 
> 
> Which calls the do_general_protection, if installed, which dumps the
> registers:
> GPF rip: 00000000001031fc, error_code=282
> RIP: e030:[<00000000001031fc>] 
> RSP: e02b:0000000000121fc8  EFLAGS: 00010212
> RAX: 0000000000000000 RBX: 0000000000119000 RCX: 00000000001031fc
> RDX: 0000000000000100 RSI: 00000000deadbeef RDI: 00000000deadbeef
> RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000000
> R10: 00000000fffffff9 R11: 0000000000000212 R12: 00000000001033ea
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> 
> Without handler (or in other instances) the domain truely crashes and
> 'xen dmesg' has a nice register and stack dump like:
> 
> (XEN) traps.c:212:d18 Guest switching to user mode with no user page tables
> (XEN) traps.c:241:d18 Fatal error
> (XEN) Domain 18 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-3.0.4-1  x86_64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e033:[<00000000001022eb>]
> (XEN) RFLAGS: 0000000000000206   CONTEXT: guest
> (XEN) rax: 0000000000000017   rbx: 0000000000119000   rcx: 00000000001022eb
> (XEN) rdx: 0000000000000100   rsi: 00000000deadbeef   rdi: 00000000deadbeef
> (XEN) rbp: 0000000000000000   rsp: 0000000000108d00   r8:  00000000ffffffff
> (XEN) r9:  0000000000000000   r10: 00000000fffffffc   r11: 0000000000000206
> (XEN) r12: 00000000001033ea   r13: 0000000000000000   r14: 0000000000000000
> (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 00000000000006f0
> (XEN) cr3: 000000005b50f000   cr2: 0000000000000000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e02b   cs: e033
> (XEN) Guest stack trace from rsp=0000000000108d00:
> (XEN)    0000000000000000 0000000000000001 000000000011b000 0000000000000000
> (XEN)    00000000001033ea 000000000000e033 0000000000010212 0000000000108d00
> (XEN)    000000000000e02b 0000000000105872 0000000000000000 0000000000000000
> ...
> 
> Both I find rather helpfull in debugging. :)

Ah, I see.

regards,
daniel

-- 
Daniel Stodden
LRR     -      Lehrstuhl für Rechnertechnik und Rechnerorganisation
Institut für Informatik der TU München             D-85748 Garching
http://www.lrr.in.tum.de/~stodden         mailto:stodden@xxxxxxxxxx
PGP Fingerprint: F5A4 1575 4C56 E26A 0B33  3D80 457E 82AE B0D8 735B



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] Using SYSCALL/SYSRET with a minios kernel