On Mon, 2008-02-25 at 14:14 +0100, Goswin von Brederlow wrote:
> Daniel Stodden <stodden@xxxxxxxxxx> writes:
> > On Mon, 2008-02-25 at 11:04 +0100, Goswin von Brederlow wrote:
> >> >> --- kernel.c ---
> >> >> HYPERVISOR_set_callbacks((unsigned long)hypervisor_callback,
> >> >> (unsigned long)failsafe_callback,
> >> >> (unsigned long)syscall_callback);
> >> >>
> >> >> __asm__ __volatile__("syscall");
> >> >>
> >> >> If I understood you right that should set the RIP to syscall_callback
> >> >> and execute from there.
> >> >
> >> > MÃƒÂ¶ÃƒÂ¶p! Only when calling in from virtual user mode. Otherwise,
> >> > you're
> >> > triggering a hypercall service routine, and one might suspect you're
> >> > presently just generating an error condition with that. :)
> >> That sounds verry odd. I'm getting no indication of it from xen.
> > Why odd? That's how e.g. syscall processing in Xen's entry.S is structured.
> > Many hypercalls fail with messages. But e.g. an invalid hypercall number
> > would silently return -ENOSYS, so it does not appear too unlikely.
> > What do you get instead?
> Nothing. The 'syscall' instruction behaves like a 'nop'. If Xen's
> syscall emulation fails then I would expect it to raise some
> exception, e.g. illegal instruction.
> The amd64 tech docs indicate that syscall can be used recursively
> (indicated by SS == 0) and no check on the CPL is performed by
> 'syscall'. So I would expect that i could call 'syscall' even in
> kernel mode in Xen too. But now that I know better that is ok too.
Well, that's how x86 does it. But the PV interface is only meant to
reflect x86 to a degree which provide fur i) porting existing OSs sane
and simple ii) keeps the hypervisor small and lean. Both at the same
time. And to my limited knowledge, syscalls from kernel mode are not
exactly a killer feature in practical x86 os design.
> >> But ok. How do I test that. Or differently phrased: What is the best
> >> way to go into user space for the verry first time? Do I really have
> >> to create a fake stack frame and call HYPERVISOR_iret?
> > iret is the only method I am aware of, can't think about anything else.
> > Doubt
> > that a stack switch would be forcibly required.
> > Does not neccesarily mean much, however, since I did not write the freaky
> > thing.
> I added the following code to x86_64.S:
> int $3
> jmp fail
> pushq $0xe02b
> pushq %rsp
> subq $64,(%rsp)
> pushq $0x10212
> pushq $0xe030
> pushq $fail
> orb $3,1*8(%rsp)
> orb $3,4*8(%rsp)
> pushq $0
> jmp hypercall_page + (__HYPERVISOR_iret * 32)
> So I construct a stack frame that looks like an interrupt happened and
> the next instruction to run is at 'fail'. I set the ring to ring3
> (orb) and then do an iret.
> The code switches the ESP and EIP and continued executing 'fail' in
> user mode. I know for sure it is user mode because first it gave an
> error that there is no user page table set (see below). Anyway, in
> user mode the 'syscall' then works.
> >> > BTW: I found building Xen with 'debug=y' generates a helpful comment on
> >> > the console every now and xen.
> >> I did that and added a patch that makes HYPERVISOR_console_io work for
> >> domU so it shows up in "xm dmesg".
> > Ah, I see. Good idea.
> >> >> But still, the syscall opcode does nothing.
> >> >> In case you wonder. The "int $80" is there to crash the domain and
> >> >> tell me it reached that point.
> > Shouldn't that just get you a GPF?
> Which calls the do_general_protection, if installed, which dumps the
> GPF rip: 00000000001031fc, error_code=282
> RIP: e030:[<00000000001031fc>]
> RSP: e02b:0000000000121fc8 EFLAGS: 00010212
> RAX: 0000000000000000 RBX: 0000000000119000 RCX: 00000000001031fc
> RDX: 0000000000000100 RSI: 00000000deadbeef RDI: 00000000deadbeef
> RBP: 0000000000000000 R08: 0000000000000004 R09: 0000000000000000
> R10: 00000000fffffff9 R11: 0000000000000212 R12: 00000000001033ea
> R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> Without handler (or in other instances) the domain truely crashes and
> 'xen dmesg' has a nice register and stack dump like:
> (XEN) traps.c:212:d18 Guest switching to user mode with no user page tables
> (XEN) traps.c:241:d18 Fatal error
> (XEN) Domain 18 (vcpu#0) crashed on cpu#0:
> (XEN) ----[ Xen-3.0.4-1 x86_64 debug=y Not tainted ]----
> (XEN) CPU: 0
> (XEN) RIP: e033:[<00000000001022eb>]
> (XEN) RFLAGS: 0000000000000206 CONTEXT: guest
> (XEN) rax: 0000000000000017 rbx: 0000000000119000 rcx: 00000000001022eb
> (XEN) rdx: 0000000000000100 rsi: 00000000deadbeef rdi: 00000000deadbeef
> (XEN) rbp: 0000000000000000 rsp: 0000000000108d00 r8: 00000000ffffffff
> (XEN) r9: 0000000000000000 r10: 00000000fffffffc r11: 0000000000000206
> (XEN) r12: 00000000001033ea r13: 0000000000000000 r14: 0000000000000000
> (XEN) r15: 0000000000000000 cr0: 000000008005003b cr4: 00000000000006f0
> (XEN) cr3: 000000005b50f000 cr2: 0000000000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033
> (XEN) Guest stack trace from rsp=0000000000108d00:
> (XEN) 0000000000000000 0000000000000001 000000000011b000 0000000000000000
> (XEN) 00000000001033ea 000000000000e033 0000000000010212 0000000000108d00
> (XEN) 000000000000e02b 0000000000105872 0000000000000000 0000000000000000
> Both I find rather helpfull in debugging. :)
Ah, I see.
LRR - Lehrstuhl für Rechnertechnik und Rechnerorganisation
Institut für Informatik der TU München D-85748 Garching
PGP Fingerprint: F5A4 1575 4C56 E26A 0B33 3D80 457E 82AE B0D8 735B
Xen-devel mailing list