This is the revised version of the patch in my message:
Subject: [PATCH] fix gdb debugging of hypervisor
Date: Tue, 11 Dec 2007 17:41:52 +0000
Message-ID: <18270.52192.288867.395215@xxxxxxxxxxxxxxxxxxxxxxxx>
This patch:
* enables the gdbstubs to properly access hypervisor memory;
* prevents an assertion failure in __spurious_page_fault's call
to map_domain_page if such accesses fail, by testing in_irq();
* prints some additional helpful messages;
* fixes the endianness of register transfers from the gdbstubs
so that gdb is much less confused.
* fixes the documentation in docs/misc/crashdb.txt
Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
Ian.
diff -r 38a45b7c6cb5 docs/misc/crashdb.txt
--- a/docs/misc/crashdb.txt Mon Dec 10 11:37:13 2007 +0000
+++ b/docs/misc/crashdb.txt Wed Dec 12 11:08:32 2007 +0000
@@ -5,31 +5,46 @@ you've crashed it, you get to poke aroun
you've crashed it, you get to poke around and find out why. There's
also a special key handler for making it crash, which is handy.
-You need to have crash_debug=y set when compiling to enable the crash
-debugger (so go ``export crash_debug=y; make'', or ``crash_debug=y
-make'' or ``make crash_debug=y''), and you also need to enable it on
-the Xen command line, by going e.g. cdb=com1. If you need to have a
-serial port shared between cdb and the console, try cdb=com1H. CDB
-will then set the high bit on every byte it sends, and only respond to
-bytes with the high bit set. Similarly for com2.
+You need to have crash_debug=y set when compiling , and you also need
+to enable it on the Xen command line, eg by gdb=com1.
-The next step depends on your individual setup. This is how to do
-it for a normal test box in the SRG:
+If you need to have a serial port shared between gdb and the console,
+you can use gdb=com1H. CDB will then set the high bit on every byte
+it sends, and only respond to bytes with the high bit set. Similarly
+for com2. If you do this you will need a demultiplexing program on
+the debugging workstation, such as perhaps tools/misc/nsplitd.
--- Make your test machine crash. Either a normal panic or hitting
- 'C-A C-A C-A %' on the serial console will do.
--- Start gdb as ``gdb ./xen-syms''
--- Go ``target remote serial.srg:12331'', where 12331 is the second port
- reported for that machine by xenuse. (In this case, the machine is
- bombjack)
--- Go ``add-symbol-file vmlinux''
--- Debug as if you had a core file
--- When you're finished, go and reboot your test box. Hitting 'R' on the
- serial console won't work.
+The next step depends on your individual setup. This is how to do it
+if you have a simple null modem connection between the test box and
+the workstation, and aren't using a H/L split console:
-At one stage, it was sometimes possible to resume after entering the
-debugger from the serial console. This seems to have rotted, however,
-and I'm not terribly interested in putting it back.
+ * Set debug=y in Config.mk
+ * Set crash_debug=y in xen/Rules.mk
+ * Make the changes in the attached patch, and build.
+ * Arrange to pass gdb=com1 as a hypervisor command line argument
+ (I already have com1=38400,8n1 console=com1,vga sync_console)
+
+ * Boot the system with minicom (or your favourite terminal program)
+ connected from your workstation via a null modem cable in the
+ usual way.
+ * In minicom, give the escape character (^A by default) three times
+ to talk to Xen (Xen prints `(XEN) *** Serial input -> Xen...').
+ * Press % and observe the messages
+ (XEN) '%' pressed -> trapping into debugger
+ (XEN) GDB connection activated.
+ (XEN) Waiting for GDB to attach...
+ * Disconnect from minicom without allowing minicom to send any
+ modem control sequences.
+ * Start gdb with gdb /path/to/build/tree/xen/xen-syms and then
+ (gdb) set remotebaud 38400
+ Remote debugging using /dev/ttyS0
+ 0xff124d61 in idle_loop () at domain.c:78
+ 78 safe_halt();
+ (gdb)
+
+There is code which was once intended to make it possible to resume
+after entering the debugger. However this does not presently work; it
+has been nonfunctional for quite some time.
As soon as you reach the debugger, we disable interrupts, the
watchdog, and every other CPU, so the state of the world shouldn't
@@ -44,7 +59,5 @@ Reasons why we might fail to reach the d
you're screwed.
-- If the page tables are wrong, you're screwed
-- If the serial port setup is wrong, badness happens
--- We acquire the console lock at one stage XXX this is unnecessary and
- stupid
-- Obviously, the low level processor state can be screwed in any
number of wonderful ways
diff -r 38a45b7c6cb5 xen/common/gdbstub.c
--- a/xen/common/gdbstub.c Mon Dec 10 11:37:13 2007 +0000
+++ b/xen/common/gdbstub.c Wed Dec 12 11:14:20 2007 +0000
@@ -43,6 +43,7 @@
#include <xen/smp.h>
#include <xen/console.h>
#include <xen/errno.h>
+#include <asm/byteorder.h>
/* Printk isn't particularly safe just after we've trapped to the
debugger. so avoid it. */
@@ -215,7 +216,7 @@ gdb_write_to_packet_hex(unsigned long x,
gdb_write_to_packet_hex(unsigned long x, int int_size, struct gdb_context *ctx)
{
char buf[sizeof(unsigned long) * 2 + 1];
- int i = sizeof(unsigned long) * 2;
+ int i;
int width = int_size * 2;
buf[sizeof(unsigned long) * 2] = 0;
@@ -233,6 +234,8 @@ gdb_write_to_packet_hex(unsigned long x,
break;
}
+#ifdef __BIG_ENDIAN
+ i = sizeof(unsigned long) * 2
do {
buf[--i] = hex2char(x & 15);
x >>= 4;
@@ -242,6 +245,17 @@ gdb_write_to_packet_hex(unsigned long x,
buf[--i] = '0';
gdb_write_to_packet(&buf[i], width, ctx);
+#elif defined(__LITTLE_ENDIAN)
+ i = 0;
+ while (i < width) {
+ buf[i++] = hex2char(x>>4);
+ buf[i++] = hex2char(x);
+ x >>= 8;
+ }
+ gdb_write_to_packet(buf, width, ctx);
+#else
+# error unknown endian
+#endif
}
static int
@@ -512,7 +526,7 @@ __trap_to_gdb(struct cpu_user_regs *regs
if ( gdb_ctx->serhnd < 0 )
{
- dbg_printk("Debugger not ready yet.\n");
+ printk("Debugging connection not set up.\n");
return -EBUSY;
}
diff -r 38a45b7c6cb5 xen/common/keyhandler.c
--- a/xen/common/keyhandler.c Mon Dec 10 11:37:13 2007 +0000
+++ b/xen/common/keyhandler.c Wed Dec 12 11:12:36 2007 +0000
@@ -275,6 +275,7 @@ extern void perfc_reset(unsigned char ke
static void do_debug_key(unsigned char key, struct cpu_user_regs *regs)
{
+ printk("'%c' pressed -> trapping into debugger\n", key);
(void)debugger_trap_fatal(0xf001, regs);
nop(); /* Prevent the compiler doing tail call
optimisation, as that confuses xendbg a
diff -r 38a45b7c6cb5 xen/include/xen/gdbstub.h
--- a/xen/include/xen/gdbstub.h Mon Dec 10 11:37:13 2007 +0000
+++ b/xen/include/xen/gdbstub.h Wed Dec 12 11:12:36 2007 +0000
@@ -53,6 +53,7 @@ void gdb_write_to_packet(
const char *buf, int count, struct gdb_context *ctx);
void gdb_write_to_packet_hex(
unsigned long x, int int_size, struct gdb_context *ctx);
+ /* ... writes in target native byte order as required by gdb spec. */
void gdb_send_packet(struct gdb_context *ctx);
void gdb_send_reply(const char *buf, struct gdb_context *ctx);
diff -r 38a45b7c6cb5 xen/arch/x86/gdbstub.c
--- a/xen/arch/x86/gdbstub.c Mon Dec 10 11:37:13 2007 +0000
+++ b/xen/arch/x86/gdbstub.c Wed Dec 12 11:12:36 2007 +0000
@@ -72,17 +72,21 @@ gdb_arch_read_reg(unsigned long regnum,
}
/* Like copy_from_user, but safe to call with interrupts disabled.
- Trust me, and don't look behind the curtain. */
+ Trust me, and don't look behind the curtain.
+ We use the __ versions to skip the access_ok check, which
+ would otherwise prevent us from accessing hypervisor memory
+ (which is the main point, obviously).
+*/
unsigned int
gdb_arch_copy_from_user(void *dest, const void *src, unsigned len)
{
- return copy_from_user(dest, src, len);
+ return __copy_from_user(dest, src, len);
}
unsigned int
gdb_arch_copy_to_user(void *dest, const void *src, unsigned len)
{
- return copy_to_user(dest, src, len);
+ return __copy_to_user(dest, src, len);
}
void
diff -r 38a45b7c6cb5 xen/arch/x86/traps.c
--- a/xen/arch/x86/traps.c Mon Dec 10 11:37:13 2007 +0000
+++ b/xen/arch/x86/traps.c Wed Dec 12 11:12:36 2007 +0000
@@ -784,8 +784,8 @@ asmlinkage int do_invalid_op(struct cpu_
predicate = is_kernel(bug_str.str) ? (char *)bug_str.str : "<unknown>";
printk("Assertion '%s' failed at %.50s:%d\n",
predicate, filename, lineno);
+ show_execution_state(regs);
DEBUGGER_trap_fatal(TRAP_invalid_op, regs);
- show_execution_state(regs);
panic("Assertion '%s' failed at %.50s:%d\n",
predicate, filename, lineno);
@@ -913,6 +913,11 @@ static int __spurious_page_fault(
l2_pgentry_t l2e, *l2t;
l1_pgentry_t l1e, *l1t;
unsigned int required_flags, disallowed_flags;
+
+ /* We are not supposed to take any spurious page faults in IRQ
+ * handlers, and map_domain_page asserts !in_irq(), so just give up. */
+ if (in_irq())
+ return 0;
/* Reserved bit violations are never spurious faults. */
if ( regs->error_code & PFEC_reserved_bit )
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|