Hi again Ian,
On 12/5/06, Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx> wrote:
Hey Magnus,
On Mon, 2006-12-04 at 13:35 +0900, Magnus Damm wrote:
> [PATCH 02/02] Kexec / Kdump: Don't declare _end
>
> _end is already declared in xen/include/asm/config.h, so don't declare
> it twice. This solves a powerpc/ia64 build problem where _end is declared
> as char _end[] compared to unsigned long _end on x86.
This change has broken x86 kdump :-( I think because you fixed a bug
with your change and thereby uncovered an another latent bug.
Yes, you are right. Thanks for noticing and cooking up a fix.
Before the range->size returned from kexec_get_xen() was 1/4 of the
correct value because you were subtracting unsigned long * pointers so
size was the number of words not the number of bytes as expected. After
this change we are now subtracting unsigned longs so the correct value
is returned.
This seems to have caused the crash notes to disappear from /proc/iomem:
Before:
00100000-def7efff : System RAM
00100000-001397bf : Hypervisor code and data
00193000-001930f7 : Crash note
00194000-001940f7 : Crash note
02000000-05ffffff : Crash kernel
After:
00100000-def7efff : System RAM
00100000-001e5eff : Hypervisor code and data
02000000-05ffffff : Crash kernel
I presume they went missing because "Hypervisor code and data" now
overlaps the notes.
Your reasoning makes sense, this indeed looks like a problem related to overlap.
For some reason this has broken kdump for me (on x86_32p). The kdump
kernel gives this stack trace and then hangs a little later on:
general protection fault: 0000 [#1]
Modules linked in:
CPU: 0
EIP: 0060:[<c204954d>] Not tainted VLI
EFLAGS: 00010002 (2.6.16.33-x86_32p-kdump #17)
EIP is at free_block+0x6d/0x100
eax: 00000000 ebx: ffffffff ecx: ffffffff edx: c2455000
esi: 00000001 edi: c5f22540 ebp: c253bef0 esp: c253bed8
ds: 007b es: 007b ss: 0068
Process events/0 (pid: 4, threadinfo=c253a000 task=c5f0aa70)
Stack: <0>00000001 c5e71210 00000000 c5e71210 00000001 c5e71200
c253bf14 c2049625
00000000 c234b100 c5f0aa70 c5f22540 c5f22588 c5f22540 c257a4c0
c253bf34
c204a796 00000000 00000086 00000000 c242d364 c5f51680 00000296
c253bf64
Call Trace:
[<c2003685>] show_stack_log_lvl+0xc5/0xf0
[<c2003847>] show_registers+0x197/0x220
[<c20039ae>] die+0xde/0x210
[<c20048fe>] do_general_protection+0xee/0x1a0
[<c200310f>] error_code+0x4f/0x54
[<c2049625>] drain_array_locked+0x45/0xa0
[<c204a796>] cache_reap+0x66/0x130
[<c2021456>] run_workqueue+0x66/0xd0
[<c2021a08>] worker_thread+0x138/0x160
[<c202461f>] kthread+0xaf/0xe0
[<c2001005>] kernel_thread_helper+0x5/0x10
I have not investigated your stack trace, but there are no crash notes
present in /proc/iomem without your patch which will lead to a vmcore
without PT_NOTE. This may trigger all sorts of errors in the secondary
kernel, but I'm not sure exactly which.
I changed xen_machine_kexec_register_resources() on the Linux side to
correctly nest the crash note resources under the xen resource which has
fixed things for me. Does the change below make sense to you? If so I'll
commit.
The patch looks good, please commit. I've tested it and the crash
notes now show up in /proc/iomem as expected. Thank you.
As a secondary point, perhaps the hypervisor resource should go all the
way to the end of the Xen heap (xenheap_phys_end I think) rather than
just the the end of .data/.bss?
Sounds like a good idea. I'm currently thinking how to pass down
virtual addresses from the hypervisor down to userspace so I can
modify kexec-tools to use proper virtual addresses for the PT_LOAD
program headers. I need the virtual address to make use of the
hypervisor resource, so they are sort of connected together. Anyway,
using the end of the Xen heap sounds like a step in the right
direction.
Many thanks,
/ magnus
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|