On Thu, Feb 25, 2010 at 11:10:45AM -0800, Jeremy Fitzhardinge wrote:
> On 02/25/2010 11:04 AM, Pasi Kärkkäinen wrote:
>> Hello,
>>
>> I just built and tried to boot upstream kernel.org Linux 2.6.33 kernel
>> as Xen PV domU, but that doesn't get very far:
>>
>> http://pasik.reaktio.net/xen/debug/bootlog-linux-2.6.33-xen-pv-domu-x86_64-crash.txt
>>
>
> Try the attached patch.
>
Yep, this patch fixes the problem, boots OK now.
Thanks! Now some save/restore testing..
-- Pasi
> J
>
>>
>> Freeing unused kernel memory: 1544k freed
>> Write protecting the kernel read-only data: 10240k
>> Freeing unused kernel memory: 1764k freed
>> BUG: unable to handle kernel paging request at ffff880001447000
>> IP: [<ffffffff8102e9f2>] free_init_pages+0xb2/0xdb
>> PGD 1a3c067 PUD 1a40067 PMD 138d5067 PTE 10000001447025
>> Oops: 0003 [#1] SMP
>> last sysfs file:
>> CPU 3
>> Pid: 1, comm: swapper Not tainted 2.6.33 #1 /
>> RIP: e030:[<ffffffff8102e9f2>] [<ffffffff8102e9f2>]
>> free_init_pages+0xb2/0xdb
>> RSP: e02b:ffff88007dfdbe60 EFLAGS: 00010286
>> RAX: 00000000cccccccc RBX: ffff880001600000 RCX: 0000000000000400
>> RDX: ffff880001447000 RSI: 0000000000000000 RDI: ffff880001447000
>> RBP: ffff88007dfdbe90 R08: 0000000000000000 R09: ffff88007fc04000
>> R10: ffff88007fc04000 R11: 0000000000100000 R12: ffff880001447000
>> R13: 0000000000000400 R14: ffffea0000000000 R15: 00000000cccccccc
>> FS: 0000000000000000(0000) GS:ffff8800139d6000(0000) knlGS:0000000000000000
>> CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> CR2: ffff880001447000 CR3: 0000000001a3b000 CR4: 0000000000002620
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
>> Process swapper (pid: 1, threadinfo ffff88007dfda000, task ffff88007dfe0000)
>> Stack:
>> 0000000000000000 ffff880000000000 6db6db6db6db6db7 ffffffff81a00000
>> <0> 0000000000a00000 0000000000000000 ffff88007dfdbec0 ffffffff8102ed73
>> <0> ffffffff81c6aa38 ffffffff81aefdf0 0000000000000100 0000000000000100
>> Call Trace:
>> [<ffffffff8102ed73>] mark_rodata_ro+0xea/0x151
>> [<ffffffff810021b9>] init_post+0x30/0x113
>> [<ffffffff81b0f715>] kernel_init+0x1c3/0x1d2
>> [<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10
>> [<ffffffff81009e91>] ? int_ret_from_sys_call+0x7/0x1b
>> [<ffffffff8143ae1d>] ? retint_restore_args+0x5/0x6
>> [<ffffffff8100aa60>] ? kernel_thread_helper+0x0/0x10
>> Code: cd 47 00 00 48 c1 e8 0c 4c 89 e2 4c 89 e9 48 6b c0 38 48 81 e2 00 f0
>> ff ff 31 f6 48 89 d7 4c 01 f0 c7 40 08 01 00 00 00 44 89 f8<f3> ab 4c 89 e7
>> 49 81 c4 00 10 00 00 e8 bc ca 09 00 48 ff 05 16
>> RIP [<ffffffff8102e9f2>] free_init_pages+0xb2/0xdb
>> RSP<ffff88007dfdbe60>
>> CR2: ffff880001447000
>> ---[ end trace 6e676731d52211fa ]---
>> Kernel panic - not syncing: Attempted to kill init!
>> Pid: 1, comm: swapper Tainted: G D 2.6.33 #1
>> Call Trace:
>> [<ffffffff81438663>] panic+0x7a/0x13d
>> [<ffffffff81057609>] ? exit_ptrace+0xa1/0x121
>> [<ffffffff8105074d>] do_exit+0x7a/0x6f3
>> [<ffffffff8104d15d>] ? spin_unlock_irqrestore+0xe/0x10
>> [<ffffffff8104dd76>] ? kmsg_dump+0x12b/0x145
>> [<ffffffff8143bc31>] oops_end+0xbf/0xc7
>> [<ffffffff8102f901>] no_context+0x1fc/0x20b
>> [<ffffffff8102fa94>] __bad_area_nosemaphore+0x184/0x1a7
>> [<ffffffff81004399>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
>> [<ffffffff8102faca>] bad_area_nosemaphore+0x13/0x15
>> [<ffffffff8143d663>] do_page_fault+0x14f/0x2a0
>> [<ffffffff8143b0b5>] page_fault+0x25/0x30
>> [<ffffffff8102e9f2>] ? free_init_pages+0xb2/0xdb
>> [<ffffffff8102ed73>] mark_rodata_ro+0xea/0x151
>> [<ffffffff810021b9>] init_post+0x30/0x113
>> [<ffffffff81b0f715>] kernel_init+0x1c3/0x1d2
>> [<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10
>> [<ffffffff81009e91>] ? int_ret_from_sys_call+0x7/0x1b
>> [<ffffffff8143ae1d>] ? retint_restore_args+0x5/0x6
>> [<ffffffff8100aa60>] ? kernel_thread_helper+0x0/0x10
>>
>> -- Pasi
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>>
>>
>
> From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Cc: "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>,
> "hpa@xxxxxxxxx" <hpa@xxxxxxxxx>,
> "rostedt@xxxxxxxxxxx" <rostedt@xxxxxxxxxxx>,
> "jeremy@xxxxxxxx" <jeremy@xxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>,
> Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Date: Thu, 18 Feb 2010 11:51:40 -0800
> Subject: Re: [LKML] Re: [PATCH] x86_64: allow sections that are recycled to
> set _PAGE_RW
>
> On Tue, 2010-02-16 at 14:13 -0800, Konrad Rzeszutek Wilk wrote:
> > On Sat, Feb 13, 2010 at 12:08:17PM -0800, Suresh Siddha wrote:
> > > The checks in static_protections() for kernel text mapping ensure that
> > > we don't break the 2MB kernel text pages unnecessarily on 64bit kernels
> > > (as it has performance implications). We should be fine as long as the
> > > kernel identity mappings reflect the correct RW permissions.
> > >
> > > But somehow this is working fine on native kernels but not on Xen pv
> > > guest. Your patch will cause the performance issues that we are
> >
> > That would not be good.
> >
> > > addressing using the static protections checks. I will look at this more
> > > detailed on tuesday.
> >
> > Great. Thank you for doing that. If you find yourself in a bind, here are
> > some steps on how to build the Xen pv-ops kernel and such:
> > http://wiki.xensource.com/xenwiki/XenParavirtOps
> >
> > It goes without saying that I would be happy to test your patch when
> > you have one ready.
>
> x86 folks, can you please queue the appended patch? If you think it is
> too late for 2.6.33, I added a "cc: stable", so that they can pick this
> up for both .32 and .33. Thanks.
> ---
>
> From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> Subject: x86_64, cpa: don't work hard in preserving kernel text 2M mapping
> when using 4K already
>
> We currently enforce the !RW mapping for the kernel mapping that maps
> holes between different text, rodata and data sections. However, kernel
> identity mappings will have different RWX permissions to the pages mapping to
> text and to the pages padding (which are freed) the text, rodata sections.
> Hence kernel identity mappings will be broken to smaller pages. For 64-bit,
> kernel text and kernel identity mappings are different, so we can enable
> protection checks that come with CONFIG_DEBUG_RODATA, as well as retain 2MB
> large page mappings for kernel text.
>
> Konrad reported a boot failure with the Linux Xen paravirt guest because of
> this. In this paravirt guest case, the kernel text mapping and the kernel
> identity mapping share the same page-table pages. Thus forcing the !RW mapping
> for some of the kernel mappings also cause the kernel identity mappings to be
> read-only resulting in the boot failure. Linux Xen paravirt guest also
> uses 4k mappings and don't use 2M mapping.
>
> Fix this issue and retain large page performance advantage for native kernels
> by not working hard and not enforcing !RW for the kernel text mapping,
> if the current mapping is already using small page mapping.
>
> Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxx [2.6.32, 2.6.33]
> ---
>
> index 1d4eb93..cf07c26 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -291,8 +291,29 @@ static inline pgprot_t static_protections(pgprot_t prot,
> unsigned long address,
> */
> if (kernel_set_to_readonly &&
> within(address, (unsigned long)_text,
> - (unsigned long)__end_rodata_hpage_align))
> - pgprot_val(forbidden) |= _PAGE_RW;
> + (unsigned long)__end_rodata_hpage_align)) {
> + unsigned int level;
> +
> + /*
> + * Don't enforce the !RW mapping for the kernel text mapping,
> + * if the current mapping is already using small page mapping.
> + * No need to work hard to preserve large page mappings in this
> + * case.
> + *
> + * This also fixes the Linux Xen paravirt guest boot failure
> + * (because of unexpected read-only mappings for kernel identity
> + * mappings). In this paravirt guest case, the kernel text
> + * mapping and the kernel identity mapping share the same
> + * page-table pages. Thus we can't really use different
> + * protections for the kernel text and identity mappings. Also,
> + * these shared mappings are made of small page mappings.
> + * Thus this don't enforce !RW mapping for small page kernel
> + * text mapping logic will help Linux Xen parvirt guest boot
> + * aswell.
> + */
> + if (lookup_address(address, &level) && (level != PG_LEVEL_4K))
> + pgprot_val(forbidden) |= _PAGE_RW;
> + }
> #endif
>
> prot = __pgprot(pgprot_val(prot) & ~pgprot_val(forbidden));
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|