WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Linux 2.6.33 crashes on boot as Xen PV domU

To: Pasi Kärkkäinen <pasik@xxxxxx>
Subject: Re: [Xen-devel] Linux 2.6.33 crashes on boot as Xen PV domU
From: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Date: Thu, 25 Feb 2010 11:10:45 -0800
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Ian Campbell <Ian.Campbell@xxxxxxxxxx>
Delivery-date: Thu, 25 Feb 2010 11:11:17 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20100225190457.GC2761@xxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <20100225190457.GC2761@xxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.7) Gecko/20100120 Fedora/3.0.1-1.fc12 Lightning/1.0b2pre Thunderbird/3.0.1
On 02/25/2010 11:04 AM, Pasi Kärkkäinen wrote:
Hello,

I just built and tried to boot upstream kernel.org Linux 2.6.33 kernel
as Xen PV domU, but that doesn't get very far:

http://pasik.reaktio.net/xen/debug/bootlog-linux-2.6.33-xen-pv-domu-x86_64-crash.txt

Try the attached patch.

    J


Freeing unused kernel memory: 1544k freed
Write protecting the kernel read-only data: 10240k
Freeing unused kernel memory: 1764k freed
BUG: unable to handle kernel paging request at ffff880001447000
IP: [<ffffffff8102e9f2>] free_init_pages+0xb2/0xdb
PGD 1a3c067 PUD 1a40067 PMD 138d5067 PTE 10000001447025
Oops: 0003 [#1] SMP
last sysfs file:
CPU 3
Pid: 1, comm: swapper Not tainted 2.6.33 #1 /
RIP: e030:[<ffffffff8102e9f2>]  [<ffffffff8102e9f2>] free_init_pages+0xb2/0xdb
RSP: e02b:ffff88007dfdbe60  EFLAGS: 00010286
RAX: 00000000cccccccc RBX: ffff880001600000 RCX: 0000000000000400
RDX: ffff880001447000 RSI: 0000000000000000 RDI: ffff880001447000
RBP: ffff88007dfdbe90 R08: 0000000000000000 R09: ffff88007fc04000
R10: ffff88007fc04000 R11: 0000000000100000 R12: ffff880001447000
R13: 0000000000000400 R14: ffffea0000000000 R15: 00000000cccccccc
FS:  0000000000000000(0000) GS:ffff8800139d6000(0000) knlGS:0000000000000000
CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff880001447000 CR3: 0000000001a3b000 CR4: 0000000000002620
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
Process swapper (pid: 1, threadinfo ffff88007dfda000, task ffff88007dfe0000)
Stack:
  0000000000000000 ffff880000000000 6db6db6db6db6db7 ffffffff81a00000
<0>  0000000000a00000 0000000000000000 ffff88007dfdbec0 ffffffff8102ed73
<0>  ffffffff81c6aa38 ffffffff81aefdf0 0000000000000100 0000000000000100
Call Trace:
  [<ffffffff8102ed73>] mark_rodata_ro+0xea/0x151
  [<ffffffff810021b9>] init_post+0x30/0x113
  [<ffffffff81b0f715>] kernel_init+0x1c3/0x1d2
  [<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10
  [<ffffffff81009e91>] ? int_ret_from_sys_call+0x7/0x1b
  [<ffffffff8143ae1d>] ? retint_restore_args+0x5/0x6
  [<ffffffff8100aa60>] ? kernel_thread_helper+0x0/0x10
Code: cd 47 00 00 48 c1 e8 0c 4c 89 e2 4c 89 e9 48 6b c0 38 48 81 e2 00 f0 ff ff 31 
f6 48 89 d7 4c 01 f0 c7 40 08 01 00 00 00 44 89 f8<f3>  ab 4c 89 e7 49 81 c4 00 
10 00 00 e8 bc ca 09 00 48 ff 05 16
RIP  [<ffffffff8102e9f2>] free_init_pages+0xb2/0xdb
  RSP<ffff88007dfdbe60>
CR2: ffff880001447000
---[ end trace 6e676731d52211fa ]---
Kernel panic - not syncing: Attempted to kill init!
Pid: 1, comm: swapper Tainted: G      D    2.6.33 #1
Call Trace:
  [<ffffffff81438663>] panic+0x7a/0x13d
  [<ffffffff81057609>] ? exit_ptrace+0xa1/0x121
  [<ffffffff8105074d>] do_exit+0x7a/0x6f3
  [<ffffffff8104d15d>] ? spin_unlock_irqrestore+0xe/0x10
  [<ffffffff8104dd76>] ? kmsg_dump+0x12b/0x145
  [<ffffffff8143bc31>] oops_end+0xbf/0xc7
  [<ffffffff8102f901>] no_context+0x1fc/0x20b
  [<ffffffff8102fa94>] __bad_area_nosemaphore+0x184/0x1a7
  [<ffffffff81004399>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
  [<ffffffff8102faca>] bad_area_nosemaphore+0x13/0x15
  [<ffffffff8143d663>] do_page_fault+0x14f/0x2a0
  [<ffffffff8143b0b5>] page_fault+0x25/0x30
  [<ffffffff8102e9f2>] ? free_init_pages+0xb2/0xdb
  [<ffffffff8102ed73>] mark_rodata_ro+0xea/0x151
  [<ffffffff810021b9>] init_post+0x30/0x113
  [<ffffffff81b0f715>] kernel_init+0x1c3/0x1d2
  [<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10
  [<ffffffff81009e91>] ? int_ret_from_sys_call+0x7/0x1b
  [<ffffffff8143ae1d>] ? retint_restore_args+0x5/0x6
  [<ffffffff8100aa60>] ? kernel_thread_helper+0x0/0x10

-- Pasi


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


--- Begin Message ---
To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Subject: Re: [LKML] Re: [PATCH] x86_64: allow sections that are recycled to set _PAGE_RW
From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
Date: Thu, 18 Feb 2010 11:51:40 -0800
Cc: "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>, "hpa@xxxxxxxxx" <hpa@xxxxxxxxx>, "rostedt@xxxxxxxxxxx" <rostedt@xxxxxxxxxxx>, "jeremy@xxxxxxxx" <jeremy@xxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>, Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Delivered-to: jeremy@xxxxxxxx
In-reply-to: <20100216221313.GA22869@xxxxxxxxxxxxxxxxxxx>
Organization: Intel Corp
References: <1266030928-2126-1-git-send-email-konrad.wilk@xxxxxxxxxx> <1266030928-2126-2-git-send-email-konrad.wilk@xxxxxxxxxx> <1266091697.2677.64.camel@sbs-t61> <20100216221313.GA22869@xxxxxxxxxxxxxxxxxxx>
Reply-to: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
On Tue, 2010-02-16 at 14:13 -0800, Konrad Rzeszutek Wilk wrote:
> On Sat, Feb 13, 2010 at 12:08:17PM -0800, Suresh Siddha wrote:
> > The checks in static_protections() for kernel text mapping ensure that
> > we don't break the 2MB kernel text pages unnecessarily on 64bit kernels
> > (as it has performance implications). We should be fine as long as the
> > kernel identity mappings reflect the correct RW permissions.
> > 
> > But somehow this is working fine on native kernels but not on Xen pv
> > guest. Your patch will cause the performance issues that we are
> 
> That would not be good.
> 
> > addressing using the static protections checks. I will look at this more
> > detailed on tuesday.
> 
> Great. Thank you for doing that. If you find yourself in a bind, here are
> some steps on how to build the Xen pv-ops kernel and such:
> http://wiki.xensource.com/xenwiki/XenParavirtOps
> 
> It goes without saying that I would be happy to test your patch when
> you have one ready.

x86 folks, can you please queue the appended patch? If you think it is
too late for 2.6.33, I added a "cc: stable", so that they can pick this
up for both .32 and .33. Thanks.
---

From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
Subject: x86_64, cpa: don't work hard in preserving kernel text 2M mapping when 
using 4K already

We currently enforce the !RW mapping for the kernel mapping that maps
holes between different text, rodata and data sections. However, kernel
identity mappings will have different RWX permissions to the pages mapping to
text and to the pages padding (which are freed) the text, rodata sections.
Hence kernel identity mappings will be broken to smaller pages. For 64-bit,
kernel text and kernel identity mappings are different, so we can enable
protection checks that come with CONFIG_DEBUG_RODATA, as well as retain 2MB
large page mappings for kernel text.

Konrad reported a boot failure with the Linux Xen paravirt guest because of
this. In this paravirt guest case, the kernel text mapping and the kernel
identity mapping share the same page-table pages. Thus forcing the !RW mapping
for some of the kernel mappings also cause the kernel identity mappings to be
read-only resulting in the boot failure. Linux Xen paravirt guest also
uses 4k mappings and don't use 2M mapping.

Fix this issue and retain large page performance advantage for native kernels
by not working hard and not enforcing !RW for the kernel text mapping,
if the current mapping is already using small page mapping.

Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Signed-off-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Cc: stable@xxxxxxxxxx   [2.6.32, 2.6.33]
---

index 1d4eb93..cf07c26 100644
--- a/arch/x86/mm/pageattr.c
+++ b/arch/x86/mm/pageattr.c
@@ -291,8 +291,29 @@ static inline pgprot_t static_protections(pgprot_t prot, 
unsigned long address,
         */
        if (kernel_set_to_readonly &&
            within(address, (unsigned long)_text,
-                  (unsigned long)__end_rodata_hpage_align))
-               pgprot_val(forbidden) |= _PAGE_RW;
+                  (unsigned long)__end_rodata_hpage_align)) {
+               unsigned int level;
+
+               /*
+                * Don't enforce the !RW mapping for the kernel text mapping,
+                * if the current mapping is already using small page mapping.
+                * No need to work hard to preserve large page mappings in this
+                * case.
+                *
+                * This also fixes the Linux Xen paravirt guest boot failure
+                * (because of unexpected read-only mappings for kernel identity
+                * mappings). In this paravirt guest case, the kernel text
+                * mapping and the kernel identity mapping share the same
+                * page-table pages. Thus we can't really use different
+                * protections for the kernel text and identity mappings. Also,
+                * these shared mappings are made of small page mappings.
+                * Thus this don't enforce !RW mapping for small page kernel
+                * text mapping logic will help Linux Xen parvirt guest boot
+                * aswell.
+                */
+               if (lookup_address(address, &level) && (level != PG_LEVEL_4K))
+                       pgprot_val(forbidden) |= _PAGE_RW;
+       }
 #endif
 
        prot = __pgprot(pgprot_val(prot) & ~pgprot_val(forbidden));



--- End Message ---
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
<Prev in Thread] Current Thread [Next in Thread>