WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] 2.6.38 x86_64 domU null pointer in xennet_alloc_rx_buffe

To: Peter Sandin <psandin@xxxxxxxxxx>
Subject: Re: [Xen-devel] 2.6.38 x86_64 domU null pointer in xennet_alloc_rx_buffers
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Tue, 12 Apr 2011 17:06:26 -0400
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 12 Apr 2011 14:08:16 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <875AC862-8CFC-4583-8BDC-45ECE189DE53@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <875AC862-8CFC-4583-8BDC-45ECE189DE53@xxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Apr 12, 2011 at 11:58:35AM -0400, Peter Sandin wrote:
> 
> We've got some 64 bit guests that have been trying to dereference a null 
> pointer in xennet_alloc_rx_buffers. We have only been receiving reports of 
> this issue since introducing 2.6.38 guest kernels. The only reports that we 
> have received of this are on guests that are running 64 bit kernels. These 
> reports have come from multiple separate physical machines. One of the 
> instances that ran in to this issue was repeatedly restarting the nginx web 
> server, and failing because port 80 was already in use, however we were 
> unable to replicate the issue using this method in a controlled environment. 
> Any suggestions on replicating or resolving this issue are would be 
> appreciated.

> 
> More traces, the .config and kernel binary can be found at:
> 
> http://thesandins.net/xen/2.6.38-x86_64/

Nothing in the Xen hypervisor console?

> 
> --
> 
> BUG: Bad page state in process swapper  pfn:5bb31
> page:ffffea000140f2b8 count:-1 mapcount:0 mapping:          (null) 
> index:0xffff88005b8bdf80
> page flags: 0x100000000000000()
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: [<ffffffff81370b27>] xennet_alloc_rx_buffers+0xe1/0x2d9

So it looks as if it just does an alloc_page, and alloc_page  does an
check_new_page(), which checks the values mentioned above. The one that is odd
is the page->_count (it should have been zero, it is -1).

.. which sadly is not getting us closer to trying to reproduce this. But it 
looks
familiar..

> PGD 7bacb067 PUD 7b930067 PMD 0 
> Oops: 0002 [#1] SMP 
> last sysfs file: /sys/kernel/uevent_seqnum
> CPU 0 
> Modules linked in:
> 
> Pid: 0, comm: swapper Not tainted 2.6.38-x86_64-linode17 #1  
> RIP: e030:[<ffffffff81370b27>]  [<ffffffff81370b27>] 
> xennet_alloc_rx_buffers+0xe1/0x2d9
> RSP: e02b:ffff88007ff7fcf0  EFLAGS: 00010202
> RAX: 0000000000000000 RBX: ffff88007bfa85c0 RCX: 0000000000000000
> RDX: ffff88007d36bf00 RSI: ffff88007b309400 RDI: ffff88007b309400
> RBP: ffff88007ff7fd50 R08: 0000000000000000 R09: 000000000007195a
> R10: 0000000000000001 R11: 00000000000006fa R12: ffff88007bfa92b0
> R13: ffff88007bfa8000 R14: 0000000000000001 R15: 00000000000002cd
> FS:  00007f4de5d42760(0000) GS:ffff88007ff7c000(0000) knlGS:0000000000000000
> CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 000000007bb74000 CR4: 0000000000002660
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a9b020)
> Stack:
>  ffff88007d36bf00 ffff88007bfa8000 ffff88007d36bf00 ffff88007bfa85c0
>  ffff88007ff7fd50 00000017813f46c5 ffff88007d36bf00 ffff88007bfa85c0
>  ffff88007ff7fe10 ffff88007bfa8000 0000000000000001 ffff88007bfa85c0
> Call Trace:
>  <IRQ> 
>  [<ffffffff81372822>] xennet_poll+0xbef/0xc85
>  [<ffffffff815272aa>] ? _raw_spin_unlock_irqrestore+0x19/0x1c
>  [<ffffffff813f4d51>] net_rx_action+0xb6/0x1dc
>  [<ffffffff812ef6e7>] ? unmask_evtchn+0x1f/0xa3
>  [<ffffffff810431a4>] __do_softirq+0xc7/0x1a3
>  [<ffffffff81085ca9>] ? handle_fasteoi_irq+0xd2/0xe1
>  [<ffffffff810069b2>] ? check_events+0x12/0x20
>  [<ffffffff8100a85c>] call_softirq+0x1c/0x30
>  [<ffffffff8100bebd>] do_softirq+0x41/0x7e
>  [<ffffffff8104303b>] irq_exit+0x36/0x78
>  [<ffffffff812f022c>] xen_evtchn_do_upcall+0x2f/0x3c
>  [<ffffffff8100a8ae>] xen_do_hypervisor_callback+0x1e/0x30
>  <EOI> 
>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006
>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006
>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006
>  [<ffffffff810063a3>] ? xen_safe_halt+0x10/0x1a
>  [<ffffffff81010998>] ? default_idle+0x4b/0x85
>  [<ffffffff81008d53>] ? cpu_idle+0x60/0x97
>  [<ffffffff8151b349>] ? rest_init+0x6d/0x6f
>  [<ffffffff81b2ad34>] ? start_kernel+0x37f/0x38a
>  [<ffffffff81b2a2cd>] ? x86_64_start_reservations+0xb8/0xbc
>  [<ffffffff81b2de71>] ? xen_start_kernel+0x528/0x52f
> Code: c8 00 00 00 41 ff c6 48 89 44 37 38 8b 82 c4 00 00 00 48 8b b2 c8 00 00 
> 00 66 c7 04 06 01 00 49 8b 44 24 08 4c 89 22 48 89 4
> 2 08 <48> 89 10 49 89 54 24 08 ff 83 00 0d 00 00 44 3b 75 cc 0f 8c 5a 
> RIP  [<ffffffff81370b27>] xennet_alloc_rx_buffers+0xe1/0x2d9
>  RSP <ffff88007ff7fcf0>
> CR2: 0000000000000000
> ---[ end trace e0e245c8a8426fde ]---
> Kernel panic - not syncing: Fatal exception in interrupt
> Pid: 0, comm: swapper Tainted: G      D     2.6.38-x86_64-linode17 #1
> Call Trace:
>  <IRQ>  [<ffffffff8152550d>] ? panic+0x8c/0x195
>  [<ffffffff8152856b>] ? oops_end+0xb7/0xc7
>  [<ffffffff8102709f>] ? no_context+0x1f7/0x206
>  [<ffffffff810ad088>] ? get_page_from_freelist+0x445/0x715
>  [<ffffffff81027236>] ? __bad_area_nosemaphore+0x188/0x1ab
>  [<ffffffff8144f390>] ? tcp_v4_rcv+0x521/0x681
>  [<ffffffff81027267>] ? bad_area_nosemaphore+0xe/0x10
>  [<ffffffff8152a4e7>] ? do_page_fault+0x1ef/0x3ee
>  [<ffffffff8144f390>] ? tcp_v4_rcv+0x521/0x681
>  [<ffffffff810ad55c>] ? __alloc_pages_nodemask+0x14d/0x6ab
>  [<ffffffff813eb0bb>] ? __netdev_alloc_skb+0x1d/0x3a
>  [<ffffffff81527a55>] ? page_fault+0x25/0x30
>  [<ffffffff81370b27>] ? xennet_alloc_rx_buffers+0xe1/0x2d9
>  [<ffffffff81372822>] ? xennet_poll+0xbef/0xc85
>  [<ffffffff815272aa>] ? _raw_spin_unlock_irqrestore+0x19/0x1c
>  [<ffffffff813f4d51>] ? net_rx_action+0xb6/0x1dc
>  [<ffffffff812ef6e7>] ? unmask_evtchn+0x1f/0xa3
>  [<ffffffff810431a4>] ? __do_softirq+0xc7/0x1a3
>  [<ffffffff81085ca9>] ? handle_fasteoi_irq+0xd2/0xe1
>  [<ffffffff810069b2>] ? check_events+0x12/0x20
>  [<ffffffff8100a85c>] ? call_softirq+0x1c/0x30
>  [<ffffffff8100bebd>] ? do_softirq+0x41/0x7e
>  [<ffffffff8104303b>] ? irq_exit+0x36/0x78
>  [<ffffffff812f022c>] ? xen_evtchn_do_upcall+0x2f/0x3c
>  [<ffffffff8100a8ae>] ? xen_do_hypervisor_callback+0x1e/0x30
>  <EOI>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006
>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006
>  [<ffffffff810013aa>] ? hypercall_page+0x3aa/0x1006
>  [<ffffffff810063a3>] ? xen_safe_halt+0x10/0x1a
>  [<ffffffff81010998>] ? default_idle+0x4b/0x85
>  [<ffffffff81008d53>] ? cpu_idle+0x60/0x97
>  [<ffffffff8151b349>] ? rest_init+0x6d/0x6f
>  [<ffffffff81b2ad34>] ? start_kernel+0x37f/0x38a
>  [<ffffffff81b2a2cd>] ? x86_64_start_reservations+0xb8/0xbc
>  [<ffffffff81b2de71>] ? xen_start_kernel+0x528/0x52f
> 
> --Peter
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>