Pardon me replying to my own post; it was only after hitting "send" on
the first one I took a closer look and found a much more interesting
(and worrisome) aspect to the issue I've been seeing: Another DomU
crashed with the same error at the same time. This one was testing an
experimental Linux kernel patch (impacting procfs handling of symlinks),
and particularly unstable for that reason. It's interesting, though,
that the other DomU (without any such patch applied) appeared to be
impacted as well by the same issue.
This appears to be reproducible.
From the DomU initiating the issue [running an experimental kernel
patch and exercising a bug in that patch]:
Bad rx buffer (memory squeeze?).
Bad rx buffer (memory squeeze?).
Unable to handle kernel paging request at ffff880000b3c700 RIP:
<ffffffff8024026a>{netif_poll+1354}
PGD c55067 PUD c56067 PMD c5c067 PTE 0
Oops: 0002 [1]
CPU 0
Modules linked in: ext3 jbd unionfs
Pid: 0, comm: swapper Tainted: GF 2.6.12.6-xenU
RIP: e030:[<ffffffff8024026a>] <ffffffff8024026a>{netif_poll+1354}
RSP: e02b:ffffffff803bbd98 EFLAGS: 00010212
RAX: ffff880000b3c700 RBX: ffff880000b97900 RCX: ffff880000b3c064
RDX: ffff880000b3c700 RSI: 0000000000000002 RDI: ffff880000b97900
RBP: ffff880000b97900 R08: 0000000000000000 R09: 0000000000000022
R10: 000000000003f998 R11: 0000000000000212 R12: ffff88003faba360
R13: ffff88003f41a138 R14: 0000000000000080 R15: 0000000000000000
FS: 00002aaaab2890a0(0000) GS:ffffffff803a7900(0000) knlGS:ffffffff80440600
CS: e033 DS: 0000 ES: 0000
Process swapper (pid: 0, threadinfo ffffffff803ba000, task ffffffff80307380)
Stack: 0000000100000000 0000000100000040 0000000000000001 0000002800000028
ffffffff803bbe2c ffff88003faba000 ffffffff803bbdc8 ffffffff803bbdc8
0000000000000000 ffff88003f879c60
Call Trace:<ffffffff80255859>{net_rx_action+169}
<ffffffff8013380b>{__do_softirq+107}
<ffffffff801338ad>{do_softirq+61} <ffffffff80114e69>{do_IRQ+57}
<ffffffff8010d948>{evtchn_do_upcall+136}
<ffffffff80111fb9>{do_hypervisor_callback+17}
<ffffffff8010f9f3>{xen_idle+83} <ffffffff8010f9f3>{xen_idle+83}
<ffffffff8010fa2f>{cpu_idle+31} <ffffffff803bc6ea>{start_kernel+490}
<ffffffff803bc169>{_sinittext+361}
Code: c7 00 01 00 00 00 48 8b 83 10 01 00 00 c7 40 04 00 00 00 00
RIP <ffffffff8024026a>{netif_poll+1354} RSP <ffffffff803bbd98>
CR2: ffff880000b3c700
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!
From the DomU being impacted by the issue [running no unusual patches
or modules, and being stable *except* when the initiating DomU is running]:
Unable to handle kernel paging request at ffff88003df8d700 RIP:
<ffffffff8024026a>{netif_poll+1354}
PGD 5e3067 PUD 5e4067 PMD 7d4067 PTE 0
Oops: 0002 [1]
CPU 0
Modules linked in: ipv6
Pid: 0, comm: swapper Tainted: GF 2.6.12.6-xenU
RIP: e030:[<ffffffff8024026a>] <ffffffff8024026a>{netif_poll+1354}
RSP: e02b:ffffffff803bbd98 EFLAGS: 00010212
RAX: ffff88003df8d700 RBX: ffff88003d99cbc0 RCX: ffff88003df8d05e
RDX: ffff88003df8d700 RSI: 0000000000000002 RDI: ffff88003d99cbc0
RBP: ffff88003d99cbc0 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff80381c20 R11: 0000000000000212 R12: ffff88003fc2e360
R13: ffff8800004dd248 R14: 0000000000000080 R15: 0000000000000000
FS: 00002aaaaade3b00(0000) GS:ffffffff803a7900(0000) knlGS:ffffffff803a7900
CS: e033 DS: 0000 ES: 0000
Process swapper (pid: 0, threadinfo ffffffff803ba000, task ffffffff80307380)
Stack: 0000000100000000 0000000100000040 0000000000000001 0000174a0000174a
ffffffff803bbe2c ffff88003fc2e000 ffffffff803bbdc8 ffffffff803bbdc8
0000000000000000 ffff8800000cae60
Call Trace:<ffffffff80255859>{net_rx_action+169}
<ffffffff8013380b>{__do_softirq+107}
<ffffffff801338ad>{do_softirq+61} <ffffffff80114e69>{do_IRQ+57}
<ffffffff8010d948>{evtchn_do_upcall+136}
<ffffffff80111fb9>{do_hypervisor_callback+17}
<ffffffff8010f9f3>{xen_idle+83} <ffffffff8010f9f3>{xen_idle+83}
<ffffffff8010fa2f>{cpu_idle+31} <ffffffff803bc6ea>{start_kernel+490}
<ffffffff803bc169>{_sinittext+361}
Code: c7 00 01 00 00 00 48 8b 83 10 01 00 00 c7 40 04 00 00 00 00
RIP <ffffffff8024026a>{netif_poll+1354} RSP <ffffffff803bbd98>
CR2: ffff88003df8d700
<0>Kernel panic - not syncing: Aiee, killing interrupt handler!
The "experimental kernel patch" in question is a unionfs patch found at
http://permalink.gmane.org/gmane.comp.file-systems.unionfs.general/638,
when applied to UnionFS 1.1.1 (a different release than that it was
initially developed against, though the patch applies cleanly). The bug
is repeatedly observable for me when playing with ifup on a system
running said patch with a root filesystem on a unionfs mount. If anyone
is interested in reproducing it and is unable to do so on the
information I've provided so far, let me know and I'd be glad to try to
offer additional details.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|