|
|
|
|
|
|
|
|
|
|
xen-devel
RE: [Xen-devel] dom0 crashing on extreme I/O
> I have 3 VMs, two running webservers and the 3rd running
> netperf/iperf. This is a multi-cpu setup, with dom0 on CPU-0
> and all the remaining VMs on a separate CPU.
>
> Currently my dom0 has 528M of memory, while each VM has around 160M.
> Under high loads, the system crashes. I'm pasting a
> representative crash here:
>
> file=grant_table.c, line=729) gnttab_transfer: out-of-range
> or xen frame 2f016001
> (XEN) (file=grant_table.c, line=729) gnttab_transfer:
> out-of-range or xen frame 2f017001
Interesting. We've seen this very occasionally before, but this is the
first time on a 32b kernel.
The clue is that the errant frame numbers always end 001, and are
actually valid if you shift them >>12.
It would be very helpful if you could work on a minimal repro case,
ideally with only one domU.
Chris: any extra debugging that might be helpful?
Thanks,
Ian
> (XEN) (file=grant_table.c, line=729) gnttab_transfer:
> out-of-range or xen frame 18fca001
> (XEN) (file=grant_table.c, line=729) gnttab_transfer:
> out-of-range or xen frame 18fcb001
> (XEN) (file=grant_table.c, line=729) gnttab_transfer:
> out-of-range or xen frame 2270c001
> (XEN) (file=grant_table.c, line=729) gnttab_transfer:
> out-of-range or xen frame 2270d001 ------------[ cut here
> ]------------ kernel BUG at drivers/xen/netback/netback.c:335!
> invalid operand: 0000 [#1]
> Modules linked in: ipt_physdev iptable_filter ip_tables video
> thermal processor fan button battery ac md sworks_agp agpgart
> dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod mptscsih
> mptbase sd_mod scsi_mod
> CPU: 0
> EIP: 0061:[<c02c6782>] Not tainted VLI
> EFLAGS: 00010246 (2.6.12.6-xen0)
> EIP is at net_rx_action+0x4c2/0x4f0
> eax: 0000fff7 ebx: df26b620 ecx: 00000042 edx: c04b8920
> esi: dc073480 edi: 00000000 ebp: c04b3900 esp: c0a23d28
> ds: 007b es: 007b ss: 0069
> Process ksoftirqd/0 (pid: 2, threadinfo=c0a22000 task=c0a16510)
> Stack: c04b38e0 c0362d90 80000000 c0363b36 dbad7d80 db6fee80
> df26b400 5cb34f36
> 00000088 00000000 0003d700 db6ff012 c04b8920 00000106
> 00a23e2c c05e5000
> 00000000 c0363a90 00000001 00000000 00000000 00000001
> c0a16510 00000000 Call Trace:
> [<c0362d90>] br_forward_finish+0x0/0x80 [<c0363b36>]
> br_handle_frame_finish+0xa6/0x160 [<c0363a90>]
> br_handle_frame_finish+0x0/0x160 [<c01423a5>]
> kmem_getpages+0x65/0x90 [<c013ece2>] __rmqueue+0xb2/0xf0
> [<c032302d>] nf_iterate+0x5d/0x90 [<c0367aa0>]
> br_nf_pre_routing_finish+0x0/0x420
> [<c0367aa0>] br_nf_pre_routing_finish+0x0/0x420
> [<c032336e>] nf_hook_slow+0x6e/0x120
> [<c0367aa0>] br_nf_pre_routing_finish+0x0/0x420
> [<c0363a90>] br_handle_frame_finish+0x0/0x160 [<c0368549>]
> br_nf_pre_routing+0x319/0x4a0 [<c0367aa0>]
> br_nf_pre_routing_finish+0x0/0x420
> [<c032302d>] nf_iterate+0x5d/0x90
> [<c0363a90>] br_handle_frame_finish+0x0/0x160 [<c0363a90>]
> br_handle_frame_finish+0x0/0x160 [<c032336e>]
> nf_hook_slow+0x6e/0x120 [<c0363a90>]
> br_handle_frame_finish+0x0/0x160 [<c0363db3>]
> br_handle_frame+0x1c3/0x260 [<c0363a90>]
> br_handle_frame_finish+0x0/0x160 [<c03188d3>]
> netif_receive_skb+0x113/0x230 [<c02820bf>]
> tg3_rx+0x2cf/0x490 [<c027e246>] tg3_restart_ints+0x26/0xa0
> [<c02823a6>] tg3_poll+0x126/0x1a0 [<c0121660>]
> ksoftirqd+0x0/0xa0 [<c0121660>] ksoftirqd+0x0/0xa0
> [<c01214ff>] tasklet_action+0x5f/0xa0 [<c0121152>]
> __do_softirq+0x52/0xc0 [<c0121207>] do_softirq+0x47/0x60
> [<c01216b9>] ksoftirqd+0x59/0xa0 [<c013079d>]
> kthread+0xad/0xf0 [<c01306f0>] kthread+0x0/0xf0
> [<c0106855>] kernel_thread_helper+0x5/0x10
> Code: 0f 0b 44 01 38 19 3a c0 90 e9 5a fe ff ff b8 74 64 40
> c0 e8 31 ac e5 ff eb 8f c7 04 24 9c 10 3b c0 e8 b3 60 e5 ff
> 8d 76 00 eb 92 <0f> 0b 4f 01 38 19 3a c0 e9 4c fe ff ff 0f 0b
> 2a 01 38 19 3a c0 <0>Kernel panic - not syncing: Fatal
> exception in interrupt
> (XEN) Domain 0 shutdown: rebooting machine.
>
> NOTE: the line number in netback.c (335) might not be very
> useful for reference. I have some additional instrumentation
> in netback, so the line number might not match the files in
> xen-unstable.hg
>
> Will increasing dom0 memory further help? Or increasing the
> size of the rings?
> --
> Web/Blog/Gallery: http://floatingsun.net
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|