WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] network hang again

To: "Brian Wolfe" <ahzz@xxxxxxxxxxx>
Subject: RE: [Xen-devel] network hang again
From: "James Harper" <JamesH@xxxxxxxxxxxxxxxx>
Date: Wed, 15 Sep 2004 12:50:26 +1000
Cc: <xen-devel@xxxxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 15 Sep 2004 03:53:05 +0100
Envelope-to: steven.hand@xxxxxxxxxxxx
List-archive: <http://sourceforge.net/mailarchive/forum.php?forum=xen-devel>
List-help: <mailto:xen-devel-request@lists.sourceforge.net?subject=help>
List-id: List for Xen developers <xen-devel.lists.sourceforge.net>
List-post: <mailto:xen-devel@lists.sourceforge.net>
List-subscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>, <mailto:xen-devel-request@lists.sourceforge.net?subject=subscribe>
List-unsubscribe: <https://lists.sourceforge.net/lists/listinfo/xen-devel>, <mailto:xen-devel-request@lists.sourceforge.net?subject=unsubscribe>
Sender: xen-devel-admin@xxxxxxxxxxxxxxxxxxxxx
Thread-index: AcSadvf6Ymk9MWmeRgS9m+VK+dUpUgAV5MWg
Thread-topic: [Xen-devel] network hang again
When I explained about the patch on the iet list, I was asked if I was
getting frequent disconnections :)

It sounds like the network issues I'm seeing in xen are probably
triggering the crash in iscsi.

I'm running iet 0.3.3 + 2.6 patch + my additional 2.6 patch on dom0, and
linux-iscsi 4.0.1.8 on dom1.

James

> -----Original Message-----
> From: Brian Wolfe [mailto:ahzz@xxxxxxxxxxx]
> Sent: Wednesday, 15 September 2004 02:22
> To: James Harper
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-devel] network hang again
> 
> I have been running IET 0.3.3 on 2.4.27 on one machine, and cisco's
> linux-iscsi on 2.6.8.1 on a second physical machine for a couple days
> now. So far the only thing that I have run into is a dump message
> concerning OOM on the linux-iscsi machine.
> 
> 
> Sep 13 00:20:11 vhost1 kernel: iSCSI: 4.0.1 ( 9-Feb-2004) built for
> Linux 2.6.8-tbc-vhost-Xen0
> Sep 13 00:20:11 vhost1 kernel: iSCSI: will translate deferred sense to
> current sense on disk command responses
> Sep 13 00:20:11 vhost1 kernel: iSCSI: control device major number 254
> Sep 13 00:20:11 vhost1 kernel: scsi_proc_hostdir_add: proc_mkdir
failed
> for <NULL>
> Sep 13 00:20:11 vhost1 kernel: scsi17 : Cisco iSCSI driver
> Sep 13 00:20:11 vhost1 kernel: iSCSI:detected HBA host #17
> Sep 13 00:20:11 vhost1 kernel: iSCSI: bus 0 target 0 =
> iqn.2001-04.dmz.iscsi1:wnhttp
> Sep 13 00:20:11 vhost1 kernel: iSCSI: bus 0 target 0 portal 0 =
address
> 10.11.7.1 port 3260 group 1
> Sep 13 00:20:11 vhost1 kernel: iSCSI: starting timer thread at
21835751
> Sep 13 00:20:11 vhost1 kernel: iSCSI: bus 0 target 0 trying to
establish
> session to portal 0, address 10.11.7.1 port 32
> 60 group 1
> Sep 13 00:20:12 vhost1 kernel: iSCSI: session c1478000 authenticated
by
> target iqn.2001-04.dmz.iscsi1:wnhttp
> Sep 13 00:20:12 vhost1 kernel: iSCSI: bus 0 target 0 established
session
> #1, portal 0, address 10.11.7.1 port 3260 grou
> p 1
> Sep 13 00:20:12 vhost1 kernel:   Vendor: LINUX     Model:
> ISCSI             Rev: 0
> Sep 13 00:20:12 vhost1 kernel:   Type:
> Direct-Access                      ANSI SCSI revision: 03
> Sep 13 00:20:12 vhost1 kernel: SCSI device sda: 16777212 512-byte hdwr
> sectors (8590 MB)
> Sep 13 00:20:12 vhost1 kernel: SCSI device sda: drive cache: write
back
> Sep 13 00:20:12 vhost1 kernel:  sda: unknown partition table
> Sep 13 00:20:12 vhost1 kernel: Attached scsi disk sda at scsi17,
channel
> 0, id 0, lun 0
> Sep 13 00:20:12 vhost1 kernel:   Vendor: LINUX     Model:
> ISCSI             Rev: 0
> Sep 13 00:20:12 vhost1 kernel:   Type:
> Direct-Access                      ANSI SCSI revision: 03
> Sep 13 00:20:12 vhost1 kernel: SCSI device sdb: 65536 512-byte hdwr
> sectors (34 MB)
> Sep 13 00:20:12 vhost1 kernel: SCSI device sdb: drive cache: write
back
> Sep 13 00:20:12 vhost1 kernel:  sdb: unknown partition table
> Sep 13 00:20:12 vhost1 kernel: Attached scsi disk sdb at scsi17,
channel
> 0, id 0, lun 1
> Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: found reiserfs format
> "3.6" with standard journal
> Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: using ordered data mode
> Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: journal params: device
> sda, size 8192, journal first block 18, max trans
> len 1024, max batch 900, max commit age 30, max trans age 30
> Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: checking transaction log
> (sda)
> Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: replayed 1 transactions
in
> 0 seconds
> Sep 13 00:21:55 vhost1 kernel: ReiserFS: sda: Using r5 hash to sort
> names
> Sep 13 00:28:51 vhost1 kernel: iscsi-tx: page allocation failure.
> order:1, mode:0x20
> Sep 13 00:28:51 vhost1 kernel:  [__alloc_pages+728/848]
> __alloc_pages+0x2d8/0x350
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [__get_free_pages+31/64]
> __get_free_pages+0x1f/0x40
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [kmem_getpages+30/224]
> kmem_getpages+0x1e/0xe0
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [cache_grow+159/336]
> cache_grow+0x9f/0x150
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [cache_alloc_refill+318/512]
> cache_alloc_refill+0x13e/0x200
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [__kmalloc+139/160]
__kmalloc+0x8b/0xa0
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [alloc_skb+71/224] alloc_skb+0x47/0xe0
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [pg0+38296326/1002676224]
> rhine_rx+0x156/0x460 [via_rhine]
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [pg0+38295340/1002676224]
> rhine_interrupt+0x1ac/0x1d0 [via_rhine]
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [handle_IRQ_event+73/144]
> handle_IRQ_event+0x49/0x90
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [do_IRQ+109/240] do_IRQ+0x6d/0xf0
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [evtchn_do_upcall+156/256]
> evtchn_do_upcall+0x9c/0x100
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [hypervisor_callback+51/73]
> hypervisor_callback+0x33/0x49
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [csum_partial_copy_generic+63/248]
> csum_partial_copy_generic+0x3f/0xf8
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [tcp_sendmsg+578/4176]
> tcp_sendmsg+0x242/0x1050
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [inet_sendmsg+77/96]
> inet_sendmsg+0x4d/0x60
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [sock_sendmsg+165/192]
> sock_sendmsg+0xa5/0xc0
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [__do_softirq+149/160]
> __do_softirq+0x95/0xa0
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [do_softirq+69/80]
do_softirq+0x45/0x50
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [do_IRQ+194/240] do_IRQ+0xc2/0xf0
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [pg0+39270168/1002676224]
> iscsi_xmit_queued_cmnds+0x188/0x3c0 [iscsi]
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [pg0+39254271/1002676224]
> iscsi_sendmsg+0x4f/0x70 [iscsi]
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [pg0+39271874/1002676224]
> iscsi_xmit_data+0x472/0x8d0 [iscsi]
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [__do_softirq+149/160]
> __do_softirq+0x95/0xa0
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [pg0+39273273/1002676224]
> iscsi_xmit_r2t_data+0x119/0x1f0 [iscsi]
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [pg0+39165617/1002676224]
> iscsi_tx_thread+0x711/0x8d0 [iscsi]
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [autoremove_wake_function+0/96]
> autoremove_wake_function+0x0/0x60
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [autoremove_wake_function+0/96]
> autoremove_wake_function+0x0/0x60
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [default_wake_function+0/32]
> default_wake_function+0x0/0x20
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [pg0+39163808/1002676224]
> iscsi_tx_thread+0x0/0x8d0 [iscsi]
> Sep 13 00:28:51 vhost1 kernel:
> Sep 13 00:28:51 vhost1 kernel:  [kernel_thread_helper+5/16]
> kernel_thread_helper+0x5/0x10
> Sep 13 00:28:51 vhost1 kernel:
> 
> The only reason I'm posting the "trace" from linux-iscsi is because it
> contains the hypervisor_callback function in it and it's in the rx
phase
> of the via_rhine driver.
> 
> What iscsi are you running on each machine? (Sorry if I missed it,
been
> offline for a few deays now. 8-( ) I'd be interested to know if this
is
> in any way similar to your issue.
> 
> Brian
> 
> 
> On Tue, 2004-09-14 at 07:38, James Harper wrote:
> > I'm now seeing this network hang a lot, to the point where it makes
my
> > iscsi testing unusable. I believe this is more to do with the sort
of
> > testing I'm doing now more so than a bug that has suddenly appeared.
> >
> > My setup is this:
> > Dom0:
> > 2.6.8.1
> > Iscsitarget 0.3.3 + 2.6 patches + my own 2.6 patches.
> > No conntrack or other netfilter related modules
> > Bridged eth0 to Dom1
> > /usr/src exported via nfs
> >
> > Dom1:
> > 2.6.8.1
> > Linux-iscsi 4.0.1.8
> > No conntrack or other netfilter related modules
> > /usr/src mounted from Dom0
> >
> > Iscsi works for a while, normally crashing in Dom0 due to another
> > non-xen related bug before it hits this bug, but if I try to do a
> > compile on Dom1 in the nfs mounted /usr/src, the network locks up
almost
> > instantly, but then clears up shortly after if I kill the compile.
> >
> > The logs show absolutely nothing of any use.
> >
> > I've just tried a few netperf tests. A quick hammering goes off
without
> > a hitch, but afterwards I see random dropped packets. I'll keep
testing.
> >
> > James
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
> > Project Admins to receive an Apple iPod Mini FREE for your judgement
on
> > who ports your project to Linux PPC the best. Sponsored by IBM.
> > Deadline: Sept. 13. Go here: http://sf.net/ppc_contest.php
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxxxx
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
> 



-------------------------------------------------------
This SF.Net email is sponsored by: thawte's Crypto Challenge Vl
Crack the code and win a Sony DCRHC40 MiniDV Digital Handycam
Camcorder. More prizes in the weekly Lunch Hour Challenge.
Sign up NOW http://ad.doubleclick.net/clk;10740251;10262165;m
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel

<Prev in Thread] Current Thread [Next in Thread>