WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

Re: [Xen-users] xen randomly crashes all VMs hosted on iSCSI NAS array

Subject: Re: [Xen-users] xen randomly crashes all VMs hosted on iSCSI NAS array
From: VPS Lime <vpslime@xxxxxxxxx>
Date: Mon, 18 Oct 2010 11:15:53 -0400
Cc: "xen-users@xxxxxxxxxxxxxxxxxxx" <xen-users@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Mon, 18 Oct 2010 08:17:38 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:cc:content-type; bh=uFql+Diwuxg+ps0HOF+Xk9ZTGNQv6eyXaIaS9s/zcT4=; b=hrYw5a3MxHN/dfb8/3dcWK71Fzxni3RrzJQebclKNmye/1N/SuM1hWDm2duF0wLg03 bkakkAUR4DZoW1193QM0ySfcB406iuz+F/U0wAopmAq2LcZqK+MCa3o1NIn6nhRxQhav mjGyB6zcn1/0SnweCYcf2GTqxqOXok6aKz5BQ=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; b=o16nAykBTD1eXkJnDtnVJireKeSCKEq/+xppT7+blp5Pd8TgPHb36extp7AjDcS/02 dJpqQuikTVVUNgfynvEtk0eF1l2Y0ltk6EngcV9miJY081m9IZOECL9os156GUNawOiR fMmuv5t2IchGF9QohWXowkpMse6+EP+1s6mNk=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <32D516B498BAE9439B6502E86F31007A01BA76E6DE3A@domain>
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTikdZxj4wYb+AW3By08B=K=VA0OBeHqpMYH0g_OP@xxxxxxxxxxxxxx> <32D516B498BAE9439B6502E86F31007A01BA76E6DE3A@domain>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Good suggestion on dmesg.  The "memory squeeze in netback driver" seems like a likely culprit.  There is a bug (http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=762) dating back several years on this issue with some suggestions and other responses that did not work.  Has anyone come up with a reliable fix for this on CentOS 5.5?

xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
device xen3.128 entered promiscuous mode
ADDRCONF(NETDEV_UP): xen3.128: link is not ready
printk: 60 messages suppressed.
xen_net: Memory squeeze in netback driver.
blkback: ring-ref 8, event-channel 15, protocol 1 (x86_64-abi)
printk: 11 messages suppressed.
xen_net: Memory squeeze in netback driver.
ADDRCONF(NETDEV_CHANGE): xen3.120: link becomes ready
xenbr1: topology change detected, propagating
xenbr1: port 36(xen3.120) entering forwarding state
ADDRCONF(NETDEV_CHANGE): xen3.123: link becomes ready
xenbr1: topology change detected, propagating
xenbr1: port 41(xen3.123) entering forwarding state
device tap2 entered promiscuous mode
xenbr1: topology change detected, propagating
xenbr1: port 43(tap2) entering forwarding state
device xen1-112 entered promiscuous mode
ADDRCONF(NETDEV_UP): xen1-112: link is not ready
tap2: no IPv6 routers present
device tap5 entered promiscuous mode
xenbr1: topology change detected, propagating
xenbr1: port 45(tap5) entering forwarding state
device xen3.109 entered promiscuous mode
ADDRCONF(NETDEV_UP): xen3.109: link is not ready
tap5: no IPv6 routers present
printk: 8 messages suppressed.
xen_net: Memory squeeze in netback driver.
xen_net: Memory squeeze in netback driver.
xenbr1: port 46(xen3.109) entering disabled state
device xen3.109 left promiscuous mode
xenbr1: port 46(xen3.109) entering disabled state
xenbr1: port 45(tap5) entering disabled state
device tap5 left promiscuous mode
xenbr1: port 45(tap5) entering disabled state
device xen3.129 entered promiscuous mode
ADDRCONF(NETDEV_UP): xen3.129: link is not ready
blkback: ring-ref 8, event-channel 15, protocol 1 (x86_64-abi)
ADDRCONF(NETDEV_CHANGE): xen3.129: link becomes ready
xenbr1: topology change detected, propagating
xenbr1: port 45(xen3.129) entering forwarding state
nfs: server 10.1.1.45 not responding, still trying
nfs: server 10.1.1.45 not responding, still trying
nfs: server 10.1.1.45 OK






On Mon, Oct 18, 2010 at 8:44 AM, Eric van Blokland <Eric@xxxxxxxxxxxx> wrote:

I’ve seen this happening in the past, when iSCSI disks became inaccessible. Hasn’t occurred for quite a while though (while I know I made these disk inaccessible quite a few times), however, your system appears to be up to date.

 

 If it is caused by disks becoming inaccessible, you should see something about it in dmesg, “connection …. timeout".

 

Van: xen-users-bounces@xxxxxxxxxxxxxxxxxxx [mailto:xen-users-bounces@xxxxxxxxxxxxxxxxxxx] Namens VPS Lime
Verzonden: maandag 18 oktober 2010 16:32
Aan: xen-users@xxxxxxxxxxxxxxxxxxx
Onderwerp: [Xen-users] xen randomly crashes all VMs hosted on iSCSI NAS array

 

I inherited a xen server that is setup to have all the VM images hosted on an iSCSI mounted NAS array.  We been experiencing a random (about every 2-3 days) issue where xen would crash all the VMs, leaving nothing but the Domain0 running.  What appears to be happening is something causes the iSCI mount to hiccup.  Running "vgchange -a y" and restarting all the VMs brings everything up.  Nothing appears to be wrong with the NAS array - there are a dozen other servers attached to it that never have a problem.  The xend log does not have anything useful in it and I'm at a loss to figure out what is causing this.  The only suggestion I've heard is maybe the memory usage is too high and it is causing the box to be unstable.  If anyone has any suggestions or any additional logs I should be looking at, I'd really appreciate it.

 

Host OS: CentOS 5.5

Xen kernel: xen.gz-2.6.18-194.11.4.el5

iSCSI libraries: iscsi-initiator-utils-6.2.0.871-0.16.el5

Memory on server: 32G

Total memory allocated for VMs running paravirt: 19,384 M

Total memory allocated for VMs running HVM: 2,688 M

 

Results of xm top:

xentop - 10:11:06   Xen 3.1.2-194.11.4.el5

39 domains: 1 running, 38 blocked, 0 paused, 0 crashed, 0 dying, 0 shutdown

Mem: 25165116k total, 25150528k used, 14588k free    CPUs: 8 @ 1995MHz

      NAME  STATE   CPU(sec) CPU(%)     MEM(k) MEM(%)  MAXMEM(k) MAXMEM(%) VCPUS NETS NETTX(k) NETRX(k) VBDS   VBD_OO   VBD_RD   VBD_WR SSID

  Domain-0 -----r         1583   17.1    3220540   12.8   no limit       n/a     8   32     1932    32747    0        0        0        0    0

 

 


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users