WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-users

[Xen-users] Xen intermittently fails to release HVM domU VBDs, preventin

To: <xen-users@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-users] Xen intermittently fails to release HVM domU VBDs, preventing Heartbeat node fail-over
From: "Steigerwald Erich" <steigerwald@xxxxxxxxxxxxxxxxxx>
Date: Mon, 30 Jun 2008 15:05:12 +0200
Delivery-date: Mon, 30 Jun 2008 06:05:50 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-users-request@lists.xensource.com?subject=help>
List-id: Xen user discussion <xen-users.lists.xensource.com>
List-post: <mailto:xen-users@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-users>, <mailto:xen-users-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-users-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: Acjase6LhLanwcy9SnSwTXRlyucZ5w==
Thread-topic: Xen intermittently fails to release HVM domU VBDs, preventing Heartbeat node fail-over
Hi,

Intermittently, upon domU shutdown, Xen appears to fail to release domU
VBD handles. Consequentially, LVs in the VG remain open, and the VG
cannot be disabled. This effectively prevents manual failover.

We suspect a bug in Xen or dm-qemu.

Regards,
Erich

System environment information:
- SLES 10 SP2 (x86_64)
- Kernel 2.6.16.60-0.21-xen
- Xen 3.2.0_16718_14-0.4
- LVM 2.02.17-7.19
- Heartbeat 2.1.3-0.9

Configuration details:
- Xen HVM domU
- Xen VBD backed by LVM LV in dom0
- Xen resources and LVM VG managed by Heartbeat
- On node failover, Heartbeat stops domU, deactivates LVM VG, activates
VG on peer, starts domU on peer

Relevant error messages concurrent with the issue:

Xend log (note error occurring during domain_destroy()):
[2008-06-20 13:55:32 16615] DEBUG (XendDomainInfo:1965)
XendDomainInfo.destroyDomain(21)
[2008-06-20 13:55:32 16615] DEBUG (XendDomainInfo:1965)
XendDomainInfo.destroyDomain(24)
[2008-06-20 13:55:32 16615] DEBUG (XendDomainInfo:1588) Removing vif/0
[2008-06-20 13:55:32 16615] DEBUG (XendDomainInfo:590)
XendDomainInfo.destroyDevice: deviceClass = vif, device = vif/0
[2008-06-20 13:55:32 16615] DEBUG (XendDomainInfo:1588) Removing
vbd/51712
[2008-06-20 13:55:32 16615] DEBUG (XendDomainInfo:590)
XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/51712
[2008-06-20 13:55:33 16615] ERROR (XendDomainInfo:1977)
XendDomainInfo.destroy: xc.domain_destroy failed.
Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py",
line 1972, in destroyDomain
    xc.domain_destroy(self.domid)
Error: (3, 'No such process')
[2008-06-20 13:55:33 16615] DEBUG (XendDomainInfo:1588) Removing
vbd/51728
[2008-06-20 13:55:33 16615] DEBUG (XendDomainInfo:590)
XendDomainInfo.destroyDevice: deviceClass = vbd, device = vbd/51728
[2008-06-20 13:55:33 16615] DEBUG (XendDomainInfo:1575) Destroying
device model
[2008-06-20 13:55:33 16615] DEBUG (XendDomainInfo:1588) Removing vkbd/0
[2008-06-20 13:55:33 16615] DEBUG (XendDomainInfo:590)
XendDomainInfo.destroyDevice: deviceClass = vkbd, device = vkbd/0
[2008-06-20 13:55:33 16615] DEBUG (XendDomainInfo:1588) Removing vfb/0
[2008-06-20 13:55:33 16615] DEBUG (XendDomainInfo:590)
XendDomainInfo.destroyDevice: deviceClass = vfb, device = vfb/0
[2008-06-20 13:55:33 16615] INFO (XendDomainInfo:1295) Domain has
shutdown: name=hostemplate id=23 reason=poweroff.
[2008-06-20 13:55:33 16615] DEBUG (XendDomainInfo:1582) Releasing
devices
[2008-06-20 13:55:33 16615] DEBUG (XendDomainInfo:1588) Removing
console/0

Xend debug log:
Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/server/Hald.py",
line 55, in shutdown
    os.kill(self.pid, signal.SIGINT)
OSError: [Errno 3] No such process

Heartbeat debug log (note domain having terminated successfully from
Heartbeat's view, but subsequent VG deactivation failure):
Jun 20 13:55:34 xenha01 lrmd: [13229]: info: RA output:
(res_xen_xen-ad:stop:stdout) Domain xen-ad terminated All domains
terminated
- and subsequently
Jun 20 13:55:38 xenha01 tengine: [10035]: info: send_rsc_command:
Initiating action 17: res_lvm_xendomains01_stop_0 on xenha01
Jun 20 13:55:38 xenha01 crmd: [13232]: info: do_lrm_rsc_op: Performing
op=res_lvm_xendomains01_stop_0
key=17:13:78f873ed-af08-4add-bb36-3798cd1a4a22)
Jun 20 13:55:38 xenha01 lrmd: [13229]: info: rsc:res_lvm_xendomains01:
stop
Jun 20 13:55:38 xenha01 crmd: [13232]: info: process_lrm_event: LRM
operation res_lvm_xendomains01_stop_0 (call=190, rc=1) complete
Jun 20 13:55:38 xenha01 tengine: [10035]: WARN: update_failcount:
Updating failcount for res_lvm_xendomains01 on
519aa0b0-a947-47e9-ace9-d52030ef98a9 after failed stop: rc=1
Jun 20 13:55:38 xenha01 tengine: [10035]: info: match_graph_event:
Action res_lvm_xendomains01_stop_0 (17) confirmed on xenha01 (rc=4)
Jun 20 13:55:38 xenha01 pengine: [10036]: ERROR: unpack_rsc_op:
Remapping res_lvm_xendomains01_stop_0 (rc=1) on xenha01 to an ERROR
Jun 20 13:55:38 xenha01 pengine: [10036]: WARN: unpack_rsc_op:
Processing failed op res_lvm_xendomains01_stop_0 on xenha01: Error

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users

<Prev in Thread] Current Thread [Next in Thread>
  • [Xen-users] Xen intermittently fails to release HVM domU VBDs, preventing Heartbeat node fail-over, Steigerwald Erich <=