This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Making snapshot of logical volumes handling HVM domU cau

To: Daniel Stodden <daniel.stodden@xxxxxxxxxx>
Subject: Re: [Xen-devel] Making snapshot of logical volumes handling HVM domU causes OOPS and instability
From: Scott Garron <xen-devel@xxxxxxxxxxxxxxxxxx>
Date: Tue, 31 Aug 2010 14:06:40 -0400
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Xu, Dongxiao" <dongxiao.xu@xxxxxxxxx>
Delivery-date: Tue, 31 Aug 2010 11:08:22 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1283246428.3092.3476.camel@xxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4C7864BB.1010808@xxxxxxxxxxxxxxxxxx> <4C7BE1C6.5030602@xxxxxxxx> <1283195639.26797.451.camel@xxxxxxxxxxxxxxxxxxxxxxx> <4C7C14F7.9090308@xxxxxxxxxxxxxxxxxx> <1283246428.3092.3476.camel@xxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20100515 Icedove/3.0.4
On 08/31/2010 05:20 AM, Daniel Stodden wrote:
If it were just some or more tasks hanging initially, and it's caught
some wait state, then identifying the point where things broke can
sometimes be quite straightforward. Doesn't seem to be the case

     True.  It's at least narrowed down to something with the way LVM/DM
and udev interact during creation and removal of snapshots since the
machine can run for days without incident until I start adding and
removing snapshots (of running HVM volumes).

Okay. I guess that won't be simple to repro. I wonder what you are
running in dom0. Distro and version, what you upgraded and what not,
any customized software builds etc.

     I'm running Debian Squeeze (testing) and have included a full list
of installed packages (dpkg -l) in the text file referenced in some of
my previous e-mails, here:

     I've also included the output of "ps -eH -owchan,nwchan,cmd" during
normal operations (not yet in the "crashed" state).

     I don't recall running any customized software builds on dom0.
It's a fairly bog standard Debian installation.  If I'm going to do
anything customized, I usually do it on a domU.

Given the rate at which you reproduce this and because only the
snapshots seem to trigger the problem, to me this looks more like an
LVM/DM issue than pvops specific.

     That has crossed my mind.  The only reason that I suspected
anything to do with Xen or pvops was that it only seems to happen when
creating/removing a snapshot of an active, running HVM.  I can create
and remove snapshots of other volumes all day and not trigger the bug
(tested yesterday).  It would probably be impossible to trigger the bug
on a baremetal machine that's not running a hypervisor.

Also, it might be worth trying to turn off udev and see whether that
changes sth.

     I'm going to try to reproduce it on another, less critical machine
today, so I can poke at it a little more.  I'll let you know what I find.

Scott Garron

Xen-devel mailing list