|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] A snapshot is not (really) a cow
Although I did get quite a lot wrong in my analysis, I come back to the
point that as lvm2 creates additional snapshots of a the same original,
it adds them to a list of snapshots of that original. If something goes
wrong while a new snapshot is being added, the effect in my experience
is pretty bad: all of the other snapshots of the same original become
unusable and you lose control of xenU domains that have their root
filesystems in those snapshots.
There may be a bug in lvm2 in the handling of error conditions when
adding a new snapshot. But if there are linkages between snapshots of
the same original it is going to be pretty hard to guarantee that the
snapshots are properly independent of each other and proof against this
kind of snarlup.
You can see what I mean if you look at the code:
drivers/md/snap.c: line 76
/*
* One of these per registered origin, held in the snapshot_origins hash
*/
struct origin {
/* The origin device */
struct block_device *bdev;
struct list_head hash_list;
/* List of snapshots for this origin */
struct list_head snapshots;
};
drivers/md/snap.c: line 922 +
static int __origin_write(struct list_head *snapshots, struct bio *bio)
{
... code which propagates a write to the original to all the
snapshots on that original ...
}
Of course I would be delighted to be wrong, or to find that a little
bugfix or two in the lvm code will do the trick. But it feels wrong to
have any connection between the independent cow images, and it probably
explains why we can't have cow image of cow images, which don't really
present a problem if all the linkages refer backwards from the cow image
to the original. Not that I see a violent need for cow pyramid structures.
-- Peri
Christian Limpach wrote:
On Sun, Sep 26, 2004 at 12:38:06PM +0100, Peri Hankey wrote:
I always found the lvm2 'snapshot' terminology confusing - the thing
created as a 'snapshot' is what accepts changes while a backup is made
of the original volume.
I don't think that's the terminology the LVM2 people use. The regular
use is to create a snapshot and backup this snapshot while you keep
using the original.
# drat - I needed another domain
lvcreate -L512M -s -n u4 /dev/vmgroup/root_file_system
... nasty messages .... all xenU domains dead ....
... lmv2 system in inconsistent state ...
... /dev/vmgroup/u4 doesn't exist ...
... /dev/mapper/root_file_system-u4 does exist ...
This should work, if it doesn't then it would seem to be a bug in
LVM2. Since you mention out of memory error messages, are you sure
that you're not running out of memory in dom0?
The problem is that the 'snapshot' cows hold onto each other's tails -
they seem to be held in a list linked (I think) from the original
logical volume (here /dev/vmgroup/root_file_system). For their intended
use as enabling backup, this seems to be meant to allow writes to the
original volume to be propagated to all 'snapshots' created against that
volume - there are comments about getting rid of the 'snapshots' after
the backup has been done because this propagation of writes hits
performance.
For my requirements, and I imagine for most others reading this list,
all of this is superfluous. I don't need
original -> snap1 -> snap2 -> snap3 ...
This is not the layout LVM2 uses. If you look at the output of
``dmsetup table'', you'll see that each snapshot is independent
and only refers to the device it is a snapshot of and to its cow
device which will hold modifications.
so that I can't create a new snap4 while any of the others are in use.
I just need
original <- cow1
original <- cow2
original <- cow3
original <- cow4
...
where A '<-' B means B is a cow image of A, and where each of the cows
is independent of the others so that a new cow can be created at any
time, regardless how many others are active.
This is the layout LVM2 uses. And it is indeed simple (and should be
quite robust) as long as you don't want to write to the original.
If you write to the original, you will have to copy the changed
blocks to every snapshot's cow device. I think I've seen this
fail when having multiple snapshots and writing to the original.
But since you didn't write to the original (and one generally doesn't
need/want to write to the original in our case), that problem
is unlikely to be relevant to the failure you've seen.
christian
-------------------------------------------------------
This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
Project Admins to receive an Apple iPod Mini FREE for your judgement on
who ports your project to Linux PPC the best. Sponsored by IBM.
Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel
|
|
|
|
|