JC> Your new CoW driver, however doesn't handle allocation, it just
JC> assumes a CoW volume that's as big as the original disk
Correct, it does not do sparse allocation. Sparse allocation does not
make any sense on a fixed-size block device. Note that it has nothing
to do with how the rest of the dm-userspace/cowd system works. When
using the prototype qcow plugin to cowd, sparse allocation is
performed as you would expect.
JC> and uses a bitmap to optimize lookups.
It does not use a bitmap to optimize lookups, it uses a bitmap as the
metadata to record which blocks have been mapped. It also determines
where the block is located in the CoW volume, as this can be
determined from the block size and the bit number.
JC> Given that you seem to be assuming that the block device is
JC> providing sparse allocation and dynamic disk resizing for you,
No, I'm not assuming this. With this current format plugin, we are
just assuming a fully-allocated block device (such as a LVM, or a
simple iSCSI device).
JC> isn't it likely that such devices would already provide low-level
JC> support for CoW and disk snapshotting?
Some do, but plenty do not. However, being able to do snapshots, CoW,
rollback, etc while easily synchronizing with other pieces of Xen is
something that will be simpler with dm-userspace than with a hardware
device. It also allows us to provide the same advanced features, no
matter what device we are working on top of, and independent of
whether or not it supports them.
JC> Qcow provides both sparse support and CoW functionality.
Sparse allocation and control of block devices (i.e. growing an LVM as
we need it for a growing CoW storage) is definitely something that is
on our radar. We are just not there yet.
I should mention here that I am not fighting for a format here. The
plugin that is the default in cowd right now (dscow) implements a very
simple format geared at speed and simplicity. It has limitations and
we understand that. It is not intended to be the CoW format of the
future :)
JC> What's the policy on metadata writes - are metadata writes
JC> synchronised with the acknowledgement of block IO requests?
Yes. Ian asked for this the last time we posted our code. We have
worked hard to implement this ability in dm-userspace/cowd between the
time we posted our original version and our recent post.
While the normal persistent domain case definitely needs this to be
"correct", there are other usage models for virtual machines that do
not necessarily need to have a persistent disk store. We are able to
disable the metadata syncing (and the metadata writing altogether if
desired) and regain a lot of speed.
--
Dan Smith
IBM Linux Technology Center
Open Hypervisor Team
email: danms@xxxxxxxxxx
pgpGXEPYOe7w0.pgp
Description: PGP signature
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|