This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] About Copy-on-Write in Xen

On 16 Jan 2004, at 05:52, stevegt@xxxxxxxxxxxxx wrote:

On Tue, Nov 11, 2003 at 11:55:50PM +0000, Ian Pratt wrote:
We're developing a copy-on-write file system [for virtual disks...]

How is this going? If I can get this I can use Xen to replace User Mode

It's actually not a file system. It's a multiple-device (md) Linux kernel driver,
derived from RAID1. I finished writing the CoW device driver for 2.4.24
and modified the 'raidtools-1.00.3' to support CoW.

Currently, the CoW device can be automatically recognized at boot time,
and be started and stopped by 'raidstart' and 'raidstop' respectively. However,
It hasn't worked properly as expected. I've been debugging the kernel,
which is a bit complicated.

Some related things that would help with that aim:

(1) Implement lazy allocation (as in the way a sparse file works) for
vbd blocks. (Or is this already implemented and I haven't found it in the list traffic?) This would allow, say, 100 vbd's, 4Gb each, to be
    allocated on a 100Gb partition, as long as the average filesystem
    utilization in those vbd's stayed below 25%.  The extra redirection
    would incur a slight performance penalty, but there is a balancing
    economic driver -- see below.

In VMWare and UML, virtual disks are implemented as files in real disks.
They can grow by seizing more space from the underlying real disks.

In Xen, VBDs are implemented as 'extents'. One 'extent' is a group of
contiguous sectors of the real disks. VBDs can grow by adding extra
extents from the underlying real disks. Obviously, VBDs work at a
much lower level.

Therefore, you can say, you have created a VBD of 4G but actually
starts with only 500M. Whenever the free space drops below a
threshold, extra extents are automatically added. This function is
already present in 1.2 and 1.3

BTW, this idea is quite similar to LVM.

(2) Also implement lazy allocation for COW backing store, unless that
    would already be taken care of by (1).

In CoW, we typically have two devices. One is the original one, say hda,
and the other is the 'dirty' one,say hdb.

When read is attempted for a sector, hdb is searched and returns the content of the sector if present. Otherwise, the content is read from hda. On the
other hand, all writes go to hdb.

Theoretically, hdb must be NO smaller than hda in order to cache all possible writes. But practically, only a small working set exists. That's why hdb could
be much smaller than hda but still sufficient.

So, 'lazy allocation' for CoW means: hdb starts small, but becomes larger by
adding extra extents.

So, why is CoW useful?

1) To remove redundancy in content. When different clients use almost the
same files, CoW can significantly reduces the total storage requirement.

2) CoW can work as a huge cache. For example, you make a CoW out of
a ramdisk and hard disk. Maybe even multiple layers of cache: memory,
magnetic disks, optical disks ...

Anway, I'm debugging the CoW driver. Hopefully to make it ASAP.

-- Bin

The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
Xen-devel mailing list