This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-API] How snapshot work on LVMoISCS SR

To: Anthony Xu <anthony@xxxxxxxxx>
Subject: Re: [Xen-API] How snapshot work on LVMoISCS SR
From: Julian Chesterfield <julian.chesterfield@xxxxxxxxxxxxx>
Date: Tue, 26 Jan 2010 10:34:09 +0000
Cc: "xen-api@xxxxxxxxxxxxxxxxxxx" <xen-api@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 26 Jan 2010 02:34:06 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1264477399.2927.58.camel@mobl-ant>
List-help: <mailto:xen-api-request@lists.xensource.com?subject=help>
List-id: Discussion of API issues surrounding Xen <xen-api.lists.xensource.com>
List-post: <mailto:xen-api@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-api>, <mailto:xen-api-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-api>, <mailto:xen-api-request@lists.xensource.com?subject=unsubscribe>
References: <1264477399.2927.58.camel@mobl-ant>
Sender: xen-api-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird (X11/20071219)
Hi Anthony,

Anthony Xu wrote:
Hi all,

Basically snapshot on LVMoISCSI SR work well, it provides thin
provisioning, so it is fast and disk space efficient.

But I still have below concern.

There is one more vhd chain when creating snapshot, if I creates 16
snapshots, there are 16 vhd chains, that means when one VM accesses a
disk block, it may need to access 16 vhd lvm one by one, then get the
right block, it makes VM access disk slow. However, it is
understandable, it is part of snapshot IMO.
The depth and speed of access will depend on the write pattern to the disk. In XCP we add an optimisation called a BATmap which stores one bit per BAT entry. This is a fast lookup table that is cached in memory while the VHD is open, and tells the block device handler whether a block has been fully allocated. Once the block is fully allocated (all logical 2MB written) the block handler knows that it doesn't need to read or write the Bitmap that corresponds to the data block, it can go directly to the disk offset. Scanning through the VHD chain can therefore be very quick, i.e. the block handler reads down the chain of BAT tables for each node until it detects a node that is allocated with hopefully the BATmap value set. The worst case is a random disk write workload which causes the disk to be fragmented and partially allocated. Every read or write will therefore potentially incur a bitmap check at every level of the chain.
But after I delete all these 16 snapshots, there is still 16 vhd chains,
the disk access is still slow, which is not understandable and
reasonable, even though there may be only several KB difference between
each snapshot,
There is a mechanism in XCP called the GC coalesce thread which gets kicked asynchronously following a VDI deletion event. It queries the VHD tree, and determines whether there is any coalescable work to do. Coalesceable work is defined as:

'a hidden child node that has no siblings'

Hidden nodes are non-leaf nodes that reside within a chain. When the snapshot leaf node is deleted therefore, it will leave redundant links in the chain that can be safely coalesced. You can kick off a coalesce by issuing an SR scan, although it should kick off automatically within 30 seconds of deleting the snapshot node, handled by XAPI. If you look in the /var/log/SMlog file you'll see a lot of debug information including tree dependencies which will tell you a) whether the GC thread is running, and b) whether there is coalescable work to do. Note that deleting snapshot nodes does not always mean that there is coalescable work to do since there may be other siblings, e.g. VDI clones.
is there any way we can reduce depth of vhd chain after deleting
snapshots? get VM back to normal disk performance.
The coalesce thread handles this, see above.
And, I notice there are useless vhd volume exist after deleting snap
shots, can we delete them automatically?
No. I do not recommend deleting VHDs manually since they are almost certainly referenced by something else in the chain. If you delete them manually you will break the chain, it will become unreadable, and you potentially lose critical data. VHD chains must be correctly coalesced in order to maintain data integrity.


- Anthony

xen-api mailing list

xen-api mailing list