This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Xen Memory De-duplication

To: Aditya Gadre <adivb2003@xxxxxxxxx>
Subject: Re: [Xen-devel] Xen Memory De-duplication
From: Pasi Kärkkäinen <pasik@xxxxxx>
Date: Sat, 9 Oct 2010 22:09:20 +0300
Cc: Xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Sat, 09 Oct 2010 12:10:16 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <AANLkTimbu2g_s9gdOD4N76TuY--x3nbhwAZDMNdkLCKh@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTimbu2g_s9gdOD4N76TuY--x3nbhwAZDMNdkLCKh@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.18 (2008-05-17)
On Sat, Oct 09, 2010 at 11:26:23PM +0530, Aditya Gadre wrote:
>    Aim is to implement Xen Memory Deduplication with minimum overhead.
>    Our approach to de-duplication is as follows
>    In most cases, Domain-U uses a small set of well-known operating systems
>    such as Linux, FreeBSD and Microsoft Windows. In such environment many
>    domains share read-only filesystems that contain operating system and
>    frequently usedprogram files and libraries.Each domain has their own
>    writable filesystems for storing data and temporary files. In this
>    configuration, multiple pages scattered in different domains mostly happen
>    to contain same disk block. So, in our approach to perform deduplication
>    we intend to add a data structure in dom 0 which store disk block number
>    and the machine frame number(MFN) when a read request for the read only
>    code(and data) is made. Now when another domain U places the request for
>    the block of code and Dom 0 recieves a request for I/O (DMA), it will
>    first check into the data structure for the entry for the block. If it
>    finds the block it will return the MFN of the already read page and map it
>    to the requesting domain's PFN resulting in zero I/O processing time of
>    blocks which are already read. This in turn results in de-duplication of
>    the read only pages accessed by multiple domains without any overhead of
>    hashing the page.
>    Test case scenario:
>    Consider a Dom0 linux kernel using a filesystem with deduplication
>    enabled. Then we install a DomU kernel with the virtual disk as a image
>    file on the disk(.img). Then we make multiple copies of the image to
>    deploy multiple DomUs running same kernel. Now, as deduplication is
>    enabled in the file system initially all the blocks of the domains will be
>    pointing to the same disk blocks. Now when the kernel's are booted, they
>    all will consume memory only once for the programs(code segment) loaded in
>    the memory. Now as these OSs start to write to their own virtual
>    filesystems the blocks of the image will be COW'ed by the filesystem
>    resulting in different block number.
>    Is such a approach implemented?  We intend to implement this as a project.
>    What are the suspected challanges?

Yeah, I think the image COW is possible using the Xen blktap2 vhd support,
and also maybe Xen qcow* stuff.

Also check Xen4.0 wiki page for more info about the memory sharing etc:

-- Pasi

Xen-devel mailing list