This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Shadow Page Tables in Xen

To: priya sehgal <priyagps@xxxxxxxxxxx>
Subject: Re: [Xen-devel] Shadow Page Tables in Xen
From: Gianluca Guida <glguida@xxxxxxxxx>
Date: Wed, 22 Apr 2009 12:05:03 +0200
Cc: gianluca.guida@xxxxxxxxxxxxx, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Delivery-date: Wed, 22 Apr 2009 03:05:32 -0700
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=gOjnrOSWb048Gnr6KZs8/vJhxfOi2smJ0qqXqBkWDK8=; b=SsQcmTRb594qqyrjC3v9jTMXIa4BqLI8r6rhlq5X0/JTQ76lmtMW1TARlrUYAMRheY oX/vy45U2xt3ymsNLVohnxbthAC7yOkVhXmpbkE7WjdmmIs0kkFofA4GW4y6VxZYbDUk HQRZEdVJFuEV0wTj14BROHrqL66e8F5nNpJHw=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=YilupQsgiKZ4wRNcgbMEU11IcfSZWEfgRLKhU8AcZtv67UokyApYIKzLaOMiPJWafc ki7LjslmH+kNtnlEbMnJ6p0ws19r9pWe3xxh3yC3SwsJXFOhlKACTMR483/rZypWZ08P ag6w55eSDOk6cK5lCYZATnJ9nKxgtmqeo3sFs=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <281111.84217.qm@xxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <281111.84217.qm@xxxxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Wed, Apr 22, 2009 at 12:37 AM, priya sehgal <priyagps@xxxxxxxxxxx> wrote:
>> > We have a course project, in which we have to improve
>> the performance of live migration for HVM guests. It seems
>> that to support live migration, all the page table entries
>> in the shadow page table are marked as write protected, so
>> as to know which pages are dirtied and to be sent to the
>> other machine. Since, there will be many page faults leading
>> to performance degradation, we want to reduce these page
>> faults. In our course project, we are supposed to form
>> groups of pages and if any page in the group hits the page
>> fault (due to write-protection), we mark all the pages in
>> the group as RW. This way we can reduce the page faults.
>> >
>> Have you actually measured this? I think that the major
>> cause of page
>> faults and VM slowdown is -- rather than page faults on
>> write access
>> -- the fact that we blow the shadow pagetables away
>> everytime we clean
>> the dirty bitmap, and this requires a long operation to
>> remove from
>> top to bottom all reference counts and reconstructing later
>> the shadow
>> pagetables on the next memory accesses.
> We have not measured this, but we will benchmark it after making the changes. 
> Since the number of page faults will reduce by a factor of "n",
> where "n" is the size of the page group, it should help speed the VM. If n is 
> large enough, say 1000
> contiguous pages and the workload is such that it dirties consecutive pages, 
> it should help in improving performance.  For very small values of "n" it 
> might not help that much.

There are various problems I can see with this approach:

- A fixup fault (a page that add the writable mapping on an L1 after a
pagefault) is not so expensive in this context. The big slowdown is
that we run most of the time on empty shadow pagetables (due to the
often shadow pagetables blowing). So, even if you speed up this minor
case, you won't get too far.

- As Tim suggested, this will make the bandwidth required to do live
migration much bigger (you're talking about increasing the granularity
of memory to be sent from 1 to 1000 pages). So you should take into
account that yes, making bigger logdirty chunks will decrease the
pagefaults, but will increase the required network bandwidth, which is
a very important parameter for live migration.

- Also, in a minor way, the fact that pages close to a page just
dirtied are likely going to be dirtied soon does not imply that when
libxc sends the big chunk of pages over the network the neighbors page
are already been dirtied by the guest. This might impredictably cause
the same big chunk of memory to be sent over the network multiple
times during the live migration, and will further increase the
bandwidth in a non controllable way.

 So, unless you're interested in this particular feature, and you just
want to check if this is worth or not (i.e. you are OK if this method
doesn't work), I'd suggest you to trace both log dirty fixup faults
(when we mark a page dirty during a page fault)  and when a particular
page is sent over the network, and analyze the flow to see if this
makes sense.

Also, seems like this feature you're thinking about is orthogonal to
the paging technique used, i.e. HAP or Shadow, so if you have an EPT
or NPT box available you might want to try with HAP at first, that
does all the log dirty at P2M level, since that will make your life
much easier.

Hope this is useful,
It was a type of people I did not know, I found them very strange and
they did not inspire confidence at all. Later I learned that I had been
introduced to electronic engineers.
                                                  E. W. Dijkstra

Xen-devel mailing list