WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] Live migration with MMIO pages

To: Keir Fraser <Keir.Fraser@xxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: Re: [Xen-devel] Live migration with MMIO pages
From: Kieran Mansley <kmansley@xxxxxxxxxxxxxx>
Date: Wed, 31 Oct 2007 16:34:39 +0000
Delivery-date: Wed, 31 Oct 2007 09:36:34 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <C34E3CFB.17B56%Keir.Fraser@xxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <C34E3CFB.17B56%Keir.Fraser@xxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
On Wed, 2007-10-31 at 14:08 +0000, Keir Fraser wrote:
> On 31/10/07 14:03, "Kieran Mansley" <kmansley@xxxxxxxxxxxxxx> wrote:
> 
> > The symptom of failure is that once live migration has started, trying
> > to write to an IO page results in it getting stuck in (or a perpetual
> > loop into) page_fault().  This only happens very occasionally.  I've
> > interpreted it getting stuck in page_fault() as a result of the shadow
> > paging which (as I understand it) marks the normal page table entries as
> > read only so that writes to pages trap into the hypervisor and it can
> > update its dirty set.
> 
> Yes, but then it should mark the page writable again, so that the access can
> be re-executed without faulting! So this rather points at some problem with
> the live-migration shadow mode w.r.t. mmio pages.

Yes.  The reason it's failing is that sh_page_fault() in
xen/arch/x86/mm/shadow/multi.c thinks it's a bad gfn:

    if ( !p2m_is_valid(p2mt) || (!(p2m_is_mmio(p2mt) || mfn_valid
(gmfn))) )
    {
        perfc_incr(shadow_fault_bail_bad_gfn);
        SHADOW_PRINTK("BAD gfn=%"SH_PRI_gfn" gmfn=%"PRI_mfn"\n", 
                      gfn_x(gfn), mfn_x(gmfn));
        goto not_a_shadow_fault;
    }

I think the problem is that set_mmio_p2m_entry() isn't getting called
when the IO mapping is established.  There are three places where
iomem_permit_access() is called:
 - XEN_DOMCTL_memory_mapping: (in xen/arch/x86/domctl.c)
 - XEN_DOMCTL_iomem_permission: (in xen/common/domctl.c)
 - __gnttab_map_grant_ref(): (in xen/common/grant_table.c)

The last one was written by me based on the second one, and neither of
these call set_mmio_p2m_entry().  I find this a bit suspicious, because
the first one does, and it looks necessary to me in all three cases.
There are however very few users of either of those domctl operations,
and so it's hard to tell what the difference is supposed to be, and so
why XEN_DOMCTL_iomem_permission doesn't call set_mmio_p2m_entry().

Adding calls to set_mmio_p2m_entry() in either of the cases that don't
have it might be a bit tricky too as I'm not sure a gfn exists for that
mfn at the point that they are called.

Kieran




_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel