WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] spurious warnings from get_page() via gnttab_copy() during f

To: xen-devel@xxxxxxxxxxxxxxxxxxx
Subject: [Xen-devel] spurious warnings from get_page() via gnttab_copy() during frontend shutdown
From: David Edmondson <dme@xxxxxxx>
Date: Tue, 27 Nov 2007 09:26:53 +0000
Delivery-date: Tue, 27 Nov 2007 01:27:54 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
In testing our implementation of the hypervisor copy based backend- >frontend networking changes, we see what I believe are spurious warning messages during the shutdown of the frontend domain:

(XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: 189:d0 Error pfn 30e290: rd=ffff830000fcf100, od=ffff830000fcf100, caf=00000000, taf=0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff83000010f240>] get_page+0x107/0x1b4
(XEN)    [<ffff83000010f10a>] get_page_and_type+0x21/0x50
(XEN)    [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4
(XEN)    [<ffff830000111971>] gnttab_copy+0xee/0x1c4
(XEN)    [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc
(XEN)    [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc
(XEN)
(XEN) Guest stack trace from rbp=ffff830000ff3cf8:
(XEN) ???????????????? <G><2>grant_table.c:990:d0 do_grant_table_op: domain 0, cmd 5, count 1 (XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: 189:d0 Error pfn 30fc2a: rd=ffff830000fcf100, od=0000000000000000, caf=00000000, taf=0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff83000010f240>] get_page+0x107/0x1b4
(XEN)    [<ffff83000010f10a>] get_page_and_type+0x21/0x50
(XEN)    [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4
(XEN)    [<ffff830000111971>] gnttab_copy+0xee/0x1c4
(XEN)    [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc
(XEN)    [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc
(XEN)

What we think is happening is that the frontend dies and most of its' pages are freed (those that are not referenced by another domain). The backend doesn't know that the frontend died yet, so it's still trying to pass packets along to it. It has the rx ring mapped (meaning that it can't be freed) and reads previously advertised grant references from it. Those grants now refer to pages that are no longer valid, so get_page() complains (the pages are no longer valid as only the frontend had references to them and they were freed).

__gnttab_copy() itself seems prepared for this situation, as failures to grab the target page due to a dying domain are correctly handled:

if ( !get_page_and_type(mfn_to_page(d_frame), dd, PGT_writable_page) )
    {
        if ( !test_bit(_DOMF_dying, &dd->domain_flags) )
gdprintk(XENLOG_WARNING, "Could not get dst frame %lx \n", d_frame);
        rc = GNTST_general_error;
        goto error_out;
    }

In our testing we believe that we're following this path (_DOMF_dying is set and rc == GNTST_general_error) and that we handle the failure correctly.

The corresponding failure mode in the page flip code path doesn't result in any INFO warnings. Should they exist in this case?

dme.
--
David Edmondson, Solaris Engineering, http://dme.org



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel