In testing our implementation of the hypervisor copy based backend- 
>frontend networking changes, we see what I believe are spurious  
warning messages during the shutdown of the frontend domain:
 (XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: 
189:d0 Error pfn 30e290: rd=ffff830000fcf100, od=ffff830000fcf100,  
caf=00000000, taf=0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff83000010f240>] get_page+0x107/0x1b4
(XEN)    [<ffff83000010f10a>] get_page_and_type+0x21/0x50
(XEN)    [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4
(XEN)    [<ffff830000111971>] gnttab_copy+0xee/0x1c4
(XEN)    [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc
(XEN)    [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc
(XEN)
(XEN) Guest stack trace from rbp=ffff830000ff3cf8:
 (XEN)  ???????????????? <G><2>grant_table.c:990:d0 do_grant_table_op:  
domain 0, cmd 5, count 1
(XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h: 
189:d0 Error pfn 30fc2a: rd=ffff830000fcf100, od=0000000000000000,  
caf=00000000, taf=0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff83000010f240>] get_page+0x107/0x1b4
(XEN)    [<ffff83000010f10a>] get_page_and_type+0x21/0x50
(XEN)    [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4
(XEN)    [<ffff830000111971>] gnttab_copy+0xee/0x1c4
(XEN)    [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc
(XEN)    [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc
(XEN)
 What we think is happening is that the frontend dies and most of its'  
pages are freed (those that are not referenced by another domain).  
The backend doesn't know that the frontend died yet, so it's still  
trying to pass packets along to it. It has the rx ring mapped  
(meaning that it can't be freed) and reads previously advertised  
grant references from it. Those grants now refer to pages that are no  
longer valid, so get_page() complains (the pages are no longer valid  
as only the frontend had references to them and they were freed).
 __gnttab_copy() itself seems prepared for this situation, as failures  
to grab the target page due to a dying domain are correctly handled:
     if ( !get_page_and_type(mfn_to_page(d_frame), dd,  
PGT_writable_page) )
    {
        if ( !test_bit(_DOMF_dying, &dd->domain_flags) )
            gdprintk(XENLOG_WARNING, "Could not get dst frame %lx 
\n", d_frame);
        rc = GNTST_general_error;
        goto error_out;
    }
In our testing we believe that we're following this path (_DOMF_dying  
is set and rc == GNTST_general_error) and that we handle the failure  
correctly.
 The corresponding failure mode in the page flip code path doesn't  
result in any INFO warnings. Should they exist in this case?
dme.
--
David Edmondson, Solaris Engineering, http://dme.org
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 
 |