In testing our implementation of the hypervisor copy based backend-
>frontend networking changes, we see what I believe are spurious
warning messages during the shutdown of the frontend domain:
(XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h:
189:d0 Error pfn 30e290: rd=ffff830000fcf100, od=ffff830000fcf100,
caf=00000000, taf=0000000000000000
(XEN) Xen call trace:
(XEN) [<ffff83000010f240>] get_page+0x107/0x1b4
(XEN) [<ffff83000010f10a>] get_page_and_type+0x21/0x50
(XEN) [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4
(XEN) [<ffff830000111971>] gnttab_copy+0xee/0x1c4
(XEN) [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc
(XEN) [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc
(XEN)
(XEN) Guest stack trace from rbp=ffff830000ff3cf8:
(XEN) ???????????????? <G><2>grant_table.c:990:d0 do_grant_table_op:
domain 0, cmd 5, count 1
(XEN) /export/build/schuster/xvm-gate/xen.hg/xen/include/asm/mm.h:
189:d0 Error pfn 30fc2a: rd=ffff830000fcf100, od=0000000000000000,
caf=00000000, taf=0000000000000000
(XEN) Xen call trace:
(XEN) [<ffff83000010f240>] get_page+0x107/0x1b4
(XEN) [<ffff83000010f10a>] get_page_and_type+0x21/0x50
(XEN) [<ffff8300001116c4>] __gnttab_copy+0x3f5/0x5b4
(XEN) [<ffff830000111971>] gnttab_copy+0xee/0x1c4
(XEN) [<ffff830000111dbd>] do_grant_table_op+0x376/0x3bc
(XEN) [<ffff8300001b83e2>] syscall_enter+0xa2/0xfc
(XEN)
What we think is happening is that the frontend dies and most of its'
pages are freed (those that are not referenced by another domain).
The backend doesn't know that the frontend died yet, so it's still
trying to pass packets along to it. It has the rx ring mapped
(meaning that it can't be freed) and reads previously advertised
grant references from it. Those grants now refer to pages that are no
longer valid, so get_page() complains (the pages are no longer valid
as only the frontend had references to them and they were freed).
__gnttab_copy() itself seems prepared for this situation, as failures
to grab the target page due to a dying domain are correctly handled:
if ( !get_page_and_type(mfn_to_page(d_frame), dd,
PGT_writable_page) )
{
if ( !test_bit(_DOMF_dying, &dd->domain_flags) )
gdprintk(XENLOG_WARNING, "Could not get dst frame %lx
\n", d_frame);
rc = GNTST_general_error;
goto error_out;
}
In our testing we believe that we're following this path (_DOMF_dying
is set and rc == GNTST_general_error) and that we handle the failure
correctly.
The corresponding failure mode in the page flip code path doesn't
result in any INFO warnings. Should they exist in this case?
dme.
--
David Edmondson, Solaris Engineering, http://dme.org
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|