WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] c/s 22402 ("86 hvm: Refuse to perform __hvm_copy() work in a

To: Daniel De Graaf <dgdegra@xxxxxxxxxxxxx>, keir@xxxxxxx
Subject: [Xen-devel] c/s 22402 ("86 hvm: Refuse to perform __hvm_copy() work in atomic context.") breaks HVM, race possible in other code - any ideas?
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Tue, 11 Jan 2011 13:00:32 -0500
Cc: jeremy@xxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, Ian.Campbell@xxxxxxxxxx
Delivery-date: Tue, 11 Jan 2011 10:07:03 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4D2C6EA3.8060900@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <1292545063-32107-1-git-send-email-dgdegra@xxxxxxxxxxxxx> <1292545063-32107-7-git-send-email-dgdegra@xxxxxxxxxxxxx> <20110110224154.GH15016@xxxxxxxxxxxx> <4D2C57DC.3090803@xxxxxxxxxxxxx> <4D2C6EA3.8060900@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mutt/1.5.20 (2009-06-14)
On Tue, Jan 11, 2011 at 09:52:19AM -0500, Daniel De Graaf wrote:
> On 01/11/2011 08:15 AM, Daniel De Graaf wrote:
> > On 01/10/2011 05:41 PM, Konrad Rzeszutek Wilk wrote:
> >>> @@ -284,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, 
> >>> int offset, int pages)
> >>>           goto out;
> >>>  
> >>>   for (i = 0; i < pages; i++) {
> >>> +         uint32_t check, *tmp;
> >>>           WARN_ON(unmap_ops[i].status);
> >>> -         __free_page(map->pages[offset+i]);
> >>> +         if (!map->pages[i])
> >>> +                 continue;
> >>> +         /* XXX When unmapping, Xen will sometimes end up mapping the GFN
> >>> +          * to an invalid MFN. In this case, writes will be discarded and
> >>> +          * reads will return all 0xFF bytes. Leak these unusable GFNs
> >>> +          * until a way to restore them is found.
> >>> +          */
> >>> +         tmp = kmap(map->pages[i]);
> >>> +         tmp[0] = 0xdeaddead;
> >>> +         mb();
> >>> +         check = tmp[0];
> >>> +         kunmap(map->pages[i]);
> >>> +         if (check == 0xdeaddead)
> >>> +                 __free_page(map->pages[i]);
> >>> +         else if (debug)
> >>> +                 printk("%s: Discard page %d=%ld\n", __func__,
> >>> +                         i, page_to_pfn(map->pages[i]));
> >>
> >> Whoa. Any leads to when the "sometimes" happens? Does the status report an
> >> error or is it silent?
> > 
> > Status is silent in this case. I can produce it quite reliably on my
> > test system where I am mapping a framebuffer (1280 pages) between two
> > HVM guests - in this case, about 2/3 of the released pages will end up
> > being invalid. It doesn't seem to be size-related as I have also seen
> > it on the small 3-page page index mapping. There is a message on xm
> > dmesg that may be related:
> > 
> > (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 
> > 7cbc6: c=8000000000000004 t=7400000000000002
> > 
> > This appears about once per page, with different MFNs but the same c/t.
> > One of the two HVM guests (the one doing the mapping) has the PCI
> > graphics card forwarded to it.
> > 
> 
> Just tested on the latest xen 4.1 (with 22402:7d2fdc083c9c reverted as
> that breaks HVM grants), which produces different output:

Keir, the c/s 22402 has your name on it.

Any ideas on the problem that Daniel is hitting with unmapping grants?
> 
> ...
> (XEN) mm.c:889:d1 Error getting mfn b803e (pfn 25a3e) from L1 entry 
> 00000000b803e021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn b8038 (pfn 25a38) from L1 entry 
> 00000000b8038021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn b803d (pfn 25a3d) from L1 entry 
> 00000000b803d021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn 10829 (pfn 25a29) from L1 entry 
> 0000000010829021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn 1081c (pfn 25a1c) from L1 entry 
> 000000001081c021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn 10816 (pfn 25a16) from L1 entry 
> 0000000010816021 for l1e_owner=1, pg_owner=1
> (XEN) mm.c:889:d1 Error getting mfn 1081a (pfn 25a1a) from L1 entry 
> 000000001081a021 for l1e_owner=1, pg_owner=1
> ...
> 
> This appears on the map; nothing is printed on the unmap. If the
> unmap happens while the domain is up, it seems to be invalid more often;
> most (perhaps all) of the destination-valid unmaps happen when the domain
> is being destroyed. Exactly which pages are valid or invalid seems to be
> mostly random, although nearby GFNs tend to have the same validity.
> 
> If you have any thoughts as to the cause, I can test patches or provide
> output as needed; it would be better if this workaround weren't needed.
> 
> -- 
> Daniel De Graaf
> National Security Agency

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>