WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH][Xen 4.0-testing.hg] fix small bugs of memory sha

To: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH][Xen 4.0-testing.hg] fix small bugs of memory sharing
From: Jui-Hao Chiang <juihaochiang@xxxxxxxxx>
Date: Wed, 15 Dec 2010 17:55:15 +0800
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Grzegorz Milos <grzegorz.milos@xxxxxxxxx>
Delivery-date: Wed, 15 Dec 2010 01:56:09 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=qA/FbYijmDpV10OUgf33VFf/Ac+ZmmFfDZfgKyxzzq0=; b=Kqec8NlAugIb2yvYyDNIxdWIIqLjsxpijfasVSbRxugZYJmVuW+v+HXOFvIavEhBl5 0caMjsNSlj6S+j7uQg9NObOAzZNXVn+CsCjnq4kUSvOer3SokHQ2R3UiBJhwr4Y1PX7K yWld5hm8lWY5KVgXNPXKq+nvnwGm8iOlg7wu8=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=OC0INNrMOcdGiUqf41JM3n9FagQ/kdoFMrotFR5aM209BvH8LG3jQukBnyxogXdReJ aaXblY0HYSWpY4XeB4K4hs8EE1i5pY8PSus+sqsiRpLXaYL8CI+YH0zQHtnHosflvWqp elLITRvViUsTxaKMO6bnpU/qTLXdLmMh41eFU=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <20101210095716.GI9912@xxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <AANLkTik-qNUCA79WTFzh-ZFQiaXMbAmStVmsyYW0hQNh@xxxxxxxxxxxxxx> <20101208105123.GC9912@xxxxxxxxxxxxxxxxxxxxxxx> <AANLkTi=7UwUWbR2XCeOOXnxT2o6B2FfMm6+YH__aa1Pu@xxxxxxxxxxxxxx> <20101209100148.GG9912@xxxxxxxxxxxxxxxxxxxxxxx> <AANLkTin0nX=_S5w+FJ1gh4FOdpuH0S+ygH_w2Yxuxiuh@xxxxxxxxxxxxxx> <20101210095716.GI9912@xxxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi, Tim:

Sorry for late reply, since I am thinking about some design issues.

On Fri, Dec 10, 2010 at 5:57 PM, Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:
Hi,

Can you possibly configure your mailer to indent quoted text?  Your
emails are a bit tricky to read without it.

I am using the gmail, but don't see any config for this. It should be indented and prefixed by the symbol '|', isn't it?
 

At 08:30 +0000 on 10 Dec (1291969810), Jui-Hao Chiang wrote:
> I just checked the value of PGT_none, and it is actually equal to 0,
> so there seems nothing wrong with the original checking.

What I meant was: the current check looks at the type and the count.
AFAICS it ought to be only checking the count and ignoring the type.
That is:

 if((x & (PGT_type_mask | PGT_count_mask)) != PGT_none)

should be:

 if ( (x & PGT_count_mask) != 0 )


> The reason that I swap step3 and step2 is because step3 doesn't modify any value while step2 does.
> If checking count_info fails, then just abort. Else if count_info is correct, then go checking and modifying type_info.

I think the check needs to be after the type change.  Otherwise you
could race with another refcount


I see. So you are saying that type_count is providing some kind of protection on refcount?
If that's the case, I will just add a small piece of code to recover type_info when the check of count_info fails.
 
> I found one patch of xenpaging as the following, but not sure it's exactly the same as my question.
> Cuts from their article http://thread.gmane.org/gmane.comp.emulators.xen.devel/91768/focus=91770
> "qemu will just keep mapping pages and not release them, which causes problems for the memory pager (since the page is mapped, it won't get paged out)"
>
> AFAIK, the qemu can not always map the entire address space of HVM guest, so Map Cache feature tends to map HVM's memory on-demand/partially. The flush-cache command will trigger qemu_invalidate_map_cache(), which in turn calls munmap() on all mapped virtual addresses. Am I on the right track?

Possibly.  Where are you putting the code that scans for duplicate pages?


First, the patch from the above URL seems to work on the unstable version, which calls qemu_invalidate_map_cache() to drop the count down to 1 for qemu mapped pages. But the patch is not currently in the unstable version.

As for where to put the code for duplicate pages, we wonder to do that in workqueue or some other thread-like things. Since this is a periodic running procedure, and definitely takes a long time. Originally we want to put it in dom0 user-level as a daemon, but there are not enough page information there. So it seems better to put inside hypersior. Do you have any other suggestion?

One more thing is about the design part.
(1) The current memory sharing code uses a unique 64-bit handle number to identify a page/mfn, and use this handle to index into the hash list.
(2) The page nomination makes the page type as p2m_ram_shared (read-only), and return the handle to user, e.g. blktap2.
(3) Later on, the user calls mem_sharing_share_pages(handle1, handle2) by giving two handles. Note that, the code doesn't do any comparison of page content but leave this task to users whereas the only user we know is blktap2 with qcow disk format configured.
But why not let the memory sharing code to do the content comparison? Is there any user who wants to share two pages with different content?

In order to perform the page comparison, we need to compute checksum value for the page content (or even md5, sha1..), and use this checksum value to index into a hash list. It seem feasible to replace the handle number with the checksum value of page, and combine the nominate() and share() operations together in one function as the following steps:
(1) Mark the page as p2m_ram_shared
(2) Compute the checksum of a page
(3) Search in the hash list, if some page with the same checksum value is found, a real byte-by-byte memory compasion is performed.
(4) If content is totally matched, perform the share() operation by removing one duplicated page/mfn.
(5) Record this checksum value in page_info->shr_handle (used to store the handle)

Afterwards, when COW happens, the unshare() operation uses domain_id and gfn to find the page_info, then uses the inner checksum value to search hash list. Everything seems fine, but the tradeoff is to change the blktap2 code for this new interface.

 
> > Assume my previous guess for stub domain is right.  Then if a page
> > from the previous scenario is made sharable and its mapped mfn is
> > freed (when sharing two pages, the later one's mfn will be discarded),
> > will the stub domain refer to the old discarded mfn if no unmapping is
> > performed?
>
> Yes, and that's definitely not good.  AFAICS, qemu has to drop all its
> mappings for the sharing to be safe, and the page needs to be unshared
> again if qemu tries to map it again.  Otherwise I/O from one VM could
> pollute another VM (or in your case, I/O to one page could overwrite
> another page).
>
>
> Thanks for your remind "the page needs to be unshared again if qemu tries to map it again".
> There exists several hypercalls to perform the memory mapping, e.g. mmu_update, update_va_mapping, and I will check on them.
> Of course, only the RW mapping should be taken care, right?

No, any foreign mapping needs to be handled, including grant tables,
because there's no guaranteed way to fix them up when the sharing is
undone later.

Thanks for the remind! I will make sure they are unmapped properly.
 
Bests,
Jui-Hao

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel