RE: [Xen-devel] [RFC] design/API for plugging tmem into existing

To:	Keir Fraser <keir.fraser@xxxxxxxxxxxxx>, "Xen-Devel (E-mail)" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject:	RE: [Xen-devel] [RFC] design/API for plugging tmem into existing xen physical memory management code
From:	Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
Date:	Sat, 14 Feb 2009 07:58:09 -0800 (PST)
Cc:
Delivery-date:	Sat, 14 Feb 2009 07:59:38 -0800
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<C5BC263E.21F3%keir.fraser@xxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

Thanks much for the reply!

> > 4) Does anybody have a list of alloc requests of
> >      order > 0
> 
> Domain and vcpu structs are order 1. Shadow pages are 
> allocated in order-2 blocks.

Are all of these allocated at domain startup only?  Or
are any (shadow pages perhaps?) allocated at relatively
random times?  If random, what are the consequences
if the allocation fails?   Isn't it quite possible
for a random order>0 allocation to fail today due
to "natural causes"?  E.g. because the currently running
domains by coincidence (or by ballooning) have used
up all available memory?  Have we just been "lucky"
to date, because fragmentation is so bad and ballooning
is so rarely used, that we haven't seen failures
of order>0 allocations? (Or maybe have seen them but
didn't know it because the observable symptoms are
a failed domain creation or a failed migration?)

In other words, I'm wondering if tmem doesn't create
this problem, just increases the probability that it will
happen?

Perhaps Jan's idea of using xenheap as an "emergency
fund" for free pages is really a good idea?

> > ** tmem has been working for months but the code has
> > until now allocated (and freed) to (and from)
> > xenheap and domheap.  This has been a security hole
> > as the pages were released unscrubbed and so data
> > could easily leak between domains.  Obviously this
> > needed to be fixed :-)  And scrubbing data at every
> > transfer from tmem to domheap/xenheap would be a huge
> > waste of CPU cycles, especially since the most likely
> > next consumer of that same page is tmem again.
> 
> Then why not mark pages as coming from tmem when you free 
> them, and scrub
> them on next use if it isn't going back to tmem?

That's a reasonable idea... maybe with a "scrub_me"
flag set in the struct page_info by tmem and checked by the
existing alloc_heap_pages (and ignored if a memflags flag
is passed to alloc_xxxheap_pages() set to "ignore_scrub_me")?
There'd also need to be a free_and_scrub_domheap_pages().

If you prefer that approach, I'll give it a go.  But still
some (most?) of the time, there will be no free pages so
alloc_heap_pages will still need to have a hook to tmem
for that case.

> I wasn't clear on who would call your C and D functions, and 
> why they can't
> be merged. I might veto those depending on how ugly and 
> exposed the changes
> are outside tmem.

I *think* these calls are just in python code (domain creation
and ballooning) and, if so, will just go through the existing
tmem hypercall.

Thanks,
Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

RE: [Xen-devel] [RFC] design/API for plugging tmem into existing xen phy