Thanks for the thoughtful reply, Ian!
> > I was planning on providing both Model C and Model D (see below),
> > but let me know if you will only accept Model C (or even Model B)
> > and I will adjust accordingly.
>
> I think all these models are wrong :-)
Yes, well, I think allowing guests to unproductively hoard
idle physical memory is also wrong. :-)
> 'free' guest memory is often serving useful purposes such as
> acting as a buffer cache etc, so ballooning it out unnecessarily
> is probably not a good thing.
Depends on the domain and workload. If the working set of
the domain is much smaller than physical memory, then letting
the domain fill its buffer cache on the odd chance that it
might use one or more of those pages again -- especially when
there are other 'memory-starved' domains -- is probably not
a good thing either.
> Model D might work better if we had a way of giving up
> memory in a way that wasn't 'final' i.e. we could surrender pages back
> to xen, but would get a ticket with which we could ask Xen if it still
> had the page, and if xen hadn't zeroed them and handed them to someone
> else we could get the original page back. Hence, we could treat pages
> handed back to xen as a kind of 'unreliable swap device'.
Cool! Yes, this would be a nice addition and would make a great
research project. I think you are positing that a large percentage
of the pages would be handed back and thus taking them away 'permanently'
is not a good idea. I wouldn't argue that there are many domains
and workloads where this is true. But I *would* argue that there
are also many domains and workloads where the percentage would
be very small, and that taking them away permanently wouldn't
be noticeable.
So certainly Model D shouldn't be mandated for all domains; but
providing it as an option seems reasonable to me. Also, with
adequate hysteresis built in, we give the domain plenty of time
to change its mind before pressuring it to give away its most
precious buffer cache pages, while enforcing that it give away
its least-likely-to-be-reused pages. So in a sense, a high
downhysteresis value essentially provides the same 'unreliable
swap device' -- but each domain is far better able to implement
a reasonable 'victim' algorithm than is domain0.
> Even if we had such extensions, I'm not sure that having every domain
> eagerly surrender memory to xen is necessarily the best
> approach. It may
> be better to have domains just indicate to domain0 whether
> they are in a
> position to release memory, or whether they could actively
> benefit from
> more, and then have domain0 act as arbiter.
The proposed implementation defaults to exactly that: Each domain
now provides "I'm in a position to release memory" or "I released
too much and I need some back" to domain0. Though one can argue
about the quality/accuracy of the data provided in the proposed
implementation, some believe it's a reasonable first approximation
(and, indeed, that it OVER-estimates the true working set).
The selfballooning part simply serves as a quick-and-dirty
first-come-first-served policy that could just as easily be
implemented in domain0 (with latency), but also serves as a nice
standalone overcommit demo which also may be "good enough" for
some real world environments where supporting more simultaneous
virtual machines is more important than a small loss in
responsiveness.
Honestly, I think this is a rather elegant way to get around the
"semantic gap" and double-paging. Indeed, I'm thinking that OS's
that are becoming increasingly virtualization-conscious should
all understand that memory is a valuable shareable resource,
and should provide "idle memory" metrics and APIs to allow
a virtualization system to manage it appropriately. Another good
research topic?
Thanks!
Dan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|