Hi, all
Page offline can be used by many purpose, like memory offline, memory
power management, proactive action when multiple CE error happen to one page
etc. In virtualization environment without guest offline support, we think
offline a page usually means replace the old page with a new one transparently
to guest.
Currently we are trying to add page offline support in Xen environment
. We'd share our idea with the mailing list before we begin implment it and
hope to get feedback from the community.
Our idea:
Page offline will be done in two steps: firstly, a page is marked
offline-pending (if the page is free already, it will be marked as offline
directly); secondly, when that page is freed, that page will be marked offline
automatically, and will not allocated anymore. Notice is, we will not support
all types of page because the diffrent page usage model. Basically, xen heap
page, page owned by dom0 can not be offline. Also, it is complex to offline a
page that is used for guest with device assigned.
We are considering utilize the live migration mechanism to achieve the
two steps. The user space tools will firstly mark page offline_pending through
hypercall, this hyercall will also return the owners of the pages. secondly, if
all pages can be offlined, user space tools should live migrate the domains
owning those pages. Thirdly, user space tools will check all page is offlined
already.
Following hypercall will be added:
int xen_page_offline_pending(int start_pfn, int end_pfn, void *result,
void *owners)
IN: start_pfn/end_pfn:
the range of pages to be offlined.
OUT: result:
A buffer contain the page status for each page, it can be:
offlined: the page is offlined already (e.g. the page
is already freed when the hypercall happen)
offline_pending: the page will be offline when freed
offline_fail: The page can't be offline, may because it
is used by xen/dom0. Notice is, if any page is marked offline_fail, this
hypercall will not change any page's status (i.e. no page will be marked
offline_pending or offlined) to make sure atomic operation.
other status: Other status to be defined in future.
OUT: owners:
A buffer contains the domains owning of the pages. Because of
security consideration, it will not state which domain owning which page.
Need notice is, issue exists for the live migrate mechanism:
a) the domain ID will be changed after live migrate
b) live migration will fail for a domain with device assigned, so user
space tools have to hot remove the device, or fail the page offline requirement.
Some other option:
Of course, there are still exists some other mechianism to achive this
purpose:
1) Handle the page offline requirement in Xen environment. It is simple
to page offline a HVM domain ( without any device assigned) utilize the p2m
table, and re-create the shadow/EPT table from scratch. But it is not so easy
for PV domain (maybe we can switch the domain to shadow mode during this
procedure), and domain with device assigned. Also, it is complex to support
following page types:
a) page shared between multiple domain, like granted to dom0,
b) page used for domain control, for example, the page used for hvm
domain's vlapic.
Any feedback is welcome.
Thanks
Yunhong Jiang
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|