|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] [PATCH 00 of 10] Teach xm save to checkpoint a running d
>I'm not too sure about the last couple of patches in this
>series. Because the checkpointing domain doesn't disconnect before
>calling suspend, it retains a few references to pages it doesn't
>own. These trigger a PT race detector in xc_linux_save, which causes
>it to abort. So the last couple of patches explicitly identify the
>references I've found so far (shared_info and some grant table shared
>pages) and simply zero those PTEs during save, since they'll be
>recreated on restore. Finding the grant table pages is a bit fragile -
>I walk the page table loaded in CR3 at the time of suspend looking for
>the virtual address I've stowed in the suspend record. I've only got
>code for two-level page tables at the moment, since I'm not convinced
>this is the right approach. Under what circumstances would a non-live
>save have an unsafe PTE race?
Pretty much any PT race in a non-live save/migrate is a bug; the
domain is (in theory) suspended at this point, and all of the
devices are disconnected. Since you've chosen not to 'disconnect'
the devices, you'll get random updates occuring to any shared
pages (shared via grants or directly shared with Xen).
> Maybe it's fine to simply zero these ptes without checking them.
I'd think not.
>Or maybe it'd be less fragile to get the owners of the pages from Xen
>and see if the guest has legitimate mappings to them? Comments?
I think the ideal thing to do here is to mirror the live migrate case,
i.e. do a full 'disconnect' of devices, xenbus, console, event channels
etc, and then bring them back up. It'll probably be possible to do this
in a slightly more efficient / less intrusive fashion by just cauterising
things in Xen (i.e. closing the event channel -> guest path but not
unbinding the interdomain side). For grants, you basically have to
follow the live migrate case and be prepared to re-issue, since otherwise
on resume (which is preumably desired at some point?) you'll have garbage
in flight and/or lost requests.
Anyway, looks like an interesting start, and would be a nice feature
to get into -unstable sometime post 3.0.4.
cheers,
S.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [Xen-devel] [PATCH 01 of 10] Add resumedomain domctl to resume a domain after checkpoint, (continued)
- [Xen-devel] [PATCH 01 of 10] Add resumedomain domctl to resume a domain after checkpoint, Brendan Cully
- [Xen-devel] [PATCH 06 of 10] Make suspend hypercall return 1 when the domain has been resumed, Brendan Cully
- [Xen-devel] [PATCH 07 of 10] Add new shutdown mode for checkpoint, Brendan Cully
- [Xen-devel] [PATCH 04 of 10] Add XS_RESUME command, Brendan Cully
- [Xen-devel] [PATCH 03 of 10] Export xc_domain_resume to xend, Brendan Cully
- [Xen-devel] [PATCH 10 of 10] Ignore safe foreign maps in xc_linux_save, Brendan Cully
- [Xen-devel] [PATCH 09 of 10] Advertise address of grant table shared pages in suspend record, Brendan Cully
- [Xen-devel] [PATCH 05 of 10] Export XS_RESUME to xend, Brendan Cully
- [Xen-devel] [PATCH 08 of 10] Add xm save -c/--checkpoint option, Brendan Cully
- Re: [Xen-devel] [PATCH 00 of 10] Teach xm save to checkpoint a running domain,
Steven Hand <=
|
|
|
|
|