|   | 
      | 
  
  
      | 
      | 
  
 
     | 
    | 
  
  
     | 
    | 
  
  
    |   | 
      | 
  
  
    | 
         
xen-devel
Re: [Xen-devel] live saving of domU
 
Andres Lagar Cavilla wrote:
 My understanding is that the guest only canonicalizes the store and 
console mfn's and places them on the shared info frame which is 
passed to the suspend hypercall. The rest of the canonicalizations 
are done by dom0 user-space code (xc_linux_save).
 
 Sort of.  When you pause a domain, it could be doing something like a 
PTE update in which case it has a PFN in a register (or on the stack 
somewhere).  Part of the reason for having a suspend entry point in 
the kernel is to ensure that we're in a consistent state.
 
 Does the guest kernel do anything beyond what's in __do_suspend in 
reboot.c?
 
Nothing that isn't reachable from that function.
 The guest never really shuts down: it issues the suspend hypercall 
and waits for it to return. This could happen months later when the 
domain is resumed :) The suspend hypercall executing in xen is the 
one that pauses all vcpus and kills the domain.
 
Actually, take a look at what HYPERVISOR_suspend is:
 It's just a shutdown op.  
 
 But it doesn't have to be. The hypercall could only pause the domain, 
and let the user-space tools unpause (no 's' bit -> no domain/devices 
teardown) when checkpointing is over. The guest kernel can't tell the 
difference: it returns from the hypercall and life goes on, as long as 
the devices are still there. That's what I was referring to with:
 
 It could, but you have a number of other problems you have to solve.  
How do you signal to userspace that the domain is suspended?  You could 
introduce another VIRQ perhaps or extend the state.  The __do_suspend 
path supposes that the devices are being cycled too.  You either need 
Xend to participate in this process.  How devices interact would need 
some careful thinking.
 Is it feasible to use a different hypercall that pauses the domain 
but doesn't kill it, and once xc_linux_save is done checkpointing 
have it issue a dom0_op that unpauses the domain?
 
 A domain is "killed" with a dom0_op of domain_destroy which is 
invoked by Xend.  The problem with checkpointing is that once the 's' 
bit has been set on a domain, there's no way to unset that bit.
 
 As I said a few lines up, let's not set the 's' bit for lightweight 
checkpoints. This is likely to cause a lot of special casing for 
xend/xenstore, right?
 
 Yeah, there's a lot of bits of userspace code that would be effected.  I 
hope this isn't disparaging, I certainly think it's worth the effort.
Regards,
Anthony Liguori
 
Andres
 
 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
 
 |   
 
 | 
    | 
  
  
    |   | 
    |