Hi,
   
  I have been reading some papers from Xen and other sources, there are 
  just a couple of questions that I found hard to understand.
   
  Why is Xen hypervisor better than a traditional hypervisor? With a 
  traditional hypervisor, during a context switch, the hypervisor stores the 
  states of a guest OS then goes to the next OS, upon coming back to the first 
  OS it restores the hardware states then passes it on to the first OS. Does Xen 
  pretty much do the same thing except it provides an API to the OS, and the 
  reason/benefit of having such an API is to reduce the time for a TLB 
  flush?  
   
   
I may be wrong here, but I think the "reduce the time for TLB flush" and 
Context Switching are not strictly related. 
 
There are generally two ways to implement a Hypervisor (aka Virtual 
Machine Monitor/VMM):
- Para-virtualization, like Xen in it's traditional shape, where the OS 
source-code is modified to interact directly with the 
hypervisor.
- Full virtualization: No changes to the OS source code. Xen can do this 
to, with HVM domains. 
 
One of the advantages with Para-virtualization is that the para-virtual 
domain can give direct and "full" information to the Hypervisor. For example: If 
a call in the user-mode app does "char *p = malloc(4 * 1024 * 1024);", then the 
OS will have to write 1024 page-table-entries (possibly plus a couple for 
creating a new page-table entry at the higher level(s)). Since only the 
hypervisor knows the ACTUAL page-table layout (since it's the only instance that 
knows the ACTUAL memory layout), the page-tables in the guest are 
write-protected and when a write happens, it's trapped. But for big blocks like 
this, assuming the code understands big blocks of memory allocation in one 
place, can just call to the hypervisor with a call to say "map me these 1024 
pages". 
 
In the full virtualization instance, we can't KNOW what's going on, so 
each single page-write will cause an intercept to the hypervisor, the hypervisor 
emulating the page-write. Of course, each page-write has a risk of incurring a 
TLB-flush too... But in the above case, we only need one TLB-flush for each call 
to the "map many pages" function. 
 
Obviously, when switching between OS's, it's necessary to save the 
current context and set up the next context, and eventually switching back to 
the original context. Whether you do this via a call inside the OS source code 
(like in Para-virtual Xen domains) or by identifying a "block" some way through 
external means (such as the intercept of a HALT instruction to indicate that the 
guest is blocked waiting for an event of some sort) isn't going to make a big 
difference. The difference is in the fact that the OS knows what it's trying to 
achieve (map one page or many pages), and can help the hypervisor by giving 
additional information that a "full" virtualization hypervisor can't know of. 
 
I'm sure that if I've got this all wrong, someone will correct me... 
 
--
Mats
   
  Can someone please explain this to me in detail?
   
  Cheers
   
  Chris