I have been reading some papers from Xen and other sources, there are
just a couple of questions that I found hard to understand.
Why is Xen hypervisor better than a traditional hypervisor? With a
traditional hypervisor, during a context switch, the hypervisor stores the
states of a guest OS then goes to the next OS, upon coming back to the first
OS it restores the hardware states then passes it on to the first OS. Does Xen
pretty much do the same thing except it provides an API to the OS, and the
reason/benefit of having such an API is to reduce the time for a TLB
I may be wrong here, but I think the "reduce the time for TLB flush" and
Context Switching are not strictly related.
There are generally two ways to implement a Hypervisor (aka Virtual
- Para-virtualization, like Xen in it's traditional shape, where the OS
source-code is modified to interact directly with the
- Full virtualization: No changes to the OS source code. Xen can do this
to, with HVM domains.
One of the advantages with Para-virtualization is that the para-virtual
domain can give direct and "full" information to the Hypervisor. For example: If
a call in the user-mode app does "char *p = malloc(4 * 1024 * 1024);", then the
OS will have to write 1024 page-table-entries (possibly plus a couple for
creating a new page-table entry at the higher level(s)). Since only the
hypervisor knows the ACTUAL page-table layout (since it's the only instance that
knows the ACTUAL memory layout), the page-tables in the guest are
write-protected and when a write happens, it's trapped. But for big blocks like
this, assuming the code understands big blocks of memory allocation in one
place, can just call to the hypervisor with a call to say "map me these 1024
In the full virtualization instance, we can't KNOW what's going on, so
each single page-write will cause an intercept to the hypervisor, the hypervisor
emulating the page-write. Of course, each page-write has a risk of incurring a
TLB-flush too... But in the above case, we only need one TLB-flush for each call
to the "map many pages" function.
Obviously, when switching between OS's, it's necessary to save the
current context and set up the next context, and eventually switching back to
the original context. Whether you do this via a call inside the OS source code
(like in Para-virtual Xen domains) or by identifying a "block" some way through
external means (such as the intercept of a HALT instruction to indicate that the
guest is blocked waiting for an event of some sort) isn't going to make a big
difference. The difference is in the fact that the OS knows what it's trying to
achieve (map one page or many pages), and can help the hypervisor by giving
additional information that a "full" virtualization hypervisor can't know of.
I'm sure that if I've got this all wrong, someone will correct me...
Can someone please explain this to me in detail?