This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


[Xen-devel] Automated para-virtualization

To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] Automated para-virtualization
From: "Joshua LeVasseur" <jtl@xxxxxxxxxx>
Date: Wed, 6 Apr 2005 01:59:27 +0200
Delivery-date: Tue, 05 Apr 2005 23:59:38 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcU6O3/LoUFVb3qdSZKYU3JZBBS8Dw==

We, University of Karlsruhe and UNSW/NICTA, have been working on a technique
to automate para-virtualization, in the hopes of simplifying the maintenance
of the various guest OSs, and would like to share our results to date.

The basis of our solution is instruction substitution at the assembler level
in order to replace the virtualization-sensitive operations of the guest OS.
The virtualization-sensitive operations include instructions and memory
accesses (such as to page tables or device registers).

In summary:

The patch to Linux 2.6 for IA32 is roughly 80 lines, primarily for manual
annotations of page table accesses (similar to Linux's user-access
annotations).  There are a few additional changes for the build process.

The automation relies on a runtime support module, which provides the CPU
model, and runs within the address space of the guest Linux.  By running
within the address space, and by batching virtualization state changes, we
achieve performance comparable to para-virtualization.  The runtime support
module is mostly guest OS independent since the virtualization is at the ISA
level; it can support other guest OSs.

The instruction substitution takes place at OS boot, which permits us to use
a single OS binary for bare metal (including VT and VMware), and any
supported hypervisor, such as Xen and our L4 microkernel.  To provide
rewrite space for the instructions, the OS binary is prepared with NOP
scratch space.

Our current research is to enable run-time migrations between incompatible
hypervisors, or between different versions of the same hypervisor, by
rewriting the instruction substitutions at time of migration.  Additionally,
we envisage that one can install a hypervisor underneath an OS which runs on
bare metal.

We have a high speed network device emulation, for the DP83820 driver, based
on the sensitive memory instruction substitution (an additional several line
patch to enable manual annotations).  If the guest OS uses the DP83820
device, then it has high-speed access to devices running in Dom0.  The speed
is comparable to using a customized device driver.  By using the DP83820
device, a guest OS can migrate between different hypervisors, since the
state is encapsulated in a model, and not a driver.

Our performance data is so far only from the Netperf benchmark, which uses
many of the virtualization-sensitive instructions.  In our results, which
focus on Xen and L4 with Linux 2.6.9, we see negligible performance
differences.  Additionally, when running the same OS binary on raw hardware,
we see an increase in performance (due to different trace cache behavior).

The solution currently works for IA32 and Itanium, but the approach is
applicable to other architectures (we are working on Power and ARM).

Our current research is to enable complete automation of the process, to
avoid any patches to Linux.  We have good progress here.

We will shortly release the code under a BSD license.


Our initial performance data:
Test system: 2.8 GHz P4 Prescott with 256MB RAM for each VM
Guest OS: Linux 2.6.9 configured for XT-PIC
Client system: 1.4 GHz P4
Connection: Intel gigabit, with the e1000 driver, via a gigabit switch
Netperf with 256K socket buffer, 1GB data transferred.

Description: send/receive (Mbit/s)

Annotated Linux on raw hardware: 834/712
Native Linux on raw hardware: 827/713

Annotated Linux on Xen: 834/711
XenoLinux: 830/711

Annotated Linux on L4: 830/709
L4Ka::Linux: 775/712

Annotated Linux with DP83820 model on L4: 771/707
L4Ka::Linux with custom network driver on L4: 772/708

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>