WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] Full virtualization and I/O

To: "Liang Yang" <multisyncfe991@xxxxxxxxxxx>
Subject: RE: [Xen-devel] Full virtualization and I/O
From: "Petersson, Mats" <Mats.Petersson@xxxxxxx>
Date: Wed, 22 Nov 2006 18:22:52 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, Thomas Heinz <thomasheinz@xxxxxxx>
Delivery-date: Wed, 22 Nov 2006 09:26:26 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <BAY125-DAV845335C01C7AC3A2954BE93E30@xxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AccOWh8vXRz5yFDqSuagVnV0XShOaAAAIb/w
Thread-topic: [Xen-devel] Full virtualization and I/O
 

> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Liang Yang
> Sent: 22 November 2006 17:17
> To: Petersson, Mats
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Thomas Heinz
> Subject: Re: [Xen-devel] Full virtualization and I/O
> 
> Hi Mats,
> 
> This para-virtualized driver in HVM domain is just like the 
> dummy device 
> driver in para-virtualized domain. And after using this 
> para-virtualized 
> driver in HVM domain, HVM doamin is also using this kind of 
> front-end/back-end model to handle I/O instead of using 
> "device model" which 
> a typical HVM domain will use.
> 
> Am I correct?

Yes, exactly. 

Of course, the HVM domain may well use a mixture, say for example using
the normal (device-model) IDE device driver to access the disk, and a
para-virtual network driver to access the network. 

--
Mats
> 
> Liang
> 
> ----- Original Message ----- 
> From: "Petersson, Mats" <Mats.Petersson@xxxxxxx>
> To: "Liang Yang" <multisyncfe991@xxxxxxxxxxx>
> Cc: "Thomas Heinz" <thomasheinz@xxxxxxx>; 
> <xen-devel@xxxxxxxxxxxxxxxxxxx>
> Sent: Wednesday, November 22, 2006 9:57 AM
> Subject: RE: [Xen-devel] Full virtualization and I/O
> 
> 
> 
> 
> > -----Original Message-----
> > From: Liang Yang [mailto:multisyncfe991@xxxxxxxxxxx]
> > Sent: 22 November 2006 16:51
> > To: Petersson, Mats
> > Cc: Thomas Heinz; xen-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: Re: [Xen-devel] Full virtualization and I/O
> >
> > Hi Mats,
> >
> > Thanks for your explanation in such details.
> >
> > As you mentioned in your post, could you elaborate using
> > unmodified driver
> > in HVM domain (i.e. using front-end driver in
> > full-virtualized domain)? Do
> > you think para-virtualized domain will have exactly the same
> > behavior as
> > full-virtualized domain when both of them are using this
> > unmodified driver
> > to access virtual block devices?
> 
> Not sure exactly what you're asking, but if you're asking if the
> performance of driver-related work will be approximately the 
> same, yes.
> 
> By the way, I wouldn't call that an "unmodified" driver - it is
> definitely a MODIFIED driver (a para-virtual driver).
> 
> --
> Mats
> >
> > Best regards,
> >
> > Liang
> >
> > ----- Original Message ----- 
> > From: "Petersson, Mats" <Mats.Petersson@xxxxxxx>
> > To: "Thomas Heinz" <thomasheinz@xxxxxxx>;
> > <xen-devel@xxxxxxxxxxxxxxxxxxx>
> > Sent: Wednesday, November 22, 2006 9:24 AM
> > Subject: RE: [Xen-devel] Full virtualization and I/O
> >
> >
> > > -----Original Message-----
> > > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> > > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of
> > > Thomas Heinz
> > > Sent: 20 November 2006 23:39
> > > To: xen-devel@xxxxxxxxxxxxxxxxxxx
> > > Subject: [Xen-devel] Full virtualization and I/O
> > >
> > > Hi
> > >
> > > Full virtualization is about providing multiple virtual ISA level
> > > environments and mapping them to a single physical one. One
> > > particular
> > > aspect of this mapping are I/O instructions (explicit or
> > > mmapped I/O). In
> > > general, there are two strategies to partition the devices,
> > > either in time
> > > or in space. Partitioning a device in space means that the
> > > device (or a
> > > part of it) is exclusively available to a single VM.
> > > Partitioning a device
> > > in time (or time multiplexing) means that it can be used by
> > > multiple VMs
> > > but only one VM may use it at any point in time.
> >
> > The Xen approach is to not allow any sharing of devices, a device is
> > owned by one domain, no other domain can directly access the device.
> > There is a protocol of so called frontend/backend driver which is
> > basically a dummy-device that forwards a request to another domain
> > (normally domain 0) and the other half of the driver-pair is
> > picking up
> > this data, forwards it to some processing task, that then sends the
> > packet onto the real hardware.
> >
> > For fully virtualized mode (hardware supported virtual
> > machine, such as
> > AMD-V or Intel VT, aka HVM), there is a different model,
> > where a "device
> > model" is involved to perform the hardware modelling. In 
> Xen, this is
> > using a modified version of qemu (called qemu-dm), which 
> has a fairly
> > complete set of "hardware" in it's model. It's got for example IDE
> > controller, several types of network devices, graphics and
> > mouse/keyboard models. The things you'd usually find in a 
> PC, that is.
> > The way it works is that the hypervisor intercepts IOIO and memory
> > mapped IO regions that match the devices involved (such as the
> > A0000-BFFFF region for VGA frame buffer memory or the 0x1F0-0x1F7 IO
> > ports for the IDE controller), and forwards a request from the
> > hypervisor to qemu-dm, where the operation changes the 
> current state,
> > and when it's necessary, the state-change will result in 
> for example a
> > read-request to the "hard-disk" (which may be a real disk, 
> a file on a
> > local disk, or a file on a network storage device, to give some
> > examples).
> >
> > There is also the option of using the frontend drivers as described
> > above in the fully virtualized model.
> >
> > Finally, while I'm on the subject of fully virtualized mode: It is
> > currently not possible to give a DMA-based device to a
> > fully-virtualized
> > domain. The reason for this is that the guest OS will have been told
> > that memory is from 0..256MB (say), and it's actual machine physical
> > address is at 256MB..512MB. The OS is completely unaware of this
> > "mismatch". So the OS will perform some operation to take a virtual
> > address of some buffer (say a network packet) and make it into a
> > "physical address", which will be an address in the range 
> of 0..256MB.
> > This will of course (at least) lead to the wrong data being
> > transmitted,
> > as the address of the actual data is somewhere in the range
> > 256MB..512MB. The only solution to this is to have an 
> IOMMU, which can
> > translate the guest's understanding of a physical address
> > (0..256MB) to
> > a machine physical address (256..512MB).
> >
> > >
> > > I am trying to understand how I/O virtualization on the ISA
> > > level works if
> > > a device is shared between multiple VM instances. On a very
> > > high level, it
> > > should be as follows. First of all, the VMM has to intercept
> > > the VM's I/O
> > > commands (I/O instructions or load/store to dedicated memory
> > > addresses -
> > > let's ignore interrupts for the moment). This could be done
> > > by traps or by
> > > replacing the resp. instructions by VMM calls to I/O
> > > primitives. The VMM
> > > keeps multiple device model instances (one for each VM using
> > > the device)
> > > in memory. The models somehow reflect the low level I/O API
> > > of the device.
> > > Depending on which I/O command is issued by the VM, either
> > the memory
> > > model is changed or a number of I/O instructions are executed
> > > to make the
> > > physical device state reflect the one represented in the
> > memory model.
> >
> > Do you by ISA mean "Instruction Set Architecture" or 
> something else (I
> > presume it's NOT meaning ISA-bus...)?
> >
> > Intercepting IOIO instructions or MMIO instructions is not 
> that hard -
> > in HVM the two processor architectures have specific intercepts and
> > bitmaps to indicate which IO instructions should be 
> intercepted. MMIO
> > will require the page-tables to be set up such that the 
> memory mapped
> > region is mapped "not present" so that any operation to this region
> > gives a page-fault, and then the page-fault is analyzed to 
> see if it's
> > for a MMIO address or for a "real page fault".
> >
> > For para-virtualization, the model is similar, but the 
> exact model of
> > how to intercept the IOIO or MMIO instruction is slightly 
> different -
> > but in essence it's the same principle. Let me know if you 
> really need
> > to know how Xen goes about doing this, as it's quite 
> complicated (more
> > so than the HVM version, for sure).
> >
> >
> > >
> > > This approach brings up a number of questions. It would be
> > > great if some of
> > > the virtualization experts here could shed some light on them
> > > (even though
> > > they are not immediately related to Xen, I know):
> > >
> > > - How do these device memory models look like? Is there a common
> > >   (automata) theory behind or are they done ad hoc?
> >
> > Not sure what you're asking for here. Since the devices are either
> > modeled after a REAL device (qemu-dm) and as such will resemble as
> > closely as possible the REAL hardware device that it's
> > emulating, or in
> > the frontend/backend driver, there is an "idealized model", 
> such that
> > the request contains just the basic data that the OS 
> provides normally
> > to the driver, and it's placed in a queue with a message-signaling
> > system to tell the other side that it's got something in the queue.
> >
> > > - What kind of strategies/algorithms are used in the merge
> > > phase, i.e. the
> > >   phase where the virtual memory model and the physical one are
> > >   synchronized? What kind of problems can occur in this phase?
> >
> > The Xen approach is to avoid this by only giving one device to each
> > machine.
> >
> > > - Are specific usage patterns used in real world
> > implementations (e.g.
> > >   VMWare) to simplify the virtualization (model or merge phase)?
> >
> > This is probably the wrong list to ask detailed questions about how
> > VMWare works... ;-)
> >
> > > - Do you have any interesting pointers to literature dealing
> > > with full I/O
> > >   virtualization? In particular, how does VMWare's full
> > virtualization
> > >   works with respect to I/O?
> >
> > Again, wrong list for VMWare questions.
> >
> > > - Is every device time partitionable? If not, which
> > > requirements does it
> > >   have to meet to be time partitionable?
> >
> > Certainly not - I would say that almost all devices are NOT time
> > partitionable, as the state in the device is dependant on 
> the current
> > usage. The more complex the device is, the more likely it is to have
> > difficulties, but even such a simple deevice as a serial port would
> > struggle to work in a time-shared fashion (not to mention 
> that serial
> > ports generally are used for multiple transactions to make a whole
> > "bigger picture transaction", so for example a web-server
> > connected via
> > a serial modem would send a packet of several hundred bytes to the
> > serial port driver, which is then portioned out as and when 
> the serial
> > port is ready to send another few bytes. If you switch from
> > one guest to
> > another during this process, and the second guest also has
> > something to
> > send on the serial port, you'd end up with a very scrambled
> > message from
> > the first guest and quite likely the second guests message 
> completely
> > lost!).
> >
> > There are some devices that are specifically built to 
> manage multiple
> > hosts, but other than that, any sharing of a device requires some
> > software to gather up "a full transaction" and then sending
> > that to the
> > actual hardware, often also waiting for the transaction to
> > complete (for
> > example the interrupt signal to say that the hard disk write is
> > complete).
> >
> >
> > >   -> I don't think every device is. What about a device
> > which supports
> > >      different modes of operation. If two VMs drive the
> > > virtual device in
> > >      different modes, it may not be possible to constantly
> > > switch between
> > >      them. Ok, this is pretty artificial.
> >
> > A particular problem is devices where you can't necessarily 
> read back
> > the last mode-setting, which may well be the case in many different
> > devices. You can't, for example, read back all the 
> registers on an IDE
> > device, because the read of a particular address amy give the status
> > rather than the current comamnd sent, or some such.
> >
> > --
> > Mats
> > >
> > > Thanks a lot for your help!
> > >
> > >
> > > Best wishes
> > >
> > > Thomas
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > > http://lists.xensource.com/xen-devel
> > >
> > >
> > >
> >
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
> >
> >
> >
> >
> 
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel