xen-devel
Re: [Xen-devel] Full virtualization and I/O
Hi Mats,
The problem using QEMU device mode is it can only support four disks at most
as IDE can only support two master/slave pair devices. Do you know if this
limitation will be gone in the future in QEMU, e.g. emulating SCSI disks
instead of IDE devices, so it can support much more than four block devices?
Thanks,
Liang
----- Original Message -----
From: "Petersson, Mats" <Mats.Petersson@xxxxxxx>
To: "Liang Yang" <multisyncfe991@xxxxxxxxxxx>
Cc: <xen-devel@xxxxxxxxxxxxxxxxxxx>; "Thomas Heinz" <thomasheinz@xxxxxxx>
Sent: Wednesday, November 22, 2006 10:22 AM
Subject: RE: [Xen-devel] Full virtualization and I/O
-----Original Message-----
From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Liang Yang
Sent: 22 November 2006 17:17
To: Petersson, Mats
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Thomas Heinz
Subject: Re: [Xen-devel] Full virtualization and I/O
Hi Mats,
This para-virtualized driver in HVM domain is just like the
dummy device
driver in para-virtualized domain. And after using this
para-virtualized
driver in HVM domain, HVM doamin is also using this kind of
front-end/back-end model to handle I/O instead of using
"device model" which
a typical HVM domain will use.
Am I correct?
Yes, exactly.
Of course, the HVM domain may well use a mixture, say for example using
the normal (device-model) IDE device driver to access the disk, and a
para-virtual network driver to access the network.
--
Mats
Liang
----- Original Message -----
From: "Petersson, Mats" <Mats.Petersson@xxxxxxx>
To: "Liang Yang" <multisyncfe991@xxxxxxxxxxx>
Cc: "Thomas Heinz" <thomasheinz@xxxxxxx>;
<xen-devel@xxxxxxxxxxxxxxxxxxx>
Sent: Wednesday, November 22, 2006 9:57 AM
Subject: RE: [Xen-devel] Full virtualization and I/O
> -----Original Message-----
> From: Liang Yang [mailto:multisyncfe991@xxxxxxxxxxx]
> Sent: 22 November 2006 16:51
> To: Petersson, Mats
> Cc: Thomas Heinz; xen-devel@xxxxxxxxxxxxxxxxxxx
> Subject: Re: [Xen-devel] Full virtualization and I/O
>
> Hi Mats,
>
> Thanks for your explanation in such details.
>
> As you mentioned in your post, could you elaborate using
> unmodified driver
> in HVM domain (i.e. using front-end driver in
> full-virtualized domain)? Do
> you think para-virtualized domain will have exactly the same
> behavior as
> full-virtualized domain when both of them are using this
> unmodified driver
> to access virtual block devices?
Not sure exactly what you're asking, but if you're asking if the
performance of driver-related work will be approximately the
same, yes.
By the way, I wouldn't call that an "unmodified" driver - it is
definitely a MODIFIED driver (a para-virtual driver).
--
Mats
>
> Best regards,
>
> Liang
>
> ----- Original Message -----
> From: "Petersson, Mats" <Mats.Petersson@xxxxxxx>
> To: "Thomas Heinz" <thomasheinz@xxxxxxx>;
> <xen-devel@xxxxxxxxxxxxxxxxxxx>
> Sent: Wednesday, November 22, 2006 9:24 AM
> Subject: RE: [Xen-devel] Full virtualization and I/O
>
>
> > -----Original Message-----
> > From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
> > [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of
> > Thomas Heinz
> > Sent: 20 November 2006 23:39
> > To: xen-devel@xxxxxxxxxxxxxxxxxxx
> > Subject: [Xen-devel] Full virtualization and I/O
> >
> > Hi
> >
> > Full virtualization is about providing multiple virtual ISA level
> > environments and mapping them to a single physical one. One
> > particular
> > aspect of this mapping are I/O instructions (explicit or
> > mmapped I/O). In
> > general, there are two strategies to partition the devices,
> > either in time
> > or in space. Partitioning a device in space means that the
> > device (or a
> > part of it) is exclusively available to a single VM.
> > Partitioning a device
> > in time (or time multiplexing) means that it can be used by
> > multiple VMs
> > but only one VM may use it at any point in time.
>
> The Xen approach is to not allow any sharing of devices, a device is
> owned by one domain, no other domain can directly access the device.
> There is a protocol of so called frontend/backend driver which is
> basically a dummy-device that forwards a request to another domain
> (normally domain 0) and the other half of the driver-pair is
> picking up
> this data, forwards it to some processing task, that then sends the
> packet onto the real hardware.
>
> For fully virtualized mode (hardware supported virtual
> machine, such as
> AMD-V or Intel VT, aka HVM), there is a different model,
> where a "device
> model" is involved to perform the hardware modelling. In
Xen, this is
> using a modified version of qemu (called qemu-dm), which
has a fairly
> complete set of "hardware" in it's model. It's got for example IDE
> controller, several types of network devices, graphics and
> mouse/keyboard models. The things you'd usually find in a
PC, that is.
> The way it works is that the hypervisor intercepts IOIO and memory
> mapped IO regions that match the devices involved (such as the
> A0000-BFFFF region for VGA frame buffer memory or the 0x1F0-0x1F7 IO
> ports for the IDE controller), and forwards a request from the
> hypervisor to qemu-dm, where the operation changes the
current state,
> and when it's necessary, the state-change will result in
for example a
> read-request to the "hard-disk" (which may be a real disk,
a file on a
> local disk, or a file on a network storage device, to give some
> examples).
>
> There is also the option of using the frontend drivers as described
> above in the fully virtualized model.
>
> Finally, while I'm on the subject of fully virtualized mode: It is
> currently not possible to give a DMA-based device to a
> fully-virtualized
> domain. The reason for this is that the guest OS will have been told
> that memory is from 0..256MB (say), and it's actual machine physical
> address is at 256MB..512MB. The OS is completely unaware of this
> "mismatch". So the OS will perform some operation to take a virtual
> address of some buffer (say a network packet) and make it into a
> "physical address", which will be an address in the range
of 0..256MB.
> This will of course (at least) lead to the wrong data being
> transmitted,
> as the address of the actual data is somewhere in the range
> 256MB..512MB. The only solution to this is to have an
IOMMU, which can
> translate the guest's understanding of a physical address
> (0..256MB) to
> a machine physical address (256..512MB).
>
> >
> > I am trying to understand how I/O virtualization on the ISA
> > level works if
> > a device is shared between multiple VM instances. On a very
> > high level, it
> > should be as follows. First of all, the VMM has to intercept
> > the VM's I/O
> > commands (I/O instructions or load/store to dedicated memory
> > addresses -
> > let's ignore interrupts for the moment). This could be done
> > by traps or by
> > replacing the resp. instructions by VMM calls to I/O
> > primitives. The VMM
> > keeps multiple device model instances (one for each VM using
> > the device)
> > in memory. The models somehow reflect the low level I/O API
> > of the device.
> > Depending on which I/O command is issued by the VM, either
> the memory
> > model is changed or a number of I/O instructions are executed
> > to make the
> > physical device state reflect the one represented in the
> memory model.
>
> Do you by ISA mean "Instruction Set Architecture" or
something else (I
> presume it's NOT meaning ISA-bus...)?
>
> Intercepting IOIO instructions or MMIO instructions is not
that hard -
> in HVM the two processor architectures have specific intercepts and
> bitmaps to indicate which IO instructions should be
intercepted. MMIO
> will require the page-tables to be set up such that the
memory mapped
> region is mapped "not present" so that any operation to this region
> gives a page-fault, and then the page-fault is analyzed to
see if it's
> for a MMIO address or for a "real page fault".
>
> For para-virtualization, the model is similar, but the
exact model of
> how to intercept the IOIO or MMIO instruction is slightly
different -
> but in essence it's the same principle. Let me know if you
really need
> to know how Xen goes about doing this, as it's quite
complicated (more
> so than the HVM version, for sure).
>
>
> >
> > This approach brings up a number of questions. It would be
> > great if some of
> > the virtualization experts here could shed some light on them
> > (even though
> > they are not immediately related to Xen, I know):
> >
> > - How do these device memory models look like? Is there a common
> > (automata) theory behind or are they done ad hoc?
>
> Not sure what you're asking for here. Since the devices are either
> modeled after a REAL device (qemu-dm) and as such will resemble as
> closely as possible the REAL hardware device that it's
> emulating, or in
> the frontend/backend driver, there is an "idealized model",
such that
> the request contains just the basic data that the OS
provides normally
> to the driver, and it's placed in a queue with a message-signaling
> system to tell the other side that it's got something in the queue.
>
> > - What kind of strategies/algorithms are used in the merge
> > phase, i.e. the
> > phase where the virtual memory model and the physical one are
> > synchronized? What kind of problems can occur in this phase?
>
> The Xen approach is to avoid this by only giving one device to each
> machine.
>
> > - Are specific usage patterns used in real world
> implementations (e.g.
> > VMWare) to simplify the virtualization (model or merge phase)?
>
> This is probably the wrong list to ask detailed questions about how
> VMWare works... ;-)
>
> > - Do you have any interesting pointers to literature dealing
> > with full I/O
> > virtualization? In particular, how does VMWare's full
> virtualization
> > works with respect to I/O?
>
> Again, wrong list for VMWare questions.
>
> > - Is every device time partitionable? If not, which
> > requirements does it
> > have to meet to be time partitionable?
>
> Certainly not - I would say that almost all devices are NOT time
> partitionable, as the state in the device is dependant on
the current
> usage. The more complex the device is, the more likely it is to have
> difficulties, but even such a simple deevice as a serial port would
> struggle to work in a time-shared fashion (not to mention
that serial
> ports generally are used for multiple transactions to make a whole
> "bigger picture transaction", so for example a web-server
> connected via
> a serial modem would send a packet of several hundred bytes to the
> serial port driver, which is then portioned out as and when
the serial
> port is ready to send another few bytes. If you switch from
> one guest to
> another during this process, and the second guest also has
> something to
> send on the serial port, you'd end up with a very scrambled
> message from
> the first guest and quite likely the second guests message
completely
> lost!).
>
> There are some devices that are specifically built to
manage multiple
> hosts, but other than that, any sharing of a device requires some
> software to gather up "a full transaction" and then sending
> that to the
> actual hardware, often also waiting for the transaction to
> complete (for
> example the interrupt signal to say that the hard disk write is
> complete).
>
>
> > -> I don't think every device is. What about a device
> which supports
> > different modes of operation. If two VMs drive the
> > virtual device in
> > different modes, it may not be possible to constantly
> > switch between
> > them. Ok, this is pretty artificial.
>
> A particular problem is devices where you can't necessarily
read back
> the last mode-setting, which may well be the case in many different
> devices. You can't, for example, read back all the
registers on an IDE
> device, because the read of a particular address amy give the status
> rather than the current comamnd sent, or some such.
>
> --
> Mats
> >
> > Thanks a lot for your help!
> >
> >
> > Best wishes
> >
> > Thomas
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxxxxxxxx
> > http://lists.xensource.com/xen-devel
> >
> >
> >
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|