xen-devel
Re: [Xen-devel] Re: Improving hvm IO performance by using self IO emulat
Mark Williamson wrote:
The big problem with disk emulation isn't IO latency, but the fact that
the IDE emulation can only have one outstanding request at a time. The
SCSI emulation helps this a lot.
IIRC, a real IDE can only have one outstanding request too (this may have
changed with AHCI). This is really IIRC :-(
Can SATA drives queue multiple outstanding requests? Thought some newer rev
could, but I may well be misremembering - in any case we'd want something
that was well supported.
SATA can, yes. However, as you mention, SATA is very poorly supported.
The LSI scsi adapter seems to work quite nicely with Windows and Linux.
And it supports TCQ. And it's already implemented :-) Can't really
beat that :-)
I don't know what the bottle neck is in network emulation, but I suspect
the number of copies we have in the path has a great deal to do with it.
This reason seems obvious.
Latency may matter more to the network performance than it did to block,
actually (especially given our current setup is fairly pessimal wrt
latency!). It would be interesting to see how much difference this makes.
In any case, copies are bad too :-) Presumably, hooking directly into the
paravirt network channel would improve this situation too.
Perhaps the network device ought to be the first to move?
Can't say. I haven't done much research on network performance.
There's a lot to like about this sort of approach. It's not a silver
bullet wrt performance but I think the model is elegant in many ways.
An interesting place to start would be lapic/pit emulation. Removing
this code from the hypervisor would be pretty useful and there is no
need to address PV-on-HVM issues.
Indeed this is the simpler code to move. But why would it be useful ?
It might be a good proof of concept, and it simplifies the hypervisor (and the
migration / suspend process) at the same time.
Does the firmware get loaded as an option ROM or is it a special portion
of guest memory that isn't normally reachable?
IMHO it should come with hvmload. No needs to make it unreachable.
Mmmm. It's not like the guest can break security if it tampers with the
device models in its own memory space.
Question: how does this compare with using a "stub domain" to run the device
models? The previous proposed approach was to automatically switch to the
stub domain on trapping an IO by the HVM guest, and have that stub domain run
the device models, etc.
Reflecting is a bit more expensive than doing a stub domain. There is
no way to wire up the VMEXITs to go directly into the guest so you're
always going to have to pay the cost of going from guest => host =>
guest => host => guest for every PIO. The guest is incapable of
reenabling PG on its own hence the extra host => guest transition.
Compare to stub domain where, if done correctly, you can go from guest
=> host/0 => host/3 => host/0 => guest. The question would be, is
host/0 => host/3 => host/0 fundamentally faster than host => guest => host.
I know that guest => host => guest typically costs *at least* 1000 nsecs
on SVM. A null sysenter syscall (that's host/3 => host/0 => host/3) is
roughly 75 nsecs.
So my expectation is that stub domain can actually be made to be faster
than reflecting.
Regards,
Anthony Liguori
You seem to be actually proposing running the code within the HVM guest
itself. The two approaches aren't actually that different, IMO, since the
guest still effectively has two different execution contexts. It does seem
to me that running within the HVM guest itself might be more flexible.
A cool little trick that this strategy could enable is to run a full Qemu
instruction emulator within the device model - I'd imagine this could be
useful on IA64, for instance, in order to provide support for running legacy
OSes (e.g. for x86, or *cough* PPC ;-))
Cheers,
Mark
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|