On Mon, May 09, 2011 at 10:00:30AM +0100, Ian Campbell wrote:
> On Wed, 2011-05-04 at 15:17 +0100, Konrad Rzeszutek Wilk wrote:
> > Hello,
> >
> > This set of v3 patches allows a PV domain to see the machine's
> > E820 and figure out where the "PCI I/O" gap is and match it with the
> > reality.
> >
> > Changelog since v2 posting:
> > - Moved 'libxl__e820_alloc' to be called from do_domain_create and if
> > machine_e820 == true.
> > - Made no_machine_e820 be set to true, if the guest has no PCI devices
> > (and is PV)
> > - Used Keir's re-worked code for E820 creation.
> > Changelog since v1 posting:
> > - Squashed the "x86: make the pv-only e820 array be dynamic" and
> > "x86: adjust the size of the e820 for pv guest to be dynamic" together.
> > - Made xc_domain_set_memmap_limit use the 'xc_domain_set_memory_map'
> > - Moved 'libxl_e820_alloc' and 'libxl_e820_sanitize' to be an internal
> > operation and called from 'libxl_device_pci_parse_bdf'.
> > - Expanded 'libxl_device_pci_parse_bdf' API call to have an extra argument
> > (optional).
> >
> > The short end is that with these patches a PV domain can:
> >
> > - Use the correct PCI I/O gap. Before these patches, Linux guest would
> > boot up and would tell:
> > [ 0.000000] Allocating PCI resources starting at 40000000 (gap:
> > 40000000:c0000000)
> > while in actuality the PCI I/O gap should have been:
> > [ 0.000000] Allocating PCI resources starting at b0000000 (gap:
> > b0000000:4c000000)
>
> The reason it needs to be a particular gap is that we can't (easily? at
> all?) rewrite the device BARs to match the guest's idea of the hole, is
> that right? So it needs to be consistent with the underlying host hole.
correct.
>
> I wonder if it is time to enable IOMMU for PV guests by default.
Would be nice. I thought if the IOMMU was present it wouldautomatically do that?
> Presumably in that case we can manufacture any hole we like in the e820,
> which is useful e.g. when migrating to not-completely-homogeneous hosts.
Hmm. I want to say yes, but not entirely sure what are all the pieces that
this would entail.
>
> > - The PV domain with PCI devices was limited to 3GB. It now can be booted
> > with 4GB, 8GB, or whatever number you want. The PCI devices will now
> > _not_ conflict
> > with System RAM. Meaning the drivers can load.
> >
> > - With 2.6.39 kernels (which has the 1-1 mapping code), the VM_IO flag
> > will be
> > now automatically applied to regions that are considerd PCI I/O regions.
> > You can
> > find out which those are by looking for '1-1' in the kernel bootup.
> >
> > To use this patchset, the guest config file has to have the parameter
> > 'pci=['<BDF>',...]'
> > enabled.
> >
> > This has been tested with 2.6.18 (RHEL5), 2.6.27(SLES11), 2.6.36, 2.6.37,
> > 2.6.38,
> > and 2.6.39 kernels. Also tested with PV NetBSD 5.1.
> >
> > Tested this with the PCI devices (NIC, MSI), and with 2GB, 4GB, and 6GB
> > guests
> > with success.
> >
> > libxc/xc_domain.c | 77 +++++++++++-----
> > libxc/xc_e820.h | 3
> > libxc/xenctrl.h | 11 ++
> > libxl/libxl.idl | 1
> > libxl/libxl_create.c | 8 +
> > libxl/libxl_internal.h | 1
> > libxl/libxl_pci.c | 230
> > +++++++++++++++++++++++++++++++++++++++++++++++++
> > libxl/xl_cmdimpl.c | 3
> > 8 files changed, 309 insertions(+), 25 deletions(-)
> >
> >
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|