On Fri, Apr 08, 2011 at 09:36:45AM +0100, Ian Campbell wrote:
> On Thu, 2011-04-07 at 21:25 +0100, Konrad Rzeszutek Wilk wrote:
> > # HG changeset patch
> > # User Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> > # Date 1302206363 14400
> > # Node ID 2e464234c94cfd29a98a9011a46d76846b87f7f8
> > # Parent 546d8a03d5cbe0ceddadf701174f2417a0b72891
> > libxl: Add support for passing in the machine's E820 for PCI passthrough.
> >
> > The code (libxl_e820_alloc) calls the xc_get_machine_memory_map to
> > retrieve the systems E820. Then the E820 is sanitized to weed out E820
> > entries below 16MB, and as well remove any E820_RAM or E820_UNUSED
> > regions as the guest does not need to know abou them. The guest
> > only needs the E820_ACPI, E820_NVS, E820_RESERVED to get an idea of
> > where the PCI I/O space is. Mostly.. The Linux kernel assumes that any
> > gap in the E820 is considered PCI I/O space which means that if we pass
> > in the guest 2GB, and the E820_ACPI, and its friend start at 3GB, the
> > gap between 2GB and 3GB will be considered as PCI I/O space. To guard
> > against
> > that we also create an E820_UNUSABLE between the region of 'target_kb'
> > (called ram_end in the code) up to the first E820_[ACPI,NVS,RESERVED]
> > region.
> >
> > Memory that is slack or for balloon (so 'maxmem' in guest configuration)
> > is put behind the machine E820. Which in most cases is after the 4GB.
> >
> > The reason for doing the fetching of the E820 using the hypercall in
> > the toolstack (instead of the guest doing it) is that when a Linux
> > guest would do a hypercall to 'XENMEM_machine_memory_map' it would
> > retrieve an E820 with I/O range caps added in. Meaning that the
> > region after 4GB up to end of possible memory would be marked as unusuable
> > and the Linux kernel would not have any space to allocate a balloon
> > region.
> >
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> >
> > diff -r 546d8a03d5cb -r 2e464234c94c tools/libxl/libxl.h
> > --- a/tools/libxl/libxl.h Thu Apr 07 15:18:46 2011 -0400
> > +++ b/tools/libxl/libxl.h Thu Apr 07 15:59:23 2011 -0400
> > @@ -204,6 +204,14 @@
> > } libxl_file_reference;
> > void libxl_file_reference_destroy(libxl_file_reference *p);
> >
> > +#include <xc_e820.h>
>
> We are trying to avoid exposing libxc headers to libxl users. struct
> e820entry is well defined and unchanging so I don't think there is much
> harm in duplicating it.
OK.
>
> > +typedef struct {
> > + uint32_t nr_entries;
> > + struct e820entry *entry;
> > +} libxl_e820;
>
> BTW does the e820entry array need to be packed at some point before
> getting returned to the guest? (I'm not sure what the expected alignment
The struct e820entry has the __packed__ attribute on it.
> is, nor how a 20 byte struct containing two u64s and a u32 naturally
> aligns on x86_64).
It seems to work OK.. (I booted a 32-bit guest under a 64-bit hypervisor
with a 64-bit dom0).
>
> > diff -r 546d8a03d5cb -r 2e464234c94c tools/libxl/libxl.idl
> > --- a/tools/libxl/libxl.idl Thu Apr 07 15:18:46 2011 -0400
> > +++ b/tools/libxl/libxl.idl Thu Apr 07 15:59:23 2011 -0400
> > @@ -22,6 +22,7 @@
> >
> > libxl_hwcap = Builtin("hwcap")
> >
> > +libxl_e820 = Builtin("e820", destructor_fn="libxl_e820_destroy",
> > passby=PASS_BY_REFERENCE)
>
> Hmmm, We are starting to collect a lot of builtins of the form {len,
> pointer}. Time to start thinking of a new type class abstraction, I
> think... (not your problem in this series though)
>
> > diff -r 546d8a03d5cb -r 2e464234c94c tools/libxl/libxl_pci.c
> > --- a/tools/libxl/libxl_pci.c Thu Apr 07 15:18:46 2011 -0400
> > +++ b/tools/libxl/libxl_pci.c Thu Apr 07 15:59:23 2011 -0400
> > @@ -37,6 +37,8 @@
> > #include "libxl_internal.h"
> > #include "flexarray.h"
> >
> > +#include <xc_e820.h>
> > +
> > #define PCI_BDF "%04x:%02x:%02x.%01x"
> > #define PCI_BDF_SHORT "%02x:%02x.%01x"
> > #define PCI_BDF_VDEVFN "%04x:%02x:%02x.%01x@%02x"
> > @@ -1047,3 +1049,154 @@
> > free(pcidevs);
> > return 0;
> > }
> > +
> > +#define E820MAX (128)
>
> Should be somewhere public for the benefit of libxl users who want to
> setup an e820? Even if it should be private it should be in a .h.
Aye aye.
>
> Are we expecting that libxl users might want to modify the e820? If not
> then why expose libxl_e820_alloc and libxl_e820_sanitize at all, just
> add a flag to the libxl interface which says whether or not to provide a
> host-derived e820.
OK. Where should I put the definitions of these two functions? (They are
called from libxl_dom.c and xl_cmdimpl.c) Is there a internal header file?
>
> If users are expected to modify it then I'm in two minds about doing the
> sanitize step at domain build time instead of in libxl_e820_alloc. If we
The issue I had was that to construct the e820, I needed the
'target_kb' and 'max_memkb' and 'slack_memkb'. If I can get those two values
when the guest config file is parsed (so xl_cmdimpl.c) I think moving
the sanitization in there (and not doing it during domain build time)
is the right thing to do. Let me see if I can do that.
> provide a default sanitized e820 to the caller and they want to modify
> it to be non-santized should we let them? I guess the question here is
> whether we are sanitizing actually invalid e820 maps or if we are also
> enforcing some higher level policy based on a specific class Linux
> guest's requirements.
Which might be quite different if the guest is a BSD one. I guess
it is time to start launching those NetBSD guests :-)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|