Xen project Mailing List

Re: [PATCH 11/17] hvmloader: allocate MMCONFIG area in the MMIO hole

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

From: Thierry Escande <thierry.escande@xxxxxxxxxx>

Date: Fri, 12 Jun 2026 12:01:54 +0200

Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=vates.tech header.i="@vates.tech" header.h="From:Subject:Date:Message-ID:To:Cc:MIME-Version:Content-Type:In-Reply-To:References:Feedback-ID"

Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Alexey Gerasimenko <x1917x@xxxxxxxxx>

Delivery-date: Fri, 12 Jun 2026 10:01:59 +0000

Feedback-id: default:8631fc262581453bbf619ec5b2062170:Sweego

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 4/29/26 11:29, Roger Pau Monné wrote: > On Fri, Mar 13, 2026 at 04:35:04PM +0000, Thierry Escande wrote: >> The actual MMCONFIG size depends on the number of PCI buses available >> which should be covered by ECAM. Possible options are 64MB, 128MB and >> 256MB. > > Are such values inherited from the real q35 impleemntation? > > AFAICT the ACPI MCFG spec notes: > > "The size of the memory mapped configuration region is indicated by > the start and end bus number fields in the Memory mapped Enhanced > configuration space base address allocation structure as shown in > Table 4-3. 0-255 is the range of allowed bus numbers supported for a > given PCI Segment Group." > > So it's in principle possible to specify a MCFG that covers a single > bus, and then it would have a size of 256 * 4K = 1M. Which avoids > wasting 63M of MMIO space in the low MMIO hole that's already fairly > tight on space. > > Is this limitation possibly inherited from the way the ECAM region > position and size must be notified to the chipset? > > And further seeing the code below - I found the answer myself, it's > because the chipset only supports negotiation those ECAM sizes. We > could possibly expose a smaller region in MCFG, but doesn't seem like > a good move. Yes indeed, it's a waste of space in MMIO hole. Maybe it's ok to only allocate the 1MB needed for 1 bus since there would be no reason for a guest to access passed the 1st MB. I can give it a try but that's possibly risky... > >> As Xen is limited to the bus 0 currently, the lowest possible >> setting is used (64MB), defined via PCI_MAX_MCFG_BUSES in >> hvmloader/config.h. When multiple PCI buses support for Xen will be >> implemented, PCI_MAX_MCFG_BUSES may be replaced by a calculation of the >> number of buses according to PCI devices enumeration. >> >> The MMCONFIG entry is inserted into bars array in the same manner like >> for any other BARs. In this case, the devfn field will point to MCH PCI >> device and bar_reg will contain PCIEXBAR register offset. It will be >> assigned a slot in the MMIO hole later in a very same way like for plain >> PCI BARs, with respect to its size and alignment. At this point, the >> actual base address and size of the ECAM space are passed to Xen using >> the HVMOP_set_ecam_space hypercall. >> >> Signed-off-by: Alexey Gerasimenko <x1917x@xxxxxxxxx> >> Signed-off-by: Thierry Escande <thierry.escande@xxxxxxxxxx> >> --- >> tools/firmware/hvmloader/config.h | 4 +++ >> tools/firmware/hvmloader/pci.c | 55 +++++++++++++++++++++++++++++ >> tools/firmware/hvmloader/pci_regs.h | 7 ++++ >> 3 files changed, 66 insertions(+) >> >> diff --git a/tools/firmware/hvmloader/config.h >> b/tools/firmware/hvmloader/config.h >> index baaed91c7f..aa3158bca5 100644 >> --- a/tools/firmware/hvmloader/config.h >> +++ b/tools/firmware/hvmloader/config.h >> @@ -55,6 +55,10 @@ extern uint32_t *cpu_to_apicid; >> #define PCI_ISA_DEVFN 0x08 /* dev 1, fn 0 */ >> #define PCI_ISA_IRQ_MASK 0x0c20U /* ISA IRQs 5,10,11 are PCI connected */ >> #define PCI_ICH9_LPC_DEVFN 0xf8 /* dev 31, fn 0 */ >> +#define PCI_MCH_DEVFN 0 /* bus 0, dev 0, func 0 */ >> + >> +/* possible values are: 64, 128, 256 */ >> +#define PCI_MAX_MCFG_BUSES 64 >> >> #define ACPI_TIS_HDR_ADDRESS 0xFED40F00UL >> >> diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c >> index 6e6720adae..54c23ffdd8 100644 >> --- a/tools/firmware/hvmloader/pci.c >> +++ b/tools/firmware/hvmloader/pci.c >> @@ -413,6 +413,58 @@ void pci_setup(void) >> pci_devfn_decode_type[devfn] |= PCI_COMMAND_MASTER; >> } >> >> + /* >> + * Calculate MMCONFIG area size and squeeze it into the bars array >> + * for assigning a slot in the MMIO hole >> + */ >> + if ( is_running_on_q35 ) >> + { >> + /* disable PCIEXBAR decoding for now */ >> + pci_writel(PCI_MCH_DEVFN, PCI_MCH_PCIEXBAR, 0); >> + pci_writel(PCI_MCH_DEVFN, PCI_MCH_PCIEXBAR + 4, 0); >> + >> + switch ( PCI_MAX_MCFG_BUSES ) >> + { >> + case 64: >> + bar_data = PCIEXBAR_64_BUSES | PCIEXBAR_ENABLE; >> + bar_sz = MB(64); >> + break; >> + >> + case 128: >> + bar_data = PCIEXBAR_128_BUSES | PCIEXBAR_ENABLE; >> + bar_sz = MB(128); >> + break; >> + >> + case 256: >> + bar_data = PCIEXBAR_256_BUSES | PCIEXBAR_ENABLE; >> + bar_sz = MB(256); >> + break; >> + >> + default: >> + /* unsupported number of buses specified */ >> + BUG(); >> + } >> + >> + addr_mask = ~(bar_sz - 1); >> + >> + for ( i = 0; i < nr_bars; i++ ) >> + if ( bars[i].bar_sz < bar_sz ) >> + break; >> + >> + if ( i != nr_bars ) >> + memmove(&bars[i+1], &bars[i], (nr_bars-i) * sizeof(*bars)); >> + >> + bars[i].is_mem = 1; >> + bars[i].devfn = PCI_MCH_DEVFN; >> + bars[i].bar_reg = PCI_MCH_PCIEXBAR; >> + bars[i].bar_sz = bar_sz; >> + bars[i].addr_mask = addr_mask; >> + bars[i].bar_data = bar_data; >> + >> + mmio_total += bar_sz; >> + nr_bars++; >> + } > > I think it might be best if the ECAM fake BAR is the first element in > the bars array, so we ensure it's the first item to consume memory > from the low MMIO hole. Not sure how that will work with the current > sorting of the resources based on their size, but it's imperative for > hvmloader to attempt to position ECAM ahead of the other device > resources IMO. With a size of 64MB it's always placed first from what I can tell. I don't get why it is imperative. Would it be to make sure that it is actually allocated in the first 4GB? > >> + >> if ( mmio_hole_size ) >> { >> uint64_t max_ram_below_4g = GB(4) - mmio_hole_size; >> @@ -592,6 +644,9 @@ void pci_setup(void) >> } >> } >> >> + if ( bar_reg == PCI_MCH_PCIEXBAR ) >> + hvm_set_ecam_space(base, bar_sz); > > As noted in a previous patch, it would be better if it's QEMU (as part > of handling the PCI_MCH_PCIEXBAR writes) that notifies Xen of the ECAM > window placement. Yes, as mentioned earlier I can do it that way. Regards, -- Thierry Escande | Vates XCP-ng Developer XCP-ng & Xen Orchestra - Vates solutions web: https://vates.tech

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.