[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 08/23] xen/arm: vsmmuv3: Add support for registers emulation
- To: Julien Grall <julien@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
- From: Milan Djokic <milan_djokic@xxxxxxxx>
- Date: Mon, 18 May 2026 02:17:20 +0200
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CrqlGz1pKadxC9XvBOPz1xCG+Yo0D73Ex0VCjASJ+8c=; b=KqoZDj0Rbuh/4xquKh5QqSco2rXqrRJUIsD0+qOyzVuY2oLkfzjYNjWe9t+UX5q2etamhV7f+DZYPmwy/SZOtN+VI46//iT79wkuEXPVkb550H3ITD8m87M1uKmFQ9GV3CtTbFHWHzvRiU1nPPjbl1myZPZ999P7IxlNJVkshAIOk+Hyo0oOB54e8vs6638dJ04d0Pn91xgD01d7EyjddTXmTa/t8I3uMFAb6Y/2bnDQzoznZmKFhsgE3Y0umjuORZ6AEJxV3SXMQmxyzqqoeGuVQH6x248xzXhJxTbA/YWk88X3TZ+N/FEaU++KucTwn9YBRHcrMmsLZbiCmJBx+Q==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=GsUpVRoMsXDDYGRBezMihzTkiGy1R8eciqUuwLc9S2xTFCLSyOOtCjfl/gFBiXlk90Vz9Q2r1Q5ODxVdd+ldIXIf1sgUvwMWCgwu9FptP0X3ZO3xOINvvKap9tF0Al1u9uMZuLxogzFupE81CBNxEUkoxyQFRgXtpr6i2JE3EUI0zJdY4+DWHVJmZsoWzzf56LGe9PV1faFsSGhBgCYlycZahxmpaK4uJUq9aJ+yIo0i/iO1V74z00yonkrP5ldDP+zu41pght0KpkEp6H6ECRjkku5TRZQcHOOr3KAbttPiw9dIMa6E8iN1p7OHW8+3SlwqkayxFO/fpAJHIrWBTQ==
- Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=epam.com header.i="@epam.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
- Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=epam.com;
- Cc: Rahul Singh <rahul.singh@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Bertrand Marquis <bertrand.marquis@xxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
- Delivery-date: Mon, 18 May 2026 00:17:36 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
Hi Julien,
On 4/14/26 10:10, Julien Grall wrote:
Hi Milan,
On 31/03/2026 10:52, Milan Djokic wrote:
From: Rahul Singh <rahul.singh@xxxxxxx>
Add initial support for various emulated registers for virtual SMMUv3
for guests and also add support for virtual cmdq and eventq.
Signed-off-by: Rahul Singh <rahul.singh@xxxxxxx>
Signed-off-by: Milan Djokic <milan_djokic@xxxxxxxx>
---
xen/drivers/passthrough/arm/smmu-v3.h | 6 +
xen/drivers/passthrough/arm/vsmmu-v3.c | 286 +++++++++++++++++++++++++
2 files changed, 292 insertions(+)
diff --git a/xen/drivers/passthrough/arm/smmu-v3.h
b/xen/drivers/passthrough/arm/smmu-v3.h
index 3fb13b7e21..fab4fd5a26 100644
--- a/xen/drivers/passthrough/arm/smmu-v3.h
+++ b/xen/drivers/passthrough/arm/smmu-v3.h
@@ -60,6 +60,12 @@
#define IDR5_VAX GENMASK(11, 10)
#define IDR5_VAX_52_BIT 1
+#define ARM_SMMU_IIDR 0x18
+#define IIDR_PRODUCTID GENMASK(31, 20)
+#define IIDR_VARIANT GENMASK(19, 16)
+#define IIDR_REVISION GENMASK(15, 12)
+#define IIDR_IMPLEMENTER GENMASK(11, 0)
+
#define ARM_SMMU_CR0 0x20
#define CR0_ATSCHK (1 << 4)
#define CR0_CMDQEN (1 << 3)
diff --git a/xen/drivers/passthrough/arm/vsmmu-v3.c
b/xen/drivers/passthrough/arm/vsmmu-v3.c
index e36f200ba5..3ae1e62a50 100644
--- a/xen/drivers/passthrough/arm/vsmmu-v3.c
+++ b/xen/drivers/passthrough/arm/vsmmu-v3.c
@@ -3,25 +3,307 @@
#include <xen/param.h>
#include <xen/sched.h>
#include <asm/mmio.h>
+#include <asm/vgic-emul.h>
vgic-emul.h is intended to only be used in the vGIC code. I am fine if
you want to use it in vsmmu-v3.c but it needs to be renamed. Maybe to
vdev-emul.h.
Sure, I'll rename it
#include <asm/viommu.h>
+#include <asm/vreg.h>
+
+#include "smmu-v3.h"
+
+/* Register Definition */
+#define ARM_SMMU_IDR2 0x8
+#define ARM_SMMU_IDR3 0xc
+#define ARM_SMMU_IDR4 0x10
+#define IDR0_TERM_MODEL (1 << 26)
+#define IDR3_RIL (1 << 10)
+#define CR0_RESERVED 0xFFFFFC20
AFAIU, this is covering all the bits defined by the SMMU spec. But some
of them are optional. Does this mean we will expose those optional features?
Right now only mandatory features are supported (SMMU_EN, CMDQ, EVTQ).
Most of the optional features are not advertised in the IDR registers,
so guests are not expected to enable or use them via CR0.
+#define SMMU_IDR1_SIDSIZE 16
+#define SMMU_CMDQS 19
Can you add some details how you decided the size of the command and ...
+#define SMMU_EVTQS 19
... even queues?
The CMDQ/EVTQ sizes are currently set to the architectural maximum.
Since there is no direct dependency on the underlying hardware queue
sizes, using the maximum supported value seemed like the simplest option.
+#define DWORDS_BYTES 8
+#define ARM_SMMU_IIDR_VAL 0x12
I am not sure which implementer this is referring to. But how do you
plan to handle errata? Are we sure they can always be handled by Xen?
This is currently a dummy value used to avoid triggering guest driver
errata/quirk paths. I will replace it with a more meaningful value.
Using the Arm implementer ID with the remaining fields cleared should be
sufficient.
My expectation is that errata handling should remain in Xen rather than
the guest. Exposing real implementer/product IDs could cause the guest
driver to enable errata workarounds that may not apply correctly due to
the vSMMU emulation layer. For the current implementation this seems
sufficient, although there might be some updates in the future in this
regard.
/* Struct to hold the vIOMMU ops and vIOMMU type */
extern const struct viommu_desc __read_mostly *cur_viommu;
+/* virtual smmu queue */
+struct arm_vsmmu_queue {
+ uint64_t q_base; /* base register */
+ uint32_t prod;
+ uint32_t cons;
+ uint8_t ent_size;
+ uint8_t max_n_shift;
+};
+
struct virt_smmu {
struct domain *d;
struct list_head viommu_list;
+ uint8_t sid_split;
+ uint32_t features;
+ uint32_t cr[3];
+ uint32_t cr0ack;
+ uint32_t gerror;
+ uint32_t gerrorn;
+ uint32_t strtab_base_cfg;
+ uint64_t strtab_base;
+ uint32_t irq_ctrl;
+ uint64_t gerror_irq_cfg0;
+ uint64_t evtq_irq_cfg0;
+ struct arm_vsmmu_queue evtq, cmdq;
};
static int vsmmuv3_mmio_write(struct vcpu *v, mmio_info_t *info,
register_t r, void *priv)
{
+ struct virt_smmu *smmu = priv;
+ uint64_t reg;
+ uint32_t reg32;
Looking at this helper and the read one, I am bit surprised there is no
lock taken nor we check the access size. Can you explain why?
For instance, we should not allow 64-bit access on 32-bit register. The
rest of the size (8-bit and 16-bit) is IMP DEFINED so it may be easier
just not allow them.
Most of the configuration registers are expected to be accessed in a
serialized manner by the guest driver, during driver initialization. So
I think that only queue and error reporting registers should be
protected. Only command queue handling is protected currently. I will do
the same for the event queue and gerror, this is missing.
As for the access size checks, yes, this needs to be implemented. I will
add it.
+
+ switch ( info->gpa & 0xffff )
+ {
+ case VREG32(ARM_SMMU_CR0):
> + reg32 = smmu->cr[0];> + vreg_reg32_update(®32, r,
info);
+ smmu->cr[0] = reg32;
+ smmu->cr0ack = reg32 & ~CR0_RESERVED;
+ break;
+
+ case VREG32(ARM_SMMU_CR1):
+ reg32 = smmu->cr[1];
+ vreg_reg32_update(®32, r, info);
+ smmu->cr[1] = reg32;
+ break;
+
+ case VREG32(ARM_SMMU_CR2):
+ reg32 = smmu->cr[2];
+ vreg_reg32_update(®32, r, info);
+ smmu->cr[2] = reg32;
+ break;
+
+ case VREG64(ARM_SMMU_STRTAB_BASE):
Looking at the SMMU spec (6.3.24 in ARM IHI 0070 H.a), the behavior of
writing to the register is constrained unpredictable before SMMUv3.2,
but after it should be ignored if SMMU_CR0.SMMUEN == 1.
So this implementation would not be valid for SMMUv3.2 and later. For
convenience it would be best to just ignore the write (which is also
valid for SMMUv3.1 and ealier).
+ reg = smmu->strtab_base;
+ vreg_reg64_update(®, r, info);
+ smmu->strtab_base = reg;
+ break;
+
+ case VREG32(ARM_SMMU_STRTAB_BASE_CFG):
Similar to above, there are some conditions when this field can be
written (see 6.3.25).
Yes, both writes should be guarded by SMMU_CR0.SMMUEN. I will add this.
+ reg32 = smmu->strtab_base_cfg;
+ vreg_reg32_update(®32, r, info);
+ smmu->strtab_base_cfg = reg32;
+
+ smmu->sid_split = FIELD_GET(STRTAB_BASE_CFG_SPLIT, reg32);
The information for sid_split is already stored in
``smmu->strtab_base_cfg``. So why do we need to store it differently?
No specific need for a separate structure field, I will remove it and
extract the SID split from ``smmu->strtab_base_cfg``
+ smmu->features |= STRTAB_BASE_CFG_FMT_2LVL;
I haven't checked the rest of the code yet. But from the name, I would
assume it indicates whether 2-level stream table is supported. From my
understanding of the specification, this is selectable by the guest OS.
So why is this unconditionally set?
I think this is a leftover from the initial implementation where the
value was hardcoded. I will fix this and derive the field from the
guest-provided configuration instead.
+ break;
+
+ case VREG32(ARM_SMMU_CMDQ_BASE):
Similar to above, there are some condition when this field is RO.
+ reg = smmu->cmdq.q_base;
+ vreg_reg64_update(®, r, info);
+ smmu->cmdq.q_base = reg;
+ smmu->cmdq.max_n_shift = FIELD_GET(Q_BASE_LOG2SIZE, smmu->cmdq.q_base);
+ if ( smmu->cmdq.max_n_shift > SMMU_CMDQS )
+ smmu->cmdq.max_n_shift = SMMU_CMDQS;
+ break;
+
+ case VREG32(ARM_SMMU_CMDQ_PROD):
+ reg32 = smmu->cmdq.prod;
+ vreg_reg32_update(®32, r, info);
+ smmu->cmdq.prod = reg32;
AFAIU, this implementation is not yet complete. If so, it would be good
to mark it as such with a comment of BUG_ON("Not yet implemented"). Same
for everywhere in this file and the rest of the series.
+ break;
+
+ case VREG32(ARM_SMMU_CMDQ_CONS):
+ reg32 = smmu->cmdq.cons;
+ vreg_reg32_update(®32, r, info);
+ smmu->cmdq.cons = reg32;
+ break;
+
+ case VREG32(ARM_SMMU_EVTQ_BASE):
+ reg = smmu->evtq.q_base;
+ vreg_reg64_update(®, r, info);
+ smmu->evtq.q_base = reg;
+ smmu->evtq.max_n_shift = FIELD_GET(Q_BASE_LOG2SIZE, smmu->evtq.q_base);
+ if ( smmu->cmdq.max_n_shift > SMMU_EVTQS )
+ smmu->cmdq.max_n_shift = SMMU_EVTQS;
+ break;
+
+ case VREG32(ARM_SMMU_EVTQ_PROD):
+ reg32 = smmu->evtq.prod;
+ vreg_reg32_update(®32, r, info);
+ smmu->evtq.prod = reg32;
+ break;
+
+ case VREG32(ARM_SMMU_EVTQ_CONS):
+ reg32 = smmu->evtq.cons;
+ vreg_reg32_update(®32, r, info);
+ smmu->evtq.cons = reg32;
+ break;
+
+ case VREG32(ARM_SMMU_IRQ_CTRL):
+ reg32 = smmu->irq_ctrl;
+ vreg_reg32_update(®32, r, info);
+ smmu->irq_ctrl = reg32;
+ break;
+
+ case VREG64(ARM_SMMU_GERROR_IRQ_CFG0):
+ reg = smmu->gerror_irq_cfg0;
+ vreg_reg64_update(®, r, info);
+ smmu->gerror_irq_cfg0 = reg;
+ break;
+
+ case VREG64(ARM_SMMU_EVTQ_IRQ_CFG0):
+ reg = smmu->evtq_irq_cfg0;
+ vreg_reg64_update(®, r, info);
+ smmu->evtq_irq_cfg0 = reg;
+ break;
+
+ case VREG32(ARM_SMMU_GERRORN):
+ reg = smmu->gerrorn;
+ vreg_reg64_update(®, r, info);
+ smmu->gerrorn = reg;
+ break;
+
+ default:
+ printk(XENLOG_G_ERR
+ "%pv: vSMMUv3: unhandled write r%d offset %"PRIpaddr"\n",
NIT: The vIOMMU is per-domain so it is sufficient to print "%pd".
+ v, info->dabt.reg, (unsigned long)info->gpa & 0xffff);
+ return IO_ABORT;
Per section 6 of the SMMU:
"
For all pages except Page 1, undefined register locations are RES0. For
Page 1, access to undefined/Reserved
register locations is CONSTRAINED UNPREDICTABLE and an implementation
has one of the following behaviors:
[...]
"
Here you seem to implement page0 so the default case should be write
ignore and therefore IO_HANDLED should be returned. BTW, you don't seem
to handle page1. Is this going to be handled later on?
From page1, right now only EVTQ registers are emulated. PRI is not
supported, but might be needed in the future for the PCI support (PRI
queue registers also belong to page1, but not emulated atm)
So I think that page1 will be handled when PCI support is completed.
+ }
+
return IO_HANDLED;
}
static int vsmmuv3_mmio_read(struct vcpu *v, mmio_info_t *info,
register_t *r, void *priv)
{
+ struct virt_smmu *smmu = priv;
+ uint64_t reg;
+
+ switch ( info->gpa & 0xffff )
+ {
+ case VREG32(ARM_SMMU_IDR0):
+ reg = FIELD_PREP(IDR0_S1P, 1) | FIELD_PREP(IDR0_TTF, 2) |
As the page-table will be used by the HW, shouldn't TTF reflect what the
HW supports? This would allow the vIOMMU to work for 32-bit domains.
If my understanding is correct, Xen SMMU driver only supports AArch64
table format, so I think that we can't advertise 32-bit table format in
the emulation layer even if the hardware supports it.
+ FIELD_PREP(IDR0_COHACC, 0) | FIELD_PREP(IDR0_ASID16, 1) |
Here you set COHACC to 0 which means the guest OS will have to clean the
cache every time. This is safe everywhere, but it will have an impact on
performance. I am not asking to allow COHACC when the HW supports it,
but I think a TODO would be worth.
Sure, I will add comments for this.
For ASID16, shouldn't the value be based on the HW?
Yes, this is wrong since it could advertise it when not supported.
I'll update this.
As an aside, I guess we don't allow BTM because we only expose a single
vSMMU?
Yes, since only a single instance is exposed, BTM support does not make
much sense.
+ FIELD_PREP(IDR0_TTENDIAN, 0) | FIELD_PREP(IDR0_STALL_MODEL, 1) |
For TTENDIAN, it is the same as above.
For STALL_MODEL, I think 1 is ok.
+ FIELD_PREP(IDR0_ST_LVL, 1) | FIELD_PREP(IDR0_TERM_MODEL, 1);
Overall, it feels the value set in IDR0 and IDR1 (below) needs some comment.
+ *r = vreg_reg32_extract(reg, info);
+ break;
+
+ case VREG32(ARM_SMMU_IDR1):
+ reg = FIELD_PREP(IDR1_SIDSIZE, SMMU_IDR1_SIDSIZE) |
+ FIELD_PREP(IDR1_CMDQS, SMMU_CMDQS) |
+ FIELD_PREP(IDR1_EVTQS, SMMU_EVTQS);
+ *r = vreg_reg32_extract(reg, info);
+ break;
+
+ case VREG32(ARM_SMMU_IDR2):
+ goto read_reserved;
+
+ case VREG32(ARM_SMMU_IDR3):
+ reg = FIELD_PREP(IDR3_RIL, 0);
I am not sure why we explicitely need to set RIL but not the other fields?
This is not necessary. I will update this with the regsiter cleared to 0
only.
> + *r = vreg_reg32_extract(reg, info);> + break;
+
+ case VREG32(ARM_SMMU_IDR4):
+ goto read_impl_defined;
+
+ case VREG32(ARM_SMMU_IDR5):
+ reg = FIELD_PREP(IDR5_GRAN4K, 1) | FIELD_PREP(IDR5_GRAN16K, 1) |
+ FIELD_PREP(IDR5_GRAN64K, 1) | FIELD_PREP(IDR5_OAS,
IDR5_OAS_48_BIT);
Similar to the other fields in IDR0, isn't this based on what the HW
supports?
Yes, this definitely needs to reflect the underlying hardware support.
I'll update it.
To summarize for the IDR registers, the intention was to preset the
capabilities to a common configuration matching the features currently
supported by the emulation layer. I will revisit the hardcoded fields
and derive the values from the underlying hardware where appropriate.
Thank you for the comments and insights on this.
+ *r = vreg_reg32_extract(reg, info);
+ break;
+
+ case VREG32(ARM_SMMU_IIDR):
+ *r = vreg_reg32_extract(ARM_SMMU_IIDR_VAL, info);
+ break;
+
+ case VREG32(ARM_SMMU_CR0):
+ *r = vreg_reg32_extract(smmu->cr[0], info);
+ break;
+
+ case VREG32(ARM_SMMU_CR0ACK):
+ *r = vreg_reg32_extract(smmu->cr0ack, info);
+ break;
+
+ case VREG32(ARM_SMMU_CR1):
+ *r = vreg_reg32_extract(smmu->cr[1], info);
+ break;
+
+ case VREG32(ARM_SMMU_CR2):
+ *r = vreg_reg32_extract(smmu->cr[2], info);
+ break;
+
+ case VREG32(ARM_SMMU_STRTAB_BASE):
+ *r = vreg_reg64_extract(smmu->strtab_base, info);
+ break;
+
+ case VREG32(ARM_SMMU_STRTAB_BASE_CFG):
+ *r = vreg_reg32_extract(smmu->strtab_base_cfg, info);
+ break;
+
+ case VREG32(ARM_SMMU_CMDQ_BASE):
+ *r = vreg_reg64_extract(smmu->cmdq.q_base, info);
+ break;
+
+ case VREG32(ARM_SMMU_CMDQ_PROD):
+ *r = vreg_reg32_extract(smmu->cmdq.prod, info);
+ break;
+
+ case VREG32(ARM_SMMU_CMDQ_CONS):
+ *r = vreg_reg32_extract(smmu->cmdq.cons, info);
+ break;
+
+ case VREG32(ARM_SMMU_EVTQ_BASE):
+ *r = vreg_reg64_extract(smmu->evtq.q_base, info);
+ break;
+
+ case VREG32(ARM_SMMU_EVTQ_PROD):
+ *r = vreg_reg32_extract(smmu->evtq.prod, info);
+ break;
+
+ case VREG32(ARM_SMMU_EVTQ_CONS):
+ *r = vreg_reg32_extract(smmu->evtq.cons, info);
+ break;
+
+ case VREG32(ARM_SMMU_IRQ_CTRL):
+ case VREG32(ARM_SMMU_IRQ_CTRLACK):
+ *r = vreg_reg32_extract(smmu->irq_ctrl, info);
+ break;
+
+ case VREG64(ARM_SMMU_GERROR_IRQ_CFG0):
+ *r = vreg_reg64_extract(smmu->gerror_irq_cfg0, info);
+ break;
+
+ case VREG64(ARM_SMMU_EVTQ_IRQ_CFG0):
+ *r = vreg_reg64_extract(smmu->evtq_irq_cfg0, info);
+ break;
+
+ case VREG32(ARM_SMMU_GERROR):
+ *r = vreg_reg64_extract(smmu->gerror, info);
+ break;
+
+ case VREG32(ARM_SMMU_GERRORN):
+ *r = vreg_reg64_extract(smmu->gerrorn, info);
+ break;
+
+ default:
+ printk(XENLOG_G_ERR
+ "%pv: vSMMUv3: unhandled read r%d offset %"PRIpaddr"\n",
+ v, info->dabt.reg, (unsigned long)info->gpa & 0xffff);
+ return IO_ABORT;
+ }
+
+ return IO_HANDLED;
+
+ read_impl_defined:
+ printk(XENLOG_G_DEBUG
+ "%pv: vSMMUv3: RAZ on implementation defined register offset
%"PRIpaddr"\n",
+ v, info->gpa & 0xffff);
+ *r = 0;
+ return IO_HANDLED;
+
+ read_reserved:
+ printk(XENLOG_G_DEBUG
+ "%pv: vSMMUv3: RAZ on reserved register offset %"PRIpaddr"\n",
+ v, info->gpa & 0xffff);
+ *r = 0;
return IO_HANDLED;
}
@@ -39,6 +321,10 @@ static int vsmmuv3_init_single(struct domain *d, paddr_t addr, paddr_t size)
return -ENOMEM;
smmu->d = d;
+ smmu->cmdq.q_base = FIELD_PREP(Q_BASE_LOG2SIZE, SMMU_CMDQS);
+ smmu->cmdq.ent_size = CMDQ_ENT_DWORDS * DWORDS_BYTES;
+ smmu->evtq.q_base = FIELD_PREP(Q_BASE_LOG2SIZE, SMMU_EVTQS);
+ smmu->evtq.ent_size = EVTQ_ENT_DWORDS * DWORDS_BYTES;
I understand why we initialize ent_size. But I am not sure to understand
why we need to initialize q_base. Can you clarify?
Actually there's no need to initialize q_base fields, these will be set
during configuration. I will remove this.
register_mmio_handler(d, &vsmmuv3_mmio_handler, addr, size, smmu);
Cheers,
Best regards,
Milan
|