[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 14/26] xen/riscv: add very early virtual APLIC (vAPLIC) initialization support





On 6/3/26 4:54 PM, Jan Beulich wrote:
On 08.05.2026 16:43, Oleksii Kurochko wrote:
--- a/xen/arch/riscv/aplic.c
+++ b/xen/arch/riscv/aplic.c
@@ -295,6 +295,11 @@ static void cf_check aplic_set_irq_type(struct irq_desc 
*desc,
      spin_unlock(&aplic.lock);
  }
+static unsigned int cf_check aplic_irq_num(void)
+{
+    return aplic_info.num_irqs;
+}
+
  static const hw_irq_controller aplic_xen_irq_type = {
      .typename     = "aplic",
      .startup      = aplic_irq_startup,
@@ -309,6 +314,7 @@ static const struct intc_hw_operations aplic_ops = {
      .host_irq_type       = &aplic_xen_irq_type,
      .handle_interrupt    = aplic_handle_interrupt,
      .set_irq_type        = aplic_set_irq_type,
+    .irq_nums            = aplic_irq_num,

Hook handler names and respective field names would preferably match up. It's
unclear why the field uses some kind of plural(?), while the function uses
singular.

I will do s/aplic_irq_num/aplic_irq_nums.


--- a/xen/arch/riscv/domain.c
+++ b/xen/arch/riscv/domain.c
@@ -11,6 +11,7 @@
  #include <asm/bitops.h>
  #include <asm/cpufeature.h>
  #include <asm/csr.h>
+#include <asm/intc.h>
  #include <asm/riscv_encoding.h>
  #include <asm/vtimer.h>
@@ -155,14 +156,10 @@ int arch_vcpu_create(struct vcpu *v)
      if ( (rc = vcpu_vtimer_init(v)) )
          goto fail;
- /*
-     * As interrupt controller (IC) is not yet implemented,
-     * return an error.
-     *
-     * TODO: Drop this once IC is implemented.
-     */
-    rc = -EOPNOTSUPP;
-    goto fail;
+    ASSERT(v->domain->arch.vintc->ops->vcpu_init);
+
+    if ( (rc = v->domain->arch.vintc->ops->vcpu_init(v)) )

I don't understand this model of the use of ASSERT(). As previously said
(more than once) - you'll crash anyway if any of the involved pointers is
NULL. If you really think an up-front check is better, then why would you
check only the leaf of the pointer chain, and not also vintc and ops?
(Once you do you'll then likely notice that there are more assertions
than actual code.)

Let me show an example. For the test I will comment ASSERT() here + make ->vcpu_init = NULL. Then the following dump will be occurred because of NULL pointer dereference:

(XEN) scause : 000000000000000c Unhandled exception[Instruction Page Fault]
(XEN) htval     : 0000000000000000 htinst    : 0000000000000000
(XEN) hedeleg   : 0000000000000000 hideleg   : 0000000000000000
(XEN) hstatus   : 0000000200000000 [ ]
(XEN) hgatp     : 0000000000000000
(XEN) hstateen0 : 0000000000000000
(XEN) stvec     : ffffffffc00397f0 vstvec    : 0000000000000000
(XEN) sepc      : 0000000000000000 vsepc     : 0000000000000000
(XEN) stval     : 0000000000000000 vstval    : 0000000000000000
(XEN) status    : 0000000200000120 vsstatus  : 0000000a00000000
(XEN) satp      : 80000000000806c5
(XEN) vscause   : 0000000000000000 [Instruction Address Misaligned]
(XEN) ra  : ffffffffc0038f92 sp  : ffffffffc00a6580
(XEN) gp  : 0000000000000000 tp  : ffffffffc005e9c0
(XEN) t0  : ffffffffc04c9f78 t1  : 0000000052464e43
(XEN) t2  : 0000000000000000 s0  : ffffffffc00a65a0
(XEN) s1  : 000000323feaa000 a0  : 000000323feaa000
(XEN) a1  : 000000321a0010a8 a2  : 0000000000000000
(XEN) a3  : ffffffffc005c978 a4  : 0000000000000000
(XEN) a5  : 0000000000000000 a6  : 0000000000000000
(XEN) a7  : 000000321a0010a0 s2  : 0000000000000000
(XEN) s3  : 0000000000000000 s4  : 0000000000000000
(XEN) s5  : 0000000000000000 s6  : ffffffffc00a6638
(XEN) s7  : ffffffffc00a6638 s8  : ffffffffc00a6658
(XEN) s9  : ffffffffffffe000 s10 : 00000032105f4328
(XEN) s11 : 0000000000000000 t3  : 0000000000000001
(XEN) t4  : 0000000000000000 t5  : 0000000000080000
(XEN) t6  : 0000000000000001

So sepc = 0x0 => the CPU attempted to fetch and execute an instruction at address NULL.

stval = 0x0 => Faulting address = 0x0 (confirms null fetch).

During writing of that I reliased that I have also ra register which will tell where the bad call came from and then using addr2line/nm/objdump/whatever it still could be recoverable a place where NULL dereference happened in this case:

$ addr2line -e xen/xen-syms 0xffffffffc0038f92
/build/xen/arch/riscv/domain.c:218

So I agree now that such ASSERT should be dropped and shouldn't be used in such cases. I will drop it here and in the similar places where I added ASSERT for the reason as it is hard to identify a place where fault happened.

+unsigned int intc_irq_nums(void)
+{
+    ASSERT(intc_hw_ops && intc_hw_ops->irq_nums);
+
+    return intc_hw_ops->irq_nums();
+}

You use this to set domains' properties. As indicated before, I view it as
wrong to do so for any domain, besides perhaps Dom0 / hwdom. If you want to
do so nevertheless, at the very least I'd expect something to be said about
such a decision in the description.

If you could explain how this is expected to work for non-Dom0/hwdom domains, I would consider reworking it.

Basically, I don't understand how the following scenario is supposed to work. Let's say the host interrupt controller can manage 15 interrupts, while the guest interrupt controller supports only 7. If we want to pass through 8 devices to the guest, what should happen in that case? Should Xen simply report that the 8th device cannot be passed through because the guest supports only 7 IRQs?

Another concern is related to 1:1 IRQ mapping. Suppose I want to pass through a UART device whose IRQ number is typically greater than 10. In that case, it seems Xen would again have to report that the device cannot be passed through because its interrupt number exceeds the number of IRQs supported by the guest interrupt controller. This could be addressed by introducing a non-1:1 IRQ mapping between the host and guest, but the current dom0less codebase appears to assume a 1:1 IRQ mapping (unless I am mistaken).

Considering that virtual interrupt controllers use the maximum possible number of interrupts supported by the interrupt controller, the concerns mentioned above are unlikely to arise for a long time, if ever, unless support for features such as migration is introduced. Therefore, I think it would be reasonable to remove intc_irq_nums() and avoid using it to initialize virtual interrupts or domain properties.

+
+static const struct vintc_ops vintc_ops = {
+    .vcpu_init = vcpu_vaplic_init,
+};
+
+int __init domain_vaplic_init(struct domain *d)

Why __init, and why is there no caller?

The caller is in follow-up patch. I will add that to commit message.

Considering that domain_vintc_init() isn't __init from where domain_vaplic_init() is called then __init should be dropped here. I will do that.

Plus why is the vCPU-init a hook,
but the domain init is not? Either you mean to allow for other ICs, or
you you don't.

IIUC your question domain_vaplic_init() ins't a hook because vaplic structure is allocated dynamically so vintc, vintc->ops and/or vintc->init_ops aren't initialized at the moment when vintc->{ops or init_ops}->domain_vaplic_init() is used in domain_vintc_init() (which is introduced in the follow up patch).


+{
+    struct vaplic *vaplic = xvzalloc(struct vaplic);
+
+    if ( !vaplic )
+        return -ENOMEM;
+
+    d->arch.vintc = &vaplic->vintc;
+    d->arch.vintc->ops = &vintc_ops;
+
+    vaplic->regs.domaincfg = APLIC_DOMAINCFG_IE | APLIC_DOMAINCFG_DM;
+
+    d->arch.vintc->irq_nums = min(intc_irq_nums(),
+                                  VAPLIC_NUM_SOURCES + 0U);
+
+
+    return 0;
+}
+
+void __init domain_vaplic_deinit(struct domain *d)
+{
+    struct vaplic *vaplic = to_vaplic(d);
+
+    xvfree(vaplic);
+}

And d->arch.vintc turns into a dangling pointer. The way you arrange data
types, you can't use XVFREE() here, but imo you really want to make sure
the function is idempotent.

I will do then:

void domain_vaplic_deinit(struct domain *d)
{
    struct vaplic *vaplic;

    if ( !d->arch.vintc )
        return;

    vaplic = to_vaplic(d);
    d->arch.vintc = NULL;
    xvfree(vaplic);
}

Thanks.

~ Oleksii



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.