|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v1 19/27] xen/riscv: emulate guest writes to virtual APLIC MMIO
On 14.04.2026 18:04, Oleksii Kurochko wrote:
> On 4/2/26 4:18 PM, Jan Beulich wrote:
>> On 10.03.2026 18:08, Oleksii Kurochko wrote:
>>> @@ -127,6 +137,164 @@ int vaplic_map_device_irqs_to_domain(struct domain *d,
>>> return 0;
>>> }
>>>
>>> +static void vaplic_dm_update_target(const unsigned long hart_id, uint32_t
>>> *iprio)
>>> +{
>>> + *iprio &= APLIC_TARGET_IPRIO_MASK;
>>> + *iprio |= (hart_id << APLIC_TARGET_HART_IDX_SHIFT);
>>> +}
>>> +
>>> +static void vaplic_update_target(const struct imsic_config *imsic,
>>> + const int guest_id,
>>> + const unsigned long hart_id, uint32_t
>>> *value)
>>> +{
>>> + unsigned long group_index;
>>> + unsigned int hhxw = imsic->group_index_bits;
>>> + unsigned int lhxw = imsic->hart_index_bits;
>>> + unsigned int hhxs = imsic->group_index_shift - IMSIC_MMIO_PAGE_SHIFT *
>>> 2;
>>> + unsigned long base_ppn = imsic->msi[hart_id].base_addr >>
>>> IMSIC_MMIO_PAGE_SHIFT;
>>> +
>>> + group_index = (base_ppn >> (hhxs + 12)) & (BIT(hhxw, UL) - 1);
>>
>> And there's no constant available to make this literal 12 more descriptive?
>
> As it was used in aplic_set_irq_affinity() - IMSIC_MMIO_PAGE_SHIFT could
> be used here.
>
>>
>>> + *value &= APLIC_TARGET_EIID_MASK;
>>> + *value |= guest_id << APLIC_TARGET_GUEST_IDX_SHIFT;
>>> + *value |= hart_id << APLIC_TARGET_HART_IDX_SHIFT;
>>> + *value |= group_index << (lhxw + APLIC_TARGET_HART_IDX_SHIFT) ;
>>> +}
>>
>> Both functions returning void right now, why would they need to return their
>> result via indirection?
>
> No specific reason. Do you think it would be better just to return value
> instead? I am okay to rework that.
Rule of thumb is: Use return value in preference to indirection when the
returned value isn't needed for some other purpose.
>>> +#define CALC_REG_VALUE(base) \
>>> +{ \
>>> + uint32_t index; \
>>> + uint32_t tmp_val; \
>>
>> Combine these two, or have the variables have initializers?
>>
>>> + index = regval_to_irqn(offset - base); \
>>
>> There's no "offset" declared or passed into here, nor ...
>>
>>> + tmp_val = APLIC_REG_GET(priv->regs, aplic_addr) &
>>> ~auth_irq_bmp[index]; \
>>
>> ... "priv", nor ...
>>
>>> + value &= auth_irq_bmp[index]; \
>>> + value |= tmp_val; \
>>
>> ... "value". It may remain like this, but then it wants putting inside the
>> sole function that uses it, and be #undef-ed at the end of the function.
>>
>>> +}
>>
>> Please wrap in do/while(0), for use sites to be required to have semicolons
>> (and hence look like normal statements). Or make it a statement expression
>> properly returning the calculated value.
>
> I will put the following inside the function + undef at the end:
>
> #define CALC_REG_VALUE(base) do { \
> \
Nit: Why this extra line?
>>> +static int cf_check vaplic_emulate_store(const struct vcpu *vcpu,
>>> + unsigned long addr, uint32_t
>>> value)
>>> +{
>>> + struct vaplic *vaplic = to_vaplic(vcpu->domain->arch.vintc);
>>> + struct aplic_priv *priv = vaplic->base.info->private;
>>> + uint32_t offset = addr & APLIC_REG_OFFSET_MASK;
>>
>> See ./CODING_STYLE as to uses of fixed-width types.
>>
>>> + unsigned long aplic_addr = addr - priv->paddr_start;
>>> + const uint32_t *auth_irq_bmp = vcpu->domain->arch.vintc->private;
>>> +
>>> + switch ( offset )
>>> + {
>>> + case APLIC_SETIP_BASE ... APLIC_SETIP_LAST:
>>
>> And (taking this just as example) any misaligned accesses falling in this
>> range
>> are fine?
>
> Do you mean something like 0x1C02 instead of 0x1C00 or 0x1C04?
Yes.
>>> + /*
>>> + * As sourcecfg register starts from 1:
>>> + * 0x0000 domaincfg
>>> + * 0x0004 sourcecfg[1]
>>> + * 0x0008 sourcecfg[2]
>>> + * ...
>>> + * 0x0FFC sourcecfg[1023]
>>> + * It is necessary to calculate an interrupt number by substracting
>>
>> Nit: subtracting
>>
>>> + * of APLIC_DOMAINCFG instead of APLIC_SOURCECFG_BASE.
>>> + */
>>> + if ( !AUTH_IRQ_BIT(regval_to_irqn(offset - APLIC_DOMAINCFG)) )
>>> + /* interrupt not enabled, ignore it */
>>
>> Throughout the series: Please adhere to ./CODING_STYLE.
>>
>>> + return 0;
>>> +
>>> + break;
>>
>> And any value is okay to write?
>
> No, it should be in a range
> [APLIC_SOURCECFG_SM_INACTIVE,APLIC_SOURCECFG_SM_LEVEL_LOW].
>
> I will add the check before break:
> if ( value > APLIC_SOURCECFG_SM_LEVEL_LOW )
> {
> gdprintk(XENLOG_WARNING,
> "value(%u) is incorrect for sourcecfg register\n",
> value);
> value = APLIC_SOURCECFG_SM_INACTIVE;
> }
And why would writing APLIC_SOURCECFG_SM_INACTIVE be any better, when
that's not what the guest wanted? Simply ignore such writes, unless the
spec mandates specific behavior for out-of-range avlues?
>>> + case APLIC_TARGET_BASE ... APLIC_TARGET_LAST:
>>> + struct vcpu *target_vcpu = NULL;
>>> +
>>> + /*
>>> + * Look at vaplic_emulate_load() for explanation why
>>> + * APLIC_GENMSI is substracted.
>>> + */
>>
>> There's no vaplic_emulate_load() - how can I go look there?
>
> It is introduced in the next patch.
As before - it should be possible to review patch series strictly
sequentially. Further, what if this patch gets committed, and the other
gets delayed by several months?
>>> + if ( !AUTH_IRQ_BIT(regval_to_irqn(offset - APLIC_GENMSI)) )
>>> + /* interrupt not enabled, ignore it */
>>> + return 0;
>>> +
>>> + for ( int i = 0; i < vcpu->domain->max_vcpus; i++ )
>>
>> unsigned int
>>
>>> + {
>>> + struct vcpu *v = vcpu->domain->vcpu[i];
>>> +
>>> + if ( v->vcpu_id == (value >> APLIC_TARGET_HART_IDX_SHIFT) )
>>> + {
>>> + target_vcpu = v;
>>> + break;
>>> + }
>>> + }
>>> +
>>> + ASSERT(target_vcpu);
>>
>> What guarantees the pointer to be non-NULL? The incoming value can be
>> arbitrary, afaict.
>
> I didn't understand your point. It is just checking that target_vcpu has
> been found. If after for() loop the value of target_vcpu is still NULL
> then something wrong in Xen.
If that's true, then the assertion is fine to have. I can't help the
impression though that a guest could pick a value such that you can't
possibly find the target vCPU. Asserting on guest controlled input is
not okay, as was said several times before.
>>> + if ( !(vaplic->regs.domaincfg & APLIC_DOMAINCFG_DM) )
>>> + {
>>> +
>>> vaplic_dm_update_target(cpuid_to_hartid(target_vcpu->processor),
>>> + &value);
>>> + }
>>> + else
>>> + vaplic_update_target(priv->imsic_cfg,
>>> + vcpu_guest_file_id(target_vcpu),
>>> + cpuid_to_hartid(target_vcpu->processor),
>>> + &value);
>>
>> I'm struggling with the naming here: When DM is clear, a function with "dm"
>> in the name is called.
>
> it means direct (delivery) mode. Maybe it is better to put dm at the end
> of the function name? Or it is just better to change it to something else?
Without a better understanding of what is wanted, all I can say is that
calling something with "dm" in its name when the condition says it's not
"dm" is confusing.
>>> + default:
>>> + panic("%s: unsupported register offset: %#x\n", __func__, offset);
>>
>> Crashing the host for the guest doing something odd? It's odd that the
>> function
>> only ever returns 0 anyway - it could simply return an error here (if the
>> itention is to not ignore such writes).
>
> But maybe it is a legal offset and we really want to support it?
Still not a reason to crash the entire host?
> Even if I will return just error then a caller site will want to do
> something with this error -> for example, kill domain or panic() again.
> Maybe panic is to much and just domain should be crashed here:
>
> default:
> gdprintk(XENLOG_WARNING,
> "Unhandled APLIC write at offset %#x (value %#x)\n",
> offset, value);
> domain_crash(vcpu->domain);
> return 0;
> ?
This would already be better. You shouldn't use gdprintk() with
domain_crash() though. Please take a look at domain_crash()'s
definition - you'll then see what to do, I suppose.
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |