[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] amd/iommu: Always atomically update DTE



Le 17/11/2025 à 14:05, Teddy Astie a écrit :
> Le 13/11/2025 à 12:38, Jan Beulich a écrit :
>> On 12.11.2025 16:37, Teddy Astie wrote:
>>> amd_iommu_set_root_page_table chooses between updating atomically
>>> and non-atomically depending on whether the DTE is active or not.
>>>
>>> This choice existed mostly because cx16 wasn't supposed always available
>>> until [1]. Thus we don't need to threat the non-atomic path in a special
>>> way anymore.
>>>
>>> By rearranging slightly the atomic path, we can make it cover all the
>>> cases
>>> which improves the code generation at the expense of systematically
>>> performing
>>> cmpxchg16b.
>>>
>>> Also remove unused raw64 fields of ldte, and the deprecated comment
>>> as the
>>> function actually behaves in a more usual way and can't return >0.
>>>
>>> [1] 2636fcdc15c7 "x86/iommu: check for CMPXCHG16B when enabling IOMMU"
>>>
>>> Signed-off-by: Teddy Astie <teddy.astie@xxxxxxxxxx>
>>> ---
>>>   xen/drivers/passthrough/amd/iommu_map.c | 78 ++++++++-----------------
>>>   1 file changed, 25 insertions(+), 53 deletions(-)
>>>
>>> diff --git a/xen/drivers/passthrough/amd/iommu_map.c b/xen/drivers/
>>> passthrough/amd/iommu_map.c
>>> index 320a2dc64c..e3165d93aa 100644
>>> --- a/xen/drivers/passthrough/amd/iommu_map.c
>>> +++ b/xen/drivers/passthrough/amd/iommu_map.c
>>> @@ -154,69 +154,41 @@ static void set_iommu_ptes_present(unsigned
>>> long pt_mfn,
>>>       unmap_domain_page(table);
>>>   }
>>> -/*
>>> - * This function returns
>>> - * - -errno for errors,
>>> - * - 0 for a successful update, atomic when necessary
>>> - * - 1 for a successful but non-atomic update, which may need to be
>>> warned
>>> - *   about by the caller.
>>> - */
>>>   int amd_iommu_set_root_page_table(struct amd_iommu_dte *dte,
>>>                                     uint64_t root_ptr, uint16_t
>>> domain_id,
>>>                                     uint8_t paging_mode, unsigned int
>>> flags)
>>>   {
>>>       bool valid = flags & SET_ROOT_VALID;
>>> -    if ( dte->v && dte->tv )
>>> -    {
>>> -        union {
>>> -            struct amd_iommu_dte dte;
>>> -            uint64_t raw64[4];
>>> -            __uint128_t raw128[2];
>>> -        } ldte = { .dte = *dte };
>>> -        __uint128_t res, old = ldte.raw128[0];
>>> -        int ret = 0;
>>> -
>>> -        ldte.dte.domain_id = domain_id;
>>> -        ldte.dte.pt_root = paddr_to_pfn(root_ptr);
>>> -        ldte.dte.iw = true;
>>> -        ldte.dte.ir = true;
>>> -        ldte.dte.paging_mode = paging_mode;
>>> -        ldte.dte.v = valid;
>>> -
>>> -        res = cmpxchg16b(dte, &old, &ldte.raw128[0]);
>>> -
>>> -        /*
>>> -         * Hardware does not update the DTE behind our backs, so the
>>> -         * return value should match "old".
>>> -         */
>>> -        if ( res != old )
>>> -        {
>>> -            printk(XENLOG_ERR
>>> -                   "Dom%d: unexpected DTE %016lx_%016lx (expected
>>> %016lx_%016lx)\n",
>>> -                   domain_id,
>>> -                   (uint64_t)(res >> 64), (uint64_t)res,
>>> -                   (uint64_t)(old >> 64), (uint64_t)old);
>>> -            ret = -EILSEQ;
>>> -        }
>>> +    union {
>>> +        struct amd_iommu_dte dte;
>>> +        __uint128_t raw128[2];
>>> +    } ldte = { .dte = *dte };
>>> +    __uint128_t res, old = ldte.raw128[0];
>>> -        return ret;
>>> -    }
>>> +    ldte.dte.domain_id = domain_id;
>>> +    ldte.dte.pt_root = paddr_to_pfn(root_ptr);
>>> +    ldte.dte.iw = true;
>>> +    ldte.dte.ir = true;
>>> +    ldte.dte.paging_mode = paging_mode;
>>> +    ldte.dte.tv = true;
>>> +    ldte.dte.v = valid;
>>> +
>>> +    res = cmpxchg16b(dte, &old, &ldte.raw128[0]);
>>> -    if ( valid || dte->v )
>>> +    /*
>>> +     * Hardware does not update the DTE behind our backs, so the
>>> +     * return value should match "old".
>>> +     */
>>> +    if ( res != old )
>>>       {
>>> -        dte->tv = false;
>>> -        dte->v = true;
>>> -        smp_wmb();
>>> +        printk(XENLOG_ERR
>>> +                "Dom%d: unexpected DTE %016lx_%016lx (expected
>>> %016lx_%016lx)\n",
>>> +                domain_id,
>>> +                (uint64_t)(res >> 64), (uint64_t)res,
>>> +                (uint64_t)(old >> 64), (uint64_t)old);
>>
>> Indentation is now off by 1 here.
>>
>>> +        return -EILSEQ;
>>
>> The downside of this is that all updates can now take this path. Yes,
>> this shouldn't
>> be possible to be taken, but it's a (minor) concern nevertheless. At
>> the very least
>> such a downside wants, imo, mentioning in the description, even if for
>> nothing else
>> than to make clear that it was a deliberate choice.
>>
>
> Apparently it doesn't build (debian-bookworm-gcc-arm32-randconfig
> catched it).
> ARM does provide MAX_VIRT_CPUS and GUEST_MAX_VCPUS which is 128 or
> lower, but that doesn't map (or not properly) with what we have in x86
> (MAX_VIRT_CPUS=8192 is PV-specific, and GUEST_MAX_VCPUS doesn't exist).
>
> I am not sure what to do, looks like many things are redundant here.
>
>> Jan
>>
>

Oops, sent the wrong message. Please ignore.


--
Teddy Astie | Vates XCP-ng Developer

XCP-ng & Xen Orchestra - Vates solutions

web: https://vates.tech





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.