|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [for 4.22 v5 10/18] xen/riscv: implement p2m_set_range()
On 14.11.2025 18:04, Oleksii Kurochko wrote:
> On 11/10/25 3:53 PM, Jan Beulich wrote:
>> On 20.10.2025 17:57, Oleksii Kurochko wrote:
>>> +#define GFN_MASK(lvl) (P2M_PAGETABLE_ENTRIES(lvl) - 1UL)
>> If I'm not mistaken, this is a mask with the low 10 or 12 bits set.
>
> I'm not sure I fully understand you here. With the current implementation,
> it returns a bitmask that corresponds to the number of index bits used
> at each level. So, if|P2M_ROOT_LEVEL = 2|, then:
> |G||FN_MASK(0) = 0x1ff| (9-bit GFN for the level 0)
> |GFN_MASK(1) = 0x1ff| (9-bit GFN width for level 1)
> |GFN_MASK(2) = 0x7ff| (11-bit GFN width for level 2)
Oh, sorry, 9 and 11 bits is what I meant.
> Or do you mean that GFN_MASK(lvl) should return something like this:
> |G||FN_MASK_(0) = 0x1FF000 (0x1ff << 0xc) GFN_MASK_(1) = 0x3FE00000
> (GFN_MASK_(0)<<9) GFN_MASK_(2) = 0x1FFC0000000 (GFN_MASK_(1)<<9 + extra
> 2 bits)
Yes.
> And then here ...|
>
>> That's not really something you can apply to a GFN, unlike the name
>> suggests.
>
> That is why virtual address should be properly shifted before, something
> like it is done in calc_offset():
Please can we stop calling guest physical addresses "virtual address"?
> (va >> P2M_LEVEL_SHIFT(lvl)) & GFN_MASK(lvl);
>
> ...
> (va & GFN_MASK_(lvl)) >> P2M_LEVEL_SHIFT(lvl) ?
> In this option more shifts will be needed.
It's okay to try to limit the number of shifts needed, but the macros need
naming accordingly.
> Would it be better to just rename GFN_MASK() to P2M_PT_INDEX_MASK()? Or,
> maybe, even just P2M_INDEX_MASK().
Perhaps. I would recommend though that you take a looks at other ports'
naming. In x86, for example, we have l<N>_table_offset().
>>> --- a/xen/arch/riscv/p2m.c
>>> +++ b/xen/arch/riscv/p2m.c
>>> @@ -9,6 +9,7 @@
>>> #include <xen/rwlock.h>
>>> #include <xen/sched.h>
>>> #include <xen/sections.h>
>>> +#include <xen/xvmalloc.h>
>>>
>>> #include <asm/csr.h>
>>> #include <asm/flushtlb.h>
>>> @@ -17,6 +18,43 @@
>>> #include <asm/vmid.h>
>>>
>>> unsigned char __ro_after_init gstage_mode;
>>> +unsigned int __ro_after_init gstage_root_level;
>> Like for mode, I'm unconvinced of this being a global (and not per-P2M /
>> per-domain).
>
> The question is then if we really will (or want to) have cases when gstage
> mode will be different per-domain/per-p2m?
Can you explain to me why you think we wouldn't want that, sooner or later?
>>> +/*
>>> + * The P2M root page table is extended by 2 bits, making its size 16KB
>>> + * (instead of 4KB for non-root page tables). Therefore, P2M root page
>>> + * is allocated as four consecutive 4KB pages (since alloc_domheap_pages()
>>> + * only allocates 4KB pages).
>>> + */
>>> +#define ENTRIES_PER_ROOT_PAGE \
>>> + (P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL) / P2M_ROOT_ORDER)
>>> +
>>> +static inline unsigned int calc_offset(unsigned int lvl, vaddr_t va)
>> Where would a vaddr_t come from here? Your input are guest-physical
>> addresses,
>> if I'm not mistaken.
>
> You are right. Would it be right to 'paddr_t gpa' here? Or paddr_t is
> supposed to use
> only with machine physical address?
In x86 we use paddr_t in such cases. Arm iirc additionally has gaddr_t.
>>> +#define P2M_MAX_ROOT_LEVEL 4
>>> +
>>> +#define P2M_DECLARE_OFFSETS(var, addr) \
>>> + unsigned int var[P2M_MAX_ROOT_LEVEL] = {-1};\
>>> + for ( unsigned int i = 0; i <= gstage_root_level; i++ ) \
>>> + var[i] = calc_offset(i, addr);
>> This surely is more than just "declare", and it's dealing with all levels no
>> matter whether you actually will use all offsets.
>
> I will rename|P2M_DECLARE_OFFSETS| to|P2M_BUILD_LEVEL_OFFSETS()|.
>
> But how can I know which offset I will actually need to use?
> If we take the following loop as an example:
> |for( level = P2M_ROOT_LEVEL; level > target; level-- ) { ||/* ||* Don't
> try to allocate intermediate page tables if the mapping ||* is about to be
> removed. ||*/ ||rc = p2m_next_level(p2m, !removing_mapping, ||level, &table,
> offsets[level]); ||... ||} |It walks from|P2M_ROOT_LEVEL| down to|target|,
> where|target| is determined at runtime.
>
> If you mean that, for example, when the G-stage mode is Sv39, there is no
> need to allocate
> an array with 4 entries (or 5 entries if we consider Sv57, so
> P2M_MAX_ROOT_LEVEL should be
> updated), because Sv39 only uses 3 page table levels — then yes, in theory it
> could be
> smaller. But I don't think it is a real issue if the|offsets[]| array on the
> stack has a
> few extra unused entries.
>
> If preferred, Icould allocate the array dynamically based
> on|gstage_root_level|.
> Would that be better?
Having a few unused entries isn't a big deal imo. What I'm not happy with here
is
that you may _initialize_ more entries than actually needed. I have no good
suggestion within the conceptual framework you use for page walking (the same
issue iirc exists in host page table walks, just that the calculations there are
cheaper).
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |