[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 10/18] xen/riscv: implement p2m_set_range()
On 25.09.2025 22:08, Oleksii Kurochko wrote: > On 9/20/25 1:36 AM, Jan Beulich wrote: >> On 17.09.2025 23:55, Oleksii Kurochko wrote: >>> +static pte_t *p2m_get_root_pointer(struct p2m_domain *p2m, gfn_t gfn) >>> +{ >>> + unsigned long root_table_indx; >>> + >>> + root_table_indx = gfn_x(gfn) >> P2M_LEVEL_ORDER(P2M_ROOT_LEVEL); >>> + if ( root_table_indx >= P2M_ROOT_PAGES ) >>> + return NULL; >>> + >>> + /* >>> + * The P2M root page table is extended by 2 bits, making its size 16KB >>> + * (instead of 4KB for non-root page tables). Therefore, p2m->root is >>> + * allocated as four consecutive 4KB pages (since alloc_domheap_pages() >>> + * only allocates 4KB pages). >>> + * >>> + * To determine which of these four 4KB pages the root_table_indx falls >>> + * into, we divide root_table_indx by >>> + * P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL - 1). >>> + */ >>> + root_table_indx /= P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL - 1); >> The subtraction of 1 here feels odd: You're after the root table's >> number of entries, i.e. I'd expect you to pass just P2M_ROOT_LEVEL. >> And the way P2M_PAGETABLE_ENTRIES() works also suggests so. > > The purpose of this line is to select the page within the root table, which > consists of 4 consecutive pages. However, > P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL) > returns 2048, so root_table_idx will always be 0 after devision, which is not > what we want. > > As an alternative, P2M_PAGETABLE_ENTRIES(0) could be used, since it always > returns 512. Dividing root_table_idx by 512 then yields the index of the page > within the root table, which is made up of 4 consecutive pages. > > Does it make sense now? Yes and no. I understand what you're after, but that doesn't make the use of P2M_PAGETABLE_ENTRIES() (with an arbitrary level as argument) correct. This calculation wants doing by solely using properties of the top level. > The problem may occur with DECLARE_OFFSET(), which can produce an incorrect > index within the root page table. Since the index is in the range [0, 2047], > it becomes an issue if the value is greater than 511, because DECLARE_OFFSET() > does not account for the larger range of the root index. > > I am not sure whether it is better to make DECLARE_OFFSET() generic enough > for both P2M and Xen page tables, or to provide a separate > P2M_DECLARE_OFFSET() > and use it only in P2M-related code. > Also, it could be an option to move DECLARE_OFFSET() from asm/page.h header > to riscv/pt.c and define another one DECLARE_OFFSETS in riscv/p2m.c. > > Do you have a preference? Not really, no. I don't like DECLARE_OFFSETS() anyway. >>> +static int p2m_set_entry(struct p2m_domain *p2m, >>> + gfn_t gfn, >>> + unsigned long page_order, >>> + mfn_t mfn, >>> + p2m_type_t t) >> Nit: Indentation. >> >>> +{ >>> + unsigned int level; >>> + unsigned int target = page_order / PAGETABLE_ORDER; >>> + pte_t *entry, *table, orig_pte; >>> + int rc; >>> + /* >>> + * A mapping is removed only if the MFN is explicitly set to >>> INVALID_MFN. >>> + * Other MFNs that are considered invalid by mfn_valid() (e.g., MMIO) >>> + * are still allowed. >>> + */ >>> + bool removing_mapping = mfn_eq(mfn, INVALID_MFN); >>> + DECLARE_OFFSETS(offsets, gfn_to_gaddr(gfn)); >>> + >>> + ASSERT(p2m_is_write_locked(p2m)); >>> + >>> + /* >>> + * Check if the level target is valid: we only support >>> + * 4K - 2M - 1G mapping. >>> + */ >>> + ASSERT(target <= 2); >>> + >>> + table = p2m_get_root_pointer(p2m, gfn); >>> + if ( !table ) >>> + return -EINVAL; >>> + >>> + for ( level = P2M_ROOT_LEVEL; level > target; level-- ) >>> + { >>> + /* >>> + * Don't try to allocate intermediate page table if the mapping >>> + * is about to be removed. >>> + */ >>> + rc = p2m_next_level(p2m, !removing_mapping, >>> + level, &table, offsets[level]); >>> + if ( (rc == P2M_TABLE_MAP_NONE) || (rc == P2M_TABLE_MAP_NOMEM) ) >>> + { >>> + rc = (rc == P2M_TABLE_MAP_NONE) ? -ENOENT : -ENOMEM; >>> + /* >>> + * We are here because p2m_next_level has failed to map >>> + * the intermediate page table (e.g the table does not exist >>> + * and they p2m tree is read-only). >> I thought I commented on this or something similar already: Calling the >> p2m tree "read-only" is imo misleading. > > I will change then "read-only" to "not allocatable". That'll be only marginally better: What's "allocatable"? Why not something like "... does not exist and none should be allocated"? Or maybe simply omit this part of the comment? >>> + /* >>> + * Free the entry only if the original pte was valid and the base >>> + * is different (to avoid freeing when permission is changed). >>> + * >>> + * If previously MFN 0 was mapped and it is going to be removed >>> + * and considering that during removing MFN 0 is used then `entry` >>> + * and `new_entry` will be the same and p2m_free_subtree() won't be >>> + * called. This case is handled explicitly. >>> + */ >>> + if ( pte_is_valid(orig_pte) && >>> + (!mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte)) || >>> + (removing_mapping && mfn_eq(pte_get_mfn(*entry), _mfn(0)))) ) >>> + p2m_free_subtree(p2m, orig_pte, level); >> I continue to fail to understand why the MFN would matter here. > > My understanding is that if, for the same GFN, the MFN changes fromMFN_1 to > MFN_2, then we need to update any references on the page referenced by > |orig_pte| to ensure the proper reference counter is maintained for the page > pointed to byMFN_1. > >> Isn't the >> need to free strictly tied to a VALID -> NOT VALID transition? A permission >> change simply retains the VALID state of an entry. > > It covers a case when removing happens and probably in this case we don't need > to check specifically for mfn(0) case "mfn_eq(pte_get_mfn(*entry), _mfn(0))", > but it would be enough to check that pte_is_valid(entry) instead: > ... > (removing_mapping && !pte_is_valid(entry)))) ) > > Or only check removing_mapping variable as `entry` would be invalided by the > code above anyway. So we will get: > + if ( pte_is_valid(orig_pte) && > + (!mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte)) || > removing_mapping) ) > + p2m_free_subtree(p2m, orig_pte, level); > > Does it make sense now? Not really, sorry. Imo the complicated condition indicates that something is wrong (or at least inefficient) here. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |