[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 10/18] xen/riscv: implement p2m_set_range()
On 9/26/25 9:07 AM, Jan Beulich wrote:
On 25.09.2025 22:08, Oleksii Kurochko wrote:On 9/20/25 1:36 AM, Jan Beulich wrote:On 17.09.2025 23:55, Oleksii Kurochko wrote:+static pte_t *p2m_get_root_pointer(struct p2m_domain *p2m, gfn_t gfn) +{ + unsigned long root_table_indx; + + root_table_indx = gfn_x(gfn) >> P2M_LEVEL_ORDER(P2M_ROOT_LEVEL); + if ( root_table_indx >= P2M_ROOT_PAGES ) + return NULL; + + /* + * The P2M root page table is extended by 2 bits, making its size 16KB + * (instead of 4KB for non-root page tables). Therefore, p2m->root is + * allocated as four consecutive 4KB pages (since alloc_domheap_pages() + * only allocates 4KB pages). + * + * To determine which of these four 4KB pages the root_table_indx falls + * into, we divide root_table_indx by + * P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL - 1). + */ + root_table_indx /= P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL - 1);The subtraction of 1 here feels odd: You're after the root table's number of entries, i.e. I'd expect you to pass just P2M_ROOT_LEVEL. And the way P2M_PAGETABLE_ENTRIES() works also suggests so.The purpose of this line is to select the page within the root table, which consists of 4 consecutive pages. However, P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL) returns 2048, so root_table_idx will always be 0 after devision, which is not what we want. As an alternative, P2M_PAGETABLE_ENTRIES(0) could be used, since it always returns 512. Dividing root_table_idx by 512 then yields the index of the page within the root table, which is made up of 4 consecutive pages. Does it make sense now?Yes and no. I understand what you're after, but that doesn't make the use of P2M_PAGETABLE_ENTRIES() (with an arbitrary level as argument) correct. This calculation wants doing by solely using properties of the top level. Got it, thanks. Then I will use solely properties of the top level. +static int p2m_set_entry(struct p2m_domain *p2m, + gfn_t gfn, + unsigned long page_order, + mfn_t mfn, + p2m_type_t t)Nit: Indentation.+{ + unsigned int level; + unsigned int target = page_order / PAGETABLE_ORDER; + pte_t *entry, *table, orig_pte; + int rc; + /* + * A mapping is removed only if the MFN is explicitly set to INVALID_MFN. + * Other MFNs that are considered invalid by mfn_valid() (e.g., MMIO) + * are still allowed. + */ + bool removing_mapping = mfn_eq(mfn, INVALID_MFN); + DECLARE_OFFSETS(offsets, gfn_to_gaddr(gfn)); + + ASSERT(p2m_is_write_locked(p2m)); + + /* + * Check if the level target is valid: we only support + * 4K - 2M - 1G mapping. + */ + ASSERT(target <= 2); + + table = p2m_get_root_pointer(p2m, gfn); + if ( !table ) + return -EINVAL; + + for ( level = P2M_ROOT_LEVEL; level > target; level-- ) + { + /* + * Don't try to allocate intermediate page table if the mapping + * is about to be removed. + */ + rc = p2m_next_level(p2m, !removing_mapping, + level, &table, offsets[level]); + if ( (rc == P2M_TABLE_MAP_NONE) || (rc == P2M_TABLE_MAP_NOMEM) ) + { + rc = (rc == P2M_TABLE_MAP_NONE) ? -ENOENT : -ENOMEM; + /* + * We are here because p2m_next_level has failed to map + * the intermediate page table (e.g the table does not exist + * and they p2m tree is read-only).I thought I commented on this or something similar already: Calling the p2m tree "read-only" is imo misleading.I will change then "read-only" to "not allocatable".That'll be only marginally better: What's "allocatable"? Why not something like "... does not exist and none should be allocated"? Or maybe simply omit this part of the comment? Agree, "allocatable" could be also confusing. Perhaps, just omitting will be fine. + /* + * Free the entry only if the original pte was valid and the base + * is different (to avoid freeing when permission is changed). + * + * If previously MFN 0 was mapped and it is going to be removed + * and considering that during removing MFN 0 is used then `entry` + * and `new_entry` will be the same and p2m_free_subtree() won't be + * called. This case is handled explicitly. + */ + if ( pte_is_valid(orig_pte) && + (!mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte)) || + (removing_mapping && mfn_eq(pte_get_mfn(*entry), _mfn(0)))) ) + p2m_free_subtree(p2m, orig_pte, level);I continue to fail to understand why the MFN would matter here.My understanding is that if, for the same GFN, the MFN changes fromMFN_1 to MFN_2, then we need to update any references on the page referenced by |orig_pte| to ensure the proper reference counter is maintained for the page pointed to byMFN_1.Isn't the need to free strictly tied to a VALID -> NOT VALID transition? A permission change simply retains the VALID state of an entry.It covers a case when removing happens and probably in this case we don't need to check specifically for mfn(0) case "mfn_eq(pte_get_mfn(*entry), _mfn(0))", but it would be enough to check that pte_is_valid(entry) instead: ... (removing_mapping && !pte_is_valid(entry)))) ) Or only check removing_mapping variable as `entry` would be invalided by the code above anyway. So we will get: + if ( pte_is_valid(orig_pte) && + (!mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte)) || removing_mapping) ) + p2m_free_subtree(p2m, orig_pte, level); Does it make sense now?Not really, sorry. Imo the complicated condition indicates that something is wrong (or at least inefficient) here. Then, in the case of a VALID -> VALID transition, where the MFN is changed for the same PTE, should something be done with the old MFN (e.g., calling If we only need to care about the VALID -> NOT VALID transition, doesn’t that mean
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |