[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 10/18] xen/riscv: implement p2m_set_range()
On 9/20/25 1:36 AM, Jan Beulich wrote:
On 17.09.2025 23:55, Oleksii Kurochko wrote:--- a/xen/arch/riscv/p2m.c +++ b/xen/arch/riscv/p2m.c @@ -16,6 +16,7 @@ #include <asm/riscv_encoding.h> unsigned long __ro_after_init gstage_mode; +unsigned int __ro_after_init gstage_root_level; void __init gstage_mode_detect(void) { @@ -53,6 +54,7 @@ void __init gstage_mode_detect(void) if ( MASK_EXTR(csr_read(CSR_HGATP), HGATP_MODE_MASK) == mode ) { gstage_mode = mode; + gstage_root_level = modes[mode_idx].paging_levels - 1; break; } } @@ -210,6 +212,9 @@ int p2m_init(struct domain *d) rwlock_init(&p2m->lock); INIT_PAGE_LIST_HEAD(&p2m->pages); + p2m->max_mapped_gfn = _gfn(0); + p2m->lowest_mapped_gfn = _gfn(ULONG_MAX); + /* * Currently, the infrastructure required to enable CONFIG_HAS_PASSTHROUGH * is not ready for RISC-V support. @@ -251,13 +256,287 @@ int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted) return rc; } +/* + * Find and map the root page table. The caller is responsible for + * unmapping the table.With the root table being 4 pages, "the root table" is slightly misleading here: Yu never map the entire table. I will update the comment then to: /* * Map one of the four root pages of the P2M root page table. * * The P2M root page table is larger than normal (16KB instead of 4KB), * so it is allocated as four consecutive 4KB pages. This function selects * the appropriate 4KB page based on the given GFN and returns a mapping * to it. * * The caller is responsible for unmapping the page after use. * * Returns NULL if the calculated offset into the root table is invalid. */ + * The function will return NULL if the offset into the root table is + * invalid. + */ +static pte_t *p2m_get_root_pointer(struct p2m_domain *p2m, gfn_t gfn) +{ + unsigned long root_table_indx; + + root_table_indx = gfn_x(gfn) >> P2M_LEVEL_ORDER(P2M_ROOT_LEVEL); + if ( root_table_indx >= P2M_ROOT_PAGES ) + return NULL; + + /* + * The P2M root page table is extended by 2 bits, making its size 16KB + * (instead of 4KB for non-root page tables). Therefore, p2m->root is + * allocated as four consecutive 4KB pages (since alloc_domheap_pages() + * only allocates 4KB pages). + * + * To determine which of these four 4KB pages the root_table_indx falls + * into, we divide root_table_indx by + * P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL - 1). + */ + root_table_indx /= P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL - 1);The subtraction of 1 here feels odd: You're after the root table's number of entries, i.e. I'd expect you to pass just P2M_ROOT_LEVEL. And the way P2M_PAGETABLE_ENTRIES() works also suggests so. The purpose of this line is to select the page within the root table, which consists of 4 consecutive pages. However, P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL) returns 2048, so root_table_idx will always be 0 after devision, which is not what we want. As an alternative, P2M_PAGETABLE_ENTRIES(0) could be used, since it always returns 512. Dividing root_table_idx by 512 then yields the index of the page within the root table, which is made up of 4 consecutive pages. Does it make sense now? The problem may occur with DECLARE_OFFSET(), which can produce an incorrect index within the root page table. Since the index is in the range [0, 2047], it becomes an issue if the value is greater than 511, because DECLARE_OFFSET() does not account for the larger range of the root index. I am not sure whether it is better to make DECLARE_OFFSET() generic enough for both P2M and Xen page tables, or to provide a separate P2M_DECLARE_OFFSET() and use it only in P2M-related code. Also, it could be an option to move DECLARE_OFFSET() from asm/page.h header to riscv/pt.c and define another one DECLARE_OFFSETS in riscv/p2m.c. Do you have a preference? +/* + * Insert an entry in the p2m. This should be called with a mapping + * equal to a page/superpage. + */I don't follow this comment: There isn't any mapping being passed in, is there? I think this comment should be dropped, it was about that requested mapping should be equal to a page/superpage(4k, 2m, 1g), the correct order is always guaranteed by p2m_mapping_order(). +static int p2m_set_entry(struct p2m_domain *p2m, + gfn_t gfn, + unsigned long page_order, + mfn_t mfn, + p2m_type_t t)Nit: Indentation.+{ + unsigned int level; + unsigned int target = page_order / PAGETABLE_ORDER; + pte_t *entry, *table, orig_pte; + int rc; + /* + * A mapping is removed only if the MFN is explicitly set to INVALID_MFN. + * Other MFNs that are considered invalid by mfn_valid() (e.g., MMIO) + * are still allowed. + */ + bool removing_mapping = mfn_eq(mfn, INVALID_MFN); + DECLARE_OFFSETS(offsets, gfn_to_gaddr(gfn)); + + ASSERT(p2m_is_write_locked(p2m)); + + /* + * Check if the level target is valid: we only support + * 4K - 2M - 1G mapping. + */ + ASSERT(target <= 2); + + table = p2m_get_root_pointer(p2m, gfn); + if ( !table ) + return -EINVAL; + + for ( level = P2M_ROOT_LEVEL; level > target; level-- ) + { + /* + * Don't try to allocate intermediate page table if the mapping + * is about to be removed. + */ + rc = p2m_next_level(p2m, !removing_mapping, + level, &table, offsets[level]); + if ( (rc == P2M_TABLE_MAP_NONE) || (rc == P2M_TABLE_MAP_NOMEM) ) + { + rc = (rc == P2M_TABLE_MAP_NONE) ? -ENOENT : -ENOMEM; + /* + * We are here because p2m_next_level has failed to map + * the intermediate page table (e.g the table does not exist + * and they p2m tree is read-only).I thought I commented on this or something similar already: Calling the p2m tree "read-only" is imo misleading. I will change then "read-only" to "not allocatable". It is a valid case + * when removing a mapping as it may not exist in the + * page table. In this case, just ignore it.I fear the "it" has no reference; aiui you mean "ignore the lookup failure", but the comment isn't worded to refer to that by "it". I will update the comment correspondingly. + */ + rc = removing_mapping ? 0 : rc; + goto out; + } + + if ( rc != P2M_TABLE_NORMAL ) + break; + } + + entry = table + offsets[level]; + + /* + * If we are here with level > target, we must be at a leaf node, + * and we need to break up the superpage. + */ + if ( level > target ) + { + panic("Shattering isn't implemented\n"); + } + + /* + * We should always be there with the correct level because all the + * intermediate tables have been installed if necessary. + */ + ASSERT(level == target); + + orig_pte = *entry; + + if ( removing_mapping ) + p2m_clean_pte(entry, p2m->clean_dcache); + else + { + pte_t pte = p2m_pte_from_mfn(mfn, t); + + p2m_write_pte(entry, pte, p2m->clean_dcache); + + p2m->max_mapped_gfn = gfn_max(p2m->max_mapped_gfn, + gfn_add(gfn, BIT(page_order, UL) - 1)); + p2m->lowest_mapped_gfn = gfn_min(p2m->lowest_mapped_gfn, gfn); + } + + p2m->need_flush = true; + + /* + * Currently, the infrastructure required to enable CONFIG_HAS_PASSTHROUGH + * is not ready for RISC-V support. + * + * When CONFIG_HAS_PASSTHROUGH=y, iommu_iotlb_flush() should be done + * here. + */ +#ifdef CONFIG_HAS_PASSTHROUGH +# error "add code to flush IOMMU TLB" +#endif + + rc = 0; + + /* + * Free the entry only if the original pte was valid and the base + * is different (to avoid freeing when permission is changed). + * + * If previously MFN 0 was mapped and it is going to be removed + * and considering that during removing MFN 0 is used then `entry` + * and `new_entry` will be the same and p2m_free_subtree() won't be + * called. This case is handled explicitly. + */ + if ( pte_is_valid(orig_pte) && + (!mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte)) || + (removing_mapping && mfn_eq(pte_get_mfn(*entry), _mfn(0)))) ) + p2m_free_subtree(p2m, orig_pte, level);I continue to fail to understand why the MFN would matter here. My understanding is that if, for the same GFN, the MFN changes from MFN_1 to
MFN_2, then we need to update any references on the page referenced by
Isn't the need to free strictly tied to a VALID -> NOT VALID transition? A permission change simply retains the VALID state of an entry. It covers a case when removing happens and probably in this case we don't need to check specifically for mfn(0) case "mfn_eq(pte_get_mfn(*entry), _mfn(0))", but it would be enough to check that pte_is_valid(entry) instead: ... (removing_mapping && !pte_is_valid(entry)))) ) Or only check removing_mapping variable as `entry` would be invalided by the code above anyway. So we will get: + if ( pte_is_valid(orig_pte) && + (!mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte)) || removing_mapping) ) + p2m_free_subtree(p2m, orig_pte, level); Does it make sense now? ~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |