[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 15/17] xen/riscv: Implement superpage splitting for p2m mappings


  • To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 24 Jul 2025 09:58:17 +0200
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Alistair Francis <alistair.francis@xxxxxxx>, Bob Eshleman <bobbyeshleman@xxxxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 24 Jul 2025 07:58:27 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 23.07.2025 21:51, Oleksii Kurochko wrote:
> 
> On 7/22/25 6:02 PM, Jan Beulich wrote:
>> On 22.07.2025 16:57, Oleksii Kurochko wrote:
>>> On 7/21/25 3:34 PM, Jan Beulich wrote:
>>>> On 17.07.2025 18:37, Oleksii Kurochko wrote:
>>>>> On 7/2/25 11:25 AM, Jan Beulich wrote:
>>>>>> On 10.06.2025 15:05, Oleksii Kurochko wrote:
>>>>>>> Add support for down large memory mappings ("superpages") in the RISC-V
>>>>>>> p2m mapping so that smaller, more precise mappings ("finer-grained 
>>>>>>> entries")
>>>>>>> can be inserted into lower levels of the page table hierarchy.
>>>>>>>
>>>>>>> To implement that the following is done:
>>>>>>> - Introduce p2m_split_superpage(): Recursively shatters a superpage into
>>>>>>>      smaller page table entries down to the target level, preserving 
>>>>>>> original
>>>>>>>      permissions and attributes.
>>>>>>> - __p2m_set_entry() updated to invoke superpage splitting when inserting
>>>>>>>      entries at lower levels within a superpage-mapped region.
>>>>>>>
>>>>>>> This implementation is based on the ARM code, with modifications to the 
>>>>>>> part
>>>>>>> that follows the BBM (break-before-make) approach. Unlike ARM, RISC-V 
>>>>>>> does
>>>>>>> not require BBM, so there is no need to invalidate the PTE and flush the
>>>>>>> TLB before updating it with the newly created, split page table.
>>>>>> But some flushing is going to be necessary. As long as you only ever do
>>>>>> global flushes, the one after the individual PTE modification (within the
>>>>>> split table) will do (if BBM isn't required, see below), but once you 
>>>>>> move
>>>>>> to more fine-grained flushing, that's not going to be enough anymore. Not
>>>>>> sure it's a good idea to leave such a pitfall.
>>>>> I think that I don't fully understand what is an issue.
>>>> Whether a flush is necessary after solely breaking up a superpage is arch-
>>>> defined. I don't know how much restrictions the spec on possible hardware
>>>> behavior for RISC-V. However, the eventual change of (at least) one entry
>>>> of fulfill the original request will surely require a flush. What I was
>>>> trying to say is that this required flush would better not also cover for
>>>> the flushes that may or may not be required by the spec. IOW if the spec
>>>> leaves any room for flushes to possibly be needed, those flushes would
>>>> better be explicit.
>>> I think that I still don't understand why what I wrote above will work as 
>>> long
>>> as global flushes is working and will stop to work when more fine-grained 
>>> flushing
>>> is used.
>>>
>>> Inside p2m_split_superpage() we don't need any kind of TLB flush operation 
>>> as
>>> it is allocation a new page for a table and works with it, so no one could 
>>> use
>>> this page at the moment and cache it in TLB.
>>>
>>> The only question is that if it is needed BBM before staring using splitted 
>>> entry:
>>>           ....
>>>           if ( !p2m_split_superpage(p2m, &split_pte, level, target, 
>>> offsets) )
>>>           {
>>>               /* Free the allocated sub-tree */
>>>               p2m_free_subtree(p2m, split_pte, level);
>>>
>>>               rc = -ENOMEM;
>>>               goto out;
>>>           }
>>>
>>> ------> /* Should be BBM used here ? */
>>>           p2m_write_pte(entry, split_pte, p2m->clean_pte);
>>>
>>> And I can't find anything in the spec what tells me to use BBM here like Arm
>>> does:
>> But what you need is a statement in the spec that you can get away without. 
>> Imo
>> at least.
> 
> In the spec. it is mentioned that:
>    It is permitted for multiple address-translation cache entries to co-exist 
> for the same
>    address. This represents the fact that in a conventional TLB hierarchy, it 
> is possible for
>    multiple entries to match a single address if, for example, a page is 
> upgraded to a
>    superpage without first clearing the original non-leaf PTE’s valid bit and 
> executing an
>    SFENCE.VMA with rs1=x0, or if multiple TLBs exist in parallel at a given 
> level of the
>    hierarchy. In this case, just as if an SFENCE.VMA is not executed between 
> a write to the
>    memory-management tables and subsequent implicit read of the same address: 
> it is
>    unpredictable whether the old non-leaf PTE or the new leaf PTE is used, 
> but the behavior is
>    otherwise well defined.
> The phrase*"but the behavior is otherwise well defined"* emphasizes that even 
> if the TLB sees
> two versions (the old and the new), the architecture guarantees stability, 
> and the behavior
> remains safe — though unpredictable in terms of which translation will be 
> used.
> And I think that this unpredictability is okay, at least, in the case if 
> superpage splitting
> and therefore TLB flushing can be deferred since the old pages (which are 
> used for old mapping)
> still exist and the permissions of the new entries match those of the 
> original ones.
> Also, it seems like there clearing PTE before TLB flushing isn't need too.
> 
> Does it make sense?

Yes, I think this indeed is sufficient as a spec requirement.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.