Xen project Mailing List

Re: Pinned, non-revocable mappings of VRAM: will bad things happen?

To: Demi Marie Obenour <demiobenour@xxxxxxxxx>, dri-devel@xxxxxxxxxxxxxxxxxxxxx, Xen developer discussion <xen-devel@xxxxxxxxxxxxxxxxxxxx>, linux-media@xxxxxxxxxxxxxxx

From: Christian König <christian.koenig@xxxxxxx>

Date: Mon, 20 Apr 2026 19:58:40 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ABIJ7tmj+LPKbyCca4GcZ4wgsHCa5XzmVwLgBnt1wHM=; b=jdd81WtBI9t8UzCfBcznmLGX015dSPCE/I4cvY2tcIHVJ+7X1je2P/+KTpd8zeqT8jHGvE1dopCnNT02Bo27LnB46lfVzld2fzYvmgxNbXNLhrekegTtcfsfzkri2x5v6fV930GGu78j54Pd7EvpYqXBF4Cnab5hshopm2bmYlaGigUX4gJ6MKFHT371GYJzW1FU6/sO/Xrd+qJVo+PabCBbjjfKQ6nHVTMZUyKQGfepqLh1U7LYvh6hRjlAqBes/P2hO+HOvEy8IKkD+7GWnN3zh0PuclkGeQsEVyXkMxz52B2tcTb2h4be5KdY9Im/Ut7rgcvG37Gx4oDopViQAA==

Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=edQAgWRZbROMMROzRfF3x/nT/ta4xrLcZJ1D2CeJZ/1LP8Fmtiv6ROEiPH1BpSzdeyZXe1lKvrx8Pp6ScFAdNRNQcceLn/8+drV5zbkAGmZ3+Ho251/fEaA0XDX+DaGpgu1LlYwl79EUyZz8/84cwiVUvuQRzIybvlnwcwSNaQAprllu4AZnEQ4g8GoeD2XNV1yAwOTOaGthlKQIydKIdyhhlUDPJAMBkU8xOpGndTf1/pCec609w+sKwNk4auGZMw21sUs58duRT808e62vdBQrRCtEzv8c8TsRqiHBvRX1LaG/VXZhShxmTh8wfpwVyiA8dbB+zO5/cG14o6Ptrw==

Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=amd.com header.i="@amd.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com;

Cc: Val Packett <val@xxxxxxxxxxxxxxxxxxxxxx>, Suwit Semal <sumit.semwal@xxxxxxxxxx>

Delivery-date: Mon, 20 Apr 2026 17:59:02 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 4/20/26 19:03, Demi Marie Obenour wrote: > On 4/20/26 04:49, Christian König wrote: >> On 4/17/26 21:35, Demi Marie Obenour wrote: ... >>> Are any of the following reasonable options? >>> >>> 1. Change the guest kernel to only map (and thus pin) a small subset >>> of VRAM at any given time. If unmapped VRAM is accessed the guest >>> traps the page fault, evicts an old VRAM mapping, and creates a >>> new one. >> >> Yeah, that could potentially work. >> >> This is basically what we do on the host kernel driver when we can't resize >> the BAR for some reason. In that use case VRAM buffers are shuffled in and >> out of the CPU accessible window of VRAM on demand. > > How much is this going to hurt performance? Hard to say, resizing the BAR can easily give you 10-15% more performance on some use cases. But that involves physically transferring the data using a DMA. For this solution we basically only have to we basically only have to transfer a few messages between host and guest. No idea how performant that is. >>> 2. Pretend that resizable BAR is not enabled, so the guest doesn't >>> think it can map much of VRAM at once. If resizable BAR is enabled >>> on the host, it might be possible to split the large BAR mapping >>> in a lot of ways. >> >> That won't work. The userspace parts of the driver stack don't care how >> large the BAR to access VRAM with the CPU is. >> >> The expectation is that the kernel driver makes thing CPU accessible as >> needed in the page fault handler. >> >> It is still a good idea for your solution #1 to give the amount of >> "pin-able" VRAM to the userspace stack as CPU visible VRAM limit so that >> test cases and applications try to lower their usage of VRAM, e.g. use >> system memory bounce buffers when possible. > > That makes sense. > >>> Or does Xen really need to allow the host to handle guest page faults? >>> That adds a huge amount of complexity to trusted and security-critical >>> parts of the system, so it really is a last resort. Putting the >>> complexity in to the guest virtio-GPU driver is vastly preferable if >>> it can be made to work well. >> >> Well the nested page fault handling KVM offers has proven to be extremely >> useful. So when XEN can't do this it is clearly lacking an important feature. > > I agree. However, it is a lot of work to implement, which is why I'm > looking for alternatives if possible. > > KVM is part of the Linux kernel, so it can just call the Linux kernel > functions used to handle userspace page faults. Xen is separate from > Linux, so it can't do that. Instead, it will need to: > > 1. Determine that the fault needs to be handled by another VM, and > the ID of the VM that needs to handle the fault. > 2. Send a message to the VM asking it to handle the fault. > 3. Block the vCPU until it gets a response. > > Then the VM owning the memory will need to call the page fault handler > and provide the memory to Xen. Xen then needs to: > > 4. Map the memory into the nested page tables of the VM that faulted. > 5. Resume the vCPU. > >> But I have one question: When XEN has a problem handling faults from the >> guest on the host then how does that work for system memory mappings? >> >> There is really no difference between VRAM and system memory in the handling >> for the GPU driver stack. >> >> Regards, >> Christian. > > Generally, Xen makes the frontend (usually an unprivileged VM) > responsible for providing mappings to the backend (usually the host). > That is possible with system RAM but not with VRAM, because Xen has > no awareness of VRAM. To Xen, VRAM is just a PCI BAR. No, that doesn't work with system memory allocations of GPU drivers either. We already had it multiple times that people tried to be clever and incremented the page reference counter on driver allocated system memory and were totally surprised that this can result in security issues and data corruption. I seriously hope that this isn't the case here again. As far as I know XEN already has support for accessing VMAs with VM_PFN or otherwise I don't know how driver allocated system memory access could potentially work. Accessing VRAM is pretty much the same use case as far as I can see. Regards, Christian. > KVM runs in the same kernel as the GPU driver. Xen doesn't, and that > is the source of the extra complexity.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.