[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pinned, non-revocable mappings of VRAM: will bad things happen?


  • To: Demi Marie Obenour <demiobenour@xxxxxxxxx>, Christian König <christian.koenig@xxxxxxx>, dri-devel@xxxxxxxxxxxxxxxxxxxxx, Xen developer discussion <xen-devel@xxxxxxxxxxxxxxxxxxxx>, linux-media@xxxxxxxxxxxxxxx
  • From: Val Packett <val@xxxxxxxxxxxxxxxxxxxxxx>
  • Date: Tue, 21 Apr 2026 13:55:14 -0300
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=fm2 header.d=invisiblethingslab.com header.i="@invisiblethingslab.com" header.h="Cc:Content-Transfer-Encoding:Content-Type:Date:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To"; dkim=pass header.s=fm2 header.d=messagingengine.com header.i="@messagingengine.com" header.h="Cc:Content-Transfer-Encoding:Content-Type:Date:Feedback-ID:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To:X-ME-Proxy:X-ME-Sender"
  • Autocrypt: addr=val@xxxxxxxxxxxxxxxxxxxxxx; keydata= xm8EaFTEiRMFK4EEACIDAwQ+qzawvLuE95iu+QkRqp8P9z6XvFopWtYOaEnYf/nE8KWCnsCD jz82tdbKBpmVOdR6ViLD9tzHvaZ1NqZ9mbrszMXq09VfefoCfZp8jnA2yCT8Y4ykmv6902Ne NnlkVwrNKFZhbCBQYWNrZXR0IDx2YWxAaW52aXNpYmxldGhpbmdzbGFiLmNvbT7CswQTEwkA OxYhBAFMrro+oMGIFPc7Uc87uZxqzalRBQJoVMSJAhsDBQsJCAcCAiICBhUKCQgLAgQWAgMB Ah4HAheAAAoJEM87uZxqzalRlIIBf0cujzfSLhvib9iY8LBh8Tirgypm+hJHoY563xhP0YRS pmqZ6goIuSGpEKcW5mV3egF/TLLAOjsfroWae4giImTVOJvLOsUycxAP4O5b1Qiy+cCGsHKA nCRzrvqnPkyf4OeRznMEaFTEiRIFK4EEACIDAwSffe3tlMmmg3eKVp7SJ+CNZLN0M5qzHSCV dBBkIVvEJo+8SDg4jrx/832rxpvMCz2+x7+OHaeBHKafhOWUccYBLKqV/3nBftxCkbzXDbfY d02BY9H4wBIn0Y3GnwoIXRgDAQkJwpgEGBMJACAWIQQBTK66PqDBiBT3O1HPO7mcas2pUQUC aFTEiQIbDAAKCRDPO7mcas2pUaptAX9f7yUJLGU4C6XjMJvXd8Sz6cGTyxkngPtUyFiNqtad /GXBi3vHKYNfSrdqJ8wmZ8MBgOqWaaa1wE4/3qZU8d4RNR8mF7O40WYK/wdf1ycq1uGad8PN UDOwAqdfvuF3w8QMPw==
  • Cc: Suwit Semal <sumit.semwal@xxxxxxxxxx>, "Pelloux-Prayer, Pierre-Eric" <Pierre-eric.Pelloux-prayer@xxxxxxx>
  • Delivery-date: Tue, 21 Apr 2026 16:55:36 +0000
  • Feedback-id: i001e48d0:Fastmail
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 4/20/26 4:12 PM, Demi Marie Obenour wrote:
On 4/20/26 14:53, Christian König wrote:
On 4/20/26 20:46, Demi Marie Obenour wrote:
On 4/20/26 13:58, Christian König wrote:
On 4/20/26 19:03, Demi Marie Obenour wrote:
On 4/20/26 04:49, Christian König wrote:
On 4/17/26 21:35, Demi Marie Obenour wrote:
...
Are any of the following reasonable options?

1. Change the guest kernel to only map (and thus pin) a small subset
    of VRAM at any given time.  If unmapped VRAM is accessed the guest
    traps the page fault, evicts an old VRAM mapping, and creates a
    new one.
Yeah, that could potentially work.

This is basically what we do on the host kernel driver when we can't resize the 
BAR for some reason. In that use case VRAM buffers are shuffled in and out of 
the CPU accessible window of VRAM on demand.
How much is this going to hurt performance?
Hard to say, resizing the BAR can easily give you 10-15% more performance on 
some use cases.

But that involves physically transferring the data using a DMA. For this 
solution we basically only have to we basically only have to transfer a few 
messages between host and guest.

No idea how performant that is.
In this use-case, 20-30% performance penalties are likely to be
"business as usual".
Well that is quite a bit.

Close to native performance would be ideal, but
to be useful it just needs to beat software rendering by a wide margin,
and not cause data corruption or vulnerabilities.
That should still easily be the case, even trivial use cases are multiple 
magnitudes faster on GPUs compared to software rendering.
Makes sense.  If only GPUs supported easy and flexible virtualization the way 
CPUs do :(.

But I have one question: When XEN has a problem handling faults from the guest 
on the host then how does that work for system memory mappings?

There is really no difference between VRAM and system memory in the handling 
for the GPU driver stack.

Regards,
Christian.
Generally, Xen makes the frontend (usually an unprivileged VM)
responsible for providing mappings to the backend (usually the host).
That is possible with system RAM but not with VRAM, because Xen has
no awareness of VRAM.  To Xen, VRAM is just a PCI BAR.
No, that doesn't work with system memory allocations of GPU drivers either.

We already had it multiple times that people tried to be clever and incremented 
the page reference counter on driver allocated system memory and were totally 
surprised that this can result in security issues and data corruption.

I seriously hope that this isn't the case here again. As far as I know XEN 
already has support for accessing VMAs with VM_PFN or otherwise I don't know 
how driver allocated system memory access could potentially work.

Accessing VRAM is pretty much the same use case as far as I can see.

Regards,
Christian.
The Xen-native approach would be for system memory allocations to
be made using the Xen driver and then imported into the virtio-GPU
driver via dmabuf.  Is there any chance this could be made to happen?
That could be. Adding Pierre-Eric to comment since he knows that use much 
better than I do.

If it's a lost cause, then how much is the memory overhead of pinning
everything ever used in a dmabuf?  It should be possible to account
pinned host memory against a guest's quota, but if that leads to an
unusable system it isn't going to be good.
That won't work at all.

We have use cases where you *must* migrate a DMA-buf to VRAM or otherwise the 
GPU can't use it.

A simple scanout to a monitor is such an use case for example, that is usually 
not possible from system memory.
Direct scanout isn't a concern here.

Is supporting page faults in Xen the only solution that will be viable
long-term, considering the tolerance for very substantial performance
overheads compared to native?  AAA gaming isn't the initial goal here.
Qubes OS already supports PCI passthrough for that.
We have AAA gaming working on XEN through native context working for quite a 
while.

Pierre-Eric can tell you more about that.

Regards,
Christian.
I've heard of that, but last I checked it required downstream patches
to Xen, Linux, and QEMU.  I don't know if any of those have been
upstreamed since, but I believe that upstreaming the Xen and Linux
patches (or rewriting them and upstreaming the rewritten version) would
be necessary.  Qubes OS (which I don't work for anymore but still want
to help with this) almost certainly won't be using QEMU for GPU stuff.

Yeah, our plan is to use xen-vhost-frontend[1] + vhost-device-gpu, ported/extended/modified as necessary. (I already have xen-vhost-frontend itself working on amd64 PVH with purely xenbus-based hotplug/configuration, currently working on cleaning up and submitting the necessary patches.)

I'm curious to hear more details about how AMD has it working but last time I checked, there weren't any missing pieces in Xen or Linux that we'd need.. The AMD downstream changes were mostly related to QEMU.

As for the memory management concerns, I would like to remind everyone once again that the pinning of GPU dmabufs in regular graphics workloads would be *very* short-term. In GPU paravirtualization (native contexts or venus or whatever else) the guest mostly operates on *opaque handles* that refer to buffers owned by the host GPU process. The typical rendering process (roughly) only involves submitting commands to the GPU that refer to memory using these handles. Only upon mmap() would a buffer be pinned/granted to the guest, and those are typically only used for *uploads* where the guest immediately does its memcpy() and unmaps the buffer.

So I'm not worried about (unintentionally) pinning too much GPU driver memory.

In terms of deliberate denial-of-service attacks from the guest to the host, the only reasonable response is:

¯\_(ツ)_/¯

CPU-mapping lots of GPU memory is far from the only DoS vector, the GPU commands themselves can easily wedge the GPU core in a million ways (and last time I checked amdgpu was noooot so good at recovering from hangs).


[1]: https://github.com/vireshk/xen-vhost-frontend

~val




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.