|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Graphical glitches (not refreshing?) with Linux's xe driver + Xen 4.19
On Wed, Jun 17, 2026 at 10:30:08PM +0200, Marek Marczykowski-Górecki wrote: > On Mon, Mar 02, 2026 at 12:19:04PM +0100, Marek Marczykowski-Górecki wrote: > > On Tue, Feb 24, 2026 at 04:58:25PM +0100, Marek Marczykowski-Górecki wrote: > > > On Fri, Feb 13, 2026 at 02:23:06AM +0100, Marek Marczykowski-Górecki > > > wrote: > > > > On Thu, Feb 12, 2026 at 04:11:50PM +0100, Roger Pau Monné wrote: > > > > > On Tue, Feb 10, 2026 at 07:06:20PM +0100, Marek Marczykowski-Górecki > > > > > wrote: > > > > > > Hi, > > > > > > > > > > > > Recently I started testing compatibility with Intel Lunar Lake. > > > > > > This is > > > > > > the first one that uses "xe" instead of "i915" Linux driver for > > > > > > iGPU. > > > > > > I test it with Qubes OS 4.3, which uses Xen 4.19.4 and PV dom0 > > > > > > running > > > > > > Linux 6.17.9 in this test. > > > > > > > > > > Not sure it's going to help a lot, but does using a PVH dom0 make any > > > > > difference? > > > > > > > > Ok, now with the correct Xen version, it's better with PVH dom0. At > > > > least on the login screen and few applications (from both dom0 and domU) > > > > I don't see the glitches anymore. I can't do a full test, because PCI > > > > passthrough doesn't seem to work with PVH dom0 on Xen 4.19 - and I need > > > > it to start most VMs. > > > > > > > > So, if the above test is representative, it's only about PV dom0. > > > > > > Some further observations: > > > > > > 1. My initial impression that Xen 4.17.6 is not affected is false. > > > Apparently I got lucky and didn't waited long enough for glitches to > > > appear. Unfortunately this means I have no way to bisect this... > > > > > > 1a. Updated test procedure - either: > > > - start Qubes OS in full (including default system domUs) and try to > > > open an app in one of them (for example file manager or pdf viewer) > > > - start Linux up to lightdm login page, log in, log out, click on a > > > few lightdm menus (session type selector, poewroff menu etc) > > > > > > The second version works even if toolstack version in dom0 doesn't match > > > Xen version. If no glitches are observed after doing either of those > > > procedures, assume it's good. > > > > > > 2. Xen staging is affected too. As well as Xen staging-4.19 without > > > any qubes patches. > > > > > > 3. After enabling CONFIG_DEBUG in Xen, the xe.ko fails to load firmware: > > > > > > xe 0000:00:02.0: [drm] Tile0: GT0: Using GuC firmware from > > > xe/lnl_guc_70.bin version 70.53.0 > > > xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: load failed: status = > > > 0x40000056, time = 0ms, freq = 1850MHz (req 1850MHz), done = -1 > > > xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: load failed: status: Reset > > > = 0, BootROM = 0x2B, UKernel = 0x00, MIA = 0x00, Auth = 0x01 > > > xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: firmware production part > > > check failure > > > xe 0000:00:02.0: [drm] *ERROR* Tile0: GT0: Failed to initialize uC > > > (-EPROTO) > > > xe 0000:00:02.0: probe with driver xe failed with error -71 > > > > > > CONFIG_DEBUG is the only change between "xe.ko loads fine but there are > > > glitches later on" and "xe.ko fails to load at all". Full console logs: > > > https://gist.github.com/marmarek/47b5e62a2cdbae6678c2aecc5283cd3f, there > > > are 3 files: > > > - CONFIG_DEBUG=n > > > - CONFIG_DEBUG=y > > > - CONFIG_DEBUG=y + iommu=debug > > > > > > 4. Updating to Linux 7.0-rc1 doesn't help, for example: > > > https://openqa.qubes-os.org/tests/168119#step/desktop_linux_manager_create_qube/11 > > > > > > Generally, it does feel like a bug in xe.ko, but I can't exclude some > > > issue > > > on Xen side too (especially given point 3 above). > > > > After waiting some time (Linux 6.19.5 this time), Xen CONFIG_DEBUG=n, I get > > some timeout messages: > > > > [ 8.122120] xe 0000:00:02.0: [drm] [ENCODER:204:DDI A/PHY A] failed > > to retrieve link info, disabling eDP > > [ 8.148476] xe 0000:00:02.0: [drm] Tile0: GT0: Using GuC firmware > > from xe/lnl_guc_70.bin version 70.53.0 > > [ 8.803845] xe 0000:00:02.0: [drm] Tile0: GT0: ccs1 fused off > > [ 8.804208] xe 0000:00:02.0: [drm] Tile0: GT0: ccs2 fused off > > [ 8.804556] xe 0000:00:02.0: [drm] Tile0: GT0: ccs3 fused off > > [ 8.822426] xe 0000:00:02.0: [drm] Tile0: GT1: Using GuC firmware > > from xe/lnl_guc_70.bin version 70.53.0 > > [ 8.827140] xe 0000:00:02.0: [drm] Tile0: GT1: Using HuC firmware > > from xe/lnl_huc.bin version 9.4.13 > > [ 8.829478] xe 0000:00:02.0: [drm] Tile0: GT1: Using GSC firmware > > from xe/lnl_gsc_1.bin version 104.0.5.1429 > > [ 8.852923] xe 0000:00:02.0: [drm] Tile0: GT1: vcs1 fused off > > [ 8.853513] xe 0000:00:02.0: [drm] Tile0: GT1: vcs2 fused off > > [ 8.854090] xe 0000:00:02.0: [drm] Tile0: GT1: vcs3 fused off > > [ 8.854706] xe 0000:00:02.0: [drm] Tile0: GT1: vcs4 fused off > > [ 8.855310] xe 0000:00:02.0: [drm] Tile0: GT1: vcs5 fused off > > [ 8.855904] xe 0000:00:02.0: [drm] Tile0: GT1: vcs6 fused off > > [ 8.856495] xe 0000:00:02.0: [drm] Tile0: GT1: vcs7 fused off > > [ 8.857079] xe 0000:00:02.0: [drm] Tile0: GT1: vecs1 fused off > > [ 8.857675] xe 0000:00:02.0: [drm] Tile0: GT1: vecs2 fused off > > [ 8.858272] xe 0000:00:02.0: [drm] Tile0: GT1: vecs3 fused off > > [ 8.975881] xe 0000:00:02.0: [drm] Registered 3 planes with drm panic > > [ 8.976586] [drm] Initialized xe 1.1.0 for 0000:00:02.0 on minor 0 > > [ 8.980882] ACPI: video: Video Device [GFX0] (multi-head: yes rom: > > no post: no) > > [ 9.033754] xe 0000:00:02.0: [drm] Tile0: GT1: found GSC cv104.1.0 > > ... > > [ 1218.319232] xe 0000:00:02.0: [drm] Tile0: GT0: Engine reset: > > engine_class=rcs, logical_mask: 0x1, guc_id=3 > > [ 1218.319890] xe 0000:00:02.0: [drm] Tile0: GT0: Timedout job: > > seqno=9883, lrc_seqno=9883, guc_id=3, flags=0x0 in Xorg [3245] > > [ 1218.320736] xe 0000:00:02.0: [drm] Xe device coredump has been > > created > > [ 1218.321140] xe 0000:00:02.0: [drm] Check your > > /sys/class/drm/card0/device/devcoredump/data > > [ 1222.285626] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] > > flip_done timed out > > [ 1232.525685] xe 0000:00:02.0: [drm] *ERROR* flip_done timed out > > [ 1232.526280] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] commit > > wait timed out > > [ 1242.765717] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] > > flip_done timed out > > [ 1253.005696] xe 0000:00:02.0: [drm] *ERROR* flip_done timed out > > [ 1253.006248] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] commit > > wait timed out > > [ 1263.245599] xe 0000:00:02.0: [drm] *ERROR* [CRTC:88:pipe A] > > flip_done timed out > > > > The glitches appear much earlier, though. > > Would content of /sys/class/drm/card0/device/devcoredump/data be useful > > for debugging this? Yes, it would. Jobs hanging can be a bug anywhere in the stack (e.g., Hardware bug, KMD bug, UMD bug, application bug, etc...) but the devcoredump would give us some hints. > > > > Full log at https://openqa.qubes-os.org/tests/168813/file/serial0.txt > > (warning, almost 200MB of those errors...) > > The issue still happens with Linux 7.0.12. Current log (quite similar to > the previous one): > https://openqa.qubes-os.org/tests/184602/logfile?filename=serial0.txt Hmm, the 'not started' messages in the dmesg are a bit concerning as this really shouldn't be possible to trigger even if user space is doing something wrong. Can you file a gitlab issue against Xe here: https://gitlab.freedesktop.org/drm/xe/kernel/issues/new TBH, I have no idea if running Xen / Qubes OS + Xe is something anyone at Intel has tried out, so please include instructions on to how reproduce and we will see in someone on engineering team can take a look at this and if issues in Xe KMD exist, try to get these fixed. Matt > > Not long after GPU errors, nvme driver fails due to full swiotlb. > > Any ideas? > > -- > Best Regards, > Marek Marczykowski-Górecki > Invisible Things Lab
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |