|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Assertion '!is_idle_vcpu(v)' failed after 'Remove fully_eager_fpu' commit on EFI
On 12.06.2026 16:32, Andrew Cooper wrote:
> On 12/06/2026 3:20 pm, Jan Beulich wrote:
>> On 12.06.2026 16:18, Andrew Cooper wrote:
>>> On 12/06/2026 3:11 pm, Marek Marczykowski-Górecki wrote:
>>>> On Fri, Jun 12, 2026 at 03:53:49PM +0200, Anthony PERARD wrote:
>>>>> Hi,
>>>>>
>>>>> Since commit dba44e051209 ("x86: Remove fully_eager_fpu"), I can't boot
>>>>> a machine and get assertion '!is_idle_vcpu(v)' failed instead. It's
>>>>> netbooted and EFI.
>>>>>
>>>>> Xen call trace:
>>>>> [<ffff82d04033da2c>] R vcpu_save_fpu+0x65/0xdc
>>>>> [<ffff82d04029c5c4>] S efi_rs_enter+0x37/0x16a
>>>>> [<ffff82d04029c7e3>] F efi_get_time+0x19/0xb2
>>>>> [<ffff82d04047cbf0>] F init_xen_time+0x1e3/0x2b4
>>>>> [<ffff82d040477a49>] F __start_xen+0x1d71/0x24b8
>>>>> [<ffff82d0402043e7>] F __high_start+0xb7/0xc0
>>>>>
>>>>> Assertion '!is_idle_vcpu(v)' failed at arch/x86/i387.c:195
>>>>>
>>>>> A few more lines from Xen:
>>>>> CPU Vendor: Intel, Family 6 (0x6), Model 86 (0x56), Stepping 3 (raw
>>>>> 00050663)
>>>>> Bootloader: GRUB 2.06
>>>>> [...]
>>>>> Enabling APIC mode. Using 2 I/O APICs
>>>>> ENABLING IO-APIC IRQs
>>>>> -> Using old ACK method
>>>>> ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
>>>>> TSC deadline timer enabled
>>>>> Assertion '!is_idle_vcpu(v)' failed at arch/x86/i387.c:195
>>>>>
>>>>> Commit this Xen is built from: 50936ea05660.
>>>> Interesting, the efi_get_time() way is nowadays a fallback if cmos one
>>>> isn't advertised. Can you try adding `cmos-rtc-probe`?
>>>>
>>>> Anyway, surely it shouldn't crash... The commit you mentioned has "No
>>>> functional change intended", but well...
>>> Well, no intended change. It was a very big patch.
>>>
>>> Nothing should ever be using efi_get_time(). It's unusable (i.e.
>>> crashing) on hundreds of millions of machines.
>>>
>>> So, while we obviously do need to fix the assertion, this is "only"
>>> collateral damage from having fallen into the efi_get_time() path in the
>>> first place. That wants investigating too.
>> Perhaps a reduced-hardware system with ACPI_FADT_NO_CMOS_RTC set?
>
> The identified system is a Broadwell-D.
>
> Come to think of it, there were some systems of that era which (falsely)
> claimed to have no CMOS. (An HP Haswell Blade comes to mind, but it
> will be a similar chipset.)
>
>> On such systems efi_get_time() would better work properly.
>
> Wouldn't that have been nice. On the bug I looked at at the time, it
> was just as broken as prior systems.
>
> It's a vicious positive feedback cycle. Windows and Linux ignore
> efi_get_time() entirely because it's broken in a way you can't probe
> for, and as a result the codepath get 0 testing by OEMs/ISVs and nothing
> gets fixed.
Do Linux and Windows then ignore ACPI_FADT_NO_CMOS_RTC on such systems? Else
how would they establish wallclock time there?
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |