[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Assertion '!is_idle_vcpu(v)' failed after 'Remove fully_eager_fpu' commit on EFI


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 12 Jun 2026 16:45:41 +0200
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=google header.d=suse.com header.i="@suse.com" header.h="Content-Transfer-Encoding:In-Reply-To:Autocrypt:From:Content-Language:References:Cc:To:Subject:User-Agent:MIME-Version:Date:Message-ID"
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>
  • Delivery-date: Fri, 12 Jun 2026 14:45:51 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 12.06.2026 16:32, Andrew Cooper wrote:
> On 12/06/2026 3:20 pm, Jan Beulich wrote:
>> On 12.06.2026 16:18, Andrew Cooper wrote:
>>> On 12/06/2026 3:11 pm, Marek Marczykowski-Górecki wrote:
>>>> On Fri, Jun 12, 2026 at 03:53:49PM +0200, Anthony PERARD wrote:
>>>>> Hi,
>>>>>
>>>>> Since commit dba44e051209 ("x86: Remove fully_eager_fpu"), I can't boot
>>>>> a machine and get assertion '!is_idle_vcpu(v)' failed instead. It's
>>>>> netbooted and EFI.
>>>>>
>>>>> Xen call trace:
>>>>>    [<ffff82d04033da2c>] R vcpu_save_fpu+0x65/0xdc
>>>>>    [<ffff82d04029c5c4>] S efi_rs_enter+0x37/0x16a
>>>>>    [<ffff82d04029c7e3>] F efi_get_time+0x19/0xb2
>>>>>    [<ffff82d04047cbf0>] F init_xen_time+0x1e3/0x2b4
>>>>>    [<ffff82d040477a49>] F __start_xen+0x1d71/0x24b8
>>>>>    [<ffff82d0402043e7>] F __high_start+0xb7/0xc0
>>>>>
>>>>> Assertion '!is_idle_vcpu(v)' failed at arch/x86/i387.c:195
>>>>>
>>>>> A few more lines from Xen:
>>>>>     CPU Vendor: Intel, Family 6 (0x6), Model 86 (0x56), Stepping 3 (raw 
>>>>> 00050663)
>>>>>     Bootloader: GRUB 2.06
>>>>>     [...]
>>>>>     Enabling APIC mode.  Using 2 I/O APICs
>>>>>     ENABLING IO-APIC IRQs
>>>>>      -> Using old ACK method
>>>>>      ..TIMER: vector=0xF0 apic1=0 pin1=2 apic2=-1 pin2=-1
>>>>>     TSC deadline timer enabled
>>>>>     Assertion '!is_idle_vcpu(v)' failed at arch/x86/i387.c:195
>>>>>
>>>>> Commit this Xen is built from: 50936ea05660.
>>>> Interesting, the efi_get_time() way is nowadays a fallback if cmos one
>>>> isn't advertised. Can you try adding `cmos-rtc-probe`?
>>>>
>>>> Anyway, surely it shouldn't crash... The commit you mentioned has "No
>>>> functional change intended", but well...
>>> Well, no intended change.  It was a very big patch.
>>>
>>> Nothing should ever be using efi_get_time().  It's unusable (i.e.
>>> crashing) on hundreds of millions of machines.
>>>
>>> So, while we obviously do need to fix the assertion, this is "only"
>>> collateral damage from having fallen into the efi_get_time() path in the
>>> first place.  That wants investigating too.
>> Perhaps a reduced-hardware system with ACPI_FADT_NO_CMOS_RTC set?
> 
> The identified system is a Broadwell-D.
> 
> Come to think of it, there were some systems of that era which (falsely)
> claimed to have no CMOS.  (An HP Haswell Blade comes to mind, but it
> will be a similar chipset.)
> 
>> On such systems efi_get_time() would better work properly.
> 
> Wouldn't that have been nice.  On the bug I looked at at the time, it
> was just as broken as prior systems.
> 
> It's a vicious positive feedback cycle.  Windows and Linux ignore
> efi_get_time() entirely because it's broken in a way you can't probe
> for, and as a result the codepath get 0 testing by OEMs/ISVs and nothing
> gets fixed.

Do Linux and Windows then ignore ACPI_FADT_NO_CMOS_RTC on such systems? Else
how would they establish wallclock time there?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.