|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 4/6] x86/idle: Implement a new MWAIT IPI-elision algorithm
On 03/07/2025 3:07 pm, Jan Beulich wrote:
> On 03.07.2025 13:59, Andrew Cooper wrote:
>> On 03/07/2025 10:01 am, Jan Beulich wrote:
>>> On 02.07.2025 16:41, Andrew Cooper wrote:
>>>> @@ -452,6 +466,10 @@ void mwait_idle_with_hints(unsigned int eax, unsigned
>>>> int ecx)
>>>> mwait(eax, ecx);
>>>> spec_ctrl_exit_idle(info);
>>>> }
>>>> +
>>>> + alternative_io("movb $0, %[in_mwait]",
>>>> + "", X86_BUG_MONITOR,
>>>> + [in_mwait] "=m" (stat->in_mwait));
>>> This doesn't strictly need to be an alternative, does it? We can save a
>>> memory write in the buggy case, but that likely makes at most a marginal
>>> difference.
>> It doesn't *need* to be an alternative. It could be:
>>
>> if ( !cpu_bug_monitor )
>> ACCESS_ONCE(stat->in_mwait) = true;
>>
>> but getting rid of the branch is an advantage too.
> That's the other alternative blob. What I mean that here it could simply
> be
>
> ACCESS_ONCE(stat->in_mwait) = false;
>
> without any conditional.
It could. Or it could be consistent with it's other half.
>
>>>> --- a/xen/arch/x86/include/asm/hardirq.h
>>>> +++ b/xen/arch/x86/include/asm/hardirq.h
>>>> @@ -5,7 +5,19 @@
>>>> #include <xen/types.h>
>>>>
>>>> typedef struct {
>>>> - unsigned int __softirq_pending;
>>>> + /*
>>>> + * The layout is important. Any CPU can set bits in
>>>> __softirq_pending,
>>>> + * but in_mwait is a status bit owned by the CPU. softirq_mwait_raw
>>>> must
>>>> + * cover both, and must be in a single cacheline.
>>>> + */
>>>> + union {
>>>> + struct {
>>>> + unsigned int __softirq_pending;
>>>> + bool in_mwait;
>>>> + };
>>>> + uint64_t softirq_mwait_raw;
>>>> + };
>>> To guard against someone changing e.g. __softirq_pending to unsigned long
>>> while ignoring the comment, how about adding a BUILD_BUG_ON() checking both
>>> parts of the union are the same size (which of course would require naming
>>> either the union field within the containing struct or its struct member)?
>> That turns out to be very hard because of the definition of
>> softirq_pending() being common. More below.
> Hmm, yes, I see. Then maybe
>
> BUILD_BUG_ON(offsetof(irq_cpustat_t, in_mwait) +
> sizeof(irq_stat[0].in_mwait) >
> offsetof(irq_cpustat_t, softirq_mwait_raw) +
> sizeof(irq_stat[0].softirq_mwait_raw));
>
> (or something substantially similar using e.g. typeof())?
>
>>>> @@ -9,4 +11,36 @@
>>>> #define HVM_DPCI_SOFTIRQ (NR_COMMON_SOFTIRQS + 4)
>>>> #define NR_ARCH_SOFTIRQS 5
>>>>
>>>> +/*
>>>> + * Ensure softirq @nr is pending on @cpu. Return true if an IPI can be
>>>> + * skipped, false if the IPI cannot be skipped.
>>>> + *
>>>> + * We use a CMPXCHG covering both __softirq_pending and in_mwait, in
>>>> order to
>>>> + * set softirq @nr while also observing in_mwait in a race-free way.
>>>> + */
>>>> +static always_inline bool arch_pend_softirq(unsigned int nr, unsigned int
>>>> cpu)
>>>> +{
>>>> + uint64_t *ptr = &irq_stat[cpu].softirq_mwait_raw;
>>>> + uint64_t old, new;
>>>> + unsigned int softirq = 1U << nr;
>>>> +
>>>> + old = ACCESS_ONCE(*ptr);
>>>> +
>>>> + do {
>>>> + if ( old & softirq )
>>>> + /* Softirq already pending, nothing to do. */
>>>> + return true;
>>>> +
>>>> + new = old | softirq;
>>>> +
>>>> + } while ( (old = cmpxchg(ptr, old, new)) != new );
>>> Don't you mean
>>>
>>> } while ( (new = cmpxchg(ptr, old, new)) != old );
>>>
>>> here
>> No. I'm pretty sure I don't.
>>
>>> (with new latched ahead of the loop and old set from new inside it),
>>> like we have it elsewhere when we avoid the use of yet another variable,
>>> e.g. in inc_linear_{entries,uses}()?
>> Eww, no. Having new and old backwards like that is borderline
>> obfucation, and is unnecessary cognitive complexity for the reader.
> Why backwards? You want to compare the (original) value read against the
> expected old value. The way you wrote it you'll do at least one extra
> loop iteration, as you wait for the fetched (original) value to equal
> "new".
Hmm, yes, that's not quite what I intended, but I'm also not happy
writing anything of the form "new = cmpxchg()". It's plain wrong for
the semantics of cmpxchg.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |