|
|
|
|
|
|
|
|
|
|
xen-devel
Re: [Xen-devel] [PATCH] x86: optimize this_cpu()
>>> On 13.07.10 at 16:26, Keir Fraser <keir.fraser@xxxxxxxxxxxxx> wrote:
> On 13/07/2010 14:35, "Jan Beulich" <JBeulich@xxxxxxxxxx> wrote:
>
>> Besides the .text space savings of over 2.5k on x86-64 (1.5k for
>> x86-32) this removes a load (plus a lea on x86-64) from various
>> frequently executed code paths, and finally provides a reason (other
>> than legibility) to prefer this_cpu() over per_cpu() in all places
>> where smp_processor_id() isn't being called anyway.
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxxxx>
>>
>> --- 2010-06-15.orig/xen/include/asm-x86/current.h 2010-07-13
>> 14:38:21.000000000 +0200
>> +++ 2010-06-15/xen/include/asm-x86/current.h 2010-07-13 15:12:37.000000000
>> +0200
>> @@ -17,6 +17,10 @@ struct vcpu;
>> struct cpu_info {
>> struct cpu_user_regs guest_cpu_user_regs;
>> unsigned int processor_id;
>> + unsigned long per_cpu_offset;
>> +#ifdef __x86_64__
>> + unsigned long __pad_for_stack_bottom;
>> +#endif
>
> That's just nasty. If we need the structure to be 16-byte aligned then we
> should achieve it via __attribute__((__aligned__(16))). And if we add that
> we may as well not ifdef it, I'm sure the up to 12 bytes padding on i386
> won't cause stack overflow.
I indeed considered this apparently cleaner alternative first, but
no, __attribute__((__aligned__())) isn't the right solution here:
For one, the has no effect due to the way get_cpu_info()
calculates its result. Second, sizeof(struct cpu_info) would change
(to a value divisible by 16), and thus offsetof(struct cpu_info,
guest_cpu_user_regs.es) would become indivisible by 16
(triggering the BUG_ON((get_stack_bottom() & 15) != 0) in
cpu_init()) without any way of making it so again.
I found it quite odd that, without any special comment to that
effect, I couldn't just add a single field to struct cpu_info without
causing breakage. The apparently odd extra padding field at
least provides a slight hint towards issues here. A similar issue
is that there is a silent requirement of "current_vcpu" being the
last field...
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|
|
|