WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH 00/12] cpumask handling scalability improvements

To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH 00/12] cpumask handling scalability improvements
From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Date: Thu, 20 Oct 2011 16:49:22 +0100
Delivery-date: Thu, 20 Oct 2011 08:50:25 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <4EA0583B020000780005C8A8@xxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4EA03FF4020000780005C797@xxxxxxxxxxxxxxxxxxxx> <CAC5F83E.235C3%keir.xen@xxxxxxxxx> <4EA0583B020000780005C8A8@xxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.23) Gecko/20110921 Lightning/1.0b2 Thunderbird/3.1.15

On 20/10/11 16:19, Jan Beulich wrote:
>>>> On 20.10.11 at 17:09, Keir Fraser <keir.xen@xxxxxxxxx> wrote:
>> On 20/10/2011 14:36, "Jan Beulich" <JBeulich@xxxxxxxx> wrote:
>>
>>> This patch set makes some first steps towards eliminating the old cpumask
>>> accessors, replacing them by such that don't require the full NR_CPUS
>>> bits to be allocated (which obviously can be pretty wasteful when
>>> NR_CPUS is high, but the actual number is low or moderate).
>>>
>>> 01: introduce and use nr_cpu_ids and nr_cpumask_bits
>>> 02: eliminate cpumask accessors referencing NR_CPUS
>>> 03: eliminate direct assignments of CPU masks
>>> 04: x86: allocate IRQ actions' cpu_eoi_map dynamically
>>> 05: allocate CPU sibling and core maps dynamically
>> I'm not sure about this. We can save ~500 bytes per cpumask_t when
>> NR_CPUS=4096 and actual nr_cpus<64. But how many cpumask_t's do we typically
>> have dynamically allocated all at once? Let's say we waste 2kB per VCPU and
>> per IRQ, and we have a massive system with ~1k VCPUs and ~1k IRQs -- we'd
>> save ~4MB in that extreme case. But such a large system probably actually
>> will have a lot of CPUs. And also a lot of memory, such that 4MB is quite
>> insignificant.
> It's not only the memory savings, but the time savings in manipulating
> less space.
>
>> I suppose there is a second argument that it shrinks the containing
>> structures (struct domain, struct vcpu, struct irq_desc, ...) and maybe
>> helps reduce our order!=0 allocations?
> Yes - that's what made me start taking over these Linux bits. What I
> sent here just continues on that route. I was really hoping that we
> wouldn't leave this in a half baked state.
>
>> By the way, I think we could avoid the NR_CPUS copying overhead everywhere
>> by having the cpumask.h functions respect nr_cpu_ids, but continuing to
>> return NR_CPUS for sentinel value (e.g., end of loop; or no bit found)? This
>> would not need to change tonnes of code. It only gets part of the benefit
>> (reducing cpu time overhead) but is more palatable?
> That would be possible, but would again leave is in a somewhat
> incomplete state. (Note that I did leave NR_CPUS in the stop-
> machine logic).
>
>>> 06: allow efficient allocation of multiple CPU masks at once
>> That is utterly hideous and for insignificant saving.
> I was afraid you would say that, and I'm not fully convinced
> either. But I wanted to give it a try to see how bad it is. The
> more significant saving here really comes from not allocating
> the CPU masks at all for unused irq_desc-s.
>
> Jan

The saving of not allocating masks for unused irq_desc's (and irq_cfg's)
will be significant in the general case.  (3 * NR_UNUSED_IRQs *
sizeof(mask)) where the average system is wasting most of 224 IRQs per CPU.

However, I am against moving the masks out of irq_desc (perhaps this is
the C++ coder inside me).

Would an acceptable alternative be to change irq_desc to use
cpumask_var_t's and allocate them on first use? (I have not spent long
thinking about this, so it is possible that the extra checks for Null
pointers on the irq path might be counter productive?)

~Andrew

>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

-- 
Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer
T: +44 (0)1223 225 900, http://www.citrix.com


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel