xen-devel
Re: [Xen-devel] Re: NUMA and SMP
Thank you for your reply
I see
and does xen support the numa-aware guestlinux now or in the future?
another question maybe should be another topic
what is the function of xc_map_foreign_range()in /tools/libxc/xc_linux.c?
does xc_map_foreign_range() mmap the shared memory with another domain
,or with domain0 ,orsomething?
could you help me
Thanks in advance
Petersson, Mats 写道:
>> -----Original Message-----
>> From: tgh [mailto:tianguanhua@xxxxxxxxxx]
>> Sent: 20 March 2007 13:10
>> To: Emmanuel Ackaouy
>> Cc: Petersson, Mats; Anthony Liguori; xen-devel; David
>> Pilger; Ryan Harper
>> Subject: Re: [Xen-devel] Re: NUMA and SMP
>>
>> I am puzzled ,what is the page migration?
>> Thank you in advance
>>
>
> I'm not entirely sure it's the correct term, but I used to indicate that if
> you allocate some memory local to processor no X, and then later on, the page
> is used by processor Y, then one could consider "moving" the page from the
> memory region of X to the memory region of Y. So you "migrate" the page from
> one processor to another. This is of course not a "free" operation, and it's
> only really helpful if the memory is accessed many times (and not cached each
> time it's accessed).
>
> A case where this can be done "almost for free" is when a page is swapped
> out, and on return, allocate the page from the processor that made the
> access. But of course, if you're looking for ultimate performance, swapping
> is a terrible idea - so making small optimizations in memory management when
> you're loosing tons of cycles by swapping is meaningless as a overall
> performance gain.
>
> --
> Mats
>
>> Emmanuel Ackaouy 写道:
>>
>>> On the topic of NUMA:
>>>
>>> I'd like to dispute the assumption that a NUMA-aware OS can actually
>>> make good decisions about the initial placement of memory in a
>>> reasonable hardware ccNUMA system.
>>>
>>> How does the OS know on which node a particular chunk of memory
>>> will be most accessed? The truth is that unless the application or
>>> person running the application is herself NUMA-aware and can provide
>>> placement hints or directives, the OS will seldom beat a
>>>
>> round-robin /
>>
>>> interleave or random placement strategy.
>>>
>>> To illustrate, consider an app which lays out a bunch of
>>>
>> data in memory
>>
>>> in a single thread and then spawns worker threads to process it.
>>>
>>> Is the OS to place memory close to the initial thread? How can it
>>> possibly
>>> know how many threads will eventually process the data?
>>>
>>> Even if the OS knew how many threads will eventually crunch
>>>
>> the data,
>>
>>> it cannot possibly know at placement time if each thread
>>>
>> will work on an
>>
>>> assigned data subset (and if so, which one) or if it will act as a
>>> pipeline
>>> stage with all the data being passed from one thread to the next.
>>>
>>> If you go beyond initial memory placement or start
>>>
>> considering memory
>>
>>> migration, then it's even harder to win because you have to pay copy
>>> and stall penalties during migrations. So you have to be real smart
>>> about predicting the future to do better than your ~10-40% memory
>>> bandwidth and latency hit associated with doing simple memory
>>> interleaving on a modern hardware-ccNUMA system.
>>>
>>> And it gets worse for you when your app is successfully
>>>
>> taking advantage
>>
>>> of the memory cache hierarchy because its performance is
>>>
>> less impacted
>>
>>> by raw memory latency and bandwidth.
>>>
>>> Things also get more difficult on a time-sharing host with competing
>>> apps.
>>>
>>> There is a strong argument for making hypervisors and OSes NUMA
>>> aware in the sense that:
>>> 1- They know about system topology
>>> 2- They can export this information up the stack to
>>>
>> applications and
>>
>>> users
>>> 3- They can take in directives from users and applications to
>>> partition the
>>> host and place some threads and memory in specific partitions.
>>> 4- They use an interleaved (or random) initial memory
>>>
>> placement strategy
>>
>>> by default.
>>>
>>> The argument that the OS on its own -- without user or application
>>> directives -- can make better placement decisions than
>>>
>> round-robin or
>>
>>> random placement is -- in my opinion -- flawed.
>>>
>>> I also am skeptical that the complexity associated with
>>>
>> page migration
>>
>>> strategies would be worthwhile: If you got it wrong the
>>>
>> first time, what
>>
>>> makes you think you'll do better this time?
>>>
>>> Emmanuel.
>>>
>>>
>>> _______________________________________________
>>> Xen-devel mailing list
>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>> http://lists.xensource.com/xen-devel
>>>
>>>
>>>
>>
>>
>>
>
>
>
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
|
|