xen-devel
Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split
Andre, George,
What seems to be interesting: I think the problem did always occur when
a new cpupool was created and the first cpu was moved to it.
I think my previous assumption regarding the master_ticker was not too bad.
I think somehow the master_ticker of the new cpupool is becoming active
before the scheduler is really initialized properly. This could happen, if
enough time is spent between alloc_pdata for the cpu to be moved and the
critical section in schedule_cpu_switch().
The solution should be to activate the timers only if the scheduler is
ready for them.
George, do you think the master_ticker should be stopped in suspend_ticker
as well? I still see potential problems for entering deep C-States. I think
I'll prepare a patch which will keep the master_ticker active for the
C-State case and migrate it for the schedule_cpu_switch() case.
Juergen
On 02/09/11 14:51, Andre Przywara wrote:
George Dunlap wrote:
<George.Dunlap@xxxxxxxxxxxxx> wrote:
On Tue, Feb 8, 2011 at 4:33 PM, Andre Przywara
<andre.przywara@xxxxxxx> wrote:
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 24
(XEN) cpu_disable_scheduler: Migrating d0v34 from cpu 24
(XEN) cpu_disable_scheduler: Migrating d0v42 from cpu 24
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 25
(XEN) cpu_disable_scheduler: Migrating d0v34 from cpu 25
(XEN) cpu_disable_scheduler: Migrating d0v42 from cpu 25
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 26
(XEN) cpu_disable_scheduler: Migrating d0v32 from cpu 26
(XEN) cpu_disable_scheduler: Migrating d0v42 from cpu 26
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 27
(XEN) cpu_disable_scheduler: Migrating d0v24 from cpu 27
(XEN) cpu_disable_scheduler: Migrating d0v32 from cpu 27
(XEN) cpu_disable_scheduler: Migrating d0v42 from cpu 27
(XEN) cpu_disable_scheduler: Migrating d0v3 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v18 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v25 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v32 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v39 from cpu 28
(XEN) cpu_disable_scheduler: Migrating d0v3 from cpu 29
Interesting -- what seems to happen here is that as cpus are disabled,
vcpus are "shovelled" in an accumulative fashion from one cpu to the
next:
* v18,34,42 start on cpu 24.
* When 24 is brought down, they're all migrated to 25; then when 25 is
brougth down, to 26, then to 27
* v24 is running on cpu 27, so when 27 is brought down, v24 is added
to the mix
* v3 is running on cpu 28, so all of them plus v3 are shoveled onto
cpu 29.
While that behavior may not be ideal, it should certainly be bug-free.
Another interesting thing to note is that the bug happened on pcpu 32,
but there were no advertised migrations from that cpu.
Andre, can you fold the attached patch into your testing?
Sorry, but that bug (and its output) didn't trigger on two tries.
Instead I now saw two occasions of the "migration failed, must retry
later" message. Interestingly enough is does not seem to be fatal. The
first time it triggers, the numa-split even completes, then after I roll
it back and repeat it it shows again, but crashes later on that old
BUG_ON().
See the attached log for more details.
Thanks for the try, anyway.
Regards,
Andre.
Thanks for all your work on this.
I am glad for all your help. I only start to really understand the
scheduler, so your support is much appreciated.
-George
--
Juergen Gross Principal Developer Operating Systems
TSP ES&S SWE OS6 Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28 Internet: ts.fujitsu.com
D-80807 Muenchen Company details: ts.fujitsu.com/imprint.html
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, (continued)
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, Andre Przywara
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split,
Juergen Gross <=
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, Juergen Gross
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, Andre Przywara
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, Andre Przywara
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, Juergen Gross
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, Andre Przywara
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, George Dunlap
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, Juergen Gross
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, Juergen Gross
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, George Dunlap
- Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split, Juergen Gross
|
|
|