|
|
|
|
|
|
|
|
|
|
xen-devel
[Xen-devel] sedf scheduler may cause a CPU fatal trap
Hello,
I played with the sEDF scheduler included in the xen-3.0-testing.hg and
everything is just fine except a CPU fatal trap error that appeared
several times. Here is what I've done on a SMP (two processors) machine:
I started two unprivileged domains and I compiled a kernel in
each of them using the command:
# time sh -c "make O=/home/guill/build/k2614 oldconfig \
&& make O=/home/guill/build/k2614"
1) Two domains with default sefd (seems to be best-effort):
| domain 1 | domain 2 |
|-------------|------------|
real | 11m43.034s | 11m46.293s |
user | 10m20.220s | 10m25.140s |
sys | 1m08.330s | 1m09.100s |
--------------------------
The xentop showed that domain1 was using aroung 99% of the CPU and it
was the same for domain2.
2) Two domains with 20ms/5ms (ie 25% of CPU time) and 20ms/15ms (ie 75%
of CPU time) with no extra time:
xm sched-sedf 1 20000000 5000000 0 0 0
xm sched-sedf 2 20000000 15000000 0 0 0
| domain 1 | domain 2 |
|-------------|------------|
real | 45m35.626s | 15m04.808s |
user | 41m04.300s | 13m37.940s |
sys | 4m24.050s | 1m25.160s |
--------------------------
The xentop showed that domain1 was using around 25% of the CPU
whereas domain2 was using around 75%.
3) Two domains with 20ms/5ms (ie 25% of CPU time) and 20ms/15ms (ie 75%
of CPU time) with extra time:
xm sched-sedf 1 20000000 5000000 0 1 0
xm sched-sedf 2 20000000 15000000 0 1 0
| domain 1 | domain 2 |
|-------------|------------|
real | 11m48.687s | 11m50.909s |
user | 10m36.870s | 10m36.180s |
sys | 1m08.320s | 1m09.540s |
--------------------------
With extra time enabled, the xentop shows that domain 1 is using
around 97% of CPU and domain 2 is using around 97% too.
4) Two domains with 20ms/5ms (ie 25% of CPU time) and 20ms/15ms (ie 75%
of CPU time) without extra time but we change the politics when
compilation in the second domain finished:
xm sched-sedf 1 20000000 5000000 0 0 0
xm sched-sedf 2 20000000 15000000 0 0 0
when second domain finished its job:
xm sched-sedf 1 20000000 0 0 1 0
xm sched-sedf 2 20000000 0 0 1 0
when I changed the politics, the xen hypervisor crashed and I get the
following error:
(XEN) CPU: 1
(XEN) EIP: e008:[<ff108d7e>] __qdivrem+0x4e/0x580
(XEN) EFLAGS: 00010046 CONTEXT: hypervisor
(XEN) eax: 00000001 ebx: 00000000 ecx: 00000000 edx: 00000000
(XEN) esi: c4b40000 edi: 00000004 ebp: 00000000 esp: ff1afd94
(XEN) cr0: 8005003b cr3: 6d236000
(XEN) ds: e010 es: e010 fs: 0000 gs: 0033 ss: e010 cs: e008
(XEN) Xen stack trace from esp=ff1afd94:
(XEN) 00000002 00000001 00007100 ff1afe20 00000989 0000ff1f 00000002
00000009
(XEN) 00000002 00000001 0000c000 ff1afde0 ff1afdfc ff1afe18 00000000
00000000
(XEN) 00000000 00000000 00000000 00000000 00000991 ff1afe38 00000571
00000000
(XEN) 00000000 00000000 00000000 0000ff1f 0000c000 00000000 00000000
00000000
(XEN) 00000000 00000000 00000000 0000c944 00004000 00000004 00000000
ff1b5e84
(XEN) ff1b6d84 ffbfa980 ff10c8b0 00000000 c4b40000 00000004 00000000
ff1092ff
(XEN) c4b40000 00000004 00000000 00000000 00000000 ff1ad080 b46d68de
ff1b5e88
(XEN) ff1b5e80 c4b40000 00000004 ff10e443 c4b40000 00000004 00000000 00000000
(XEN) ff1afee4 b2a993ef 000012b4 ff10d898 00001000 00000001 ff1b5080
ffbfa980
(XEN) ff1b5e80 b2aea7e3 000012b4 ff10d8c0 b2aea7e3 000012b4 ff1b5080
00000080
(XEN) 0000efff 0000fe80 e6525499 000012b4 00000001 ff1924a0 b3355354
000012b4
(XEN) ffbfa988 00000001 ffbfa990 ffbfa998 00000096 00000001 bfb12eb8
00000096
(XEN) 00000000 00000000 ff174010 ff174010 b2aea7e3 000012b4 ff1aff74
ff10ec3b
(XEN) ff1aff74 b2aea7e3 000012b4 00000033 0000000c 00000000 00000000
ff12111d
(XEN) 0000000c 00000000 00055080 ff1b5080 00000080 00000000 00000001
ff1b5080
(XEN) ff1affb4 00000000 ff1249ce ff1affb4 ff1affb4 00000020 00000000 00000080
(XEN) b7efa860 00000005 bfb12eb8 ff10f732 00000005 bfb12eb8 ff1b5080
ff1354c6
(XEN) b7ef8ff4 00000000 00000001 b7efa860 00000005 bfb12eb8 00000000
000d0000
(XEN) b7e2e549 00000073 00010286 bfb12e90 0000007b 0000007b 0000007b
00000000
(XEN) 00000033 00000001 ff1b5080
(XEN) Xen call trace:
(XEN) [<ff108d7e>] __qdivrem+0x4e/0x580
(XEN) [<ff10c8b0>] runq_comp+0x0/0x70
(XEN) [<ff1092ff>] __divdi3+0x4f/0xa0
(XEN) [<ff10e443>] desched_extra_dom+0x1f3/0x210
(XEN) [<ff10d898>] sedf_do_schedule+0x228/0x260
(XEN) [<ff10d8c0>] sedf_do_schedule+0x250/0x260
(XEN) [<ff10ec3b>] __enter_scheduler+0x7b/0x2e0
(XEN) [<ff12111d>] mod_l1_entry+0x9d/0xf0
(XEN) [<ff1249ce>] do_general_protection+0xbe/0x180
(XEN) [<ff10f732>] do_softirq+0x32/0x50
(XEN) [<ff1354c6>] process_softirqs+0x6/0x8
(XEN)
(XEN) ************************************
(XEN) CPU1 FATAL TRAP 0 (divide error), ERROR_CODE 0000, IN INTERRUPT CONTEXT.
(XEN) System shutting down -- need manual reset.
(XEN) ************************************
This fatal trap doesn't appear if we use
xm sched-sedf 1 20000000 5000000 0 1 0
Did someone else have this problem? I can reproduce the bug on my Xeon
x86_64 box so I can provide more inputs.
Hope this help,
Best regards,
Guillaume
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [Xen-devel] sedf scheduler may cause a CPU fatal trap,
Guillaume Thouvenin <=
|
|
|
|
|