xen-devel
Re: [Xen-devel] dom0 hang
Hi Kevin/Yu:
acpi_processor_idle()
{
sched_tick_suspend();
/*
* sched_tick_suspend may raise TIMER_SOFTIRQ by __stop_timer,
* which will break the later assumption of no sofirq pending,
* so add do_softirq
*/
if ( softirq_pending(smp_processor_id()) )
do_softirq(); <===============
local_irq_disable();
if ( softirq_pending(smp_processor_id()) )
{
local_irq_enable();
sched_tick_resume();
cpufreq_dbs_timer_resume();
return;
}
wouldn't the do_softirq() call scheduler with tick suspended, and
the scheduler then context switches to another vcpu0 (with *_BOOST) which
would result in the stuck vcpu I described?
thanks
Mukesh
Mukesh Rathor wrote:
ah, i totally missed csched_tick():
if ( !is_idle_vcpu(current) )
csched_vcpu_acct(cpu);
yeah, looks like that's what is going on. i'm still waiting to
reproduce. at first glance, looking at c/s 19460, seems like
suspend/resume, well at least the resume, should happen in
csched_schedule().....
thanks,
Mukesh
George Dunlap wrote:
[Oops, adding back in distro list, also adding Kevin Tian and Yu Ke
who wrote cs 19460]
The functionality I was talking about, subtracting credits and
clearing BOOST, happens in csched_vcpu_acct() (which is different than
csched_acct()). vcpu_acct() is called from csched_tick(), which
should still happen every 10ms on every cpu.
The patch I referred to (cs 19460) disables and re-enables tickers in
xen/arch/x86/acpi/cpu_idle.c:acpi_processor_idle() every time the
processor idles. I can't see anywhere else that tickers are disabled,
so it's probably something not properly re-enabling them again.
Try applying the attached patch to see if that changes anything. (I'm
on the road, so I can't repro the lockup issue.) If that doesn't
work, try disabling c-states and see if that helps. Then at least
we'll know where the problem lies.
-George
On Thu, Jul 2, 2009 at 10:10 PM, Mukesh
Rathor<mukesh.rathor@xxxxxxxxxx> wrote:
that seems to only suspend csched_pcpu.ticker which is csched_tick
that is
only sorting local runq.
again, we are concerned about csched_priv.master_ticker that calls
csched_acct? correct, so i can trace that?
thanks,
mukesh
George Dunlap wrote:
Ah, I see that there's been some changes to tick stuff with the
c-state (e.g., cs 19460). It looks like they're supposed to be going
still, but perhaps the tick_suspend() and tick_resume() aren't being
called properly. Let me take a closer look.
-George
On Thu, Jul 2, 2009 at 8:14 PM, Mukesh Rathor<mukesh.rathor@xxxxxxxxxx>
wrote:
George Dunlap wrote:
On Thu, Jul 2, 2009 at 4:19 AM, Mukesh
Rathor<mukesh.rathor@xxxxxxxxxx>
wrote:
dom0 hang:
vcpu0 is trying to wakeup a task and in try_to_wake_up() calls
task_rq_lock(). since the task has cpu set to 1, it gets runq lock
for vcpu1. next it calls resched_task() which results in sending
IPI
to vcpu1. for that, vcpu0 gets into the HYPERVISOR_event_channel_op
HCALL and is waiting to return. Meanwhile, vcpu1 got running,
and is
spinning on it's runq lock in
"schedule():spin_lock_irq(&rq->lock);",
that vcpu0 is holding (and is waiting to return from the HCALL).
As I had noticed before, vcpu0 never gets scheduled in xen. So
looking further into xen:
xen:
Both vcpu's are on the same runq, in this case cpu1. But the
priority of vcpu1 has been set to CSCHED_PRI_TS_BOOST. As a result,
the scheduler always picks vcpu1, and vcpu0 is starved. Also, I
see in
kdb that the scheduler timer is not set on cpu 0. That would've
allowed csched_load_balance() to kick in on cpu0. [Also, on
cpu1, the accounting timer, csched_tick, is not set. Altho,
csched_tick() is running on cpu0, it only checks runq for cpu0.]
Looks like c/s 19500 changed csched_schedule():
- ret.time = MILLISECS(CSCHED_MSECS_PER_TSLICE);
+ ret.time = (is_idle_vcpu(snext->vcpu) ?
+ -1 : MILLISECS(CSCHED_MSECS_PER_TSLICE));
The quickest fix for us would be to just back that out.
BTW, just a comment on following (all in sched_credit.c):
if ( svc->pri == CSCHED_PRI_TS_UNDER &&
!(svc->flags & CSCHED_FLAG_VCPU_PARKED) )
{
svc->pri = CSCHED_PRI_TS_BOOST;
}
comibined with
if ( snext->pri > CSCHED_PRI_TS_OVER )
__runq_remove(snext);
Setting CSCHED_PRI_TS_BOOST as pri of vcpu seems dangerous. To
me,
since csched_schedule() never checks for time accumulated by a
vcpu at pri CSCHED_PRI_TS_BOOST, that is same as pinning a
vcpu to a
pcpu. if that vcpu never makes progress, essentially, the system
has lost a physical cpu. Optionally, csched_schedule() should
always
check for cpu time accumulated and reduce the priority over time.
I can't tell right off if it already does that. or something like
that :)... my 2 cents.
Hmm... what's supposed to happen is that eventually a timer tick will
interrupt vcpu1. If cpu1 is set to be "active", then it will be
debited 10ms worth of credit. Eventually, it will go into OVER, and
lose BOOST. If it's "inactive", then when the tick happens, it will
be set to "active" and be debited 10ms again, setting it directly
into
OVER (and thus also losing boost).
Can you see if the timer ticks are still happening, and perhaps put
some tracing it to verify that what I described above is happening?
-George
George,
Is that in csched_acct()? Looks like that's somehow gotten removed. If
true, then may be that's the fundamental problem to chase.
Here's what the trq looks like when hung, not in any schedule
function:
[0]xkdb> dtrq
CPU[00]: NOW:0x00003f2db9af369e
1: exp=0x00003ee31cb32200 fn:csched_tick data:0000000000000000
2: exp=0x00003ee347ece164 fn:time_calibration data:0000000000000000
3: exp=0x00003ee69a28f04b fn:mce_work_fn data:0000000000000000
4: exp=0x00003f055895e25f fn:plt_overflow data:0000000000000000
5: exp=0x00003ee353810216 fn:rtc_update_second data:ffff83007f0226d8
CPU[01]: NOW:0x00003f2db9af369e
1: exp=0x00003ee30b847988 fn:s_timer_fn data:0000000000000000
2: exp=0x00003f1b309ebd45 fn:pmt_timer_callback data:ffff83007f022a68
thanks
Mukesh
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- Re: [Xen-devel] dom0 hang, Mukesh Rathor
- Re: [Xen-devel] dom0 hang, Keir Fraser
- Re: [Xen-devel] dom0 hang, George Dunlap
- Re: [Xen-devel] dom0 hang, Mukesh Rathor
- Message not available
- Message not available
- Re: [Xen-devel] dom0 hang, George Dunlap
- Re: [Xen-devel] dom0 hang, Mukesh Rathor
- Re: [Xen-devel] dom0 hang,
Mukesh Rathor <=
- RE: [Xen-devel] dom0 hang, Yu, Ke
- RE: [Xen-devel] dom0 hang, Yu, Ke
- Re: [Xen-devel] dom0 hang, Mukesh Rathor
- RE: [Xen-devel] dom0 hang, Yu, Ke
- Re: [Xen-devel] dom0 hang, Keir Fraser
- Re: [Xen-devel] dom0 hang, Mukesh Rathor
|
|
|