xen-devel

[Top] [All Lists]

Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL

from [Tim Deegan]

[Permanent Link][Original]

To:	Jan Beulich <JBeulich@xxxxxxxxxx>
Subject:	Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL
From:	Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Date:	Mon, 14 Mar 2011 10:52:28 +0000
Cc:	George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, "George@xxxxxxxxxx" <George@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
Delivery-date:	Mon, 14 Mar 2011 03:53:08 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<4D7DFE76020000780003634E@xxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References:	<osstest-6374-mainreport@xxxxxxx> <19834.24888.630582.491364@xxxxxxxxxxxxxxxxxxxxxxxx> <20110314100223.GE24523@xxxxxxxxxxxxxxxxxxxxxxx> <4D7DFE76020000780003634E@xxxxxxxxxxxxxxxxxx>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent:	Mutt/1.5.20 (2009-06-14)

At 10:39 +0000 on 14 Mar (1300099174), Jan Beulich wrote:
> > I think this hang comes because although this code:
> > 
> >             cpu = cycle_cpu(CSCHED_PCPU(nxt)->idle_bias, nxt_idlers);
> >             if ( commit )
> >                CSCHED_PCPU(nxt)->idle_bias = cpu;
> >             cpus_andnot(cpus, cpus, per_cpu(cpu_sibling_map, cpu));
> > 
> > removes the new cpu and its siblings from cpus, cpu isn't guaranteed to
> > have been in cpus in the first place, and none of its siblings are
> > either since nxt might not be its sibling.
> 
> I had originally spent quite a while to verify that the loop this is in
> can't be infinite (i.e. there's going to be always at least one bit
> removed from "cpus"), and did so again during the last half hour
> or so.

I'm pretty sure there are possible passes through this loop that don't
remove any cpus, though I haven't constructed the full history that gets
you there.  But the cpupool patches you suggest in your other email look
like much stronger candidates for this hang.

> > which guarantees that nxt will be removed from cpus, though I suspect
> > this means that we might not pick the best HT pair in a particular core.
> > Scheduler code is twisty and hurts my brain so I'd like George's
> > opinion before checking anything in.
> 
> No - that was precisely done the opposite direction to get
> better symmetry of load across all CPUs. With what you propose,
> idle_bias would become meaningless.

I don't think see why it would.  As I said, having picked a core we
might not iterate to pick the best cpu within that core, but the
round-robining effect is still there.  And even if not I figured a
hypervisor crash is worse than a suboptimal scheduling decision. :)

Tim.

-- 
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

[More with this subject...]

<Prev in Thread]	Current Thread	[Next in Thread>
[Xen-devel] [xen-unstable test] 6374: regressions - FAIL, xen . org Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Ian Jackson Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Tim Deegan Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Jan Beulich Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Tim Deegan <= Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Jan Beulich Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Tim Deegan Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Jan Beulich Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Juergen Gross

Previous by Date:	[Xen-devel] A problem about remus, taojiang628
Next by Date:	RE: [Xen-devel] RE: Rather slow time of Pin in Windows with GPL PVdriver, James Harper
Previous by Thread:	Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Jan Beulich
Next by Thread:	Re: [Xen-devel] [xen-unstable test] 6374: regressions - FAIL, Jan Beulich
Indexes:	[Date] [Thread] [Top] [All Lists]