Re: [Xen-devel] [Patch] continue_hypercall_on_cpu rework using t

To:	"Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx>, Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
Subject:	Re: [Xen-devel] [Patch] continue_hypercall_on_cpu rework using tasklets
From:	Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Date:	Fri, 16 Apr 2010 08:13:37 +0100
Cc:	"xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, "Yu, Ke" <ke.yu@xxxxxxxxx>
Delivery-date:	Fri, 16 Apr 2010 00:14:37 -0700
Envelope-to:	www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to:	<789F9655DD1B8F43B48D77C5D30659731D79778B@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help:	<mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id:	Xen developer discussion <xen-devel.lists.xensource.com>
List-post:	<mailto:xen-devel@lists.xensource.com>
List-subscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe:	<http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender:	xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index:	AcrccV74GaC9GZ9lTbWk9MTZP2+EwQAAhsPFAABQK/sAAk0GUAADclFDACFVn6AACNGn5w==
Thread-topic:	[Xen-devel] [Patch] continue_hypercall_on_cpu rework using tasklets
User-agent:	Microsoft-Entourage/12.24.0.100205

On 16/04/2010 07:42, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:

> Thanks for pointing out the stop_machine_run deadlock issue.
> 
> After more consideration and internally discussion, seems the key point is,
> the tasklet softirq is something like getting a lock for the current vcpu's
> state(i.e. no one else could change that state unless this softirq is
> finished).
>
> So any block action in softirq context, not just vcpu_pause_sync,
> is dangerous and should be avoided (we can't get a lock and do block action
> per my understanding).
> This is because vcpu's state can only be changed by schedule softirq (am I
> right on this?), while schedule softirq can't prempt other softirq. So, more
> generally, anything that will be updated by a softirq context, and will be
> syncrhozed/blocking waitied in xen's vcpu context is in fact a implicit lock
> held by the softirq.

I think you're at risk of over-analysing the situation. You cannot safely
synchronously pause vcpus from within softirq context. I'm not sure we can
extrapolate further than that.

> To the tricky bug on the stop_machine_run(), I think it is in fact similar to
> the cpu_add_remove_lock. The stop_machine_run() is a block action, so we must
> make sure no one will be blocking to get the lock that is held by
> stop_machine_run() already. At that time, we change all components that try to
> get the cpu_add_remove_lock to be try_lock.
> 
> The changes caused by the tasklet is, a new implicit lock is added, i.e. the
> vcpu's state.

Let me be clear: the deadlock I described in stop_machine_run() is *not*
introduced by the c_h_o_c reimplementation. It has been there all along.
Nothing in my description of the deadlock depended on the implementation of
c_h_o_c: it depended only on the baheviour of stop_machine_run itself, which
does not internally use c_h_o_c.

> The straightforward method is like the cpu_add_remove_lock: make everything
> that waiting for the vcpu state change to do softirq between the checking.

Possible, could have impacts of its own of course. We couldn't do
SCHEDULE_SOFTIRQ as we would lose the caller's context, and the caller could
not hold locks when pausing others (although I suppose they generally can't
anyway).

> Maybe the cleaner way is your previous suggestion, that is, put the
> stop_machine_run() in the idle_vcpu(), another way is, turn back to the
> original method, i.e. do it in the schedule_tail.

Argh. That would be annoying!

Another possibility is to... shudder... introduce a timeout. Wait only e.g.,
1ms for all vcpus to enter the STOPMACHINE_PREPARE state and if they don't,
cancel the operation and return -EBUSY. There are already a bunch of
opportunities to return 'spurious' -EBUSY already in the cpu-offline paths,
so tools already need to know to retry at least a certain number of times.
Undoubtedly this is the dirty solution, but it is easy-ish to implement and
should only be allowing us to avoid an extremely rare deadlock possibility.
It would just be critical to pick a large enough timeout!

> Also We are not sure why the continue_hypercall_on_cpu is changed to use
> tasklet. What's the benifit for it? At least I think this is quite confusing,
> because per our understanding, usually hypercall is assumed to execut in
> current context, while this change break the assumption. So any hypercall that
> may use this _c_h_o_c, and any function called by that hypercall, should be
> aware of this, I'm not sure if this is really so correct, at least it may
> cause trouble if someone use this without realize the limitation. From Juergen
> Gross's mail, it seems for cpupool, but I have no idea of the cpupool :-(

The implementation is simpler and lets us get rid of the complexity around
vcpu affinity logic. There aren't that many users of c_h_o_c and most are
still trivially correct. I don't think the changes to c_h_o_c have
introduced any more deadlocks, apart from the freeze_domains issue (which I
believe I have now fixed).

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

WARNING - OLD ARCHIVES

xen-devel

Re: [Xen-devel] [Patch] continue_hypercall_on_cpu rework using tasklets