[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [Patch] continue_hypercall_on_cpu rework using tasklets




>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
>Sent: Friday, April 16, 2010 3:14 PM
>To: Jiang, Yunhong; Juergen Gross
>Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Yu, Ke
>Subject: Re: [Xen-devel] [Patch] continue_hypercall_on_cpu rework using 
>tasklets
>
>On 16/04/2010 07:42, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:
>
>> Thanks for pointing out the stop_machine_run deadlock issue.
>>
>> After more consideration and internally discussion, seems the key point is,
>> the tasklet softirq is something like getting a lock for the current vcpu's
>> state(i.e. no one else could change that state unless this softirq is
>> finished).
>>
>> So any block action in softirq context, not just vcpu_pause_sync,
>> is dangerous and should be avoided (we can't get a lock and do block action
>> per my understanding).
>> This is because vcpu's state can only be changed by schedule softirq (am I
>> right on this?), while schedule softirq can't prempt other softirq. So, more
>> generally, anything that will be updated by a softirq context, and will be
>> syncrhozed/blocking waitied in xen's vcpu context is in fact a implicit lock
>> held by the softirq.
>
>I think you're at risk of over-analysing the situation. You cannot safely
>synchronously pause vcpus from within softirq context. I'm not sure we can
>extrapolate further than that.
>
>> To the tricky bug on the stop_machine_run(), I think it is in fact similar to
>> the cpu_add_remove_lock. The stop_machine_run() is a block action, so we must
>> make sure no one will be blocking to get the lock that is held by
>> stop_machine_run() already. At that time, we change all components that try 
>> to
>> get the cpu_add_remove_lock to be try_lock.
>>
>> The changes caused by the tasklet is, a new implicit lock is added, i.e. the
>> vcpu's state.
>
>Let me be clear: the deadlock I described in stop_machine_run() is *not*
>introduced by the c_h_o_c reimplementation. It has been there all along.
>Nothing in my description of the deadlock depended on the implementation of
>c_h_o_c: it depended only on the baheviour of stop_machine_run itself, which
>does not internally use c_h_o_c.

Oops, yes, this should be there already, without c_h_o_c.
So, if another vcpu is trying to vcpu_pause() this vcpu, it will deadlock 
becaues this vcpu can't be paused, right? (I assume the same reason to 
hvmop_flush_tlb_all).

If this is true, then how about checking !vcpu_runnable(current) in 
stop_machine_run()'s stopmachine_set_state()? If this check failed, it means 
someone try to change our state and potential deadlock may happen, we can 
cancel this stop_machine_run()? This is at least a bit cleaner than the timeout 
mechanism.

>
>> The straightforward method is like the cpu_add_remove_lock: make everything
>> that waiting for the vcpu state change to do softirq between the checking.
>
>Possible, could have impacts of its own of course. We couldn't do
>SCHEDULE_SOFTIRQ as we would lose the caller's context, and the caller could
>not hold locks when pausing others (although I suppose they generally can't
>anyway).
>
>> Maybe the cleaner way is your previous suggestion, that is, put the
>> stop_machine_run() in the idle_vcpu(), another way is, turn back to the
>> original method, i.e. do it in the schedule_tail.
>
>Argh. That would be annoying!

Seems do it in the schedule_tail will not benifit this deadlock issues.

>
>Another possibility is to... shudder... introduce a timeout. Wait only e.g.,
>1ms for all vcpus to enter the STOPMACHINE_PREPARE state and if they don't,
>cancel the operation and return -EBUSY. There are already a bunch of
>opportunities to return 'spurious' -EBUSY already in the cpu-offline paths,
>so tools already need to know to retry at least a certain number of times.
>Undoubtedly this is the dirty solution, but it is easy-ish to implement and
>should only be allowing us to avoid an extremely rare deadlock possibility.
>It would just be critical to pick a large enough timeout!
>
>> Also We are not sure why the continue_hypercall_on_cpu is changed to use
>> tasklet. What's the benifit for it? At least I think this is quite confusing,
>> because per our understanding, usually hypercall is assumed to execut in
>> current context, while this change break the assumption. So any hypercall 
>> that
>> may use this _c_h_o_c, and any function called by that hypercall, should be
>> aware of this, I'm not sure if this is really so correct, at least it may
>> cause trouble if someone use this without realize the limitation. From 
>> Juergen
>> Gross's mail, it seems for cpupool, but I have no idea of the cpupool :-(
>
>The implementation is simpler and lets us get rid of the complexity around
>vcpu affinity logic. There aren't that many users of c_h_o_c and most are
>still trivially correct. I don't think the changes to c_h_o_c have
>introduced any more deadlocks, apart from the freeze_domains issue (which I
>believe I have now fixed).

Yes, c_h_o_c does not introduce any more deadlock, my concern is, it's name 
maybe confusing if one try to use it for other hypercall . After all, if a 
hypercall and function used by the hypercall depends on current, it should not 
use this c_h_o_c.

--jyh

>
> -- Keir
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.