>>> On 15.03.11 at 10:21, Juergen Gross <juergen.gross@xxxxxxxxxxxxxx> wrote:
> On 03/15/11 10:01, Jan Beulich wrote:
>>>>> On 15.03.11 at 09:46, Juergen Gross<juergen.gross@xxxxxxxxxxxxxx> wrote:
>>> On 03/15/11 08:57, Jan Beulich wrote:
>>>>>>> On 15.03.11 at 06:50, Juergen Gross<juergen.gross@xxxxxxxxxxxxxx>
>>>>>>> wrote:
>>>>> On 03/14/11 16:03, Jan Beulich wrote:
>>>>>>>>> On 14.03.11 at 15:39, Juergen Gross<juergen.gross@xxxxxxxxxxxxxx>
>>>>>>>>> wrote:
>>>>>>> On multi-thread multi-core systems an endless loop can occur in
>>>>>>> vcpu_migrate()
>>>>>>> with credit scheduler. Avoid this loop by changing the interface of
>>>>>>> pick_cpu
>>>>>>> to indicate a repeated call in this case.
>>>>>>
>>>>>> But you're not changing in any way the loop that doesn't get
>>>>>> exited - did you perhaps read my original description as the
>>>>>> pick function itself looping (which - afaict - it doesn't)?
>>>>>
>>>>> I'm changing the way the pick_cpu function is reacting on multiple calls
>>>>> in
>>>>> a loop. If I've understood the idle_bias correctly, updating it in each
>>>>> loop iteration did result in returning another cpu for each call.
>>>>> By updating idle_bias only once, it should return the same cpu in
>>>>> subsequent
>>>>> calls. This should exit the loop in vcpu_migrate.
>>>>
>>>> You're only decreasing the likelihood of a live lock, as the return
>>>> value of pick_cpu not only depends on idle_bias.
>>>
>>> Hmm, then another solution would be to let pick_cpu really return the
>>> proposed cpu from the first iteration, if it doesn't contradict the
>>> allowed settings. It could be sub-optimal, but I don't think this is
>>> critical, as vcpu_migrate is called rarely.
>>>
>>> Patch attached.
>>
>> That candidate-is-valid check seems absolutely independent of the
>> particular scheduler used, and hence could be done in the (sole)
>> caller, thus not requiring any change to the scheduler interface.
>>
>> Which at once would eliminate unnecessary calls into pick_cpu (i.e.
>> you'd call it a second time only if the previously selected CPU really
>> is no longer valid to be used for that vCPU).
>
> True.
>
> The patch seems to become smaller :-)
This looks good to me now, and it makes quite obvious that there
is a likely exit path from the loop (it can only live lock now if
v->cpu_affinity and/or v->domain->cpupool->cpu_valid are
constantly changing, which could only be due to a misbehaving
administrator).
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|