Keir, thanks for your detailed suggestion!
But I suspect even if we backport all Kernel infrastructure, we may still need
some changes for this. Per my understanding, In kernel, the task is migrated
after the offline CPU is dead, there is no lazy state sync, that means we need
fixing this issue still.
Have a short discussion with Kevin, maybe we can sync the state in
cpu_disable_scheduler if current is idle, and then set a flag so that we will
not sync again in context siwtch later. If the current is not idle, we can
leave the context switch to do the sync for us. I will do more investigate to
see how many changes are needed.
--jyh
>-----Original Message-----
>From: Keir Fraser [mailto:keir.fraser@xxxxxxxxxxxxx]
>Sent: Thursday, April 01, 2010 6:25 PM
>To: Jiang, Yunhong
>Cc: Jan Beulich; xen-devel@xxxxxxxxxxxxxxxxxxx
>Subject: Re: [PATCH] [RFC] Fix a small window on CPU online/offline
>
>On 01/04/2010 10:22, "Jiang, Yunhong" <yunhong.jiang@xxxxxxxxx> wrote:
>
>> The following code try to sync the vcpu context in stop_machine_run()
>> context,
>> so that the vCPU will get the the context synced. However, it still not
>> resolve issue c. I'm considering to mark the curr_vcpu() to be idle also, so
>> that idle_task_exit() will not try to sync context again, but I suspect that
>> is not a right way.
>>
>> Any suggestion?
>
>Painful though it is to say it, the answer may be to run the stop_machine in
>a 'hypervisor thread' context, as Linux does. That sidesteps the whole
>issue, but of course would need us to introduce a whole bunch of
>infrastructure which we don't have in Xen presently.
>
>Another approach would be to defer a lot of what goes on in __cpu_disable()
>until play_dead()/cpu_exit_clear(). You could do the stop_machine_run()
>invocation from there, knowing that you can sync guest state before zapping
>the cpu_online_map... Actually this approach does start to unravel and need
>quite a lot of subtle changes itself!
>
>I would probably investigate option A (a more Linux-y stop_machine_run) but
>instead of the kthread_create() infrastructure I would consider extending
>the purpose of our 'idle vcpus' to provide a sufficiently more generic
>'hypervisor vcpu context'. For our purposes that would mean:
> 1. Allow the scheduling priority of an idle vcpu to be changed to
>highest-priority (would mean some, hopefully not very major, scheduler
>surgery).
> 2. Add a hook to the idle loop to call out to a hook in stop_machine.c.
>
>Then you would loop over all online CPUs, like in Linux, whacking up the
>priority and getting the idle vcpu to call back to us. Umm, we would also
>need some kind of wait_for_completion() mechanism, which might look a bit
>like aspects of continue_hypercall_on_cpu() -- we would pause the running
>vcpu, change schedule_tail hook, and exit. We would then get our pause count
>dropped when the stop_machine stuff is done, and re-gain control via
>schedule_tail.
>
>Well, it wouldn't be a trivial patch by any means, but potentially not *too*
>bad, and less fragile than some other approaches? I think it's a major
>benefit that it takes us closer to Linux semantics, as this stuff is
>fragile, and we're already quite a way down the road of Linux but currently
>our stop_machine is just a bit half-arsed and causing us problems.
>
>By the way, you could consider that c/s 21049 starts to take us down this
>path: the spin_trylock()/create_continuation() semantics is not totally
>dissimilar to Linux's mutex_lock (in which other softirqs/vcpus can get to
>run while we wait for the lock to be released), which are used for the
>cpu-hotplug related locks in Linux.
>
> -- Keir
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|