>From: Keir Fraser [mailto:Keir.Fraser@xxxxxxxxxxxx]
>Sent: 2008年2月4日 1:44
>This is a good thing to have, and my main comments are
>stylistic. Firstly, I
>think the name and interface change compared with Linux's stop_machine
>mechanism is gratuitous. I would prefer us to have the same
>also probably use the same or similar names throughout the
>where appropriate). I don't think the more flexible interface
>is useful -- certainly finding an application for
>rendezvousing a set of
>CPUs and running a function on a subset of those, with
>optional irq disable,
>is a bit of a brain bender. :-) So, stick to stop_machine interface,
>semantics, and naming....
Here I'd like to explain why Linux stop_machine semantics is a bit
unhandily for Xen, and to change that further vary from Linux logic
which lead me to create a new name. :-)
Linux creates kernel threads and schedule them to individual cpus
to fulfill synchronization process. It's the one executing (*fn) to
conduct the whole process. For example, if _stop_machine is called
on cpu_i, with target cpu_j to execute (*fn). stop_machine first
creates a kernel thread, bound to cpu_j and then wait for completion
on cpu_i which causes context_switch. When kernel thread is
scheduled on cpu_j, it then creates (max_cpus - 1) kernel threads
on the rest cpus and wait them to be scheduled. Once stop threads
are scheduled on all cpus, cpu_j first command other cpus to stop
activity, then execute (*fn), and finally commands resume to others.
As the result, previous flow on cpu_i is waken up upon completion.
Xen has no dynamically created vcpus and thus cpu_i can only send
a notification to cpu_j and wait cpu_j to check at some time. It's
naturally to take a softirq bit since do_softirq is in a safe point just
before resuming to guest or in idle_loop.
Then it's a bit cumbersome to let cpu_j to conduct the stop process.
First cpu_i needs to block itself after sending notification to cpu_j,
or else cpu_i would not handle softirq later when cpu_j requests to
do. Then block on what? Maybe an event channel... But the contin-
uation stack mechanism in Xen decides more dirty work to be added
since previous frames are lost and stack reset to bottom after resume.
Uah, maybe we can temporarily check softirq on cpu_i in a check
loop, however that still may cause context switch if, a schedule
softirq is triggered in between.
All above just made me distraught to follow Linux semantics, and
further thought leads me to why not let cpu_i to conduct stop process
directly, and then let concerned cpus to call (*fn) by adding a new
action as ***_INVOKE. By this way, cpu_i doesn't need to be cut
off from current flow, and once stop_machine returns, all necessary
works to be handled in a stopped environment are fulfilled.
Actually further we may even conduct cpu hotplug in one call, with
all other APs to invoke cpu_disable at same time at S3.
Finally coming up above conclusion, I really thought it's different from
Linux stop_machine semantics, and thus re-wrote the stuff including
name as differentation. Though only S3 is the only user on this inf
by far, I do think it useful in other places later when one cpu wants
to kick another one to do something in a safe point.
>Also, I'd prefer this to be implemented in
>common/stop_machine.c if at all
>possible. It's not really x86 specific. Certainly I do not
>want it in smp.c,
>as that file is full enough already of random cruft with no other home.
>Oh, also I think you are missing local_irq_disable() on the
>CPU that calls
>on_rendezvous_cpus(). Like the Linxu implementation you should
>it at the same time you signal other CPUs to do so.
>Apart from that it's a good idea, and I'll look more closely
>at how you tie
>it in to CPU hotplug when you resubmit it.
Then before re-submission, I'll wait for you further comment to my
forenamed explanation first.
Xen-devel mailing list