This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen

To: Haitao Shan <maillists.shan@xxxxxxxxx>
Subject: Re: [Xen-devel] Re: [PATCH 1/4] CPU online/offline support in Xen
From: Frank van der Linden <Frank.Vanderlinden@xxxxxxx>
Date: Wed, 10 Sep 2008 10:05:06 -0600
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, "Shan, Haitao" <haitao.shan@xxxxxxxxx>, Keir Fraser <keir.fraser@xxxxxxxxxxxxx>
Delivery-date: Wed, 10 Sep 2008 09:05:40 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <481ad8630809100559k2ecdb5ffidab0a2754f0cf869@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <823A93EED437D048963A3697DB0E35DE01BE83CC@xxxxxxxxxxxxxxxxxxxxxxxxxxxx> <C4ED6370.26F21%keir.fraser@xxxxxxxxxxxxx> <481ad8630809100559k2ecdb5ffidab0a2754f0cf869@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
User-agent: Thunderbird (X11/20080616)
Haitao Shan wrote:
Agree. Placing migration in stop_machine context will definitely make
our jobs easier. I will start making a new patch tomorrow. :)
I place the migraton code outside the stop_machine_run context, partly
because I am not quite sure how long it will take to migrate all the
vcpus away. If it takes too much time, all useful works are blocked
since all cpus are in the stop_machine context. Of course, I borrowed
the ideas from kernel, which also let me made the desicion.

2008/9/10 Keir Fraser <keir.fraser@xxxxxxxxxxxxx>:
I feel this is more complicated than it needs to be.

How about clearing VCPUs from the offlined CPU's runqueue from the very end
of __cpu_disable()? At that point all other CPUs are safely in softirq
context with IRQs disabled, and we are running on the correct CPU (being
offlined). We could have a hook into the scheduler subsystem at that point
to break affinities, assign to different runqueues, etc. We would just need
to be careful not to try an IPI. :-) This approach would not need a
cpu_schedule_map (which is really increasing code fragility imo, by creating
possible extra confusion about which cpumask is the wright one to use in a
given situation).

My feeling, unless I've missed something, is that this would make the patch
quite a bit smaller and with a smaller spread of code changes.

 -- Keir
This would also address some problems I saw with the patch: race conditions regarding migration of VCPUs, because other CPUs may call runq_tickle. Or a hypercall may come in changing the VCPU affinity, since things are done in 2 stages.

The changes I have are more complicated, because I was working off 3.1.4, which is our current Xen version. It doesn't have things like stop_machine_run. But if the patch is simplified in this manner, it is easier for us to use, and we can just backport things like stop_machine_run for the time being.

The other issue I was seeing was that cpu_up sometimes did not succeed in actually getting a CPU to boot. But there have been a few fixes to smpboot.c, so I'll have to see if that always works now.

- Frank

Xen-devel mailing list