This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] [PATCH] Yield to VCPU hcall, spinlock yielding

To: habanero@xxxxxxxxxx
Subject: Re: [Xen-devel] [PATCH] Yield to VCPU hcall, spinlock yielding
From: Bryan S Rosenburg <rosnbrg@xxxxxxxxxx>
Date: Wed, 8 Jun 2005 16:49:47 -0400
Cc: Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>, ryanh@xxxxxxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, hohnbaum@xxxxxxxxxx, Orran Y Krieger <okrieg@xxxxxxxxxx>
Delivery-date: Wed, 08 Jun 2005 20:48:11 +0000
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <200506081411.06506.habanero@xxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx

habanero@xxxxxxxxxxxxxxxxxxxxxxx wrote on 06/08/2005 03:11:06 PM:

> > In our original posting, we proposed that the Linux interrupt handler
> > for preemption notifications would create (or unblock) a
> > high-priority kernel thread which would then yield back to the
> > hypervisor.  To Linux on other CPUs, the de-scheduled CPU would
> > appear to be busy running the high-priority thread, and all real work
> > that that CPU had been doing would be eligible for stealing.
> IMO, I don't think this alone is enough to encourage task migration.  
> The primary motivator to steal is a 25% or more load imbalance, and one
> extra fake kernel thread will probably not be enough to trigger this.

The kernel thread is needed at the very least to ensure that all user programs on the de-scheduled CPU are available for migration.  In an important case, a program on the de-scheduled CPU holds a futex, and another CPU goes idle because its program blocks on the futex.  We'd want the idle CPU to pick up the futex holder, and I'm assuming (with very little actual knowledge) that the Linux scheduler would make that happen.

> To solve this and other issues, I believe we need an extra modifier to
> the Linux kernel cpus' load value, which Xen could modify to hint the
> kernel what cpus' relative processing power is.  The Linux kernel
> scheduler's per cpu load values would be something like (max_cpu_power
> / cpu_power * nr_running).  Xen could update cpu_power for a number of
> situations, a "long" preemption, a much faster alternative to a vcpu
> hot-unplug (don't unplug, just set cpu_power to 0), and to normalize
> load values for vcpus which have different time-slice lengths on the
> physical cpus.  
> I would hope something like this could also be used without Xen on Linux
> so it has wider appeal.  One thing that comes to mind is normalizing
> cpus' load when some cpus may be "speed stepped" down for power
> management or heat issues.
> -Andrew

I'd view your "cpu_power" proposal as orthogonal to (or perhaps complementary to) our ideas on preemption notification.  It's aimed more at load-balancing and fair scheduling than specifically at the problems that arise with the preemption of lock holders.  On the apparent CPU speed issue, does Linux account in any way for different interrupt loads on different processors?  Is a program just out of luck if it happens to get scheduled on a processor with heavy interrupt traffic, or will Linux notice that it's not making the same progress as its peers and shuffle things around?  It seems that your cpu_power proposal might have something to contribute here.

- Bryan

- Bryan
Xen-devel mailing list