WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

RE: [Xen-devel] unnecessary VCPU migration happens again

To: "Emmanuel Ackaouy" <ack@xxxxxxxxxxxxx>, "Xu, Anthony" <anthony.xu@xxxxxxxxx>
Subject: RE: [Xen-devel] unnecessary VCPU migration happens again
From: "Petersson, Mats" <Mats.Petersson@xxxxxxx>
Date: Wed, 6 Dec 2006 15:13:59 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, xen-ia64-devel <xen-ia64-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Wed, 06 Dec 2006 06:14:13 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <20061206140135.GA19638@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AccZP0mLWj/gHGpCR7q6jrObarAEPwAAKClA
Thread-topic: [Xen-devel] unnecessary VCPU migration happens again
 

> -----Original Message-----
> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx 
> [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of 
> Emmanuel Ackaouy
> Sent: 06 December 2006 14:02
> To: Xu, Anthony
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; xen-ia64-devel
> Subject: Re: [Xen-devel] unnecessary VCPU migration happens again
> 
> Hi Anthony.
> 
> Could you send xentrace output for scheduling operations
> in your setup?
> 
> Perhaps we're being a little too aggressive spreading
> work across sockets. We do this on vcpu_wake right now.
> 
> I'm not sure I understand why HVM VCPUs would block
> and wake more often than PV VCPUs though. Can you
> explain?

Whilst I don't know any of the facts of the original poster, I can tell
you why HVM and PV guests have differing number of scheduling
operations... 

Every time you get a IOIO/MMIO vmexit that leads to a qemu-dm
interaction, you'll get a context switch. So for an average IDE block
read/write (for example) on x86, you get 4-5 IOIO intercepts to send the
command to qemu, then an interrupt is sent to the guest to indicate that
the operation is finished, followed by a 256 x 16-bit IO read/write of
the sector content (which is normally just one IOIO intercept unless the
driver is "stupid"). This means around a dozen or so schedule operations
to do one disk IO operation.

The same operation in PV (or using PV driver in HVM guest of course)
would require a single transaction from DomU to Dom0 and back, so only
two schedule operations. 

The same "problem" occurs of course for other hardware devices such as
network, keyboard, mouse, where a transaction consists of more than a
single read or write to a single register. 

--
Mats
> 
> If you could gather some scheduler traces and send
> results, it will give us a good idea of what's going
> on and why. The multi-core support is new and not
> widely tested so it's possible that it is being
> overly aggressive or perhaps even buggy.
> 
> Emmanuel.
> 
> 
> On Fri, Dec 01, 2006 at 06:11:32PM +0800, Xu, Anthony wrote:
> > Emmanue,
> > 
> > I found that unnecessary VCPU migration happens again.
> > 
> > 
> > My environment is,
> > 
> > IPF two sockes, two cores per socket, 1 thread per core.
> > 
> > There are 4 core totally.
> > 
> > There are 3 domain, they are all UP,
> > So there are 3 VCPU totally.
> > 
> > One is domain0,
> > The other two are VTI-domain.
> > 
> > I found there are lots of migrations.
> > 
> > 
> > This is caused by below code segment in function csched_cpu_pick.
> > When I comments this code segment, there is no migration in above 
> > enviroment. 
> > 
> > 
> > 
> > I have a little analysis about this code.
> > 
> > This code handls multi-core and multi-thread, that's very good,
> > If two VCPUS running on LPs which belong to the same core, then the
> > performance
> > is bad, so if there are free LPS, we should let this two 
> VCPUS run on
> > different cores.
> > 
> > This code may work well with para-domain.
> > Because para-domain is seldom blocked,
> > It may be block due to guest call "halt" instruction.
> > This means if a idle VCPU is running on a LP,
> > there is no non-idle VCPU running on this LP.
> > In this evironment, I think below code should work well.
> > 
> > 
> > But in HVM environment, HVM is blocked by IO operation,
> > That is to say, if a idle VCPU is running on a LP, maybe a
> > HVM VCPU is blocked, and HVM VCPU will run on this LP, when
> > it is woken up.
> > In this evironment, below code cause unnecessary migrations.
> > I think this doesn't reach the goal ot this code segment.
> > 
> > In IPF side, migration is time-consuming, so it caused some 
> performance
> > degradation.
> > 
> > 
> > I have a proposal and it may be not good.
> > 
> > We can change the meaning of idle-LP,
> > 
> > Idle-LP means a idle-VCPU is running on this LP, and there 
> is no VCPU
> > blocked on this
> > LP.( if this VCPU is woken up, this VCPU will run on this LP).
> > 
> > 
> > 
> > --Anthony
> > 
> > 
> >         /*
> >          * In multi-core and multi-threaded CPUs, not all 
> idle execution
> >          * vehicles are equal!
> >          *
> >          * We give preference to the idle execution vehicle with the
> > most
> >          * idling neighbours in its grouping. This distributes work
> > across
> >          * distinct cores first and guarantees we don't do something
> > stupid
> >          * like run two VCPUs on co-hyperthreads while 
> there are idle
> > cores
> >          * or sockets.
> >          */
> >         while ( !cpus_empty(cpus) )
> >         {
> >             nxt = first_cpu(cpus);
> > 
> >             if ( csched_idler_compare(cpu, nxt) < 0 )
> >             {
> >                 cpu = nxt;
> >                 cpu_clear(nxt, cpus);
> >             }
> >             else if ( cpu_isset(cpu, cpu_core_map[nxt]) )
> >             {
> >                 cpus_andnot(cpus, cpus, cpu_sibling_map[nxt]);
> >             }
> >             else
> >             {
> >                 cpus_andnot(cpus, cpus, cpu_core_map[nxt]);
> >             }
> > 
> >             ASSERT( !cpu_isset(nxt, cpus) );
> >         }
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
> 
> 
> 



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel