WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] atropos scheduler broken

To: Steven Hand <Steven.Hand@xxxxxxxxxxxx>
Subject: Re: [Xen-devel] atropos scheduler broken
From: Ian Pratt <Ian.Pratt@xxxxxxxxxxxx>
Date: Tue, 26 Oct 2004 10:01:33 +0100
Cc: Diwaker Gupta <diwakergupta@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxxx, Ian.Pratt@xxxxxxxxxxxx
Delivery-date: Tue, 26 Oct 2004 10:01:34 +0100
Envelope-to: Steven.Hand@xxxxxxxxxxxx
In-reply-to: Your message of "Tue, 26 Oct 2004 08:36:50 BST." <E1CMLsw-00083A-00@xxxxxxxxxxxxxxxxx>
> >o first I create 2 VMs -- VM1 and VM2
> >o then I change their atropos params as follows:
> >$ xm atropos 1 10 100 1 1
> >$ xm atropos 2 70 100 1 1
> >Ideally, this should guarantee that VM1 gets 10ns of CPU time every
> >100ns, and VM2 gets 70ns every 100ns, and that any left over CPU time
> >will be shared between the 2.


You might find the following program useful while testing out the
scheduler. It prints the amount of CPU it's getting once a
second. Atropos was working fine for CPU bound domains a few
months back, but had some fairly odd behaviour for IO intensive
domains. Because no one has been using it its probably rotted a
bit. The original algorithm (used in the Nemesis OS) worked just
fine, so this is just an implementation issue.

Ian

/******************************************************************************
 * slurp.c
 * 
 * Slurps spare CPU cycles and prints a percentage estimate every second.
 */

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


/* rpcc: get full 64-bit Pentium TSC value */
static __inline__ unsigned long long int rpcc(void) 
{
    unsigned int __h, __l;
    __asm__ __volatile__ ("rdtsc" :"=a" (__l), "=d" (__h));
    return (((unsigned long long)__h) << 32) + __l;
}


/*
 * find_cpu_speed:
 *   Interrogates /proc/cpuinfo for the processor clock speed.
 * 
 *   Returns: speed of processor in MHz, rounded down to nearest whole MHz.
 */
#define MAX_LINE_LEN 50
int find_cpu_speed(void)
{
    FILE *f;
    char s[MAX_LINE_LEN], *a, *b;

    if ( (f = fopen("/proc/cpuinfo", "r")) == NULL ) goto out;

    while ( fgets(s, MAX_LINE_LEN, f) )
    {
        if ( strstr(s, "cpu MHz") )
        {
            /* Find the start of the speed value, and stop at the dec point. */
            if ( !(a=strpbrk(s,"0123456789")) || !(b=strpbrk(a,".")) ) break;
            *b = '\0';
            fclose(f);
            return(atoi(a));
        }
    }

 out:
    fprintf(stderr, "find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz");
    exit(1);
}


int main(void)
{
    int mhz, i;

    /*
     * no_preempt_estimate is our estimate, in clock cycles, of how long it
     * takes to execute one iteration of the main loop when we aren't
     * preempted. 50000 cycles is an overestimate, which we want because:
     *  (a) On the first pass through the loop, diff will be almost 0,
     *      which will knock the estimate down to <40000 immediately.
     *  (b) It's safer to approach real value from above than from below --
     *      note that this algorithm is unstable if n_p_e gets too small!
     */
    unsigned int no_preempt_estimate = 50000;

    /*
     * prev  = timestamp on previous iteration;
     * this  = timestamp on this iteration;
     * diff  = difference between the above two stamps;
     * start = timestamp when we last printed CPU % estimate;
     */
    unsigned long long int prev, this, diff, start;

    /*
     * preempt_time = approx. cycles we've been preempted for since last stats
     *                display.
     */
    unsigned long long int preempt_time = 0;

    /* Required in order to print intermediate results at fixed period. */
    mhz = find_cpu_speed();
    printf("CPU speed = %d MHz\n", mhz);

    start = prev = rpcc();

    for ( ; ; )
    {
        /*
         * By looping for a while here we hope to reduce affect of getting
         * preempted in critical "timestamp swapping" section of the loop.
         * In addition, it should ensure that 'no_preempt_estimate' stays
         * reasonably large which helps keep this algorithm stable.
         */
        for ( i = 0; i < 10000; i++ );

        /*
         * The critical bit! Getting preempted here will shaft us a bit,
         * but the loop above should make this a rare occurrence.
         */
        this = rpcc();
        diff = this - prev;
        prev = this;

        /* if ( diff > (1.5 * preempt_estimate) */
        if ( diff > no_preempt_estimate + (no_preempt_estimate>>1) )
        {
            /* We were probably preempted for a while. */
            preempt_time += diff - no_preempt_estimate;            
        }
        else
        {
            /*
             * Looks like we weren't preempted -- update our time estimate:
             * New estimate = 0.75*old_est + 0.25*curr_diff
             */
            no_preempt_estimate =
                (no_preempt_estimate>>1) + (no_preempt_estimate>>2) +
                (diff>>2);
        }
            
        /* Dump CPU time every second. */
        if ( (this - start) / mhz > 1000000 ) 
        { 
            printf("Slurped %.2f%% CPU, TSC %08x\n", 
                   100.0*((this-start-preempt_time)/((double)this-start)),
                   this);
            start = this;
            preempt_time = 0;
        }
    }

    return(0);
}

<Prev in Thread] Current Thread [Next in Thread>