WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] Re: credit scheduler error rates as reported by HP and UCSD

To: m+Ian.Pratt@xxxxxxxxxxxx, ackaouy@xxxxxxxxx, lucy.cherkasova@xxxxxx, xen-devel@xxxxxxxxxxxxxxxxxxx, ncmike@xxxxxxxxxx
Subject: [Xen-devel] Re: credit scheduler error rates as reported by HP and UCSD
From: Lucy Cherkasova <lucy@xxxxxxxxxxxxxxxx>
Date: Thu, 12 Apr 2007 09:22:27 -0700 (PDT)
Delivery-date: Fri, 13 Apr 2007 02:15:32 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi Mike,

> 
> My first observation is that the credit scheduler will select a vcpu
> that has exceeded its credit when there is no other work to be done on
> any of the other physical cpus in the system.

In the version of the paper that you read and refer to, we consciously 
considered  the three scheduler comparison using 1 CPU machine:
the goal was to compare the "BASIC" scheduler functionality.
I will present a bit more results for 2-CPU case during the Xen Summit.

> 
> In light of the paper, with very low allocation targets for vcpus, it
> is not surprising that the positive allocation errors can be quite
> large. It is also not surprising that the errors (and error
> distribution) decrease with larger allocation targets. 

Because of 1-CPU machine, the explanation of this phenomena is different
(it is not related to load balancing of VCPUs) and the Credit scheduler 
can/should  be made more precise.
What our paper does not show is the original error distribution for Credit 
(original -- means after it was released). The resulst that you see in
the paper are with the next, significantly improved version by Emmanuel. 
I beleive that there is still a significant room for improvement.

> 
> None of this explains the negative allocation errors, where the vcpu's
> received less than their pcpu allotments. I speculate that a couple of
> circumstances may contribute to negative allocation errors:
> 
> very low weights attached to domains will cause the credit scheduler
> to attempt to pause vcpus almost every accounting cycle. vcpus may
> therefore not have as many opportunities to run as frequently as
> possible. If the ALERT measument method is different, or has a
> different interval, than the credit schedulers 10ms tick and 30ms
> accounting cycle, negative errors may result in the view of ALERT. 

ALERT benchmark is setting the allocation of a SINGLE domain (on 1 CPU machine,
no other competing domains while running this benchmark) to a chosen
target CPU allocation, e.g., 20%, in the non-work-conserving mode. 
It means that the CPU allocation is CAPPED by 20%. This single domain runs 
"slurp" (a tight CPU loop, 1 process) to consume the allocated CPU share.

The monitoring part of ALERT just collects the measurements from the system
using both XenMon and xentop with 1 second reporting granularity
Since 1 sec is so much larger than 30 ms slices, there should be possible
to get a very accurate CPU allocation for larger CPU allocation targets.
However, for 1% CPU allocation you have an immediate error, because
Credit will allocate 30ms slice (that is 3% of 1 sec). If Credit
would use 10 sec slices than the error will be (theoretically) bounded
to 1%. 

The expectations are that each 1 sec measurements should show 20% CPU 
utilization for this domain.

We run ALERT for different CPU allocation targets from 1% to 90%.
The reported error is the error between the targetted CPU allocation and 
the measured CPU allocation at 1 sec granularity.

> 
> I/O activity: if ALERT performans I/O activity the test, even though
> it is "cpu intensive" may cause domu to block on dom0 frequently,
> meaning it will idle more, especially if dom0 has a low credit
> allocation.

There are no I/O activities, ALERT functionality is very special as 
described above: nothing else is happening in the system.


> 
> Questions: how does ALERT measure actual cpu allocation? Using Xenmon?

As, I've mentioned above we have measurements from both XenMon and xentop,
they are very close for these experiments.

> How does the ALERT exersize the domain? 

ALERT runs "slurp", a cpu-hungry loop, which will "eat"
as much CPU as you allocate to it. It is a single process application.


The paper didn't mention the
> actual system calls and hypercalls the domains are making when running
> ALERT.

There is none of such: it is a pure user space benchmark.


Best regards, Lucy

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>