Naoki,
Thank you for your work! The results look really good.
Overall, I think the scheduler as a whole needs some design work
before these can go in. No one at this point fully understands the
principles on which it's supposed to run. I've been taking a close
look at the unmodified scheduler (notably trying to understand the
anomalies pointed out by Atsushi), and I think it's clear that there
are some flaws in the logic.
Before making a large change like this, I think we should do several things:
* Try to describe exactly what the scheduler is currently doing, and why
* If there are some inconsistencies, change them
* Modify the description to include your proposed changes to the boost scheduler
Your changes, although proven effective, make the scheduler much more
complicated. If no one understands it now, it will be even harder to
understand with your changes, unless we set down some very clear
documentation of how the algorithm is supposed to work. Namely, we
need to document:
* What factors different workloads need; i.e.:
+ Long enough time for cpu-bound workloads to warm up the cache effectively
+ Fast responsiveness for "latency-sensitive" workloads, esp. in the
face of multiple latency-sensitive workloads
+ Fairness wrt weight
* At a high level, what we'd like to see happen
* How individual mechanisms work:
+ Credits: when they are added / subtracted
+ Priorities: when they are changed and why
+ Preemption: when a cpu-bound process gets preempted
+ Active / passive status: when and why switched from one to the other
I've been intending to do this for a couple of weeks now, but I've got
some other patches I need to get cleaned up and submitted first.
Hopefully those will be finished by the end of the week. This is my
very next priority.
Once I have the "design" document, I can describe your changes in
reference to them, and we can discuss them at a design level.
I have a couple of specific comments on your patches that I'll put
inline in other e-mails.
Thank you for your work, and your patience.
-George
On Thu, Dec 18, 2008 at 2:49 AM, NISHIGUCHI Naoki
<nisiguti@xxxxxxxxxxxxxx> wrote:
> Hi all,
>
> In almost the same environment as the paper, I experimented with credit
> scheduler(original and modified version).
> I describe the results below.
>
> Unfortunately the good result was not obtained by my previous patches.
>
> I found that there were some problems on my previous patches.
> So, I had revised the patches and experimented with revised version again.
> Using revised patches, the good result was obtained.
> Especially, please look at the result of ex7. In revised version, I/O
> bandwidth per guest is growing correctly according to dom0's weight.
>
> I'll post the revised patches later.
>
> Thanks,
> Naoki Nishiguchi
>
> ---------- results ----------
> experimental environment:
> HP dc7800 US/CT(Core2 Duo E6550 2.33GHz)
> Multi-processor: disable
> Xen: xen 3.3.0 release
> dom0: CentOS 5.2
>
> I used the following experiments from among the paper's experiments.
> ex3: burn x7, ping x1
> ex5: stream x7, ping x1
> ex7: stream x3, burn x3, ping x1
> ex8: stream x3, ping+burn x1, burn x3
>
> original credit scheduler
> ex3
> burn(%): 14 14 14 14 14 14 14
> ping(ms): 19.7(average) 0.1 - 359
> ex5
> stream(Mbps): 144.05 141.19 137.81 137.01 137.30 138.76 142.21
> ping(ms) : 8.2(average) 7.84 - 8.63
> ex7
> stream(Mbps): 33.74 27.74 34.70
> burn(%): 28 28 28 (by guess)
> ping(ms): 238(average) 1.78 - 485
> ex7(xm sched-credit -d 0 -w 512)
> There was no change in the result.
> ex8
> stream(Mbps): 9.98 11.32 10.61
> ping+burn: 264.9ms(average) 20.3 - 547
> 24%
> burn(%): 24 24 24
>
>
> modified version(previous patches)
> ex3
> burn(%): 14 14 14 14 14 14 14
> ping(ms): 0.17(average) 0.136 - 0.202
> ex5
> stream(Mbps): 143.90 141.79 137.15 138.43 138.37 130.33 143.36
> ping(ms): 7.2(average) 4.85 - 8.95
> ex7
> stream(Mbps): 2.33 2.18 1.87
> burn(%): 32 32 32 (by guess)
> ping(ms): 373.7(average) 68.0 - 589
> ex7(xm sched-credit -d 0 -w 512)
> There was no change in the result.
> ex7(xm sched-credit -d 0 -m 100 -r 20)
> stream(Mbps): 114.49 117.59 115.76
> burn(%): 24 24 24
> ping(ms): 1.2(average) 0.158 - 65.1
> ex8
> stream(Mbps): 1.31 1.09 1.92
> ping+burn: 387.7ms(average) 92.6 - 676
> 24% (by guess)
> burn(%): 24 24 24 (by guess)
>
>
> revised version
> ex3
> burn(%): 14 14 14 14 14 14 14
> ping(ms): 0.18(average) 0.140 - 0.238
> ex5
> stream(Mbps): 142.57 139.03 137.50 136.77 137.61 138.95 142.63
> ping(ms): 8.2(average) 7.86 - 8.71
> ex7
> stream(Mbps): 143.63 132.13 131.77
> burn(%): 24 24 24
> ping(ms): 32.2(average) 1.73 - 173
> ex7(xm sched-credit -d 0 -w 512)
> stream(Mbps): 240.06 204.85 229.23
> burn(%): 18 18 18
> ping(ms): 7.0(average) 0.412 - 73.9
> ex7(xm sched-credit -d 0 -m 100 -r 20)
> stream(Mbps): 139.74 134.95 135.18
> burn(%): 23 23 23
> ping(ms): 15.1(average) 1.87 - 95.4
> ex8
> stream(Mbps): 118.15 106.71 116.37
> ping+burn: 68.8ms(average) 1.86 - 319
> 19%
> burn(%): 19 19 19
> ----------
>
> NISHIGUCHI Naoki wrote:
>>
>> Thanks for your information.
>>
>> George Dunlap wrote:
>>>
>>> There was a paper earlier this year about scheduling and I/O performance:
>>> http://www.cs.rice.edu/CS/Architecture/docs/ongaro-vee08.pdf
>>>
>>> One of the things he noted was that if a driver domain is accepting
>>> network packets for multiple VMs, we sometimes get the following
>>> pattern:
>>> * driver domain wakes up, starts processing packets. Because it's in
>>> "over", it doesn't get boosted.
>>> * Passes a packet to VM 1, waking it up. It runs in "boost",
>>> preempting the (now lower-priority) driver domain.
>>> * Other packets (possibly even for VM 1) sit in the driver domain's
>>> queue, waiting for it to get cpu time.
>>
>> I don't read the paper yet, but I think our approach is effective in this
>> problem.
>> However, if driver domain consumes cpu time too much, we couldn't prevent
>> it from becoming "over" priority. Otherwise, we could keep it with "under"
>> or "boost" priority.
>>
>>> Their tests, for 3 networking guests and 3 cpu-intensive guests,
>>> showed a 40% degradation in performance due to this problem. While
>>> we're thinking about the scheduler, it might be worth seeing if we can
>>> solve this.
>>
>> Firstly, I'd like to read the paper.
>>
>> Regards,
>> Naoki Nishiguchi
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|