This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] How can Xen trigger a context switch in an HVM guest do

To: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, "James (song wei)" <jsong@xxxxxxxxxx>
Subject: Re: [Xen-devel] How can Xen trigger a context switch in an HVM guest domain?
From: XiaYubin <xiayubin@xxxxxxxxx>
Date: Tue, 3 Nov 2009 09:43:28 +0800
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Mon, 02 Nov 2009 17:44:15 -0800
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=qr0WxrVpE2uUJz28HbZvRBSnkRyUELPFHyZ0S8utsgc=; b=hRJ3OmGcX4lVGWwhNt2Gz3tC4f/z6bVx9Hmnxgy9JwJoUgk/eKcy8rFFIGkiZs1XTt T0cs+UtsiFJqt5sKHOakKdJ0Sp+OJbAuRlogTrJ5L66RJ5qvJjcqpf66TexMPTJqZTfJ 5lrWPzx/MZjpsU8KugfmTBmNIiXGlN8tXKMWM=
Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=d48izx1H79g5y/mjiPcJ+c2j0jjKDfiUv0/pwxN8rAfZUtWHgUU90UCVQh4hXqVjPL qFL2mJmmy2yp+ffRggzpMUSIRBsd8XF4ftxHFnWCr0CSgdCvh+ZlxHNw8nNNBQ89QObu ZDIq0UhgIMV9IA5pqCVoeTkDEyBmMLmXwv1cM=
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <de76405a0911020805h59954bc9r9155b4cdb87ff01@xxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <8ee64b0c0910310402x7e3aabbeh26d77455408a9d0f@xxxxxxxxxxxxxx> <de76405a0910310820p375f4d02xc94aea9804b99b96@xxxxxxxxxxxxxx> <8ee64b0c0910312254u6931cc08sebffd47b6e100f88@xxxxxxxxxxxxxx> <de76405a0911020805h59954bc9r9155b4cdb87ff01@xxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
James and George, thank you both! The breakpoint way is interesting, I
don't event think of it :)

OK, I'm going to use a simpler way to verify my idea first. Before the
preempting-state VM runs, I will set a timer to make Xen get to run
every 100us (maybe longer for the first iteration). The timer-handler
will check if the preempting VM is in kernel-mode or user-mode. If it
is in user-mode with cpu-hog's CR3, then it will be scheduled out.
Meanwhile, if the iteration goes beyond some threshold (say 5 times),
the VM will also be scheduled out. This way seems much simpler than
the one using breakpoint, and more accurate than the one using
1ms-timer. It may bring some overhead, but the preemption is not
supposed to occur frequently and the fairness is more important.

The thread problem also exists in Linux platform. Currently I have no
good idea to identify different threads from the hypervisor's
perspective. I have a dream that one day those OS guys will export
this information to VMM, a dream that one day our children will live
in a world where virtualization rules. I have a dream today :)



On Tue, Nov 3, 2009 at 12:05 AM, George Dunlap
<George.Dunlap@xxxxxxxxxxxxx> wrote:
> OK, so you want to allow a VM to run so that it can do packet
> processing in the kernel, but once it's done in the kernel you want to
> preempt the VM again.
> An idea I was going to try out is that if a VM receives an interrupt
> (possibly only certain interrupts, like network), let it run for a
> very short amount of time (say, 1ms or 500us).  That should be enough
> for it to do its basic packet processing (or audio processing, video
> processing, whatever).  True, you're going to run the "cpu hog" during
> that time, but that will be debited against time he'll run later.  (I
> haven't tested this idea yet. It may work better with some credit
> algorithms than others.)
> The problem with inducing a guest to call schedule():
> * It may not have any other runnable processes, or it may choose the
> same process to run again; so it may not switch the cr3 anyway.
> * The only reliable way to do it without some kind of
> paravirtualization (if even a kernel driver) would be to give it a
> timer interrupt, which may mess up other things on the system, such as
> the system time.
> If you're really keen to preempt on return to userspace, you could try
> something like the following.  Before delivering the interrupt, note
> the EIP the guest is at.  If it's in user space, set a hardware
> breakpoint at that address.  Then deliver the interrupt.  If the guest
> calls schedule(), you can catch the CR3 switch; if it returns to the
> same process, it will hit the breakpoint.
> Two possible problems:
> * For reasons of ancient history, the iret instruction may set the RF
> flag in the EFLAGS register, which will cause the breakpoint not to
> fire after the guest iret.  You may need to decode the instruction and
> set the breakpoint at the instruction after, or something like that.
> * I believe windows doens't do a cr3 switch if it does a *thread*
> switch.  If so, on a thread switch you'll get neither the CR3 switch
> nor the breakpoint (since the other thread is probably running
> somewhere else).
> Peace,
>  -George
> On Sun, Nov 1, 2009 at 5:54 AM, XiaYubin <xiayubin@xxxxxxxxx> wrote:
>> Hi, George,
>> Thank you for your reply. Actually, I'm looking for a generic
>> mechanism of cooperative scheduling. The independence of  guest OS can
>> make such mechanism more convincing and practical, just like the
>> balloon driver does.
>> Maybe you are wondering why I asked such a wired question, let me
>> describe it with more details. My current work is based on "Task-aware
>> VM scheduling", which is published on VEE'09. By monitoring CR3
>> changing at VMM level, Xen can get information of tasks' CPU
>> consumption to identify CPU hogs and I/O tasks. Therefore, the
>> task-aware mechanism offers a more fine-grained scheduler than the
>> original VCPU-level scheduler, as a VCPU may run CPU hogs and I/O
>> tasks in a mixed style.
>> Imagine there are n VMs. One of them, named mix-VM, runs two tasks:
>> cpuhog and iotask (network). The other VMs, named CPU-VM, run just
>> cpuhog. All VMs are using PV driver ( GPLPV driver for Windows).
>> Here's what supposed to happen when iotask receiving an network
>> packet: The NIC raises an IRQ, passes to Xen, then domain-0 sends an
>> inter-domain event to mix-VM, which is likely to be in run-queue. Xen
>> then schedules it to run immediately and set its state to
>> preempting-state. Right after that, the mix-VM *should* schedules
>> iotask to process the incoming packet, and then schedules cpuhog after
>> processing. When the CR3 is changing to cpuhog, Xen knows that the
>> mix-VM has finished I/O processing (here we assume that the priority
>> of cpuhog is usually lower than iotask in most OS), and schedules the
>> mix-VM out to finish its preempting-state. Therefore, the mix-VM can
>> preempt other VMs to process I/O ASAP, while making the preempting
>> time as short as possible to keep fairness. The point is: cpuhog
>> should not run in preempting-state.
>> However, a problem arises when the mix-VM sending packets. When iotask
>> sends an amount of data (using TCP protocol), it will block and wait
>> to be waked up after guest kernel sending all the data, which may be
>> split into thousands of TCP packets. The mix-VM will receives an ACK
>> packet every time it sending a packet, which makes it enter
>> preempting-state. Note that at this moment, the CR3 of mix-VM is
>> cpuhog's (as the only running process). After the guest kernel
>> processing the ACK packet and sending next packet, it switches to user
>> mode, which means the cpuhog gets to run in preempting-state. The
>> point is: as there is no CR3-changing, Xen has no way to run.
>> One way is to add a hook at user/kernel mode switching, then Xen can
>> catch the moment when cpuhog gets to run. However, this way costs too
>> much. Another way is to force a VM to schedule when it entering
>> preempting-state. Therefore, it will trap to Xen when CR3 is changed,
>> and Xen can finish its preempting-state when it schedules cpuhog to
>> run. That's why I want to trigger guest context switch from Xen. I
>> don't really care *which* process it will switch to, I just want to
>> get Xen a chance to run. The point is: is there a better/simpler way
>> to solve this problem?
>> Hope I described the problem clearly. And would you please show more
>> details about the thought of "reschedule event channel"? Thanks!
>> --
>> Yubin
>> On Sat, Oct 31, 2009 at 11:20 PM, George Dunlap
>> <George.Dunlap@xxxxxxxxxxxxx> wrote:
>>> Context switching is a choice the guest OS has to make, and how that's
>>> done will differ based on the operating system.  I think if you're
>>> thinking about modifying the guest scheduler, you're probably better
>>> off starting with Linux.  Even if there's a way to convince Windows to
>>> call schedule() to pick a new process, I'm not sure you'll be able to
>>> tell it *which* process to choose.
>>> As far as mechanism on Xen's side, it would be easy enough to allocate
>>> a "reschedule" event channel for the guest, such that whenever you
>>> want to trigger a guest reschedule, just raise the event channel.
>>>  -George
>>> On Sat, Oct 31, 2009 at 11:02 AM, XiaYubin <xiayubin@xxxxxxxxx> wrote:
>>>> Hi, all,
>>>> As I'm doing some research in cooperative scheduling between Xen and
>>>> guest domain, I want to know how many ways can Xen trigger a context
>>>> switch inside an HVM guest domain (which runs Windows in my case). Do
>>>> I have to write a driver (like balloon-driver)? Or a user process is
>>>> enough? Or there is an even simpler way?
>>>> All your suggestions are appreciated. Thanks! :)
>>>> --
>>>> Yubin
>>>> _______________________________________________
>>>> Xen-devel mailing list
>>>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>>>> http://lists.xensource.com/xen-devel
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel

Xen-devel mailing list

<Prev in Thread] Current Thread [Next in Thread>