This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
Home Products Support Community News


Re: [Xen-devel] x86's context switch ordering of operations

To: "Keir Fraser" <keir.fraser@xxxxxxxxxxxxx>
Subject: Re: [Xen-devel] x86's context switch ordering of operations
From: "Jan Beulich" <jbeulich@xxxxxxxxxx>
Date: Tue, 29 Apr 2008 14:39:16 +0100
Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
Delivery-date: Tue, 29 Apr 2008 06:59:24 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <C43CD826.17FA6%keir.fraser@xxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <48173326.76E4.0078.0@xxxxxxxxxx> <C43CD826.17FA6%keir.fraser@xxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> Keir Fraser <keir.fraser@xxxxxxxxxxxxx> 29.04.08 14:50 >>>
>On 29/4/08 13:39, "Jan Beulich" <jbeulich@xxxxxxxxxx> wrote:
>> To do so, I was considering using {un,}map_domain_page() from
>> the context switch path, but there are two major problems with the
>> ordering of operations:
>> - for the outgoing task, 'current' is being changed before the
>> ctxt_switch_from() hook is being called
>> - for the incoming task, write_ptbase() happens only after the
>> ctxt_switch_to() hook was already called
>> I'm wondering whether there are hidden dependencies that require
>> this particular (somewhat non-natural) ordering.
>ctxt_switch_{from,to} exist only in x86 Xen and are called from a single
>hook point out from the common scheduler. Thus either they both happen
>before, or both happen after, current is changed by the common scheduler. It

Maybe I'm mistaken (or it is being done twice with no good reason), but
I see a set_current(next) in x86's context_switch() ...

>took a while for the scheduler interfaces to settle down to something both
>x86 and ia64 was happy with so I'm not particularly excited about revisiting
>them. I'm not sure why you'd want to map_domain_page() on context switch
>anyway. The map_domain_page() 32-bit implementation is inherently per-domain

If pages mapped that way survive context switches, then it would
certainly be possible to map them once and keep them until no longer
needed. Doing this during context switch was more as an attempt to
conserve on virtual address use (so other vCPU-s of the same guest
not using this functionality would have less chances of running out
of space). The background is that I think that it'll also be necessary
to extend MAX_VIRT_CPUS beyond 32 at some not too distant point
(at least in dom0 for CPU frequency management - or do you have
another scheme in mind how to deal with systems having more than
32 CPU threads), resulting in more pressure on the address space.

>> 2) The implementation in the hypervisor seems to have added yet another
>> scalibility issue (on 32-bits), as this is being carried out using
>> map_domain_page_global() - if there are sufficiently many guests with
>> sufficiently many vCPU-s, there just won't be any space left at some
>> point. This worries me especially in the context of seeing a call to
>> sh_map_domain_page_global() that is followed by a BUG_ON() checking
>> whether the call failed.
>The hypervisor generally assumes that vcpu_info's are permanently and
>globally mapped. That obviously places an unavoidable scalability limit for
>32-bit Xen. I have no problem with telling people who are concerned about
>the limit to use 64-bit Xen instead.

I know your position here, but - are all 32-on-64 migration/save/restore
issues meanwhile resolved (that is, can the tools meanwhile deal with
either size domains no matter whether using a 32- or 64-bit dom0)? If
not, there may be reasons beyond that of needing vm86 mode that
might force people to stay with 32-bit Xen. (I certainly agree that there
are unavoidable limitations, but obviously there is a big difference
between requiring 64 bytes and 4k per vCPU for this particular


Xen-devel mailing list