WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

[Xen-devel] x86's context switch ordering of operations

To: <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] x86's context switch ordering of operations
From: "Jan Beulich" <jbeulich@xxxxxxxxxx>
Date: Tue, 29 Apr 2008 13:39:34 +0100
Delivery-date: Tue, 29 Apr 2008 05:39:37 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
In the process of inventing a reasonable mechanism to support some
advanced debugging features for pv guests (last exception record
MSRs, last branch stack MSRs after #DE, DS area) I was considering
to add another shared state area (similar to the relocated vCPU info,
but read-only to the guest and not permanently mapped), where the
hypervisor could store relevant information which otherwise can get
destroyed before the guest would be able to pick it up, as well as
state the CPU is to use which the guest must not be able to modify
directly (and extensible to a reasonable degree to support future
hardware enhancements).

To do so, I was considering using {un,}map_domain_page() from
the context switch path, but there are two major problems with the
ordering of operations:
- for the outgoing task, 'current' is being changed before the
ctxt_switch_from() hook is being called
- for the incoming task, write_ptbase() happens only after the
ctxt_switch_to() hook was already called
I'm wondering whether there are hidden dependencies that require
this particular (somewhat non-natural) ordering.

While looking into this, I noticed two things that I'm not quite clear
on regarding VCPUOP_register_vcpu_info:

1) How does the storing of vcpu_info_mfn in the hypervisor survive
migration or save/restore? The mainline Linux code, which uses this
hypercall, doesn't appear to make any attempt to revert to using the
default location during suspend or to re-setup the alternate location
during resume (but of course I'm not sure that guest is save/restore/
migrate ready in the first place). I would imagine it to be at least
difficult for the guest to manage its state post resume without the
hypervisor having restored the previously established alternative
placement.

2) The implementation in the hypervisor seems to have added yet another
scalibility issue (on 32-bits), as this is being carried out using
map_domain_page_global() - if there are sufficiently many guests with
sufficiently many vCPU-s, there just won't be any space left at some
point. This worries me especially in the context of seeing a call to
sh_map_domain_page_global() that is followed by a BUG_ON() checking
whether the call failed.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel