WARNING - OLD ARCHIVES

This is an archived copy of the Xen.org mailing list, which we have preserved to ensure that existing links to archives are not broken. The live archive, which contains the latest emails, can be found at http://lists.xen.org/
   
 
 
Xen 
 
Home Products Support Community News
 
   
 

xen-devel

Re: [Xen-devel] [PATCH, v2] add privileged/unprivileged kernel feature i

To: "Ian Campbell" <Ian.Campbell@xxxxxxxxxx>
Subject: Re: [Xen-devel] [PATCH, v2] add privileged/unprivileged kernel feature indication
From: "Jan Beulich" <JBeulich@xxxxxxxxxx>
Date: Tue, 19 Jul 2011 11:24:04 +0100
Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Delivery-date: Tue, 19 Jul 2011 03:25:08 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxxx
In-reply-to: <1311067737.20648.100.camel@xxxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
References: <4E2563D8020000780004E28F@xxxxxxxxxxxxxxxxxxxx> <1311067737.20648.100.camel@xxxxxxxxxxxxxxxxxxxxxx>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
>>> On 19.07.11 at 11:28, Ian Campbell <Ian.Campbell@xxxxxxxxxx> wrote:
> On Tue, 2011-07-19 at 10:00 +0100, Jan Beulich wrote:
>> With our switching away from supporting 32-bit Dom0 operation, users
>> complained that attempts (perhaps due to lack of knowledge of that
>> change) to boot the no longer privileged kernel in Dom0 resulted in
>> apparently silent failure. To make the mismatch explicit and visible,
>> add feature flags that the kernel can set to indicate operation in
>> what modes it supports. For backward compatibility, absence of both
>> feature flags is taken to indicate a kernel that may be capable of
>> operating in both modes.
>> 
>> v2: Due to the way elf_xen_parse_features() worked up to now (getting
>> fixed here), adding features indications to the old, string based ELF
>> note would make the respective kernel unusable on older hypervisors.
> 
> What was the failure mode? Can we not fix it (with suitable backport
> recommendations) rather than adding a new duplicated interface?

Adding a supported feature Xen doesn't understand leads to a
"cannot load Dom0 kernel" without any indication what was actually
wrong with the kernel.

The fix is trivial (this patch's change to elf_xen_parse_features()),
but expecting everyone to backport this to (perhaps very) old
hypervisors didn't seem realistic to me.

>> For that reason, a new ELF Note is being introduced that allows
>> specifying supported features as a bit array instead (with features
>> unknown to the hypervisor simply ignored, as now also done by
>> elf_xen_parse_features(), whereas here unknown kernel-required
>> features still keep the kernel [and hence VM] from booting).
>  
>> +    case XEN_ELFNOTE_SUPPORTED_FEATURES:
>> +        for ( i = 0; i < XENFEAT_NR_SUBMAPS; ++i )
> 
> There needs to be some negotiation of what the kernel thought
> XENFEAT_NR_SUBMAPS was or else if/when we have enough features to bump
> the number we risk running off the end of the array on older kernels.

No. The kernel simply specifies what it's (approximate) notion
through the note's data size. Xen just reads as much as it
understands (and ignores, as written in the description) the rest.

>> +        {
>> +            parms->f_supported[i] |= val;
> 
> val is a uint64_t so we don't support more than two submaps, which is ok
> for now but the elf note needs to include a way to grow beyond that in a
> forward and backward compatible way (lest we grow a third interface for
> this in the future!).

Hmm, indeed - we'd have to improve elf_note_numeric() to be
forward compatible.

> Notes have a length field and we support 1,2 and 4 byte numerical notes
> but here I think we need to add support for arbitrary length arrays on
> numerical values.

Easiest would seem to be to have the caller (optionally) specify a unit
size and index. What do you think?

>>  /* x86: pirq can be used by HVM guests */
>> -#define XENFEAT_hvm_pirqs           10
>> +#define XENFEAT_hvm_pirqs                 10
>> +
>> +/* privileged operation is supported */
>> +#define XENFEAT_privileged                11
>> +
>> +/* un-privileged operation is supported */
>> +#define XENFEAT_unprivileged              12
> 
> This still strikes me as odd because unprivileged is a subset of
> privileged (I understand the backwards compatibility argument for having
> it this way though). Really XENFEAT_unprivileged is the
> "XENFEAT_privileged feature bit is supported" meta-feature flag.

No, I don't view it that way - in the Linux ports, the meaning of
the respective config options is such that privileged includes
unprivileged, but that's a guest OS decision, not one the interface
should dictate.

The only implication done here is that the (otherwise meaningless)
absence of both flags gets taken as if both flags were set.

> Perhaps this should be a separate elf note? If present then it is 0 or 1
> to indicate support for running privileged and if absent we assume it is
> supported. This would also remove the need for:
> 
>> @@ -278,7 +278,8 @@ DO(xen_version)(int cmd, XEN_GUEST_HANDL
>>          switch ( fi.submap_idx )
>>          {
>>          case 0:
>> -            fi.submap = 0;
>> +            fi.submap = 1U << (IS_PRIV(current->domain) ?
>> +                               XENFEAT_privileged : XENFEAT_unprivileged);
> 
> Which information which is already exposed to the guest via start_info.

Yeah, I just wanted this for completeness. If it's deemed bad to do so,
it could simply be dropped.

> Do we really mean "privileged" here, or do we mean "dom0", I know the
> two are tied together today but will this be the case as we disaggregate
> more and more? Does this flag really mean "can drive APICs and run ACPI
> code etc"? Which is distinct from the ability to drive hardware
> generally etc.

No, this is really only meaningful as a Dom0-capability indication in
today's sense. Splitting this will probably yield the whole privileged/
unprivileged distinction bogus, and hence kernels supporting this
would probably be required to just always set both flags.

If the feature naming is a problem, we could certainly rename them
into "dom0" and "domU", to distinguish the two ways of getting
loaded.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel