Re: [Xen-API] Comments on Xen API document

On Tue, Aug 15, 2006 at 03:21:45AM +0100, John Levon wrote:

> 
> I've only recently joined this mailing list, so I'm sure some of my questions
> here are old hat - in which case apologies. Same goes for ones that are a
> result of my limited knowledge about XML RPC.
> 
> These comments are based on version 0.4 from the wiki page.
> 
> Generally - there's no clear versioning. In particular a description of how
> to find out what version of this API the host expects, and what it will
> return.  Ideally it would degrade to the version that the client reports as
> being able to handle. The API docs should in the future clearly label what
> revision things were added in. One thing that might be useful is clearly
> specifying stability of each item. This nicely allows for experimental
> features, whilst labelling truly stable entries clearly.
>
> I think you mention it, but all calls need to clearly define the failure
> modes and what gets returned on failure.

Yes, both versioning and failure modes and codes are on the todo list -- we
definitely intend to specify this stuff properly before release.

> It also wasn't clear to me what happens if I try to set several parameters
> in a single XML RPC call (presuming that's possible?). If the second one
> fails, what do I get back in terms of the error struct?

No, you can't do this.  In general, we want to keep the calls "atomic", in
that the call either succeeds or it fails and changes no state.  The policy on
handling failures in the middle of a set of operations is something that we
want to push to the client.

> 1.1
> 
> The add_to_* primitives for sets should probably return the equivalent of
> EEXIST, not just silently succeed: it's always better to provide more
> information to a client. The operation may be a logic error.

OK, that's reasonable.  I'll put that in.

> 1.3.1
> 
> ErrorDescription isn't specified well enough to program against.

What do you mean by this?  We still haven't specified the actual error codes,
their meaning, and the parameters that the code implies, but other than that I
thought it was well enough specified.  Could you be more specific?

> 1.4.2
> 
> What happens to the session if a connection is terminated or idles out?

This isn't in the document, but I think the behaviour we'll go for is that
sessions can time out, and any call could potentially return a "NoSuchSession"
error, and so the client must be prepared to authenticate again.

Note that session objects are independent of the underlying transport, so
there's no reason why a TCP timeout should necessarily cause a session
timeout.

> 1.4.3
> 
> The persistent tasks per user behaviour should be called out here explicitly,
> rather than the comment later on in the document.

OK.  The task stuff is still a bit up in the air, and could do with more work,
certainly.

> The specification implies that a failed task must have its Progress set to
> "100". This seems odd.

Odd maybe, but not harmful, I think.  The intention is that you continue to
poll until you see 100, and then parse the result for success or failure codes
etc -- it's not intended that 100 indicates success.

> More widely, has there been consideration of indicating
> /phases/ of operation? Often this is more useful than a generic slider value.

No, no-one's considered that, as far as I know.  What would you expect here?
Would we end up with a whole new set of codes to indicate task progress, on
top of the error codes?

> 1.5
> 
> I don't understand the expanded XML for the do_login_with_password() - the
> paramters are not name/value pairs. If this is a general design, how are
> optional parameters to be handled?

Optional parameters aren't handled at all, deliberately, as they add
complexity that I don't think we need, and would make life difficult for
bindings to languages that don't have them (like C).

Any call that takes a struct could in the future have optional entries, since
those are defined using name-value pairs, but again, we've not allowed that in
the specification yet.

> "getname_label" is a result, presumably, of the odd "name/label" naming of
> the parameter. This mapping is not defined in the document, and it's a
> little odd that we have "getname_label" but "get_uuid".

Yes, that's a typo -- it ought to read get_name_label -- and we certainly need
to define that mapping properly.

> Probably an XML RPC thing, but there's no discussions of transactions in the
> document. Can I change a batch of things and assure it's done atomically?

No -- we want the client to make rollback / error handling decisions -- Xend
simply isn't the right place for that kind of policy.  Atomicity is impossible
to guarantee for a lot of the parameters in any case, because Xen doesn't
offer a transactional interface either.

> 1.7
> 
> Do zombie domains need to be explicitly called out in the lifecycle?

Hmmm, maybe.  I'll have a think.  Zombies are more related to the _domain_
rather than the _VM_ -- you could restart the VM even with a zombie lying
around, there are just some additional resources being consumed -- so I don't
think that they really belong in that lifecycle.

> 2.3.3
> 
> These enums would be easier to read if placed by the calls that used them.

OK, I'll see if I can rejig it.  There might be places where enums are used by
more than one class.

> enum vm_power_state - "power" is a very odd notion for a VM. Why not just
> "vm_state" ?

I'm not wedded to the name -- I just wanted to avoid having a completely
generic name like "state".  Perhaps another term would be better?

> enum on_crash_behaviour - it's likely useful to have a host-wide setting for
> whether domains should core dump. How would that interact?

We could core dump if either setting said so, I guess, but there's no per-host
setting for this in the API at the moment.  Isn't this something that we could
punt to the higher-level tools?

> enum boot_type - it's not clear to me how a client would make use of knowing
> that it's "kernel_internal"

Well the user needs to know, because they need to make sure that they don't
delete the kernel from the guest's rootfs in that case!

> 2.6.1
> 
> is_a_template - "is_template" seems more natural

OK.

> memory/* needs units

Yes, they do, certainly.

> VCPUs/params needs a detailed description of what it can/does contain

It's meant to be a free-form string, interpreted by whatever scheduler is
running.

> VCPUS/utilisation - a percentage presumably?

Um, load average I think, like top, but I'll check with our scheduler expert
and see if we can get more interesting stats out.

> bios/cdrom - I'm not sure what this does...

Yes, we've dropped that one already.

> kernel/kernel et al - depending on bootloader, is this the copied kernel path
> on dom0 filesystem, or the path inside the domU filesystem? e.g. pygrub,
> domuloader

Yes, that's right.

> tools_version - needs defining explicitly?

We'll certainly need to define some permanent entries, but we wanted this to
be open for expansion later too.  Obviously we'll guarantee that there will be
a "Xend" entry, but we wanted for other version numbers to be able to slot in
here too, such as the version of PV drivers inside guests.

> There's no mapping from VM vcpus to the host_cpu's it's pinned on.

Pinning is something that would get encoded into VCPUs/params, when the
scheduler supports such functionality.

> 2.6.2
> 
> What's the interaction with the on shutdown behaviour for the shutdown RPCs?

The specified after_shutdown action takes place once the shutdown has
completed, regardless of whether you use clean_shutdown or hard_shutdown.

> suspend/resume - I'm a little confused about where the domain save file
> actually goes, and how 'xm' will specify it

It doesn't "go" anywhere defined -- that's an internal issue for Xend to deal
with.  We're moving away from the model where you can save and restore to
files and move them around willy-nilly -- people have a tendency to corrupt
their filesystems by starting the same save file twice, for example.

If you actually want the save file, you need some kind of "export" call, which
we haven't yet defined.

> 2.7.1
> 
> What is name/label intended to be for a host?

The tools could initially set it from something obvious like the machine's DNS
name, but it's meant to be human-readable and settable -- it's for the admin
to specify, ultimately.

> It's not actually entirely clear to me what a host /is/.

One host == one Xen == one Xend.  Does that clear it up?

> What does "list" do? Note that early in the document you refer to a
> non-existent ListAllVMs RPC.

Yes, the doc needs fixing so that it's consistent.  I think we've dropped
listAllVMs in favour of Host.get_resident_vms, which returns a set of
references to the VMs available on that host (i.e. a list of VM UUIDs).

> 'disable' and 'enable' seem perhaps a bit too generic name-wise.

What would you prefer?

> 2.11
> 
> I'm not clear on what an SR actually is and its relationship to its contained
> VDIs. Could the document clarify this somewhat?

Sure, we could clarify this.  An SR is a storage segment, like a partition, or
an LVM volume, or a LUN on a SAN.  I'll get our storage guys to specify this a
little better than I can manage!  With blocktap or loopback files, you can
have multiple VDIs within an SR, one backing file per VDI, hence the
containment relationship.

> type,location - what are its possible values?

That's still up in the air, I think.  That's something else that needs to go
on the todo.

> 2.12.1
> 
> "shareable" just wins in a googlefight against "sharable"

Ah, googlefight, the OED for the 21st Century!  We Brits (the pedantic ones,
at least) tend not to pay much attention to googlefights, but in this case I
prefer it with the 'e' anyway ;-)  I'll change that.

> 2.12.2
> 
> Why is a snapshot defined as living in the same SR?

So that copy-on-write implementations of snapshotting can work.  If you want a
snapshot in a separate SR, you need to explicitly copy it, using the call that
I've forgotten to put in ;-)

> 2.13.1
> 
> I don't think vbd_mode or driver_type are specified.

They're enums -- perhaps they dropped off your copy, but in my more recent one
they're definitely there.  vbd_mode is RO or RW, and driver_type is ioemu or
paravirtualised.

Thanks for your feedback, it's been very useful,

Ewan.

_______________________________________________
xen-api mailing list
xen-api@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-api
WARNING - OLD ARCHIVES

xen-api

Re: [Xen-API] Comments on Xen API document