On Tue, 21 Sep 2010, Ian Jackson wrote:
> This document describes the vbd device numbering and naming. I've
> posted versions of it before. It should be in docs/misc, so here is a
> patch to add it.
>
> This is currently an RFC because the section near the bottom about the
> behaviour of Linux guests needs to be checked for accuracy. In
> particular, it would be good for Stefano or Jeremy to confirm the
> behaviour of current pvops kernels.
>
> Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> Cc: Stefano Stabellini <Stefano.Stabellini@xxxxxxxxxxxxx>
> Cc: Ian Campbell <Ian.Campbell@xxxxxxxxxxxxx>
> Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
>
> diff -r 77a3da957017 docs/misc/block-numbering-naming.txt
> --- /dev/null Thu Jan 01 00:00:00 1970 +0000
> +++ b/docs/misc/block-numbering-naming.txt Tue Sep 21 16:20:53 2010 +0100
> @@ -0,0 +1,124 @@
> +Xen guest interface
> +-------------------
> +
> +A Xen guest can be provided with block devices. These are always
> +provided as Xen VBDs; for HVM guests they may also be provided as
> +emulated IDE or SCSI disks.
> +
> +The abstract interface involves specifying, for each block device:
> +
> + * Nominal disk type: Xen virtual disk (aka xvd*, the default); SCSI
> + (sd*); IDE (hd*).
> +
> + For HVM guests, each whole-disk hd* and and sd* device is made
> + available _both_ via emulated IDE resp. SCSI controller, _and_ as a
> + Xen VBD. The HVM guest is entitled to assume that the IDE or SCSI
> + disks available via the emulated IDE controller target the same
> + underlying devices as the corresponding Xen VBD (ie, multipath).
> +
> + For PV guests every device is made available to the guest only as a
> + Xen VBD. For these domains the type is advisory, for use by the
> + guest's device naming scheme.
> +
> + The Xen interface does not specify what name a device should have
> + in the guest (nor what major/minor device number it should have in
> + thee guest, if the guest has such a concept).
> +
It should be made clear that for HVM guests specifying xvd* in the VM
config file means that the user is requesting a PV only disk without any
corresponding emulated disks.
As a consequence using only xvd* disks in an HVM config file is a
mistake, because grub (or any other bootloader) wouldn't be able to boot
the OS.
> + * Disk number, which is a nonnegative integer,
> + conventionally starting at 0 for the first disk.
> +
> + * Partition number, which is a nonnegative integer where by
> + convention partition 0 indicates the "whole disk".
> +
> + Normally for any disk _either_ partition 0 should be supplied in
> + which case the guest is expected to treat it as they would a native
> + whole disk (for example by putting or expecting a partition table
> + or disk label on it);
> +
> + _Or_ only non-0 partitions should be supplied in which case the
> + guest should expect storage management to be done by the host and
> + treat each vbd as it would a partition or slice or LVM volume (for
> + example by putting or expecting a filesystem on it).
> +
> + Non-whole disk devices cannot be passed through to HVM guests via
> + the emulated IDE or SCSI controllers.
> +
> +
> +Configuration file syntax
> +-------------------------
> +
> +The config file syntaxes are, for example
> +
> + d0 d0p0 xvda Xen virtual disk 0 partition 0 (whole disk)
> + d1p2 xvda2 Xen virtual disk 1 partition 2
shouldn't this be xvdb2?
> + d536p37 xvdtq37 Xen virtual disk 536 partition 37
> + sdb3 SCSI disk 1 partition 3
> + hdc2 IDE disk 2 partition 2
> +
> +The d*p* syntax is not supported by xm/xend.
> +
> +To cope with guests which predate this scheme we therefore preserve
> +the existing facility to specify the xenstore numerical value directly
> +by putting a single number (hex, decimal or octal) in the domain
> +config file instead of the disk identifier.
> +
> +
> +Concrete encoding in the VBD interface (in xenstore)
> +----------------------------------------------------
> +
> +The information above is encoded in the concrete interface as an
> +integer (in a canonical decimal format in xenstore), whose value
> +encodes the information above as follows:
> +
> + 1 << 28 | disk << 8 | partition xvd, disks or partitions 16 onwards
> + 202 << 8 | disk << 4 | partition xvd, disks and partitions up to 15
> + 8 << 8 | disk << 4 | partition sd, disks and partitions up to 15
> + 3 << 8 | disk << 6 | partition hd, disks 0..1, partitions 0..63
> + 22 << 8 | (disk-2) << 6 | partition hd, disks 2..3, partitions 0..63
> + 2 << 28 onwards reserved for future use
> + other values less than 1 << 28 deprecated / reserved
> +
> +The 1<<28 format handles disks up to (1<<20)-1 and partitions up to
> +255. It will be used only where the 202<<8 format does not have
> +enough bits.
> +
> +Guests MAY support any subset of the formats above except that if they
> +support 1<<28 they MUST also support 202<<8. PV-on-HVM drivers MUST
> +support at least one of 3<<8 or 8<<8; 3<<8 is recommended.
> +
> +Some software has provided essentially Linux-specific encodings for
> +SCSI disks beyond disk 15 partition 15, and IDE disks beyond disk 3
> +partition 63. These vbds, and the corresponding encoded integers, are
> +deprecated.
> +
> +Guests SHOULD ignore numbers that they do not understand or
> +recognise. They SHOULD check supplied numbers for validity.
> +
> +
> +Notes on Linux as a guest
> +-------------------------
> +
> +Very old Linux guests (PV and PV-on-HVM) are able to "steal" the
> +device numbers and names normally used by the IDE and SCSI
> +controllers, so that writing "hda1" in the config file results in
> +/dev/hda1 in the guest. These systems interpret the xenstore integer
> +as
> + major << 8 | minor
> +where major and minor are the Linux-specific device numbers. Some old
> +configurations may depend on deprecated high-numbered SCSI and IDE
> +disks. This does not work in recent versions of Linux.
> +
> +So for Linux PV guests, users are recommended to supply xvd* devices
> +only. Modern PV drivers will map these to identically-named devices
> +in the guest.
> +
> +For Linux HVM guests using PV-on-HVM drivers, users are recommended to
> +supply as few hd* devices as possible and use pure xvd* devices for
> +the rest. Modern PV-on-HVM drivers will map the hd* devices to
> +/dev/xvdHDa etc.
> +
moderm PV-on-HVM drivers will map the hd* devices to /dev/xvd* etc., so
"hda1" in the config file results in /dev/xvda1 in the guest.
> +Some Linux HVM guests with broken PV-on-HVM drivers do not cope
> +properly if both hda and hdc are supplied, nor with both hda and xvda,
> +because they directly map the bottom 8 bits of the xenstore integer
> +directly to the Linux guest's device number and throw away the rest;
> +they can crash due to minor number clashes.
>
> --
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel
|