# HG changeset patch # User Robb Romans <3r@xxxxxxxxxx> # Node ID 8e4bcebfca9a944a3dfe9392813da995162a9b28 # Parent 23470c8ea113a3a55edb9f6d98c16467bdf10c43 Separate file for docs/src/user/domain_filesystem.tex Signed-off-by: Robb Romans <3r@xxxxxxxxxx> diff -r 23470c8ea113 -r 8e4bcebfca9a docs/src/user.tex --- a/docs/src/user.tex Mon Sep 19 20:26:38 2005 +++ b/docs/src/user.tex Mon Sep 19 20:57:21 2005 @@ -73,255 +73,9 @@ %% Chapter Domain Management Tools moved to domain_mgmt.tex \include{src/user/domain_mgmt} - -\chapter{Domain Filesystem Storage} - -It is possible to directly export any Linux block device in dom0 to -another domain, or to export filesystems / devices to virtual machines -using standard network protocols (e.g. NBD, iSCSI, NFS, etc). This -chapter covers some of the possibilities. - - -\section{Exporting Physical Devices as VBDs} -\label{s:exporting-physical-devices-as-vbds} - -One of the simplest configurations is to directly export -individual partitions from domain 0 to other domains. To -achieve this use the \path{phy:} specifier in your domain -configuration file. For example a line like -\begin{quote} -\verb_disk = ['phy:hda3,sda1,w']_ -\end{quote} -specifies that the partition \path{/dev/hda3} in domain 0 -should be exported read-write to the new domain as \path{/dev/sda1}; -one could equally well export it as \path{/dev/hda} or -\path{/dev/sdb5} should one wish. - -In addition to local disks and partitions, it is possible to export -any device that Linux considers to be ``a disk'' in the same manner. -For example, if you have iSCSI disks or GNBD volumes imported into -domain 0 you can export these to other domains using the \path{phy:} -disk syntax. E.g.: -\begin{quote} -\verb_disk = ['phy:vg/lvm1,sda2,w']_ -\end{quote} - - - -\begin{center} -\framebox{\bf Warning: Block device sharing} -\end{center} -\begin{quote} -Block devices should typically only be shared between domains in a -read-only fashion otherwise the Linux kernel's file systems will get -very confused as the file system structure may change underneath them -(having the same ext3 partition mounted rw twice is a sure fire way to -cause irreparable damage)! \Xend will attempt to prevent you from -doing this by checking that the device is not mounted read-write in -domain 0, and hasn't already been exported read-write to another -domain. -If you want read-write sharing, export the directory to other domains -via NFS from domain0 (or use a cluster file system such as GFS or -ocfs2). - -\end{quote} - - -\section{Using File-backed VBDs} - -It is also possible to use a file in Domain 0 as the primary storage -for a virtual machine. As well as being convenient, this also has the -advantage that the virtual block device will be {\em sparse} --- space -will only really be allocated as parts of the file are used. So if a -virtual machine uses only half of its disk space then the file really -takes up half of the size allocated. - -For example, to create a 2GB sparse file-backed virtual block device -(actually only consumes 1KB of disk): -\begin{quote} -\verb_# dd if=/dev/zero of=vm1disk bs=1k seek=2048k count=1_ -\end{quote} - -Make a file system in the disk file: -\begin{quote} -\verb_# mkfs -t ext3 vm1disk_ -\end{quote} - -(when the tool asks for confirmation, answer `y') - -Populate the file system e.g. by copying from the current root: -\begin{quote} -\begin{verbatim} -# mount -o loop vm1disk /mnt -# cp -ax /{root,dev,var,etc,usr,bin,sbin,lib} /mnt -# mkdir /mnt/{proc,sys,home,tmp} -\end{verbatim} -\end{quote} - -Tailor the file system by editing \path{/etc/fstab}, -\path{/etc/hostname}, etc (don't forget to edit the files in the -mounted file system, instead of your domain 0 filesystem, e.g. you -would edit \path{/mnt/etc/fstab} instead of \path{/etc/fstab} ). For -this example put \path{/dev/sda1} to root in fstab. - -Now unmount (this is important!): -\begin{quote} -\verb_# umount /mnt_ -\end{quote} - -In the configuration file set: -\begin{quote} -\verb_disk = ['file:/full/path/to/vm1disk,sda1,w']_ -\end{quote} - -As the virtual machine writes to its `disk', the sparse file will be -filled in and consume more space up to the original 2GB. - -{\bf Note that file-backed VBDs may not be appropriate for backing -I/O-intensive domains.} File-backed VBDs are known to experience -substantial slowdowns under heavy I/O workloads, due to the I/O handling -by the loopback block device used to support file-backed VBDs in dom0. -Better I/O performance can be achieved by using either LVM-backed VBDs -(Section~\ref{s:using-lvm-backed-vbds}) or physical devices as VBDs -(Section~\ref{s:exporting-physical-devices-as-vbds}). - -Linux supports a maximum of eight file-backed VBDs across all domains by -default. This limit can be statically increased by using the {\em -max\_loop} module parameter if CONFIG\_BLK\_DEV\_LOOP is compiled as a -module in the dom0 kernel, or by using the {\em max\_loop=n} boot option -if CONFIG\_BLK\_DEV\_LOOP is compiled directly into the dom0 kernel. - - -\section{Using LVM-backed VBDs} -\label{s:using-lvm-backed-vbds} - -A particularly appealing solution is to use LVM volumes -as backing for domain file-systems since this allows dynamic -growing/shrinking of volumes as well as snapshot and other -features. - -To initialise a partition to support LVM volumes: -\begin{quote} -\begin{verbatim} -# pvcreate /dev/sda10 -\end{verbatim} -\end{quote} - -Create a volume group named `vg' on the physical partition: -\begin{quote} -\begin{verbatim} -# vgcreate vg /dev/sda10 -\end{verbatim} -\end{quote} - -Create a logical volume of size 4GB named `myvmdisk1': -\begin{quote} -\begin{verbatim} -# lvcreate -L4096M -n myvmdisk1 vg -\end{verbatim} -\end{quote} - -You should now see that you have a \path{/dev/vg/myvmdisk1} -Make a filesystem, mount it and populate it, e.g.: -\begin{quote} -\begin{verbatim} -# mkfs -t ext3 /dev/vg/myvmdisk1 -# mount /dev/vg/myvmdisk1 /mnt -# cp -ax / /mnt -# umount /mnt -\end{verbatim} -\end{quote} - -Now configure your VM with the following disk configuration: -\begin{quote} -\begin{verbatim} - disk = [ 'phy:vg/myvmdisk1,sda1,w' ] -\end{verbatim} -\end{quote} - -LVM enables you to grow the size of logical volumes, but you'll need -to resize the corresponding file system to make use of the new -space. Some file systems (e.g. ext3) now support on-line resize. See -the LVM manuals for more details. - -You can also use LVM for creating copy-on-write clones of LVM -volumes (known as writable persistent snapshots in LVM -terminology). This facility is new in Linux 2.6.8, so isn't as -stable as one might hope. In particular, using lots of CoW LVM -disks consumes a lot of dom0 memory, and error conditions such as -running out of disk space are not handled well. Hopefully this -will improve in future. - -To create two copy-on-write clone of the above file system you -would use the following commands: - -\begin{quote} -\begin{verbatim} -# lvcreate -s -L1024M -n myclonedisk1 /dev/vg/myvmdisk1 -# lvcreate -s -L1024M -n myclonedisk2 /dev/vg/myvmdisk1 -\end{verbatim} -\end{quote} - -Each of these can grow to have 1GB of differences from the master -volume. You can grow the amount of space for storing the -differences using the lvextend command, e.g.: -\begin{quote} -\begin{verbatim} -# lvextend +100M /dev/vg/myclonedisk1 -\end{verbatim} -\end{quote} - -Don't let the `differences volume' ever fill up otherwise LVM gets -rather confused. It may be possible to automate the growing -process by using \path{dmsetup wait} to spot the volume getting full -and then issue an \path{lvextend}. - -In principle, it is possible to continue writing to the volume -that has been cloned (the changes will not be visible to the -clones), but we wouldn't recommend this: have the cloned volume -as a `pristine' file system install that isn't mounted directly -by any of the virtual machines. - - -\section{Using NFS Root} - -First, populate a root filesystem in a directory on the server -machine. This can be on a distinct physical machine, or simply -run within a virtual machine on the same node. - -Now configure the NFS server to export this filesystem over the -network by adding a line to \path{/etc/exports}, for instance: - -\begin{quote} -\begin{small} -\begin{verbatim} -/export/vm1root 1.2.3.4/24 (rw,sync,no_root_squash) -\end{verbatim} -\end{small} -\end{quote} - -Finally, configure the domain to use NFS root. In addition to the -normal variables, you should make sure to set the following values in -the domain's configuration file: - -\begin{quote} -\begin{small} -\begin{verbatim} -root = '/dev/nfs' -nfs_server = '2.3.4.5' # substitute IP address of server -nfs_root = '/path/to/root' # path to root FS on the server -\end{verbatim} -\end{small} -\end{quote} - -The domain will need network access at boot time, so either statically -configure an IP address (Using the config variables \path{ip}, -\path{netmask}, \path{gateway}, \path{hostname}) or enable DHCP ( -\path{dhcp='dhcp'}). - -Note that the Linux NFS root implementation is known to have stability -problems under high load (this is not a Xen-specific problem), so this -configuration may not be appropriate for critical servers. +%% Chapter Domain Filesystem Storage moved to domain_filesystem.tex +\include{src/user/domain_filesystem} + \part{User Reference Documentation} diff -r 23470c8ea113 -r 8e4bcebfca9a docs/src/user/domain_filesystem.tex --- /dev/null Mon Sep 19 20:26:38 2005 +++ b/docs/src/user/domain_filesystem.tex Mon Sep 19 20:57:21 2005 @@ -0,0 +1,243 @@ +\chapter{Domain Filesystem Storage} + +It is possible to directly export any Linux block device in dom0 to +another domain, or to export filesystems / devices to virtual machines +using standard network protocols (e.g.\ NBD, iSCSI, NFS, etc.). This +chapter covers some of the possibilities. + + +\section{Exporting Physical Devices as VBDs} +\label{s:exporting-physical-devices-as-vbds} + +One of the simplest configurations is to directly export individual +partitions from domain~0 to other domains. To achieve this use the +\path{phy:} specifier in your domain configuration file. For example a +line like +\begin{quote} + \verb_disk = ['phy:hda3,sda1,w']_ +\end{quote} +specifies that the partition \path{/dev/hda3} in domain~0 should be +exported read-write to the new domain as \path{/dev/sda1}; one could +equally well export it as \path{/dev/hda} or \path{/dev/sdb5} should +one wish. + +In addition to local disks and partitions, it is possible to export +any device that Linux considers to be ``a disk'' in the same manner. +For example, if you have iSCSI disks or GNBD volumes imported into +domain~0 you can export these to other domains using the \path{phy:} +disk syntax. E.g.: +\begin{quote} + \verb_disk = ['phy:vg/lvm1,sda2,w']_ +\end{quote} + +\begin{center} + \framebox{\bf Warning: Block device sharing} +\end{center} +\begin{quote} + Block devices should typically only be shared between domains in a + read-only fashion otherwise the Linux kernel's file systems will get + very confused as the file system structure may change underneath + them (having the same ext3 partition mounted \path{rw} twice is a + sure fire way to cause irreparable damage)! \Xend\ will attempt to + prevent you from doing this by checking that the device is not + mounted read-write in domain~0, and hasn't already been exported + read-write to another domain. If you want read-write sharing, + export the directory to other domains via NFS from domain~0 (or use + a cluster file system such as GFS or ocfs2). +\end{quote} + + +\section{Using File-backed VBDs} + +It is also possible to use a file in Domain~0 as the primary storage +for a virtual machine. As well as being convenient, this also has the +advantage that the virtual block device will be \emph{sparse} --- +space will only really be allocated as parts of the file are used. So +if a virtual machine uses only half of its disk space then the file +really takes up half of the size allocated. + +For example, to create a 2GB sparse file-backed virtual block device +(actually only consumes 1KB of disk): +\begin{quote} + \verb_# dd if=/dev/zero of=vm1disk bs=1k seek=2048k count=1_ +\end{quote} + +Make a file system in the disk file: +\begin{quote} + \verb_# mkfs -t ext3 vm1disk_ +\end{quote} + +(when the tool asks for confirmation, answer `y') + +Populate the file system e.g.\ by copying from the current root: +\begin{quote} +\begin{verbatim} +# mount -o loop vm1disk /mnt +# cp -ax /{root,dev,var,etc,usr,bin,sbin,lib} /mnt +# mkdir /mnt/{proc,sys,home,tmp} +\end{verbatim} +\end{quote} + +Tailor the file system by editing \path{/etc/fstab}, +\path{/etc/hostname}, etc.\ Don't forget to edit the files in the +mounted file system, instead of your domain~0 filesystem, e.g.\ you +would edit \path{/mnt/etc/fstab} instead of \path{/etc/fstab}. For +this example put \path{/dev/sda1} to root in fstab. + +Now unmount (this is important!): +\begin{quote} + \verb_# umount /mnt_ +\end{quote} + +In the configuration file set: +\begin{quote} + \verb_disk = ['file:/full/path/to/vm1disk,sda1,w']_ +\end{quote} + +As the virtual machine writes to its `disk', the sparse file will be +filled in and consume more space up to the original 2GB. + +{\bf Note that file-backed VBDs may not be appropriate for backing + I/O-intensive domains.} File-backed VBDs are known to experience +substantial slowdowns under heavy I/O workloads, due to the I/O +handling by the loopback block device used to support file-backed VBDs +in dom0. Better I/O performance can be achieved by using either +LVM-backed VBDs (Section~\ref{s:using-lvm-backed-vbds}) or physical +devices as VBDs (Section~\ref{s:exporting-physical-devices-as-vbds}). + +Linux supports a maximum of eight file-backed VBDs across all domains +by default. This limit can be statically increased by using the +\emph{max\_loop} module parameter if CONFIG\_BLK\_DEV\_LOOP is +compiled as a module in the dom0 kernel, or by using the +\emph{max\_loop=n} boot option if CONFIG\_BLK\_DEV\_LOOP is compiled +directly into the dom0 kernel. + + +\section{Using LVM-backed VBDs} +\label{s:using-lvm-backed-vbds} + +A particularly appealing solution is to use LVM volumes as backing for +domain file-systems since this allows dynamic growing/shrinking of +volumes as well as snapshot and other features. + +To initialize a partition to support LVM volumes: +\begin{quote} +\begin{verbatim} +# pvcreate /dev/sda10 +\end{verbatim} +\end{quote} + +Create a volume group named `vg' on the physical partition: +\begin{quote} +\begin{verbatim} +# vgcreate vg /dev/sda10 +\end{verbatim} +\end{quote} + +Create a logical volume of size 4GB named `myvmdisk1': +\begin{quote} +\begin{verbatim} +# lvcreate -L4096M -n myvmdisk1 vg +\end{verbatim} +\end{quote} + +You should now see that you have a \path{/dev/vg/myvmdisk1} Make a +filesystem, mount it and populate it, e.g.: +\begin{quote} +\begin{verbatim} +# mkfs -t ext3 /dev/vg/myvmdisk1 +# mount /dev/vg/myvmdisk1 /mnt +# cp -ax / /mnt +# umount /mnt +\end{verbatim} +\end{quote} + +Now configure your VM with the following disk configuration: +\begin{quote} +\begin{verbatim} + disk = [ 'phy:vg/myvmdisk1,sda1,w' ] +\end{verbatim} +\end{quote} + +LVM enables you to grow the size of logical volumes, but you'll need +to resize the corresponding file system to make use of the new space. +Some file systems (e.g.\ ext3) now support online resize. See the LVM +manuals for more details. + +You can also use LVM for creating copy-on-write (CoW) clones of LVM +volumes (known as writable persistent snapshots in LVM terminology). +This facility is new in Linux 2.6.8, so isn't as stable as one might +hope. In particular, using lots of CoW LVM disks consumes a lot of +dom0 memory, and error conditions such as running out of disk space +are not handled well. Hopefully this will improve in future. + +To create two copy-on-write clone of the above file system you would +use the following commands: + +\begin{quote} +\begin{verbatim} +# lvcreate -s -L1024M -n myclonedisk1 /dev/vg/myvmdisk1 +# lvcreate -s -L1024M -n myclonedisk2 /dev/vg/myvmdisk1 +\end{verbatim} +\end{quote} + +Each of these can grow to have 1GB of differences from the master +volume. You can grow the amount of space for storing the differences +using the lvextend command, e.g.: +\begin{quote} +\begin{verbatim} +# lvextend +100M /dev/vg/myclonedisk1 +\end{verbatim} +\end{quote} + +Don't let the `differences volume' ever fill up otherwise LVM gets +rather confused. It may be possible to automate the growing process by +using \path{dmsetup wait} to spot the volume getting full and then +issue an \path{lvextend}. + +In principle, it is possible to continue writing to the volume that +has been cloned (the changes will not be visible to the clones), but +we wouldn't recommend this: have the cloned volume as a `pristine' +file system install that isn't mounted directly by any of the virtual +machines. + + +\section{Using NFS Root} + +First, populate a root filesystem in a directory on the server +machine. This can be on a distinct physical machine, or simply run +within a virtual machine on the same node. + +Now configure the NFS server to export this filesystem over the +network by adding a line to \path{/etc/exports}, for instance: + +\begin{quote} + \begin{small} +\begin{verbatim} +/export/vm1root 1.2.3.4/24 (rw,sync,no_root_squash) +\end{verbatim} + \end{small} +\end{quote} + +Finally, configure the domain to use NFS root. In addition to the +normal variables, you should make sure to set the following values in +the domain's configuration file: + +\begin{quote} + \begin{small} +\begin{verbatim} +root = '/dev/nfs' +nfs_server = '2.3.4.5' # substitute IP address of server +nfs_root = '/path/to/root' # path to root FS on the server +\end{verbatim} + \end{small} +\end{quote} + +The domain will need network access at boot time, so either statically +configure an IP address using the config variables \path{ip}, +\path{netmask}, \path{gateway}, \path{hostname}; or enable DHCP +(\path{dhcp='dhcp'}). + +Note that the Linux NFS root implementation is known to have stability +problems under high load (this is not a Xen-specific problem), so this +configuration may not be appropriate for critical servers.