Chapter 3: Provisioning domUs

From PrgmrWiki
Revision as of 05:09, 19 June 2010 by Nyck (talk | contribs) (Converting VMware Disk Images)
You can suck Linux right out of the air, as it were, by downloading the right files and
putting them in the right places, but there probably are not more than a few hundred
people in the world who could create a functioning Linux system in that way.
—Neal Stephenson, In the Beginning Was the Command Line

Up until now, we’ve focused on administering the dom0, leaving the specifics of domU creation up to the virt-install tool. However, you’ll probably need to build a domU image from scratch on occasion. There are plenty of good reasons for this—perhaps you want an absolutely minimal Linux environment to use as a base for virtual private server (VPS) hosting setups. Maybe you’re deploying some custom application—a server appliance— using Xen. It might just seem like a good way to keep systems patched. Possibly you need to create Xen instances without the benefit of a network connection.

Just as there are many reasons to want custom filesystem images, there are many ways to make the images. We’ll give detailed instructions for some that we use frequently, and briefly mention some others, but it would be impossible to provide an exhaustive list (and very boring besides). The goal of this chapter is to give you an idea of the range of options you have in provisioning domU filesystems, a working knowledge of the principles, and just enough step-by-step instruction to get familiar with the processes.

A Basic DomU Configuration

All of the examples that we’re presenting here should work with a basic—in fact, downright skeletal—domU config file. Something along the lines of this should work:

kernel = /boot/vmlinuz-2.6-xen.gz
vif = ['']
disk = ['phy:/dev/targetvg/lv,sda,w']

This specifies a kernel, a network interface, and a disk, and lets Xen use defaults for everything else. Tailor the variables, such as volume group and kernel name, to your site. As we mention elsewhere, we recommend including other variables, such as a MAC and IP address, but we’ll omit them during this chapter for clarity so we can focus on creating domU images.

NOTE

This doesn’t include a ramdisk. Either add a ramdisk= line or include xenblk (and

xennet if you plan on accessing the network before modules are available) in your kernel. When we compile our own kernels, we usually include the xenblk and xennet drivers directly in the kernel. We only use a ramdisk to satisfy the requirements of the distro

kernels.

If you’re using a modular kernel, which is very likely, you’ll also need to ensure that the kernel has a matching set of modules that it can load from the domU filesystem. If you’re booting the domU using the same kernel as the dom0, you can copy over the modules like this (if the domU image is mounted on /mnt):

# mkdir -p /mnt/lib/modules
# cp -a /lib/modules/`uname -r` /mnt

Note that this command only works if the domU kernel is the same as the dom0 kernel! Some install procedures will install the correct modules automatically; others won’t. No matter how you create the domU, remember that modules need to be accessible from the domU, even if the kernel lives in the dom0. If you have trouble, make sure that the kernel and module versions match, either by booting from a different kernel or copying in different modules.

Selecting a Kernel

Traditionally, one boots a domU image using a kernel stored in the dom0 filesystem, as in the sample config file in the last section. In this case, it’s common to use the same kernel for domUs and the dom0. However, this can lead to trouble—one distro’s kernels may be too specialized to work properly with another distro. We recommend either using the proper distro kernel, copying it into the dom0 filesystem so the domain builder can find it, or compiling your own generic kernel.

Another possible choice is to download Xen’s binary distribution, which includes precompiled domU kernels, and extracting an appropriate domU kernel from that.

Alternatively (and this is the option that we usually use when dealing with distros that ship Xen-aware kernels), you can bypass the entire problem of kernel selection and use PyGRUB to boot the distro’s own kernel from within the domU filesystem. For more details on PyGRUB, see Chapter 7. PyGRUB also makes it more intuitive to match modules to kernels by keeping both the domU kernel and its corresponding modules in the domU.

Quick-and-Dirty Install via tar

Let’s start by considering the most basic install method possible, just to get an idea of the principles involved. We’ll generate a root filesystem by copying files out of the dom0 (or an entirely separate physical machine) and into the domU. This approach copies out a filesystem known to work, requires no special tools, and is easy to debug. However, it’s also likely to pollute the domU with a lot of unnecessary stuff from the source system and is kind of a lot of work.

A good set of commands for this “cowboy” approach might be:

# xm block-attach 0 duncan.img /dev/xvda1 w 0
# mke2fs -j /dev/xvda1
# mount /dev/xvda1 /mnt
# cd /
# tar -c -f - --exclude /home --exclude /mnt --exclude /tmp --exclude \
  /proc --exclude /sys --exclude /var | ( cd /mnt/ ; tar xf - )
# mkdir /mnt/sys
# mkdir /mnt/proc
NOTE

Do all this as root.

These commands, in order, map the backing file to a virtual device in the dom0, create a filesystem on that device, mount the filesystem, and tar up the dom0 root directory while omitting /home, /mnt, /tmp, /proc, /sys, and /var. The output from this tar command then goes to a complementary tar used to extract the file in /mnt. Finally, we make some directories that the domU will need after it boots. At the end of this process, we have a self-contained domU in duncan.img.

Why This Is Not the Best Idea

The biggest problem with the cowboy approach, apart from its basic inelegance, is that it copies a lot of unnecessary stuff with no easy way to clear it out. When the domU is booted, you could use the package manager to remove things or just delete files by hand. But that’s work, and we are all about avoiding work.

Stuff to Watch Out For

There are some things to note:

  • You must mkdir /sys and /proc or else things will not work properly.
The issue here is that the Linux startup process uses /sys and /proc to discover and configure hardware—if, say, /proc/mounts doesn’t exist, the boot scripts will become extremely annoyed.
  • You may need to mknod /dev/xvda b 220 0.
/dev/xvd is the standard name for Xen virtual disks, by analogy with the hd and sd device nodes. The first virtual disk is /dev/xvda, which can be partitioned into /dev/xvda1, and so on. The command
# /mknod /dev/xvda b 220 0
creates the node /dev/xvda as a block device (b) with major number 220 (the number reserved for Xen VBDs) and minor number 0 (because it’s xvda—the first such device in the system).
NOTE

On most modern Linux systems, udev makes this unnecessary.

  • You may need to edit /etc/inittab and /etc/securettys so that /dev/xvc0 works as the console and has a proper getty.
We’ve noticed this problem only with Red Hat’s kernels: for regular XenSource kernels (at least through 3.1) the default getty on tty0 should work without further action on your part. If it doesn’t, read on!
The term console is something of a holdover from the days of giant time-sharing machines, when the system operator sat at a dedicated terminal called the system console. Nowadays, the console is a device that receives system administration messages—usually a graphics device, sometimes a serial console.
In the Xen case, all output goes to the Xen virtual console, xvc0. The xm console command attaches to this device with help from xenconsoled. To log in to it, Xen’s virtual console must be added to /etc/inittab so that init knows to attach a getty.1 Do this by adding a line like the following:
xvc:2345:respawn:/sbin/agetty -L xvc0
(As with all examples in books, don’t take this construction too literally! If you have a differently named getty binary, for example, you will definitely want to use that instead.)
You might also, depending on your policy regarding root logins, want to add /dev/xvc0 to /etc/securetty so that root will be able to log in on it. Simply append a line containing the device name, xvc0, to the file.

Using the Package Management System with an Alternate Root

Another way to obtain a domU image would be to just run the setup program for your distro of choice and instruct it to install to the mounted domU root. The disadvantage here is that most setup programs expect to be installed on a real machine, and they become surly and uncooperative when forced to deal with paravirtualization.

Nonetheless, this is a viable process for most installers, including both RPM and Debian-based distros. We’ll describe installation using both Red Hat’s and Debian’s tools.

Red Hat, CentOS, and Other RPM-Based Distros

On Red Hat–derived systems, we treat this as a package installation, rather than a system installation. Thus, rather than using anaconda, the system installer, we use yum, which has an installation mode suitable for this sort of thing. First, it’s easiest to make sure that SELinux is disabled or nonenforcing because its extended permissions and policies don’t work well with the installer.2 The quickest way to do this is to issue echo 0 >/selinux/enforce. A more permanent solution would be to boot with selinux=0 on the kernel command line.

NOTE: Specify kernel parameters as a space-separated list on the “module” line that loads the Linux kernel—either in /boot/grub/menu.lst or by pushing e at the GRUB menu.

When that’s done, mount your target domU image somewhere appropriate. Here we create the logical volume malcom in the volume group scotland and mount it on /mnt:

# lvcreate -L 4096 -n malcom scotland
# mount /dev/scotland/malcom /mnt/

Create some vital directories, just as in the tar example:

# cd /mnt
# mkdir proc sys etc

Make a basic fstab (you can just copy the one from dom0 and edit the root device as appropriate—with the sample config file mentioned earlier, you would use /dev/sda):

# cp /etc/fstab /mnt/etc
# vi /mnt/etc/fstab

Fix modprobe.conf, so that the kernel knows where to find its device drivers. (This step isn’t technically necessary, but it enables yum upgrade to properly build a new initrd when the kernel changes—handy if you’re using PyGRUB.)

# echo "alias scsi_hostadapter xenblk\nalias eth0 xennet" > /mnt/etc/modprobe.conf

At this point you’ll need an RPM that describes the software release version and creates the yum configuration files—we installed CentOS 5, so we used centos-release-5-2.el5.centos.i386.rpm.

# wget http://mirrors.prgmr.com/os/centos/5/os/i386/CentOS/centos-release-5.el5.centos.i386.rpm
# rpm -ivh --nodeps --root /mnt centos-release-5-2.el5.centos.i386.rpm

In this case we just used the first mirror that we could find. You may want to look at a list of CentOS mirrors and pick a more suitable one. Next we install yum under the new install tree. If we don’t do this before installing other packages, yum will complain about transaction errors:

# yum --installroot=/mnt -y install yum

Now that the directory has been appropriately populated, we can use yum to finish the install.

# yum --installroot=/mnt -y groupinstall Base

And that’s really all there is to it. Create a domU config file as normal.

Debootstrap with Debian and Ubuntu

Debootstrap is quite a bit easier. Create a target for the install (using LVM or a flat file), mount it, and then use debootstrap to install a base system into that directory. For example, to install Debian Etch on an x68_64 machine:

# mount /dev/scotland/banquo /mnt
# debootstrap --include=ssh,udev,linux-image-xen-amd64 etch /mnt http://mirrors.easynews.com/
linux/debian

Note the --include= option. Because Xen’s networking requires the hotplug system, the domU must include a working install of udev with its support scripts. (We’ve also included SSH, just for convenience and to demonstrate the syntax for multiple items.) If you are on an i386 platform, add libc6-xen to the include list. Finally, to ensure that we have a compatible kernel and module set, we add a suitable kernel to the include= list. We use linux-imagexen- amd64. Pick an appropriate kernel for your hardware.

If you want to use PyGRUB, create /mnt/etc/modules before you run debootstrap, and put in that file:

xennet
xenblk

Also, create a /mnt/boot/grub/menu.lst file as for a physical machine. If you’re not planning to use PyGRUB, make sure that an appropriate Debian kernel and ramdisk are accessible from the dom0, or make sure that modules matching your planned kernel are available within the domU. In this case, we’ll copy the sdom0 kernel modules into the domU.

# cp -a /lib/modules/<domU kernel version> /mnt/lib/modules

When that’s done, copy over /etc/fstab to the new system, editing it if necessary:

# cp /etc/fstab /mnt/etc

Renaming Network Devices

Debian, like many systems, uses udev to tie eth0 and eth1 to consistent physical devices. It does this by assigning the device name (ethX) based on the MAC address of the Ethernet device. It will do this during debootstrap—this means that it ties eth0 to the MAC of the box you are running debootstrap on. In turn, the domU’s Ethernet interface, which presumably has a different MAC address, will become eth1.3 You can avoid this by removing /mnt/etc/udev/rules.d/z25_persistent-net.rules, which contains the stored mappings between MAC addresses and device names. That file will be recreated next time you reboot. If you only have one interface, it might make sense to remove the file that generates it, /mnt/etc/udev/rules.d/z45_persistent-net-generator.rules.

# rm /mnt/etc/udev/rules.d/z25_persistent-net.rules

Finally, unmount the install root. Your system should then essentially work. You may want to change the hostname and edit /etc/inittab within the domU’s filesystem, but these are purely optional steps.

# umount /mnt

Test the new install by creating a config file as previously described (say, /etc/xen/banquo) and issuing:

# xm create -c /etc/xen/banquo

QEMU Install

Our favorite way to create the domU image—the way that most closely simulates a real machine—is probably to install using QEMU and then take the installed filesystem and use that as your domU root filesystem. This allows you, the installer, to leverage your years of experience installing Linux. Because it’s installing in a virtual machine as strongly partitioned as Xen’s, the install program is very unlikely to do anything surprising and even more unlikely to interact badly with the existing system. QEMU also works equally well with all distros and even non-Linux operating systems.

QEMU does have the disadvantage of being slow. Because KQEMU (the kernel acceleration module) isn’t compatible with Xen, you’ll have to fall back to software-only full emulation. Of course, you can use this purely for an initial image-creation step and then copy the pristine disk images around as needed, in which case the speed penalty becomes less important.

QEMU’S RELATION TO XEN

    You may already have noted that QEMU gets mentioned fairly often in connection
with Xen. There’s a good reason for this: The two projects complement each other.
Although QEMU is a pure, or classic, full emulator, there’s some overlap in QEMU’s
and Xen’s requirements. For example, Xen can use QCOW images for its disk
emulation, and it uses QEMU fully virtualized drivers when running in hardware
virtualization mode. QEMU also furnishes some code for the hardware virtualization
built into the Linux kernel, KVM (kernel virtual machine)* and win4lin, on the theory
that there’s no benefit in reinventing the wheel.
    Xen and QEMU aren’t the same, but there’s a general consensus that they
complement each other well, with Xen more suited to high-performance production
environments, and QEMU is aimed more at exact emulation. Xen’s and QEMU’s
developers have begun sharing patches and working together. They’re distinct
projects, but Xen developers have acknowledged that QEMU “played a critical
role in Xen’s success.”†

* Although we don’t cover KVM extensively, it’s another interesting virtualization technology.
More information is available at the KVM web page, http://kvm.sf.net/.
† Liguori, Anthony, “Merging QEMU-DM upstream,” http://www.xen.org/files/xensummit_4/
Liguori_XenSummit_Spring_2007.pdf.

This technique works by running QEMU as a pure emulator for the duration of the install, using emulated devices. Begin by getting and installing QEMU. Then run:

# qemu -hda /dev/scotland/macbeth -cdrom slackware-11.0-install-dvd.iso -boot d

This command runs QEMU with the target device—a logical volume in this case—as its hard drive and the install medium as its virtual CD drive. (The Slackware ISO here, as always, is just an example—install whatever you like.) The -boot d option tells QEMU to boot from the emulated CD drive.

Now install to the virtual machine as usual. At the end, you should have a completely functional domU image. Of course, you’re still going to have to create an appropriate domU config file and handle the other necessary configuration from the dom0 side, but all of that is reasonably easy to automate.

One last caveat that bears repeating because it applies to many of these install methods: If the domU kernel isn’t Xen-aware, then you will have to either use a kernel from the dom0 or mount the domU and replace its kernel.

virt-install—Red Hat’s One-Step DomU Installer

Red Hat opted to support a generic virtualization concept rather than a specific technology. Their approach is to wrap the virtualization in an abstraction layer, libvirt. Red Hat then provides support software that uses this library to take the place of the virtualization package-specific control software.4 (For information on the management end of libvirt, virt-manager, see Chapter 6.)

For example, Red Hat includes virsh, a command-line interface that controls virtual machines. xm and virsh do much the same thing, using very similar commands. The advantage of virsh and libvirt, however, is that the virsh interface will remain consistent if you decide to switch to another virtualization technology. Right now, for example, it can control QEMU and KVM in addition to Xen using a consistent set of commands.

The installation component of this system is virt-install. Like virsh, it builds on libvirt, which provides a platform-independent wrapper around different virtualization packages. No matter which virtualization backend you’re using, virt-install works by providing an environment for the standard network install method: First it asks the user for configuration information, then it writes an appropriate config file, makes a virtual machine, loads a kernel from the install medium, and finally bootstraps a network install using the standard Red Hat installer, anaconda. At this point anaconda takes over, and installation proceeds as normal.

Unfortunately, this means that virt-install only works with networkaccessible Red Hat–style directory trees. (Other distros don’t have the install layout that the installer expects.) If you’re planning to standardize on Red Hat, CentOS, or Fedora, this is okay. Otherwise, it could be a serious problem.

Although virt-install is usually called from within Red Hat’s virt-manager GUI, it’s also an independent executable that you can use manually in an interactive or scripted mode. Here’s a sample virt-install session, with our inputs in bold.

# /usr/sbin/virt-install

Would you like a fully virtualized guest (yes or no)? This will allow you to
run unmodified operating systems. no

What is the name of your virtual machine? donalbain

How much RAM should be allocated (in megabytes)? 512

What would you like to use as the disk (path)? /mnt/donalbain.img

How large would you like the disk (/mnt/donalbain.img) to be (in gigabytes)? 4

Would you like to enable graphics support? (yes or no) no

What is the install location?

ftp://mirrors.easynews.com/linux/centos/4/os/i386/

Most of these inputs are self-explanatory. Note that the install location can be ftp://, http://, nfs:, or an SSH-style path (user@host:/path). All of these can be local if necessary—a local FTP or local HTTP server, for example, is a perfectly valid source. Graphics support indicates whether to use the virtual framebuffer—it tweaks the vfb= line in the config file.

Here’s the config file generated from that input:

name = "donalbain"
memory = "512"
disk = ['tap:aio:/mnt/donalbain.img,xvda,w', ]
vif = [ 'mac=00:16:3e:4b:af:c2, bridge=xenbr0', ]
uuid = "162910c8-2a0c-0333-2349-049e8e32ba90"
bootloader = "/usr/bin/pygrub"
vcpus = 1
on_reboot = 'restart'
on_crash = 'restart'

There are some niceties about virt-install’s config file that we’d like to mention. First, note that virt-install accesses the disk image using the tap driver for improved performance. (For more details on the tap driver, see Chapter 4.)

It also exports the disk as xvda to the guest operating system, rather than as a SCSI or IDE device. The generated config file also includes a randomly generated MAC for each vif, using the 00:16:3e prefix assigned to Xen. Finally, the image boots using PyGRUB, rather than specifying a kernel within the config file.

Converting VMware Disk Images

One of the great things about virtualization is that it allows people to distribute virtual appliances—complete, ready-to-run, preconfigured OS images. VMware has been pushing most strongly in that direction, but with a little work, it’s possible to use VMware’s prebuilt virtual machines with Xen.

PYGRUB, PYPXEBOOT, AND FRIENDS

    The principle behind PyGRUB, pypxeboot, and similar programs is that they allow
Xen’s domain builder to load a kernel that isn’t directly accessible from the dom0
filesystem. This, in turn, improves Xen’s simulation of a real machine. For example,
an automated provisioning tool that uses PXE can provision Xen domains without
modification. This becomes especially important in the context of domU images
because it allows the image to be a self-contained package—plop a generic config
file on top, and it’s ready to go.
    Both PyGRUB and pypxeboot take the place of an analogous utility for physical
machines: GRUB and PXEboot, respectively. Both are emulations written in Python,
specialized to work with Xen. Both acquire the kernel from a place where the
ordinary loader would be unable to find it. And both can help you, the hapless Xen
administrator, in your day-to-day life.
    For more notes on setting up PyGRUB, see Chapter 7. For more on pypxeboot,
see “Installing pypxeboot” on page 38.

Other virtualization providers, by and large, use disk formats that do more than Xen’s—for example, they include configuration or provide snapshots. Xen’s approach is to leave that sort of feature to standard tools in the dom0. Because Xen uses open formats and standard tools whenever possible, its disk images are simply . . . filesystems.5

Thus, the biggest part of converting a virtual appliance to work with Xen is in converting over the disk image. Fortunately, qemu-img supports most of the image formats you’re likely to encounter, including VMware’s .vmdk, or Virtual Machine Disk format.

The conversion process is pretty easy. First, get a VMware image to play with. There are some good ones at http://www.vmware.com/appliances/directory/.

Next, take the image and use qemu-img to convert it to a QCOW or raw image:

# qemu-img convert foo.vmdk -o qcow hecate.qcow

This command duplicates the contents of foo.vmdk in a QCOW image (hence the -o qcow, for output format) called hecate.qcow. (QCOW, by the way, is a disk image format that originates with the QEMU emulator. It supports AES encryption and transparent decompression. It’s also supported by Xen. More details on using QCOW images with Xen are in Chapter 4.) At this point you can boot it as usual, loading the kernel via PyGRUB if it’s Xenaware or if you’re using HVM, or using a standard domU kernel from within the dom0 otherwise.

Unfortunately, this won’t generate a configuration suitable for booting the image with Xen. However, it should be easy to create a basic config file that uses the QCOW image as its root device. For example, here’s a fairly minimal generic config that relies on the default values to the extent possible:

name = "hecate"
memory = 128
disk = ['tap:qcow:/mnt/hecate.img,xvda,w' ]
vif = [ '' ]
kernel = "/boot/vmlinuz-2.6-xenU"

Note that we’re using a kernel from the dom0 filesystem rather than loading the kernel from the VMware disk image with PyGRUB, as we ordinarily suggest. This is so we don’t have to worry about whether or not that kernel works with Xen.

RPATH’S RBUILDER: A NEW APPROACH

    RPath is kind of interesting. It probably doesn’t merit extended discussion, but their
approach to building virtual machines is cool. Neat. Elegant.
    RPath starts by focusing on the application that the machine is meant to run and
then uses software that determines precisely what the machine needs to run it by
examining library dependencies, noticing which config files are read, and so on.
The promise of this approach is that it delivers compact, tuned, refined virtual
machine images with known characteristics—all while maintaining the high degree
of automation necessary to manage large systems.
    Their website is http://rpath.org/. They’ve got a good selection of prerolled
VMs, aimed at both testing and deployment. (Note that although we think their
approach is worth mentioning, we are not affiliated with rPath in any way. You
may want to give them a shot, though.)

Mass Deployment

Manual Deployment

===QEMU and Your Existing Infrastructure === ===Installing pypxeboot === ===Automated Installs the Red Hat Way ===

And Then . . .