Hacking diskimage-builder for Fun and Profit
Introduction
For the past year I have been responsible for redesigning and keeping a multi-region OpenStack deployment up to date. I will eventually go in-depth on what that entailed, but for now I will focus on my first challenge on that infrastructure, keeping the cloud images provided to the users up to date and customized according to defined local policies.
To achieve this, and after researching my choices, I chose diskimage-builder. It may seem overkill at first glance, but because of its modular architecture it called to my OCD nature and in the end you can make your own modules that can depend on existing modules, so yes it seemed perfect for the job. Besides, it is made by the Openstack team, so it is more than trustworthy.
Before I continue let me reword myself, this is used for generating stripped down Linux bootable cloud images, you can use them wherever you would like that supports booting the resulting filesystem. These images are cloud-ready and should be able to retrieve any metadata you configure it to fetch from your cloud provider.
At the end of this article you should be able to generate your weekly up to date cloud images in whatever CI solution you master =)
Requirements
I recommend having some background prior to delving in this use case, as it can get very complex when you need to debug more advance generation scenarios (i.e: CI). The cloud images provided by your favorite distribution should be enough to satisfy most cloud computing needs. This process helps in reducing the cloud image footprint to your bare needs, and allows you to customize it beyond the distribution's provided cloud images.
That said, I advise:
- experience with chroot (I recommend reading the gentoo documentation)
- experience with loop devices (specifically using losetup)
- Python experience, especially regarding virtual environments
- know your Linux File System (LFS)
- know your way around manipulating BASH variables)
- have minor experience with cloud-init
- and as always, use Git
diskimage-builder
As stated in its documentation, diskimage-builder is a tool for automatically building customized operating-system images for use in clouds and other environments.
It includes support for building images based on many major distributions and can produce cloud-images in all common formats (qcow2, vhd, raw, etc), bare metal file-system images and ram-disk images. These images are composed from the many included elements; diskimage-builder acts as a framework to easily add your own elements for even further customization.
One of the main reasons I chose to use diskimage-builder was for its modular design, but I quickly discovered that it is also used extensively by the TripleO project and within OpenStack Infrastructure, which fit perfectly in our Openstack infrastructure.
Elements
"An element is a particular set of code that alters how the image is built, or runs within the chroot to prepare the image." [2]
"Images are built using a chroot and bind mounted /proc /sys and /dev. The goal of the image building process is to produce blank slate machines that have all the necessary bits to fulfill a specific purpose in the running of an OpenStack cloud. Images produce either a filesystem image with a label of cloudimg-rootfs, or can be customised to produce whole disk images (but will still contain a filesystem labelled cloudimg-rootfs). Once the file system tree is assembled a loopback device with filesystem (or partition table and file system) is created and the tree copied into it. The file system created is an ext4 filesystem just large enough to hold the file system tree and can be resized up to 1PB in size." [2]
Take this in... envision it, and now we will split the image process in its ten phases, which each element can go through if it needs to. [3]
- root.d
- extra-data.d
- pre-install.d
- install.d
- post-install.d
- post-root.d
- block-device.d
- pre-finalise.d
- finalise.d
- cleanup.d
root.d
Create or adapt the initial root filesystem content. This is where alternative distribution support is added, or customisations such as building on an existing image.
Only one element can use this at a time unless particular care is taken not to blindly overwrite but instead to adapt the context extracted by other elements.
- runs:
- outside chroot
- inputs:
- $ARCH=i386 OR amd64 OR armhf OR arm64
- $TARGET_ROOT=/path/to/target/workarea
extra-data.d
Pull in extra data from the host environment that hooks may need during image creation. This should copy any data (such as SSH keys, http proxy settings and the like) somewhere under $TMP_HOOKS_PATH.
- runs:
- outside chroot
- inputs:
- $TMP_HOOKS_PATH
- outputs:
- None
Contents placed under $TMP_HOOKS_PATH will be available at /tmp/in_target.d inside the chroot.
pre-install.d
Run code in the chroot before customisation or packages are installed. A good place to add apt repositories.
- runs:
- in chroot
install.d
Runs after pre-install.d in the chroot. This is a good place to install packages, chain into configuration management tools or do other image specific operations.
- runs:
- in chroot
post-install.d
Run code in the chroot. This is a good place to perform tasks you want to handle after the OS/application install but before the first boot of the image. Some examples of use would be:
- Run chkconfig to disable unneeded services
- Clean the cache left by the package manager to reduce the size of the image.
- runs:
- in chroot
post-root.d
Run code outside the chroot. This is a good place to perform tasks that cannot run inside the chroot and must run after installing things. The root filesystem content is rooted at $TMP_BUILD_DIR/mnt.
- runs:
- outside chroot
block-device.d
Customise the block device that the image will be made on (for example to make partitions). Runs after the target tree has been fully populated but before the cleanup.d phase runs.
- runs:
- outside chroot
- inputs:
- $IMAGE_BLOCK_DEVICE={path}
- $TARGET_ROOT={path}
- outputs:
- $IMAGE_BLOCK_DEVICE={path}
pre-finalise.d
Final tuning of the root filesystem, outside the chroot. Filesystem content has been copied into the final file system which is rooted at $TMP_BUILD_DIR/mnt. You might do things like re-mount a cache directory that was used during the build in this phase (with subsequent unmount in cleanup.d).
- runs:
- outside chroot
finalise.d
Perform final tuning of the root filesystem. Runs in a chroot after the root filesystem content has been copied into the mounted filesystem: this is an appropriate place to reset SELinux metadata, install grub bootloaders and so on.
Because this happens inside the final image, it is important to limit operations here to only those necessary to affect the filesystem metadata and image itself. For most operations, post-install.d is preferred.
- runs:
- in chroot
cleanup.d
Perform cleanup of the root filesystem content. For instance, temporary settings to use the image build environment HTTP proxy are removed here in the dpkg element.
- runs:
- outside chroot
- inputs:
- $ARCH=i386 OR amd64 OR armh OR arm64
- $TARGET_ROOT=/path/to/target/workarea
cloud-init
Cloud-init is a method developed by Canonical for cross-platform cloud instance initialization. It is supported across major public cloud providers, provisioning systems for private cloud infrastructure, and bare-metal installations.
Cloud instances are initialized from a disk image and instance cloud metadata.
Cloud-init will identify the cloud it is running on during boot, read any provided metadata from the cloud and initialize the system accordingly. This may involve setting up the network and storage devices to configuring SSH access key and many other aspects of a system. [4]
You can see here the availability for different Linux distributions as well as compatibility for different cloud execution environments.
We should note that there are two main metadata types, EC2 and Openstack. They both have different API specifications but you can usually find support for Openstack metadata links across different public clouds, or even both, be sure to know your cloud provider!
We will not go much into cloud-init operation, just know that it exist and what purpose it serves. Of course, you are free to explore the documentation, just be careful with the rabbit-hole Alice =)
Creating our Image
For this article we will be building a CentOS 8 cloud-ready image.
CentOS has released its version 8 cloud images last month, conveniently the diskimage-builder developers already added the much needed changes to their centos-minimal element. Fortunately I noticed that the cloud-init package was missing from the generated image (I don't know if it is also missing from the upstream), so we should also add that package to our image, or else we will not have our precious ssh-key added to access any launched instances.
Setup
First we should get the tutorial files:
t0rrant@testing:~$ wget https://implement.pt/files/hacking-diskimage-builder-for-fun-and-profit/dib-tutorial.tar.gz
t0rrant@testing:~$ tar xzf dib-tutorial.tar.gz
t0rrant@testing:~$ cd dib-tutorial
Now we should create a python virtual environment using virtualenv wrapper and install diskimage-builder and its dependencies.
t0rrant@testing:~/dib-tutorial$ mkvirtualenv dib-elements
(dib-elements) t0rrant@testing:~/dib-tutorial$ pip install diskimage-builder
Project Structure
For modularity and learning purposes we will have two elements, centos-eight and my_base. The my_base element will contain settings that may be common between different images, be it CentOS, Debian, Ubuntu, Gentoo, or any others. Whereas the centos-eight will take care only of things specific to CentOS 8.
Let us start with the my_base element.
We should have a directory structure that looks something like this:
my_base/
├── element-deps
├── environment.d
│ └── 10-cloud-init-datasources
├── package-installs.yaml
├── pkg-map
├── post-install.d
│ └── 01-cloud-init-override
└── README.rst
the element-deps file describes which other elements should be run alongside ours:
|
|
Besides the "mandatory" base and vm elements, we have some important elements that we do not need to configure.
- growroot - grow the root partition on first boot [5]
- dhcp-all-interfaces - autodetect network interfaces during boot and configure them for DHCP [6]
- enable-serial-console - start getty on active serial consoles. [7]
- openssh-server - ensures that openssh server is installed and enabled during boot [8]
Using the cloud-init-datasources element allows us to configure from where does cloud-init fetch metadata to feed to our instance, and the DIB_CLOUD_INIT_DATASOURCES environment variable must be set on image creation. One way of doing this is using the environment.d stage, and in our example we setup only the Openstack data source. Let us create the 10-cloud-init-datasources file inside environment.d:
|
|
Using both package-installs and pkg-map elements gives us a flexible way to manage package installation across different distributions. We use pkg-map to map an arbitrary name to a specific package name in a specific distribution family, or release, and with package-installs we define which of those arbitrary names we want installed in our image. So we create the package-installs.yaml file in the root of our element:
|
|
and the corresponding package mapping in the pkg-map file, in json format:
|
|
Which in our case will map everything under ["family"]["redhat"] except for the ntp package, which we specified to be chrony for CentOS 8.
For consistency between different elements using the my_base element, we may want to override the cloud-init configuration that comes with each distro's cloud-init version. To do that we should use the post-install.d stage. Let us create the 01-cloud-init-override file in the post-install.d directory:
|
|
Note: the files under post-install.d should have executable mode activated, or they will not be executed.
Now we can move on to our centos-eight element, here is our directory structure:
centos-eight/
├── element-deps
├── package-installs.yaml
├── pkg-map
├── post-install.d
│ └── 50-cloud-init-config
└── README.rst
Taking a look at the dependencies (element-deps file) for our final element, centos-eight:
|
|
we can see we are including the my_base element we created earlier, two new elements, and we are repeating the use for the package-installs and pkg-map. We include them again as we will want to specify in this element only a special case where we want the cloud-init package to be installed. We do not do this in the my_base element as other distributions may already have cloud-init installed, and they may or may not use the package manager, so we leave that option open.
Let us configure package-install.yaml for centos-eight:
|
|
and also the pkg-map:
|
|
The centos-minimal element will create a minimal image based on CentOS. The use of this element will require ‘yum’ and ‘yum-utils’ to be installed if you are using Ubuntu or Debian where you generate the image. Nothing additional is needed on Fedora or CentOS. [9]
The epel element installs the Extra Packages for Enterprise Linux (EPEL) repository GPG key as well as configuration for yum. [10]
In this case we want to make sure some of cloud-init's options are added to our already overridden configuration, so we will use the same stage as in the my_base element, but with another priority so that it runs after, in this case we name the file 50-cloud-init-config within the post-instal.d directory:
|
|
That is it for the centos-eight element and we are now (almost) ready to generate our customized cloud-ready image
Generating the image
Now that we have our elements setup, we can setup our environment variables:
- we define where disk-image-create can find our custom elements
- we define the main distro we are creating
- we define the distro's release (this is relevant for *-minimal elements)
(dib-elements) t0rrant@testing:~/dib-tutorial$ export ELEMENTS_PATH=elements
(dib-elements) t0rrant@testing:~/dib-tutorial$ export DISTRO=centos
(dib-elements) t0rrant@testing:~/dib-tutorial$ export DIB_RELEASE=8
Finally we can create our image:
(dib-elements) t0rrant@testing:~/dib-tutorial$ disk-image-create -x --no-tmpfs -o centos8.qcow2 block-device-mbr centos-eight
We can now upload the generated image to our cloud platform, i.e:
(dib-elements) t0rrant@testing:~/dib-tutorial$ openstack image create my-centos-eight --file centos8.qcow2 --disk-format=qcow2 --container-format=bare
Pick up some java, wheat or malt, you've earned it! Then go play with the image you have just uploaded and see if you can customize it even further.
Conclusion
In this article we went through all of the components required to create a cloud-ready image (almost) from scratch. We saw what are elements defined in diskimage builder, and what phases they go through in the image building process. Talked briefly about cloud-init and metadata types, and finished with an in-depth tutorial of building a customized cloud-ready CentOS 8 image.
Hopefully this was useful and you could replicate every example =)
Feel free to leave comments below, any improvement to this and other articles is always welcome.
References
[1] | --, diskimage-builder - Openstack Documentation, diskimage-builder 2.33.1.dev11. [link] |
[2] | (1, 2) -- , Developer Guide (Design) - Openstack Documentation, diskimage-builder 2.33.1.dev11. [link] |
[3] | -- , Developer Guide (Developing Elements) - Openstack Documentation, diskimage-builder 2.33.1.dev11. [link] |
[4] | -- , cloud-init Documentation, cloud-init 20.1. [link] |
[5] | --, growroot element - Openstack Documentation, diskimage-builder 2.33.1.dev11. [link] |
[6] | --, dhcp-all-interfaces element - Openstack Documentation, diskimage-builder 2.33.1.dev11. [link] |
[7] | --, enable-serial-console element - Openstack Documentation, diskimage-builder 2.33.1.dev11. [link] |
[8] | --, openssh-server element - Openstack Documentation, diskimage-builder 2.33.1.dev11. [link] |
[9] | --, centos-minimal element - Openstack Documentation, diskimage-builder 2.33.1.dev11. [link] |
[10] | --, epel element - Openstack Documentation, diskimage-builder 2.33.1.dev11. [link] |
Tags: linux centos openstack diskimage-builder yaml cloud-init cloud images filesystem
Related Content
- Virtualenvwrapper Installation and Usage
- Creating Custom Resources for Your Cookbooks
- Hitchhiker's Guide to Chef
- An Advanced Guide to Salt
- A Comprehensive Introduction to Salt