What I Learned Building a Toy VPS Cloud - Part 1
How does a VM actually get provisioned in something like EC2 or DigitalOcean? I had a rough idea, but the only way to really know was to build a small version myself. The result is Cumulus — a toy VPS cloud running on my laptop. The source is on GitHub.
This is the first post in the series, and it covers the foundation: getting a single VM running with QEMU and configured enough to log in.
The product
I wanted to learn a few specific things, and those became the requirements for Cumulus.
- Manage VMs — see what’s running, start and stop them, destroy them when I’m done.
- Configure instance sizes — pick how many vCPUs, how much RAM, how big the disk.
- Take snapshots — cloud providers offer point-in-time backups from a single click, and they’re fast. How does that actually work?
- Choose operating systems — every cloud lets you pick from a handful of base images, and the new VM comes up already configured on first boot. How do they do that?
- SSH into a fresh VM — when you create a machine on DigitalOcean or Hostinger, it boots with SSH already set up for your user and your public key. How does that get wired up?
- Connect via VNC — open a graphical console of any VM right in the browser, no client install. How is that wired up?
If I can build all of that, I’ll have a real mental model of how a cloud provider works under the hood — and a feel for the problems they actually have to solve.
Architecture
I had an architecture in mind from the start. My development machine is a Mac, but I wanted the design to extend to Linux hosts as well, so ended up choosing three components:
- Control Plane — the source of truth for which VMs should exist and what specs they should have. This should be platform agnostic in respect to the hosts and VMs.
- Node Agent — a daemon that runs on each host and actually launches the VMs, taking orders from the Control Plane.
- Web UI — some things just look cooler when you can click around and see things happening. Mostly for ease of operation.
Plenty of things changed as the project grew, but this shape stayed intact.
Running VMs
There are several options for running VMs locally. I picked QEMU because it didn’t need much setup and exposed enough internals to let me poke at the parts I cared about. For a real production system I’d probably reach for libvirt which seems to have a more declarative way of handling things.
QEMU
QEMU is the heart of Cumulus, though in theory it could be swapped for any other VM backend. So my first task was to figure out how it worked and how to drive it for what I needed. After some trial and error, I landed on a command that would actually launch a VM, something like:
qemu-system-aarch64 \
-accel hvf \
-machine virt \
-cpu host \
-m 2G \
-drive file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-serial unix:serial.sock,server \
-display none
That’s a lot happening here, what is important to note is the command name. QEMU has different binaries per target architecture (e.g., qemu-system-x86_64). So basically you pick the command matching the architecture you want the guest to be. Let’s discuss each of the params:
-
-accel hvf: sets the acceleration backend. HVF is the Apple Hypervisor.framework, which is the macOS native virtualization layer, on Linux you would use KVM, Xen, or MSHV, according to their documentation. This is important because it tells QEMU to not emulate the CPU in software, meaning that the guest instructions will run in the actual hardware. If we omit this, QEMU falls back to TCG (Tiny Code Generator) which basically will emulate the VM CPU in software and be much slower. -
-machine virt: this option is the kind of machine you are emulating. You can runqemu-system-aarch64 -machine helpto see the whole list, but the gist of it is that you could choose a Raspberry Pi that QEMU would emulate it down to its quirks and limits. -
-cpu hostthis passes the CPU exact model to the VM/guest, no translation so since I’m running on a mac, the guest OS will see an arm64 by Apple. -
-m 2Gsets the memory size to 2GiB -
-drive file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=onattaches the UEFI firmware as a read only raw image through a parallel flash interface type. -
-serial unix:serial.sock,serverexposes the guest’s serial console as a Unix socket on the host and blocks until a client connects to it. I’ve learned that there is a special piece of hardware that Linux exposes as a TTY and writes and reads bytes to and from it. -
-display none: display none doesn’t open the framebuffer window on my desktop. The idea is that once Cumulus has more features, we connect to it via SSH or VNC.
So at this point, we can connect to the serial socket from another terminal tab on the machine:
socat -,rawer,escape=0x1d unix-connect:serial.sock
At this point, we see on our terminal the UEFI shell.
UEFI Interactive Shell v2.2
EDK II
UEFI v2.70 (EDK II, 0x00010000)
map: No mapping found.
Press ESC in 1 seconds to skip startup.nsh or any other key to continue.
Shell>
And this is expected because we don’t have any disks the machine can boot attached to it. So basically the next step is disks. We could go as we go with our own machines, attach an empty volume, and take an ISO from an operating system and install it, like I used to do in my machines in the old times. However, there is a better way, some Linux distros provide cloud images (Ubuntu’s here: https://ubuntu.com/server/docs/explanation/clouds/find-cloud-images/) which already have this installation step complete for us. So, we can do something like this:
curl -LO https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-arm64.img
This downloads an Ubuntu cloud image in the qcow2 format. qcow2 is QEMU’s disk image format: instead of representing a raw block device byte-for-byte, it stores the virtual disk in a regular file and allocates space as data is written. It also supports features like snapshots, compression, and backing files, which makes it common for VM and cloud images.
The extension is .img, but if you inspect the file with qemu-img info noble-server-cloudimg-arm64.img, you’ll see that it’s a qcow2 format. So now, we repeat the previous command, but we add a new drive from a file:
qemu-system-aarch64 \
-accel hvf \
-machine virt \
-cpu host \
-m 2G \
-drive file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-drive file=noble-server-cloudimg-arm64.img,if=virtio \
-serial unix:serial.sock,server \
-display none
Once we do this and connect again to the machine, we’re going to see Ubuntu booting. That’s really fun! If we wait enough, we’re going to see the login screen. At this point, there is a gotcha. Those images don’t have a default user and password. So we can’t log in.
How to Log in
The machine we just booted has no info about anything, so we need a way to inject that inside this, so enter cloud-init.
Cloud-init is a package that comes pre-installed on those distros and runs at boot time to turn the generic image into a specific configured VM. Cloud-init has several datasources, you can take data from a specific IP or domain, but there is also the NoCloud data source (https://docs.cloud-init.io/en/latest/reference/datasources/nocloud.html), which, among other things, searches for a file system labeled with cidata with two files: user-data and meta-data, so basically we need a way to copy those files inside the machine before it boots.
As the name implies user-data contains parameters related to the user to be configured, so a minimal example is:
#cloud-config
users:
- name: cumulus
plain_text_passwd: cumulus
lock_passwd: false
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
Also, we have a meta-data file, which a minimal example is:
instance-id: cumulus-vm-001
We can package those files into a file system and pass it to QEMU with the following command:
mkisofs -output seed.iso -volid cidata -rock user-data meta-data
I used mkisofs to build an ISO 9660 CD-ROM image to the seed.iso file, with the volume identifier cidata (expected by cloud-init NoCloud) and the two files. The -rock extension is an extension to allow some Unix file system features, otherwise the files would be all uppercase named like DOS-era files. Those two files will be at the root of the “CD-ROM”, now we can boot the VM passing -cdrom seed.iso.
qemu-system-aarch64 \
-accel hvf \
-machine virt \
-cpu host \
-m 2G \
-drive file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-drive file=noble-server-cloudimg-arm64.img,if=virtio \
-cdrom seed.iso \
-serial unix:serial.sock,server \
-display none
At this point, we can connect to the serial socket, wait for it to boot, and voila! We have a VM more or less functional.
What’s Next?
In theory, we could wrap this command into the Cumulus agent and have the Control Plane drive it — and that’s what I ended up doing. But along the way I kept hitting things VPS providers handle that I hadn’t yet thought about: scheduling VMs across hosts, snapshots, picking an operating system, SSH key injection, browser-based VNC consoles. Each gets its own post in the series.