What I Learned Building a Toy VPS Cloud - Part 2
In Part 1 we got an Ubuntu VM running on QEMU and configured it enough to log in. The way we interacted with it was through the serial console — we ran socat against a Unix socket, watched the kernel boot, and got a login prompt.
On the other side of that socket is the guest, in other words the VM. The setup is no different from plugging a serial cable into a physical machine’s console port. Virtualization just hands us a Unix socket instead of a connection via a serial port.
But in a real cloud, the management plane doesn’t talk to the guest like that. When you click “Stop” on EC2, AWS doesn’t SSH in and run shutdown. It couldn’t, even if it wanted to: what if the VM is frozen and needs a hard reset? AWS is talking to the hypervisor itself, not to the OS running inside the VM.
That hypervisor-side channel has a name: QMP, the QEMU Machine Protocol. Through it we can:
- query VM state (running, paused, shutting down)
- take a snapshot of the disk
- reset the VM
- add or remove virtual devices
Enabling QMP
We add one line to the command from Part 1: -qmp unix:monitor.sock,server,nowait.
qemu-system-aarch64 \
-accel hvf \
-machine virt \
-cpu host \
-m 2G \
-drive file=/opt/homebrew/share/qemu/edk2-aarch64-code.fd,if=pflash,format=raw,readonly=on \
-drive file=noble-server-cloudimg-arm64.img,if=virtio \
-cdrom seed.iso \
-serial unix:serial.sock,server \
-qmp unix:monitor.sock,server,nowait \
-display none
There’s a lot packed into that new option. Let’s break it down:
-qmp: tells QEMU to expose its management protocol on a socket. There’s also another way, using-monitor1, which is a human-readable command line. QMP is the JSON-RPC version designed for programs to talk to.unix:monitor.sock: the socket lives atmonitor.sockin the working directory.server: QEMU listens on the socket. Without it, QEMU flips to client mode and tries to dial an already-listening socket instead — useful when a management daemon owns the socket and spawns QEMU into it, but not what we want here.nowait: unlike Part 1’s-serial unix:serial.sock,server, this one doesn’t block waiting for a client. We don’t want to gate VM boot on someone listening to QMP — it’s something we connect to opportunistically.
Now we can run the command again to boot the VM. We’ve got two channels into the same machine: serial.sock for the guest, monitor.sock for QEMU.
Establishing the connection
With the QMP socket exposed, we now need to connect to it and complete a handshake before QEMU will accept any commands.
From another terminal, connect with socat:
socat -,rawer,escape=0x1d unix-connect:monitor.sock
QEMU sends the first message the moment we connect:
{"QMP": {"version": {"qemu": {"micro": 0, "minor": 2, "major": 9}, "package": "v9.2.0"}, "capabilities": ["oob"]}}
This is the server greeting. Most of the fields are self-explanatory, with the exception of the capabilities.
This is not important at the moment, but as a curiosity, having the oob (out-of-band) capability enabled means that this connection can process messages even if it’s already processing something (like a long-running command). Without it, QMP handles commands strictly in order, so a slow one blocks everything behind it.
At this point, QEMU is still establishing the connection, let’s say, so before QMP will accept any other commands, we have to acknowledge by entering capability negotiation mode:
{"execute": "qmp_capabilities"}
QEMU replies:
{"return": {}}
Now the connection is established and we can send other commands.
QMP commands in action
The simplest and most useful thing we can do is query the VM status. We send:
{"execute": "query-status"}
QEMU responds:
{"return": {"status": "running", "running": true}}
To pause the VM:
{"execute": "stop"}
QEMU returns {"return": {}} and immediately emits an event:
{"timestamp": {"seconds": 1778539198, "microseconds": 420264}, "event": "STOP"}
The virtual CPUs halt and guest RAM freezes in place. From inside the VM, time just doesn’t pass. If we keep the VM stopped for an hour and then resume it, its wall clock will be an hour behind.
Running query-status now shows the new state:
{"return": {"status": "paused", "running": false}}
To resume the VM, we send cont:
{"execute": "cont"}
QEMU returns {"return": {}} and pushes a matching event:
{"timestamp": {"seconds": 1778539312, "microseconds": 118402}, "event": "RESUME"}
vCPUs start ticking again, devices wake up, and the guest picks up at exactly the instruction it was about to execute.
This is the type of interaction we can have with the VM. Things that are on the QEMU layer and not necessarily have to do with the guest operating system.
Interacting with the VM
So far we’ve used QMP to read the VM status and pause/resume it. We can also send a shutdown signal to the guest.
{"execute": "system_powerdown"}
QEMU acknowledges with {"return": {}} and sends an ACPI power-down signal to the guest — same effect as pressing the physical power button on a machine. The OS catches the signal and runs its shutdown sequence.
Almost immediately, QEMU pushes an event:
{"timestamp": {"seconds": 1778539749, "microseconds": 374873}, "event": "POWERDOWN"}
A few seconds later, QEMU pushes us an event:
{"timestamp": {"seconds": 1778539887, "microseconds": 228866}, "event": "SHUTDOWN", "data": {"guest": true, "reason": "guest-shutdown"}}
The "guest": true part is important: the shutdown originated from inside the guest (the OS responded to the ACPI signal), not from QEMU forcing it. That’s the difference between a graceful shutdown and a hard shutdown.
The QEMU process then exits.
The Protocol
Notice that commands, responses, events, and the server greeting are different objects.
- Commands: JSON objects we send to QEMU. Shape:
{"execute": "name", "arguments": {...}}. See Issuing Commands. - Responses: what QEMU sends back per command, exactly once. Either
{"return": ...}on success or{"error": ...}on failure. See Commands Responses. - Events: unsolicited messages QEMU pushes when something happens in the VM. Shape:
{"event": "NAME", "timestamp": {...}, "data": {...}}. We’ve seenSTOP,RESUME,POWERDOWN,SHUTDOWN. See Asynchronous Events. - Server greeting: the first message QEMU sends on connect, announcing version and supported capabilities (covered above in Establishing the connection). See Server Greeting.
Wrapping up
The VM has two open channels now: serial for the guest, QMP for QEMU itself. Through QMP we have query, pause/resume, graceful and hard shutdowns, snapshots, device hotplug, and event subscription.
So now we have another building block of Cumulus. We can already start the machine, and interact with it via QEMU’s API, but we need a lot more.
If you’re curious, stay tuned for the rest of the series.
-
Initially I didn’t know about QMP, so Cumulus went the
-monitorroute with some text parsing on top. ↩