Building GoShip: Why I’m Building a VM Control Plane in Go

Most modern infrastructure treats virtual machines as implementation details — hidden beneath container abstractions and serverless platforms. GoShip started as a personal learning project to treat VMs as first-class primitives again and dive deep into the Linux virtualization stack — KVM, QEMU, and libvirt — instead of only consuming it through higher-level platforms.

I run personal projects across my dev machine, homelab, and a few VPS instances. At first, I was simply looking for stronger isolation. Port collisions, network conflicts, and resource contention kept getting in the way. For years, the default answer was containers: throw everything into Docker, share the kernel, and call it a day.

While digging into isolation options, I ran into a less obvious problem: “just use containers” often means “trust the host more than you probably should.” Container escapes are a well-documented vulnerability class in shared-kernel environments. Incidents such as CVE-2019-5736 (runc) and CVE-2022-0492 (cgroups breakout) show that namespace-based isolation ultimately depends on the integrity of the host kernel and runtime. Mechanisms like seccomp, AppArmor, SELinux, and rootless containers reduce attack surface, but they do not change the fundamental architectural boundary: containers share a kernel; virtual machines execute behind a hardware-enforced boundary. Isolation is never absolute. It is always a tradeoff between performance, complexity, and trust assumptions.

This does not make containers insecure; it simply means their isolation model differs fundamentally from hardware-enforced virtualization.

Around the same time, I was searching for a strong technical theme to shape future master’s research. The thesis I landed on is simple: containers are excellent, but they are not always the best isolation primitive. I documented that direction in the GoShip design doc.

I’ve loved containers since before Docker existed. I experimented with LXC early on, and when Docker emerged I was giving talks and organizing meetups to spread the word — not to mention my time as a maintainer of tsuru. This isn’t a rejection of containers. It’s a recognition of boundaries.

For specific security, multi-tenant, and confidential workload scenarios, a VM-first model can be the right starting point.

I also have a specific interest in confidential computing and VM-scoped AI workloads. When inference runs on shared infrastructure, both the data and the model weights need protection. Confidential VMs change the trust model in a way that aligns naturally with a VM-scoped control plane: memory is encrypted, attestation precedes secret injection, and the host is no longer implicitly trusted.

I also spent years working in telecom and VoIP environments, where predictable latency and tight hardware integration matter. That industry has traditionally favored VM-based deployments. Container-native approaches are improving, but they introduce different tradeoffs.

So I started building a small control plane that treats the VM as the primary unit of isolation — built in tiny, auditable increments where every step is runnable and explainable.


The GoShip Thesis

GoShip is built around one core invariant:

A project owns one or more virtual machines. All workloads of that project execute inside those VMs.

The model is intentionally simple:

  • Project = isolation boundary
  • One VM per project per node (one VM per machine for that project)
  • Apps run only inside project VMs
  • Control plane never executes workloads directly

This model optimizes for explicit trust boundaries, debuggability, upstream alignment, and clear failure domains.

GoShip does not attempt to hide virtualization. It exposes CPU topology, memory configuration, device modeling, and confidential VM flags, because hiding these details makes debugging harder, not easier.

GoShip is primarily aimed at people who want to understand and build VM-centric systems, not hide them behind layers of abstraction.


The Linux Virtualization Stack

Linux virtualization is not a single component. It is a layered system composed of hardware CPU extensions, a kernel module, a userspace emulator, and a management layer. Each layer has a strict responsibility boundary.

Understanding those boundaries is essential if you want to build a control plane that is correct.

Hardware Virtualization

Modern CPUs provide virtualization extensions:

  • Intel VT-x
  • AMD-V (SVM)

These extensions introduce a dedicated guest execution mode, VM entry and VM exit instructions, and hardware-managed control structures (VMCS on Intel, VMCB on AMD).

At runtime, the cycle is straightforward:

  1. The guest executes instructions normally.
  2. A privileged event triggers a VMEXIT.
  3. Control transfers to the host.
  4. The hypervisor handles the exit reason.
  5. Execution resumes inside the guest.

The VMEXIT boundary is one of the most important concepts in virtualization. It is where isolation, performance, and security converge. Certain privileged instructions, device accesses, and page faults can trigger VMEXIT events. The cost of that transition determines how fast the VM feels. The correctness of the hypervisor’s handling determines whether isolation holds.

Without hardware support, virtualization relies on binary translation or paravirtualization, which increases complexity and reduces performance predictability.

KVM: Linux as a Hypervisor

KVM (Kernel-based Virtual Machine) adds hypervisor capabilities to the Linux kernel.

KVM:

  • Exposes /dev/kvm
  • Allocates guest memory
  • Creates virtual CPUs
  • Handles VMEXIT events
  • Interfaces directly with CPU virtualization extensions

KVM does not emulate devices. It provides the execution engine.

From userspace, creating a VM means opening /dev/kvm, creating a VM instance, registering guest memory, creating vCPUs, and entering guest execution mode. But something still needs to emulate disks, NICs, PCI devices, firmware, and everything else the guest OS expects to find.

That is where QEMU enters.

QEMU: Device Emulation in Userspace

QEMU runs as a regular Linux process. In a KVM-backed VM, QEMU allocates memory, emulates devices, and issues ioctls to /dev/kvm. KVM executes the guest code on actual hardware.

From the host’s perspective, a VM is just a process:

qemu-system-x86_64
├── uses /dev/kvm
├── maps guest memory
├── emulates devices
└── runs vCPUs

This design has important implications. VMs can be inspected with normal Linux tools — ps, top, strace. cgroups apply naturally. Resource accounting is explicit: the QEMU process owns the VM’s memory and CPU time. Security boundaries are enforced by hardware (VMEXIT) plus kernel (process isolation), not just by convention.

GoShip intentionally does not bypass this model. A VM is a process. You can observe it, limit it, and kill it with standard tooling. That is a feature, not a limitation.

libvirt: Declarative VM Management

Managing QEMU directly is fragile. The qemu-system-x86_64 command accepts hundreds of flags. One wrong flag and you get a VM that boots but cannot reach the network, or uses the wrong firmware, or maps memory incorrectly.

libvirt provides:

  • Stable APIs across QEMU versions
  • Persistent domain definitions
  • Storage pool management
  • Network abstraction
  • XML-based configuration

Instead of invoking QEMU manually, you define a VM declaratively:

<domain type='kvm'>
  <name>project-alpha</name>
  <memory unit='MiB'>2048</memory>
  <vcpu placement='static'>4</vcpu>

  <cpu mode='host-passthrough'>
    <topology sockets='1' cores='4' threads='1'/>
  </cpu>

  <devices>
    <disk type='file' device='disk'>
      <source file='/var/lib/goship/images/alpha.qcow2'/>
      <target dev='vda' bus='virtio'/>
    </disk>
  </devices>
</domain>

libvirt translates this into appropriate QEMU invocations. It separates definition from persistence from execution from introspection. You can define a VM, inspect its XML, start it, stop it, and redefine it — all through a consistent API. GoShip generates this XML programmatically from Go templates.

This separation is critical for building an auditable control plane. When a VM behaves unexpectedly, you inspect the XML. The XML is the contract.

The Full Stack

Putting everything together:

GoShip Control Plane (Go)
        │
        ▼
    Node Agent
        │
        ▼
libvirt (userspace daemon)
        │
        ▼
  QEMU (userspace process)
        │
        ▼
   KVM (kernel module)
        │
        ▼
CPU virtualization extensions

Each layer has a strict boundary. Understanding these separations is essential for debugging. When something fails, the layer boundaries tell you where to look. A networking issue inside the VM? Check QEMU’s device emulation and the libvirt network XML. A performance problem with vCPU scheduling? Check KVM’s handling of VMEXITs. A permissions error on disk access? Check libvirt’s security driver configuration.


Other Key Technologies

The virtualization stack handles VM lifecycle. But two more technologies are essential before a VM becomes useful: provisioning its identity on first boot, and giving the host a way to talk to it.

Cloud-init: First-Boot Identity Injection

Cloud-init is widely adopted across cloud and virtualization ecosystems. The idea is simple:

  1. The VM image ships with cloud-init pre-installed.
  2. On first boot, cloud-init looks for a data source containing configuration.
  3. It applies that configuration: sets the hostname, injects SSH keys, creates users, runs scripts.
  4. On subsequent boots, it checks the instance ID. If it has not changed, it skips re-provisioning.

For local VMs without a metadata API, the NoCloud data source is the right choice. NoCloud works by attaching a small ISO image to the VM as a CDROM. Cloud-init looks for a disk with the volume label cidata and reads two YAML files from it: meta-data (machine identity) and user-data (configuration directives).

The beauty of this approach is that the VM image stays generic. All customization happens through external data injected at creation time. One base image serves every VM. The provisioning data is a sidecar — attached at creation, consumed on first boot, ignored afterward.

Virtio-serial: Talking to a VM Without a Network

GoShip needs a structured communication channel between the host control plane and the in-VM agent. The channel must work without depending on the VM’s network stack — if the network is misconfigured or not yet up, the control plane still needs to reach the agent.

After evaluating SSH (too heavy — requires key management, sshd, network reachability), QEMU Guest Agent (fixed command set, not extensible), virtio-vsock (requires kernel module support, CID management), and 9P/virtiofs (awkward for request/response), I chose virtio-serial.

The architecture is simple:

┌─────────────────────┐          ┌─────────────────────┐
│        HOST          │          │         VM           │
│                      │          │                      │
│  GoShip Control Plane│          │  GoShip guest agent  │
│        │             │          │        │              │
│  net.Dial("unix",    │          │  open("/dev/virtio-  │
│   "goship.sock")     │          │   ports/goship.0")   │
│        │             │          │        │              │
└────────┼─────────────┘          └────────┼──────────────┘
         │    virtio-serial channel        │
         └─────────────────────────────────┘
              (QEMU bridges both sides)

On the host side, libvirt creates a Unix domain socket. On the guest side, the virtio-serial channel appears as a character device. QEMU bridges the two — bytes written to the socket appear on the device, and vice versa. Full-duplex, low overhead, custom protocol, libvirt-native. It hit the sweet spot for a control channel.


Why Go

I picked Go for very practical reasons.

System programming ergonomics. Reading host state, handling process lifecycle, and wiring external APIs feels natural in Go. The standard library covers most of what a control plane needs.

Great fit for control planes. context, concurrency primitives, error handling, and tooling all help with operational code. Go’s tooling and deployment model make it well-suited for building small, auditable infrastructure binaries.

Small, explicit code paths. I can keep the code boring and direct, which is exactly what I want for infrastructure. No magic, no hidden control flow.

I love Go. I have used it for years, so I can move quickly without fighting the language.

Upstream-aligned by design. GoShip uses KVM, QEMU, and libvirt directly and respects their boundaries, rather than hiding them behind a hand-wavy abstraction.


The Build Strategy

From day one, GoShip followed a strict constraint:

Every step must be independently runnable and explainable.

This looks slower at first, and honestly, it is. But I have learned that “fast” infrastructure code is often just deferred confusion. The baby-step approach pays off when something breaks and the architecture is still understandable.

Here are the thirteen steps that make up v0:

StepConceptWhat You Build
1libvirt connectionCLI skeleton, goshipctl version with libvirt info
2Host capabilitiesCPU topology, hugepages, KVM, confidential computing
3Domain XMLVM blueprint generation, goshipctl generate-xml
4VM lifecycleCreate/destroy VMs, CoW disk images
5Cloud-initVM provisioning via ISO
6Virtio-serialHost-VM JSON communication channel
7GoShip InitPer-VM guest provisioning + minimal agent ping/pong
8State storeDomain types, JSON persistence
9Runtime interfaceAbstraction layer wiring Steps 1-6
10Project CLIproject create/list/delete/info
11Docker in VMContainer management via SDK
12App CLIapp deploy/list/stop/remove/logs/exec
13Process modeDirect process supervision, binary upload, auto-restart

What Comes Next

In Part 2, I walk through all thirteen steps as a deep technical journal — from connecting to libvirt and parsing host capabilities, through VM lifecycle and cloud-init, across the virtio-serial boundary into GoShip Init, and finally up to deploying containers and processes inside VMs.

Every step includes the code, the design decisions, and the debugging stories.