Ryan Sepassi

A portable picokernel for async I/O

October 2025

What if a kernel was just an event loop? No scheduler, no memory protection, no blocking calls - just an async runtime that happens to run on bare metal.

There's kernels, there's microkernels, there's nanokernels, there's unikernels, and then there's what I'm building, which is so tiny, so under-featured, so unsafe that I'm not even sure it qualifies as a kernel. But it does allow for logging, timers, random number generation, block storage, UDP/IP networking, and running freestanding C code targeting its minimal interface. So, I'm going to call it a "picokernel" and I'd like to tell you about it.

Why?

Why build another minimal kernel? There are learning kernels out there - xv6, OS/161, and others - but they're typically tied to a single architecture or focus on traditional Unix-style abstractions. There are unikernels like MirageOS, but it's written in OCaml. Embedded RTOS systems are portable but often proprietary or baroque, and often lean into threads and the same Unix-style abstractions.

But, forget the reasons, I just thought it'd be fun and cute, and I was right!

I wanted to see what a minimal, portable kernel looks like. Minimal as in does the bare minimum for the functionality I want. Portable as in works across x86, ARM, RISC-V, 32-bit and 64-bit (and has a hope of running on 32-bit microcontrollers as well as hosted on other operating systems).

The basic structure is an async runtime. Everything is cooperative, event-driven, and non-blocking. There's no preemption, no scheduling, no memory protection. It's closer to embedded firmware than to traditional kernels. This simplicity makes the entire system easy to reason about: requests go in, completions come out, state machines tick forward.

Think of it as a portable async runtime that happens to run on bare metal, rather than a traditional kernel that has async I/O.

What can it do?

Not much, but not nothing:

Debug logging (printk)
Time (milliseconds from startup)
Timers
Randomness
Block storage: read, write, flush
Networking: ARP, ICMP, UDP/IP

What does it not do?

No memory management or memory isolation
No filesystem
No preemption or scheduling
No TCP

At the moment, it's a work-in-progress. The structure is there but I'm still working on the finer points of racy interrupt handling and synchronization barriers around memory-mapped IO.

Platform Support

What kind of "machines" does it run on? Currently it runs on ARM, x86, RISC-V, both 32-bit and 64-bit, under QEMU, with virtio (PCI and MMIO) devices.

Here's the QEMU command for arm64, attaching virtio MMIO devices for an RNG, a block device (i.e. a hard drive), and a network device. The commands for the other architectures are similar.

qemu-system-aarch64 \
  -machine virt -cpu cortex-a57 \
  -m 128M -smp 1 \
  -nographic -nodefaults -no-user-config -no-reboot \
  -kernel build/arm64/kernel.elf \
  -serial stdio \
  -device virtio-rng-device \
  -device virtio-blk-device,drive=hd0 \
  -drive file=/tmp/drive.img,if=none,id=hd0,format=raw,cache=none \
  -device virtio-net-device,netdev=net0
  -netdev user,id=net0,hostfwd=udp::8888-10.0.2.15:8080

The platform contract

Portability hinges on a clean abstraction boundary. Each platform implementation (arm64, x86, etc.) must provide a minimal set of capabilities to the platform-agnostic kernel:

Boot sequence: Set up CPU state, initialize a stack, jump to kmain()
UART output: Debug logging via platform_uart_write()
Interrupts: Enable/disable interrupts globally, register IRQ handlers
Timers: Wait-for-interrupt with timeout (platform_wfi())
Device discovery: Probe for virtio devices via PCI or MMIO
Work submission: Accept async I/O requests and drive them to completion

That's it. No MMU setup, no memory allocator, no thread scheduler. The kernel lives in a single address space with no protection boundaries.

Try it yourself

If you want to try it yourself:

git clone https://github.com/rsepassi/vmos
cd vmos
make run PLATFORM=arm64 PORT=8888

You should see the kernel boot, initialize virtio devices, and run whatever's in kmain_usermain() (kernel/user.c). The default user code requests random numbers and prints them, runs a read/write test on a file-backed block device, uses ARP to communicate to the QEMU gateway, and starts a UDP echo server.

Try sending a UDP packet to the echo server (in a separate terminal):

echo "hello" | nc -u localhost 8888

You should see your message echoed back in both the nc output and the kernel logs.

PLATFORM can be one of rv32, rv64, arm32, arm64, x32, x64.

To use PCI devices instead of MMIO:

make run PLATFORM=arm64 USE_PCI=1

All you need is make, clang, and qemu-system-*. clang handles the cross-compilation.

User API

Here's what it looks like to request some random numbers (timers, storage read/write/flush, and UDP send/recv are similar):

typedef struct {
  krng_req_t rng_req;
  uint8_t random_buf[32];
} kuser_t;

static kuser_t g_user;
void kmain_usermain(kernel_t* k) {
  kuser_t* ctx = &g_user;

  // Submission
  // 1. Configure the work item with a callback
  kwork_init(&ctx->rng_req.work, KWORK_OP_RNG_READ, ctx, on_random_ready, 0);
  // 2. Configure the request
  ctx->rng_req.buffer = ctx->random_buf;
  ctx->rng_req.length = 32;
  // 3. Submit to queue (non-blocking!)
  KASSERT(ksubmit(k, &ctx->rng_req.work) == KERR_OK);
}

// 4. Callback fires when ready
static void on_random_ready(kwork_t *work) {
  kuser_t *ctx = work->ctx;
  krng_req_t *req = KCONTAINER_OF(work, krng_req_t, work);
  KASSERT(work->result == KERR_OK);
  KASSERT(req->completed == 32);

  printk("Random bytes: ");
  for (size_t i = 0; i < 32; i++) {
    printk_hex8(ctx->random_buf[i]);
  }
  printk("\n");
}

So the entire system is based on async completions, entirely cooperatively scheduled, and entirely in a single address space. So, no synchronous calls, no context switches, no preemptions.

Event Loop

Where you see completions, there must be an event loop. There sure is. Here's "kmain", the kernel entry point:

static kernel_t g_kernel;
void kmain(void *platform_boot_ctx) {
  // Banner
  printk("\n\n=== KMAIN ===\n\n");

  // Initialize
  kernel_t *k = &g_kernel;
  kmain_init(k, platform_boot_ctx);

  // User kickoff
  kmain_usermain(k);

  // Event loop
  while (1) {
    // tick: Process completions, expire timers, run callbacks
    kmain_tick(k, k->current_time_ms);
    // next_delay: When's the next timer?
    uint64_t timeout = KMIN(kmain_next_delay(k), 2000);
    // wfi: Wait for interrupt (or timeout)
    k->current_time_ms = platform_wfi(&k->platform, timeout);
  }
}

Implementation Size

How about some line counts?

Makefile   208
kernel/    1.5KLOC
driver/    1.5KLOC
platform/
  arm32/   1.5KLOC
  arm64/   1.5KLOC
  rv32/    1.5KLOC
  rv64/    1.5KLOC
  x32/     3KLOC
  x64/     3KLOC

It's all freestanding C11 code, no external source dependencies, and the tooling is make, sh, clang, qemu.

(In the repo, you'll see vendor/monocypher, which provides ChaCha20 for the kernel's CSPRNG, seeded by hardware randomness, but it's not core to the kernel).

The (stripped) kernel.elf binaries weigh in at:

arm32 128K 
arm64 128K 
rv32   64K 
rv64   65K 
x32    56K
x64    60K

Development Experience

Debugging is printk-driven, but crashes are straightforward: llvm-objdump on the ELF shows exactly where the PC landed. The workflow is tight: edit, make run, see output, repeat. Most bugs are caught by assertions or manifest as immediate crashes rather than subtle corruption, thanks to the single address space and (current) lack of concurrency.

Security Model

There isn't one.

No memory protection, no privilege levels, no isolation boundaries. Everything runs in a single address space with full hardware access. A bug in user code can corrupt kernel state. A stray pointer can overwrite interrupt handlers. There's no defense against malicious or buggy code.

What this means in practice

VMOS is suitable for:

Trusted single-purpose workloads (network appliances, IoT sensors)
Educational exploration of OS concepts
Prototyping and experimentation
Environments where the VM boundary provides isolation

VMOS is not suitable for:

Multi-tenant systems
Running untrusted compiled code
Security-critical applications
Production systems without additional isolation

Future safety approaches

The lack of kernel-enforced safety doesn't mean applications must be unsafe. Several approaches could provide memory safety and concurrency safety at the application level:

Language-level safety: Compile from memory-safe languages (Rust, Pony) or checked C subsets to VMOS's C API. The application becomes safe even though the kernel isn't.

WebAssembly: Run all application code in a Wasm runtime. The kernel becomes just a capabilities provider to sandboxed code.

Verified code: Formally verify critical components, proving memory safety without runtime overhead.

These approaches let you choose your safety/complexity tradeoff rather than forcing one on everyone.

Philosophy and Future Directions

Why async resonates with hardware

There's something fundamental about an async runtime that makes it feel like the "right" abstraction for this level of system programming. It's not just about performance or simplicity - it's about alignment with how hardware actually works.

Hardware doesn't manage itself via "threads". It has state and events. A disk controller doesn't "block" waiting for a sector to be read - it starts the operation, goes idle, and fires an interrupt when done. A network card doesn't sit in a loop polling for packets - it DMAs them into memory and signals completion. Even the CPU itself - when there's no work, it halts until an interrupt arrives.

Traditional kernels paper over this event-driven reality with blocking abstractions. They create threads, schedulers, and context switches to present a synchronous programming model. Our little picokernel just gives it to you straight: reality is asynchronous.

An async runtime embraces the hardware's native behavior: submit work, go idle, wake on interrupt, process completions. No context to save because we never left. No scheduler because we're cooperative. No lock contention because we're single-threaded (for now; and my multi-core plan is lockless). The result is code that maps almost directly to what the hardware is doing.

This reminds me of Leslie Lamport's paper Computation and State Machines, where he argues that state machines are a universal framework for representing and reasoning about computation. Async runtimes are a fitting execution plane for state machines. Messages/interrupts/completions come in, we process them (along with whatever timers expired) and update internal state, we send new messages/requests out, and then wait for the next round. This isn't just a programming pattern - it's a reflection of the hardware's fundamental operation.

The clarity this brings to reasoning about the system is remarkable. Want to know what the kernel does? Look at the event loop. Want to trace a network packet? Follow it from interrupt -> completion queue -> callback. No hidden preemption points, no mysterious wakeups, no "it depends on the scheduler."

Microcontrollers

It'd be great to get this running on microcontrollers too, where I think it's a great fit. It already supports 32-bit, it does no dynamic allocation, and it doesn't depend on any standard library (only the freestanding C headers).

Hosted environments

I'd like for this same runtime and user API to work in a "hosted" form as well. That is, same user code works on other operating systems too - iOS, Android, Linux, Windows, MacOS, BSD. (Maybe even the browser.) The platform expectations are quite minimal:

Debug log
Request randomness
Request block read, write, flush
Request UDP/IP send/recv
Be notified on completions, with a timeout

That's all quite doable, via kqueue (iOS, MacOS, BSD), epoll (Linux, Android), io_uring (Linux), IOCP (Windows).

So the dream would be a single codebase that's totally event-driven that works across bare metal and all these operating systems.

For cloud deployment, the goal is to test on providers that allow custom kernels. Additionally, I want to build a minimal Linux image where init is just a kvmtool launch of this kernel - that way, we go anywhere Linux goes, with the full security and isolation of a nested VM.

State machines and language targets

I've also been banging around on a new C-based state machine libary that helps organize async code without language support, so I'll present that here soon.

Also, I think this runtime and maybe the state machine library would make for nice compilation targets. A C-like language with async/await, maybe something akin to Zig, could compile down to VMOS's simple async API.

Roadmap

Here's what on my roadmap, though we'll see how far I decide to go:

Soon
- Non-racy stable running on the current platforms!!!
- State machine library
- Timeout and cancellation support
- Ensure it works with accelerated cpu (-accel hvf/kvm)
- Hosted implementation on other operating systems (including FreeRTOS)
- Multi-core support
I can dream
- Test on cloud providers that allow custom kernels
- Build minimal Linux image where init launches this kernel via kvmtool
- Scatter-gather IO for blk and net
- mimalloc-style memory allocator
- Copy-on-write transactional key-value store (B-tree) over the block storage
- DNS
- Wireguard-style identity and encryption over UDP
- QUIC (or QUIC-like, without TLS) over UDP

If you think this is neat and would like to chat about it, please drop me a line.

Inspirations

UNIX, C, Linux, Plan9, Inferno
Leslie Lamport, Niklaus Wirth
Cosmopolitan libc