I Vibe Coded an Operating System
Published:
This post supports English / 中文 switching via the site language toggle in the top navigation.
TL;DR
Over one weekend day, I more or less vibe coded an operating system. More precisely, I built a scripting-first OS prototype in which a Python bootloader loads a Pebble runtime, injects a Pebble shell, and then hands more and more user-visible behavior to Pebble code. There are no native binaries here in the usual sense; programs are source files, or source compiled to bytecode, and the whole system is organized around that fact.
The project lives here: https://github.com/DavidLXu/PebbleOS. Even the name PebbleOS was suggested by Codex, which feels fitting in retrospect, because the project itself grew through a loop of design, refactoring, and implementation with Codex rather than through one big up-front plan. What started as a toy interpreter wrapped in a shell gradually turned into something much more interesting: a small system where the language is no longer just running inside the environment, but increasingly defining the environment itself.
1. The Real Pivot Was Architectural
PebbleOS did not start as a serious operating-system project. The original idea was much smaller: create some files, edit them, and run a tiny language from them. In that phase, the system was essentially a toy interpreter with a shell-like wrapper. The filesystem was still simple, the language was still small, and the whole project was exploratory rather than structural. What changed was not just feature count, but where policy started to live.
Once Pebble became expressive enough to describe parts of its own environment, the center of gravity moved. Pebble stopped being just the guest language and started becoming the system language. That is the key architectural move in this project. From that point on, the interesting question was no longer “what syntax should I add next?” but “what parts of the operating system can be pushed upward into Pebble, leaving Python as the substrate rather than the place where OS policy accumulates?”
2. What Actually Happens at Boot
The current boot path is already much closer to an OS bootstrapping story than a simple REPL launcher. When I run python3 main.py, Python acts as the hidden bootloader layer. It selects a filesystem mode, mounts the host runtime tree under system/..., loads system/runtime.peb, injects the source of system/shell.peb as data, calls boot(), and only then enters the interactive shell. That detail matters because it means the visible shell is no longer primarily defined by Python control flow. Prompt text, help behavior, launcher policy, and built-in command behavior now live in Pebble-managed files.
This is also why I think PebbleOS has a real bootstrap flavor now, even if it is not self-hosting in the compiler-toolchain sense. system/runtime.peb defines the shared runtime surface. system/shell.peb defines the interactive shell. Even the default editor experience now routes through Pebble-side code such as system/nano.peb. In other words, later Pebble programs run inside a world that earlier Pebble system code has already helped define.
3. Pebble Is a System Language, Not Just a Script Syntax
Technically, Pebble is still a small language, but it is now clearly a system-facing one. It uses Python-style indentation and currently supports loops, functions, lists, dicts, modules, try/except, raise, and first-class user-defined functions. There are built-in modules such as math, text, os, random, memory, and heap, and there are also file-based user modules imported directly from the active Pebble filesystem. That combination is important: Pebble is not just a shell macro language. It already has enough structure to express runtime helpers, command behavior, and parts of system control flow.
The execution model is also explicitly split. run FILE [ARGS...] executes Pebble source in interpreter mode, while exec FILE [ARGS...] compiles source to bytecode and runs it on the bytecode VM. Both are still present, but they are increasingly treated as compatibility launchers rather than the preferred user interface. The preferred model is now closer to COMMAND [ARGS...], with the shell choosing how to launch programs. This is one place where the scripting-first nature of PebbleOS really shows: there is still no native binary format, only source and bytecode. The system is built around interpretable artifacts, not compiled executables.
4. The Filesystem Is the First Real OS Boundary
The filesystem is where PebbleOS first stopped feeling like a demo. Instead of a flat file list, it now exposes a rooted filesystem with current working directory tracking, absolute and relative paths, . and .. normalization, mounted runtime files under system/..., and a visible /dev bootstrap view. The public API for all of this lives on the Pebble side in system/runtime.peb, even though Python still supplies the raw host bridge underneath. That split is important: Python provides primitive filesystem operations, but Pebble now decides how they compose into the filesystem the user actually sees.
The multi-mode storage design is also much more deliberate than it looks at first glance. hostfs gives a direct host-backed rooted filesystem for speed and easy debugging. mfs and mfs-import move user files into Pebble-managed in-memory storage, with explicit sync rather than hidden persistence. vfs-import and vfs-persistent push further by making a Pebble virtual filesystem the session or persistent source of truth. In Pebble-managed modes, there is even a shadow-file bridge for legacy paths like run and nano: a VFS-backed file may be copied into a temporary host file, operated on, then copied back. That is a very practical transitional design, and it captures the spirit of the project well: keep the substrate small, but let the visible semantics migrate upward.
5. Shell, Launcher, and Login Semantics Are Now Part of the Design
One reason this project feels more technical than a typical toy shell is that launch rules and session semantics are now explicitly documented and encoded. The command lookup order is defined: first shell builtins, then PATH lookup in /system/bin and /system/sbin, then /bin/... compatibility mapping, then direct Pebble program launch from the current directory with .peb implied. The preferred launch surface is COMMAND [ARGS...] or COMMAND &, while run, exec, runbg, and execbg remain bootstrap compatibility entrypoints that still expose interpreter-versus-bytecode distinctions.
Login behavior is similarly specified instead of left as an implementation accident. A shell session ensures bootstrap files such as /etc/profile, /etc/passwd, /etc/group, and /etc/fstab, then loads /etc/profile into the current session. The shell maintains a mutable environment map, supports set, export, env, and source, and even allows single-command assignment prefixes such as FOO=bar env. There is also a deliberately temporary bootstrap compromise in which source FILE and sh FILE currently execute in the same session state rather than as truly separate shell processes. That may sound like a small detail, but it is exactly the sort of technical distinction that matters when a shell grows into a real process model.
6. ABI First: Moving Policy Out of Python
The most technically important document in the repository may actually be the ABI plan, because it states the core rule of the whole architecture: new system features should first define a Pebble-visible ABI and only then map missing primitives to the host. That is the opposite of the usual “just add one more Python helper” trap. The host function inventory is already being classified into syscall families such as fs, proc, term, clock, and error, while the target Pebble-side ABI expands that into fs, proc, thread, term, clock, memory, service, and net.
This is why the Pebble kernel layer now exists. Modules like system/kernel/syscall.peb, proc.peb, thread.peb, and term.peb are not just abstractions for neatness; they are the mechanism that keeps shell and runtime code from depending directly on raw host function names. The current mapping is still transitional, and the kernel modules still delegate back into Python-hosted primitives, but the boundary is now named, documented, and increasingly enforced. That matters a lot. Once the boundary exists, the project can evolve toward a Pebble-defined OS without constantly leaking shell policy back into Python.
7. Processes, Threads, and TTY Are Where the System Gets Real
PebbleOS does not yet have a full Linux-like process table. Right now it still operates through two host-backed execution forms: VM-backed bytecode tasks and host-managed background worker jobs. But it already has a Pebble-visible transition layer for process semantics. The process context shape includes fields such as pid, ppid, pgid, sid, cwd, argv, env, uid, gid, umask, and path, and the current process states include ready, running, foreground, done, halted, and error. That is not a finished process model yet, but it is already more than ad hoc background jobs.
Threading is similarly in a bootstrap-but-real phase. The current thread ABI exposes thread_spawn_source, thread_spawn(func, args), thread_join, thread_status, thread_self, thread_yield, and thread_list, plus mutex operations such as mutex_create, mutex_lock, mutex_try_lock, and mutex_unlock. Internally this is still built on the existing VM task scheduler rather than a final POSIX-style thread model, but it already has explicit thread states like blocked-input, blocked-tty, and blocked-mutex. This is one of the places where Pebble’s first-class function values stopped being a language nicety and became a systems feature.
TTY and input handling are where the runtime complexity really becomes visible. PebbleOS now distinguishes line input from key input, uses cooked mode for input()-style prompts and raw mode for read_key() / read_key_timeout()-style interaction, and surfaces those waits as scheduler-visible blocking states rather than hiding them inside host-only special cases. The system also maintains a per-task key queue so full-screen interactive apps such as nano do not lose fast input during redraw. Add to that the /dev bootstrap layer exposing /dev/tty, /dev/stdin, /dev/stdout, /dev/stderr, and /dev/null, and you get something that is still incomplete but unmistakably operating-system-shaped.
8. Pebble Now Owns More of Its Runtime Model
The memory model is another place where the project got more technical in a quiet but important way. Pebble now exposes a Pebble-managed memory module representing flat logical RAM cells, and a Pebble-managed heap module representing a simple arena allocator layered on top of that memory. This is not hardware memory and it is not a page system, but it creates a language-visible runtime memory model that Pebble code can manipulate directly. That is a real architectural step, because it separates the semantics of runtime memory from Python’s host process memory.
The bytecode VM has also been pushed in a more explicit direction. In exec mode, the VM now carries an explicit value_stack, a frame_stack, and frame records rather than leaning entirely on incidental Python temporaries. Python still owns the concrete implementation, but Pebble is slowly gaining semantic independence over the runtime structures that matter. That is the deeper pattern of the whole project: not physical independence yet, but semantic independence first.
9. What Codex Was Actually Useful For
Most of this was built in a loop with Codex, but the useful interaction was not “write feature X” in isolation. The more valuable pattern was: inspect the current repository, read the architecture docs, understand which layer a change belongs to, then push one subsystem forward without losing the internal story of the project. Sometimes that meant adding or refactoring commands. More often it meant tightening the boundary between Python and Pebble, restating the launcher model, clarifying process or TTY semantics, or turning an implicit rule into an explicit one.
That is also why I increasingly think of AI agents as force multipliers for individual builders. If you already have a technical direction in your head, and you can tell when an implementation is right or wrong, an agent like Codex dramatically compresses the distance between idea and working system. The real superpower is not that it replaces judgment; it is that it lets intent propagate into code, refactors, documentation, and iteration loops much faster than one person could usually sustain alone. In that sense, it can feel a little like saying what you want and watching the system begin to exist.
So “vibe coding” is funny, but slightly misleading. The project moved fast, yes, and a lot of it was built in one concentrated weekend day. But what happened here was not random prompting followed by a pile of disconnected features. What happened was that a project that might once have taken months of stop-and-go solo time could be compressed into a single intense day, because the bottleneck shifted. The hard part was no longer typing every line myself; the hard part was having a coherent idea, choosing the right abstractions, and steering the system as it took shape. The result is not just “more code, faster,” but a repository that ended up with a recognizable architecture: a Python bootloader substrate, a Pebble-defined runtime and shell, a filesystem model with multiple semantics, a growing ABI, and the beginnings of a process/thread/TTY story that can actually be extended rather than rewritten from scratch.
10. Current State
PebbleOS is still a transition-stage system. It does not yet have full process isolation, a complete Linux-style process table, mature permission and user models, networking, packaging, or a real service layer. But it is also no longer a toy shell around a guest language. It has a rooted filesystem, a Pebble-defined shell, a bytecode VM, a Pebble-visible ABI, a bootstrap thread model, a TTY/input model, device-style paths under /dev, and a memory/heap layer that increasingly belongs to the language rather than the host.
That is why I think the most accurate description is not “I made a toy OS,” but something more specific: PebbleOS is a scripting-first operating-system experiment in which the language is gradually taking ownership of the system above a shrinking Python substrate. That is a much more technical claim than the title suggests, but it is the one I actually care about.
