Building a Real-Time Web IDE with xterm.js and Monaco

FlowKoi Team
01 Mar, 2026
03 Mins read
Engineering

FlowKoi’s web interface lets you watch AI workflows execute in real time, edit workflow files in the browser, and interact with the container terminal — all without installing anything locally. Under the hood, this is powered by two key open-source libraries: xterm.js for terminal emulation and Monaco Editor for code editing. Building them into a cohesive, real-time IDE required solving several interesting engineering problems.

Terminal Architecture with xterm.js

xterm.js is a terminal emulator that runs in the browser. It renders a full terminal UI, handles ANSI escape codes for colors and cursor movement, and processes keyboard input. But it is just the frontend — it needs a backend to connect to.

FlowKoi’s execution backend runs a PTY (pseudo-terminal) process inside the Docker container and bridges it to the browser over WebSocket. The data flow looks like this:

The browser opens a WebSocket connection to the execution backend.
The backend spawns a PTY process (using the node-pty library) inside the container.
Keystrokes from xterm.js are sent over WebSocket to the backend, which writes them to the PTY.
Output from the PTY is sent back over WebSocket and rendered by xterm.js.

This gives users a fully interactive terminal session in the browser with the same fidelity as a local terminal.

A Critical Lesson: Never Unmount the Terminal

One of the hardest bugs we encountered was terminal sessions breaking on tab switches. The root cause was React’s component lifecycle: when the terminal component unmounted, xterm.js disposed of its internal state. Remounting created a new instance that could not reconnect to the existing WebSocket session.

The fix was to keep the terminal component always mounted and toggle visibility with CSS (using an invisible class) instead of conditional rendering. This preserves the terminal instance, the WebSocket connection, and the scroll buffer across UI state changes.

Code Editing with Monaco Editor

Monaco is the editor that powers VS Code. Embedding it in a web application gives you syntax highlighting, IntelliSense, minimap navigation, and multi-cursor editing out of the box.

FlowKoi uses Monaco for editing workflow files (CLAUDE.md, workflow.md, tools, and environment configuration) directly in the browser. File changes are saved as Firestore artifacts and synced back to the container via the file watcher.

For Markdown files like workflow.md, we also integrate Milkdown — a WYSIWYG Markdown editor — as an alternative editing mode. Users can switch between raw Markdown in Monaco and rich text in Milkdown depending on their preference.

File Watching with chokidar

The execution backend uses chokidar to watch the container’s working directory for file changes. When the AI agent creates or modifies a file, chokidar detects the change and syncs the updated file back to Firestore as an artifact. This means the browser’s file tree and editor content update in near-real time as the workflow executes.

Certain paths are excluded from watching: the instanceDataPath (default output/), node_modules/, .git/, and credential files. This prevents noisy or sensitive files from flooding the artifact system.

Building the File Tree

The file tree in the sidebar is built from a flat list of artifact paths. A recursive buildFileTree() function converts paths like tools/analyze.sh and tools/fetch.py into a nested tree structure. A recursive TreeNode React component renders the tree with expand/collapse controls. All directories are auto-expanded on initial load so users can see the full project structure at a glance.

Key Prop for Editor Remounting

When the user clicks a different file in the tree, Monaco needs to load the new file’s content with the correct language mode. Rather than imperatively updating the editor’s model, we use React’s key prop set to the selected file path. Changing the key forces React to unmount and remount the editor component, which cleanly initializes it with the new file. This approach is simpler and less error-prone than trying to swap editor models in place.

Putting It Together

The combination of a persistent terminal, a real-time file watcher, and a browser-based code editor creates an experience where you can launch a workflow, watch the AI work in the terminal, see new files appear in the sidebar, and open them in the editor — all without leaving the browser. It is the kind of tight feedback loop that makes AI workflow development feel interactive rather than batch-oriented.