Building an online code editor
Implement the sandboxed run with cgroup/seccomp limits, a warm sandbox pool, live output streaming, and the timeout/cleanup that contains runaway code.
The execution worker
A worker takes a run job, grabs a sandbox, copies in the code, executes with hard limits, streams output, and always tears the sandbox down:
def run_job(job):
sandbox = pool.acquire() # pre-warmed micro-VM/container
try:
sandbox.write_files(job.files)
proc = sandbox.exec(
cmd=lang_run_cmd(job.language),
limits=Limits(cpu="1", mem="256m", pids=64, wall_time="10s"),
network="none", read_only_root=True, user="nobody", seccomp=PROFILE,
)
stream_output(job.id, proc) # push stdout/stderr live
return proc.wait(timeout=10) # SIGKILL on timeout
finally:
pool.destroy(sandbox) # ephemeral: never reuse across users
The finally is the safety net — a sandbox is never reused by another user, so no
state leaks.
Enforcing limits (defense in depth)
cgroups: cpu quota, memory.max (OOM-kill on breach), pids.max (block fork bombs)
seccomp: whitelist syscalls; block ptrace, mount, network syscalls
namespaces: isolated PID/network/mount/user namespaces
filesystem: read-only root + a small writable tmpfs scratch; no host mounts
wall clock: a hard kill timer outside the sandbox (don't trust in-sandbox timing)
Memory breach → the kernel OOM-kills the process; CPU overrun → throttled then timed out; fork bomb → blocked by the PID cap. All enforced by the host kernel, not the user’s code.
Warm pool for fast starts
Cold-booting a sandbox per run adds latency. Keep a pool of pre-initialized sandboxes (base image + language runtime loaded) and hand one out instantly; replace each consumed sandbox in the background:
class SandboxPool:
def acquire(self):
sb = self.ready.get() # instant if warm
threading.Thread(target=self._refill).start()
return sb
Streaming output live
Programs print incrementally and may read stdin, so don’t buffer to completion — forward in real time over a WebSocket:
def stream_output(job_id, proc):
for chunk in proc.stdout_stream():
ws.send(job_id, {"type": "stdout", "data": chunk})
# stdin flows the other way for interactive runs
The queue and autoscaling
The API enqueues runs; workers pull. Autoscale the worker fleet on queue depth and CPU, and enforce per-user concurrency limits (rate limiter) so one user can’t monopolize capacity. A global cap protects the cluster.
Failure handling and abuse
- Runaway code → cgroup limits + wall-clock kill contain it; the sandbox is destroyed regardless.
- Sandbox escape attempt → micro-VM/gVisor boundary + seccomp; defense in depth so one layer failing isn’t catastrophic.
- Crash mid-run → the job is at-least-once; reruns are safe (ephemeral, no side effects) — or surface a clear failure.
- Resource exhaustion attack (many heavy runs) → per-user quotas + global cap + queue backpressure.
The takeaway
Concrete signals: ephemeral sandboxes destroyed after each run, kernel-enforced cgroup/seccomp/namespace limits + wall-clock kill, a warm pool for latency, live streaming, and per-user quotas. The reusable lesson: never trust the workload — isolate it, cap it, and throw the box away. (This is exactly what an in-browser judge like Lyte Code’s own runner does, scaled to a server fleet.)