Pond — hardened, trusted, still-functional runs

Design plan for executing agent runs in pond so they are hardened (untrusted code + untrusted agent + untrusted worker host can’t harm pond, other tenants, or the host) and trusted (pond can rely on the result), while staying a real, modifiable coding harness (the agent can read/write files, run builds & tests, install deps, and call the model). Companion to pond-migration.md.

Grounded in a survey of what already exists. Where the survey’s current-state claims are uncertain they’re marked (verify) — confirm before relying.

The tension

Hardening and “a fully functional coding harness” pull opposite ways: real isolation wants no filesystem, no network, no privileges; a coding agent legitimately needs a writable tree, a toolchain, package installs, and the model API. The resolution is layered, tiered isolation with a default-deny egress allowlist — lock everything, then open exactly the holes a harness needs, and match isolation strength to how much you trust the host.

Threat model (decided)

Actors and trust:

Actor	Trusted?	Why it matters
Target repository code	No	Builds/tests run arbitrary code; may be malicious.
The agent itself	No	Prompt-injectable; can be steered to exfiltrate/pivot.
Worker host (BYO / third-party)	No	Pairing/enrollment lets anyone attach a worker.
Orchestrator	Semi	Bearer-authed; today results are trusted on its say-so.
Pond (control plane)	Yes	The only fully trusted component.

Two distinct threats, often conflated — keep them separate:

T1 — untrusted agent/code on a trusted host. Defended by a strong in-VM sandbox + egress allowlist + sealed secrets + quotas. This is the common case and the bulk of the value.
T2 — untrusted host. A malicious worker operator can read anything the job touches after it’s decrypted on their box (source, unsealed creds, outputs) and can fabricate results. Isolation cannot fix this — a sandbox protects the host from the job, not the job from the host. T2 needs trust tiering + result attestation + data minimization, and for sensitive work on untrusted hardware, confidential computing (below).

Trust tiers + routing

Route by the existing two axes (tags for deployment sharding, capabilities for features) plus a new trust facet:

Tier A — pond-operated pool (tag: pool:trusted, KVM hosts). Runs untrusted code for any tenant. Host is trusted, so isolation protects the host + enforces cross-tenant separation. Sensitive / multi-tenant runs land here. Default for product-driven runs.
Tier B — BYO / untrusted host (tag: pool:byo). Only runs work whose inputs that operator is already entitled to see (self-hosted, single-tenant — “bring your own compute for your own repos”). Results are attested and verified before pond trusts them. Never receives another tenant’s data.
Tier C — confidential (future, capability: sandbox.cc-snp / cc-tdx). Untrusted host can run sensitive work because AMD SEV-SNP / Intel TDX keeps the guest memory opaque to the host and pond gets a remote-attestation quote before releasing secrets. The only honest answer to “sensitive workload on hardware I don’t control.” Firecracker alone does not provide this.

Pond is fail-closed: a run’s required tier is part of its sandbox spec; a job only dispatches to a worker whose tags/capabilities satisfy it (extends the existing _SANDBOX_PROFILE_MIN_CAP routing).

Isolation backend: “Firecracker-grade”, delivered pragmatically

VM-level isolation is correct (shared-kernel containers are insufficient for untrusted code on a multi-tenant pool). But ship it so the harness stays an editable OCI image, not a hand-baked rootfs:

Preferred: Kata Containers (Firecracker VMM). microVM hardware boundary + normal container images. “Modify the environment” = edit a Dockerfile, not a kernel. KVM-required.
Alt: gVisor (runsc). Lighter ops, runs on more hosts; occasional syscall-compat gaps — acceptable for most coding harnesses.
Dev fallback: passthrough (the existing NoneProvider) for local/darwin where there’s no KVM. Never for untrusted multi-tenant.

These become sandbox providers/backends alongside today’s NoneProvider + DockerProvider (swarm/src/sandbox.py), selected by the sandbox.{profile, backend} block already on DispatchJob. Raw Firecracker stays possible via a backend but isn’t the default — the image ergonomics are what keep the harness “modifiable and functional.”

Sandbox profile catalog

Profiles fix the isolation posture (immutable); only resource/image knobs are overridable — mirrors today’s _SANDBOX_OVERRIDABLE = (memory, cpus, pids, image).

Profile	Filesystem	Network	Use
`dev` (existing `none`)	host	open	local/darwin only
`review` (existing `untrusted-code-read`)	ro checkout	broker-only	read-only analysis
`harness` (new `untrusted-code-write`)	rw throwaway copy	allowlist egress	the full coding harness (edit/build/test/install)

harness is the new profile that makes “still functional” true: a disposable read-write copy of the checkout (so edits + node_modules are fine and discarded after), an allowlist egress (so pip/npm install + git + the model API work), dropped caps + no-new-privileges + non-root user, microVM backend, and resource quotas. Output (edit_result.json) is captured from the rw copy.

Egress: default-deny allowlist proxy (decided)

Default-deny; open exactly:

allow: <model API host>            # e.g. api.anthropic.com
       <package registries>        # pypi.org, files.pythonhosted.org,
                                    # registry.npmjs.org, crates.io, …
       <the target's own git remote>
deny:  169.254.0.0/16 (metadata)   10.0.0.0/8  172.16.0.0/12
       192.168.0.0/16  ::1 loopback  everything else

Enforcement point matters per tier. On Tier A the proxy is a pond-/pool- controlled egress gateway the microVM is forced through (a malicious agent can’t bypass it). On Tier B the allowlist lives in the guest’s network config — it constrains the agent, but a malicious host can ignore it (that’s the T2 reality; mitigate via data minimization, not network policy).
The registry allowlist is operator-editable per project/workflow (this is part of “modifiable”) with safe defaults; adding a host is an explicit, audited config change.
Pairs with the broker: the model key is reached only via the broker URL, which is itself on the allowlist.

Secret flow + data minimization

Keep the sealed-box + broker design: model creds resolved at dispatch (project→org), Fernet at rest, sealed to the worker pubkey per job, broker sidecar holds the real key so the agent env only ever has a dummy + BROKER_URL (swarm/src/sandbox.py broker plan, crypto/sealed.py, harness runtime files {{SECRET:NAME}}/{{BROKER_URL}}).
H0 RESULT (2026-06-21) — sealing IS fully wired and fail-closed; the wire is safe. Traced live: seal() runs server-side at claim time on POST /orch/seal (model creds, orch.py:648) and POST /orch/bundle/redeem (bundle key, orch.py:745), each requiring the worker’s enrollment-pinned 32-byte pubkey (else ok=false, no payload). The /orch/poll response carries no secret — only a cred_request intent (credential_name + project_id), a source_bundle boolean, and a one-time bundle token (orch.py:233-255). The orchestrator wires BackendClient.seal as the claim-time callback (swarm/src/http.py); the worker unseals with its private key (swarm/src/security.py). No cleartext fallback if the pubkey is absent — the job just runs without the secret. The survey’s “exercised only by tests / cleartext on the wire” was a stale sealed.py docstring (now corrected), not a real gap.
Data minimization for Tier B (T2): an untrusted host can read anything it decrypts. So BYO workers get only their own operator’s data; sensitive / cross-tenant secrets + source never route to pool:byo. Broker-only network keeps the key out of the agent, but not out of a malicious host — don’t oversell it.

Result trust / attestation

Today results are trusted because they arrived over an orchestrator bearer token — no proof the worker ran the job (survey §7). For untrusted hosts that’s not enough. Add, in order of cost:

Signed results. Worker signs (job_id, input_sha, output_sha, rc) with the key already pinned at enrollment (pubkey pinning + optional fingerprint exist). Pond verifies the signature + that input_sha matches the bundle it issued. Cheap; closes result fabrication by anyone other than the keyholder.
Output integrity + bounds. Hash the artifact set; sequence + size-cap the stdout/stderr chunks (today they’re byte-counted but unbounded and unordered) so a worker can’t DoS via gigabytes of output.
Reproducibility / quorum (Tier B sensitive). Re-run high-stakes jobs on a Tier-A worker (or a second BYO) and compare output hashes; divergence → distrust.
Remote attestation (Tier C). SEV-SNP/TDX quote bound to the guest measure- ment, verified before secret release. The real fix for “trust a result from hardware I don’t own.”

Resource governance (per-run quotas)

Make the optional Docker knobs enforced defaults, add what’s missing:

cgroup defaults per profile: memory, cpus, pids-limit (fork-bomb), block-io.
ulimits: nofile, nproc, core=0, stack — none today.
disk quota on the rw copy + tmpfs caps (checkout extraction can fill disk).
wall-clock budget per run (not just per-stage timeout) → auto-cancel.
cost ceiling: enforce max_cost_cents pre-emptively — the telemetry rollup already computes run cost; cancel when it crosses the budget instead of only recording it after the fact.

What keeps it a functional, modifiable harness

The hardening is designed around the harness, not against it:

rw throwaway checkout copy → edits, builds, node_modules, generated files all work; discarded after the run (rw-checkout-copy posture exists).
allowlist egress incl. registries + git → pip/npm/cargo install and git operations succeed; only exfil/pivot is blocked.
editable OCI image (Kata/gVisor) → operators bake/modify the toolchain via a Dockerfile; per-profile image override already allowed.
harness runtime files → the agent CLI (claude/codex) + its config land in $HOME via the existing {{MODEL}}/{{BROKER_URL}}/{{SECRET:NAME}} templating.
generous-but-bounded resources + wall-clock + cost ceiling → big enough for real builds, capped so a run can’t run away.

Implementation status

Everything in this model is shipped and fail-closed today: trust-tier routing, the three sandbox profiles, the gVisor / Kata-Firecracker isolation backends (fail-closed when the runtime is absent — never a silent downgrade to runc), the default-deny egress allowlist (per-job --internal network + operator-editable rules), per-job credential sealing + the broker sidecar, encrypted source bundles, signed result attestation (Ed25519, pinned at enrollment), and per-run resource / wall-clock / cost governance. The mechanisms are specified in docs/specs/.

The one deferred piece is a confidential tier (SEV-SNP / TDX with remote-attestation-gated secret release), for running sensitive workloads on a fully untrusted host — additive, and only needed if that’s a hard requirement.