run.json
Canonical state for phases, command ids, allowed paths, criteria classes, deliverables, and status.
An agent operating system for coding runs: turn a rough request into a structured run.json,
evidence-backed phases, one ready-to-paste /goal, mechanical gates, and a report an engineer can inspect.
$ /supergoal redesign billing settings
Stage 0 memory + tools loaded
Stage 2 repo recon completed
Stage 4 sliced into 5 phases
Stage 5 compiled run.json
Stage 6 kernel validated
PREFLIGHT_GREEN
/goal "Execute .supergoal/billing-settings-H7x3 until AUDIT_COMPLETE, RUN_REPORT_WRITTEN, and SUPERGOAL_RUN_COMPLETE."
Supergoal is not a framework, server, or UI library. It is a skill package that gives Claude Code and Codex a disciplined run kernel for non-trivial software work: a manifest, scoped phases, command evidence, failure events, mechanical gates, and a final audit.
The key move is simple: /goal carries the short end-state, while the real contract lives on disk.
The executor keeps returning to run.json, phase specs, evidence files, and the protocol instead of trusting chat memory.
v1 gives every run a small operating system: manifest, event stream, evidence vault, gates, audit, and report.
run.jsonCanonical state for phases, command ids, allowed paths, criteria classes, deliverables, and status.
events.jsonlAppend-only black box recorder for starts, gates, failures, audits, and report generation.
evidence/Command logs, diffs, screenshots, and proof files the phase gate can inspect.
sg.pyStandard-library kernel for validation, event recording, phase gates, audit, resume, and reports.
Each layer removes one class of agent failure: vague plans, invisible work, unbounded edits, weak recovery, and unverifiable completion.
The planner writes run.json before any execution. Phases, dependencies, allowed paths, command ids, deliverables, and trust debt become a contract, not a suggestion.
{
"schema_version": "1.0",
"phase": {
"id": 1,
"allowed_paths": ["src/auth/"],
"commands": ["test"]
}
}
Command output, diff summaries, screenshots, and audit notes live in evidence/phase-N/. The transcript can summarize proof; the run keeps the proof.
evidence/
`-- phase-3/
|-- commands/test.log
|-- diffs/summary.txt
`-- screenshots/mobile.png
sg.py gate-phase checks required evidence, command exit markers, changed files, and trust debt before SUPERGOAL_PHASE_DONE.
PHASE_GATE_VERIFY pass
TRUST_DEBT phase 3: 1/8 trust-prior (12%)
SCOPE_DRIFT: none
Failures write events. Resume does not ask the next agent to infer where the run died; it prints the exact next phase, gap, or blocked reason.
{"type":"failure.probe","phase":2,"status":"fail"}
{"type":"audit.fail","data":{"gaps":["missing deliverable"]}}
report.html turns a long autonomous session into a review artifact: phase status, event history, evidence counts, and the boundary between mechanical proof and human judgment.
AUDIT_COMPLETE
RUN_REPORT_WRITTEN .supergoal/run/report.html
SUPERGOAL_RUN_COMPLETE
Supergoal makes the planning explicit before execution starts, then keeps execution bound to a structured run contract.
Memory, tools, repo state, risks.
run.json, events, evidence vault.
Executor resumes from disk, not chat memory.
Commands, scope, evidence, trust debt.
Retry, fix spec, or blocked handoff.
report.html exposes what passed and what still needs judgment.
Detect memory, tools, repo state, active runs, and whether the work is greenfield or brownfield.
Scan the codebase and environment so the plan reflects the project instead of guessing from the prompt.
Create run.json, markdown mirrors, phase specs, and the evidence vault.
Supergoal prints one /goal line. Slash commands remain user-triggered, so the handoff is honest.
Every phase must pass evidence, command, scope, and trust-debt checks before done.
The protocol is intentionally blunt: phases must be measurable, evidence must exist on disk, and the final audit checks the working tree rather than trusting the conversation.
Required files and command logs must exist before a phase can print SUPERGOAL_PHASE_DONE.
Changed files are checked against each phase's allowed_paths; drift prints SCOPE_DRIFT.
Criteria are labeled mechanical, human, or trust-prior; weak proof is visible.
After audit, Supergoal writes report.html with phases, events, evidence counts, and trust debt.
run.json, state, protocol, evidence, and phase specs live on disk./goal drives the whole run.AUDIT_COMPLETE and RUN_REPORT_WRITTEN must appear before completion.The report is generated under the run root, not hosted externally. It turns a long agent session into a reviewable state summary.
| Phase | Status | Gate | Evidence |
|---|---|---|---|
| Foundation | complete | pass | 9 files |
| States & edges | complete | pass | 12 files |
| Polish & Harden | complete | pass | 8 files |
exit 0.allowed_paths.trust-prior.Supergoal should make clean success obvious, but it should make imperfect runs even more inspectable.
Every phase has evidence, commands exited cleanly, audit found no gaps, report was written.
AUDIT_COMPLETE
Audit identified a missing deliverable, wrote a focused fix spec, reran, and completed cleanly.
audit.fail -> audit.pass
The run stops with probe history and exact next action rather than pretending the task is done.
FAILURE_HANDOFF
The phase tried to touch a file outside allowed_paths; the gate flagged it before completion.
SCOPE_DRIFT
This site is static. The repo can publish it from site/ with GitHub Actions and no frontend build step.
/plugin marketplace add https://github.com/robzilla1738/supergoal.git
/plugin install supergoal@supergoal
/reload-plugins
mkdir -p ~/.codex/skills
git clone https://github.com/robzilla1738/supergoal /tmp/supergoal-clone
cp -R /tmp/supergoal-clone/skills/supergoal ~/.codex/skills/
rm -rf /tmp/supergoal-clone
git add site .github/workflows/pages.yml
git commit -m "Add GitHub Pages site"
git push origin main
# In GitHub: Settings -> Pages -> Source -> GitHub Actions
Supergoal ships the skill and its runtime assets. Tests, docs, and this website stay in the repository.
supergoal/
├── skills/supergoal/
│ ├── SKILL.md
│ ├── scripts/
│ │ ├── claim-run.sh
│ │ ├── sg.py
│ │ └── repo-state.sh
│ ├── templates/
│ │ ├── ROADMAP.md
│ │ ├── STATE.md
│ │ └── PROTOCOL.md
│ └── references/
├── tests/
│ ├── sg-run-kernel.test.sh
│ ├── claim-run.test.sh
│ └── repo-state.test.sh
├── site/
└── .github/workflows/pages.yml
SKILL.mdThe main instruction surface that defines stages, intake, recon, plan review, and handoff behavior.
sg.pyValidates manifests, records events, gates phases, audits deliverables, resumes runs, and writes reports.
repo-state.shChecks deliverables against the complete working tree, including untracked files.
PROTOCOL.mdDefines the autonomous phase loop, evidence vault, gate commands, recovery blocks, audit, and report markers.