1 █▀▀▄ █▀▀▄ ▄▀▀▄ ▀█▀ █  █ ▄▀▀▄ █▄  █
2 █▄▄█ █▄▄▀ █  █  █  █▄▄█ █  █ █ █ █
3 █    █  █ ▀▄▄▀  █  █  █ ▀▄▄▀ █  ▀█ 

docs-first project generator for AI-assisted Python development

A batteries-included uv Python project generator with AI alignment tooling that makes AI-powered development dramatically faster.

$ uv tool install "git+https://github.com/jackedney/prothon" github.com/jackedney/prothon
uv ruff ty claude
pytest hypothesis copier poe bandit mutmut vulture complexipy
Core Insight

AI alignment isn't about better prompts — it's about giving AI a durable source of truth and a verification loop that catches when code drifts from intent.

01 Modern Python Scaffolding

One command. Full toolchain. Same quality bar for human and AI code. 9 quality tools on every commit. copier update pulls upstream improvements without losing local changes.

poe check — single gate for hooks, CI, and AI
9 quality tools: ruff, ty, pytest, hypothesis, mutmut, bandit, vulture, complexipy
uv, pyproject.toml-only, src/ layout, py.typed marker
copier update for upstream template improvements
single quality gate
pre-commit CI AI
poe check
ruff ty pytest hypothesis mutmut bandit vulture complexipy
prothon new my-project
my-project/
├── pyproject.toml          # uv, poethepoet, all tool config
├── .python-version
├── .gitignore
├── .pre-commit-config.yaml  # 9 quality tools
├── AGENTS.md               # canonical AI instructions
├── CLAUDE.md  symlink to AGENTS.md
├── docs/
│   ├── SPEC.md              # requirements (highest authority)
│   ├── DESIGN.md            # architecture (traces to SPEC)
│   └── PATTERNS.md          # conventions (can't contradict)
├── src/my_project/
│   ├── __init__.py
│   └── py.typed              # PEP 561 marker
└── tests/

02 Hierarchical Documentation

Three documents with strict authority. Higher overrides lower. Each has a dedicated conversational skill that presents one decision at a time and hard-rejects content belonging at a different level.

Skills hard-reject content from other levels
SPEC change triggers DESIGN, then PATTERNS review
Conflicts resolve at doc level before code
After design or patterns, harmonizer cross-references all three levels
authority chain
SPEC.md highest
DESIGN.md traces up
PATTERNS.md no conflict
change cascades top-down

03 Design Workflow

Seven commands, run in sequence. Each launches an interactive session scoped to a single concern. Each produces a versioned artifact in the repo.

new — scaffold a fresh project from the template
init — add prothon to an existing project
spec — "What are you building, who is it for, and why?"
design — researches tech, presents trade-offs
patterns — code style, testing, conventions
executefresh subagents, verifies promises
complianceevidence tables, code vs docs
new / init
bootstrap
spec
what & why
design
architecture
patterns
code style
execute
build it
compliance
verify it

04 Drift Detection & Reconciliation

After design and patterns commands, subagent systems fire automatically to maintain doc consistency and generate reference material.

Harmonizer catches contradictions, scope creep, unchosen tech
Amends lower doc. SPEC never touched.
Tech Researcher generates reference skills from Context7, web, training data
Compliance runs at checkpoints: PASS/FAIL/PARTIAL with file:line evidence
auto-fire gates
harmonizer
doc ↔ doc
after design/patterns
tech researcher
generates skills
after design
SPEC never amended

05 Skills Collection

After DESIGN is written, tech researcher generates reference skills for your exact stack. Queries Context7 live docs, falls back to web search, then training knowledge. Current material, not generic training data.

tech-* — library usage, idioms, gotchas, version-specific APIs
style-* — naming conventions, import organization, type annotations
optim-* — performance patterns, GPU batching, subprocess management
domain-* — field-specific concepts: geospatial, ML, finance, etc.
Auto-loaded during execution — no manual context switching

Example: ML + geospatial project

.agents/skills/
.agents/skills/
├── tech-pytorch.md
├── tech-fastapi.md
├── tech-polars.md
├── style-python.md
├── optim-gpu.md
└── domain-geospatial.md
research pipeline
DESIGN.md
Context7 web
tech-* style-* optim-* domain-*
auto-loaded during execute

06 Execution Promises

Before execution starts, the planner writes change_promise.toml — a contract that declares exactly what each task will produce. This turns open-ended code generation into a bounded, verifiable process.

Files to create, modify, remove — declared upfront
Line predictions force thinking through scope
Checked against git with ±30% or ±30 lines tolerance
3 attempts per task, fresh context each
1 Plan
Read all docs + skills
Scan codebase gaps
Write promise file
2 Execute
Fresh subagent per task
Implement → check → commit
Verify promise (3 retries)
3 Verify
Compliance check
Full docs vs code
Cleanup promise file

Why line predictions? Requiring the AI to predict line counts forces thoughtful scoping. If it predicts 50 lines but writes 300, either the plan was sloppy or execution went sideways.

Example

docs/change_promise.toml
[metadata]
base_commit = "a3f2c1b"

[[tasks]]
title = "Add auth handler"
goal = "Implement JWT auth"
success_criteria = "Tests pass"
files_to_create = ["src/auth/handler.py"]
files_to_modify = ["src/__init__.py"]
files_to_remove = []
expected_lines_added = 85
expected_lines_removed = 0
context_files = ["src/config.py"]
doc_sections = ["docs/DESIGN.md#auth"]
reference_skills = ["tech-pyjwt"]
dependencies = []
completed = false
attempts = 0

[[tasks]]
title = "Add auth tests"
goal = "Test auth flows"
success_criteria = "100% coverage"
files_to_create = ["tests/test_auth.py"]
files_to_modify = []
files_to_remove = []
expected_lines_added = 120
expected_lines_removed = 0
context_files = ["src/auth/handler.py"]
doc_sections = []
reference_skills = []
dependencies = [0]
completed = false
attempts = 0

Don't rely on the AI remembering.
Make the instructions part of the repository.

$ uv tool install "git+https://github.com/jackedney/prothon" github.com/jackedney/prothon
$