Process artifact Report artifact 2026-07-02

Factory Run Schema v1

A specialist factory needs proof folders, not vibes. This schema defines the local run directory, required metrics, and decision vocabulary that every public artifact should eventually satisfy.

factory
evals
reports

Headline Numbers

Required files

8 config, dataset, train log, evals, report, artifact, decision

Decisions

6 ship, reject, retry-data, retry-training, retry-eval, park

First-class outputs

5 data, training, eval, package, report

Competitive Context

System	Metric	Score	Size / Class	Comparable?	Readout
TinyGPT factory schema	required public run files	8	repo-local contract	Direct	Defines the minimum evidence bundle each future public model artifact must carry.
Ad hoc model card only	before/after reproducibility	weak	single document	Directional	Useful for release notes, but insufficient for a factory claim without eval JSON, decision, and blockers.

Direct rows share this artifact's eval setup. Directional rows are useful market context but should not be read as leaderboard claims.

Run folder contract

File	Purpose	Public relevance
config.json	Target, base, method, thresholds	Explains what was attempted
dataset.json	Sources, rows, filtering, heldout	Provenance
eval-baseline.json	Frozen baseline result	Before number
eval-candidate.json	Candidate result	After number
decision.json	Ship/reject/retry call	Honest release status

Release Blockers

Needs a canonical rendered example

The schema is real, but the website should show one complete run folder as the public example.

Unblock: Promote the SQL routed result into a small report-only rendered artifact.

Evidence

Next Release Action

Turn the SQL routed result into the first website-native factory report that follows this schema.