Leaderboard

Small models, browser-runnable, ranked. Each entry is a .tinygpt checkpoint you can load and try in one click. The launch benchmark is TinyStories PPL — perplexity on a 50-story holdout. Lower is better. More benchmarks (sort-6, reverse-16, shakespeare-ppl) land next.

TinyStories PPL. Held-out perplexity over 50 stories from a slice of the TinyStories corpus that wasn't seen during gallery training. Score = exp(mean per-byte cross-entropy). Browser-trainable models with ~1–10M params can credibly compete (Eldan & Li 2023 showed 1M-param models cross the coherence threshold on this distribution).
# Model Params Score Train
Loading…
Train one in your browser tab. Pick a corpus, hit Start, wait minutes-not-days. When the model converges, hit "Download model" — you get a .tinygpt file. The gallery cards above are all browser-trained 9.6M-param models; a sort-6 / reverse-16 model can be 100× smaller and beat them. Submission flow ships next — for now, open a PR adding your model to data/gallery/ and we'll score it.
Train one →