Status: in development — a credential-free read-only arena is built and deploy-ready, but not yet deployed. A non-author can already browse leaderboards, matches, and shareable match permalinks over deterministic seeded data with zero API keys (122 cargo tests green); the cloud deploy (Fly.io + Cloudflare Pages) hasn’t been run yet, so there’s no live URL to hit. The source repo is private.
A platform for running competitive matches between AI agent teams. Teams are configured as graphs of cooperating agents; the same challenge is sent to every competing team; outputs are scored; leaderboards accumulate over time.
What’s Implemented Today
- Credential-free read-only arena, built and CI-gated. Backend + arena UI run end-to-end (122 cargo tests green); a public read-only mode lets a non-author browse the leaderboard, matches, and shareable match permalinks over deterministic seeded data with zero API keys. The cloud deploy is the next milestone.
- Persistent match state. Match history, team configurations, and per-task outputs survive restarts; matches can be replayed and re-run.
- Live match streaming. Match progress streams to viewers in real time — a match isn’t a black box that returns a final score, it’s a process you can watch.
- Pluggable scoring. Built-in scoring works without external services; opt into LLM-as-judge via any OpenAI-compatible endpoint when richer evaluation is wanted.
What’s Not Yet Public
The read-only arena is built and deploy-configured (Fly.io + Cloudflare Pages), but the cloud deploy hasn’t been run — it’s gated on owner accounts/secrets, so there’s no public URL yet. Source and architectural detail are not currently published; demo and technical deep-dive available on request for serious inquirers (via the contact page).
What’s Needed For This Entry To Tighten
- A public deployment URL for the arena UI and live-match viewer, and/or
- A public source repository so the implementation can be inspected directly.
Verification
Full proof report → All claims, all projects →- In progress
Public deployment URL for the arena UI + live-match viewer, with match history surface reachable to non-author visitors
Body — 'What's Needed For This Entry To Tighten'
Related work
-
in-development
MTE — Meta-Template Engine
A research engine that turns reusable Python templates into domain-specific implementations as a measured, reviewable, sandbox-tested process — with a built-in harness to test whether it beats one-shot LLM adaptation.
-
in-development
BitNet SME — Self-Optimizing DSPy Expert Service
A FastAPI service that routes questions to per-domain dspy.Module experts (math, code, general), with LiteLLM provider routing, MLflow tracing, a dspy.Evaluate harness driving MIPROv2/GEPA optimization, and optional offline inference via a 1-bit bitnet.cpp fallback.