EMPERO/02 — MODELShuggingface.co/empero-ai6 MODELS · 6 VARIANTS

Frontier
on your GPU.

Reasoning distills from closed frontier models onto small open weights — led by Qwythos-9B-Claude-Mythos-5-1M (1M-token context, native tool use, GGUF v2 local builds) and Qwable-9B-Claude-Fable-5 — with Qwythos-27B announced as the next larger Mythos model — plus pure-PyTorch ports of frontier architectures so the original weights load, quantize and train without proprietary kernels, all on a single consumer GPU. Apache-2.0 or NVIDIA OML.

01Flagship · Qwythos-9BQWEN3.5-9B · GGUF V2 · 1M CONTEXT · APACHE-2.0

CURRENT FLAGSHIP · GGUF V2 LIVE · QWYTHOS-27B ANNOUNCED

A 9B that checks
its own work.

Qwythos-9B-Claude-Mythos-5-1M is our shipped flagship — a full-parameter reasoning model distilled from Claude Mythos 5 and Claude Fable traces, with a 1M-token context, native tool use, and a +34.3 MMLU lift over its base. The v2 GGUFs are live now: fixed Qwen3.5 runtime metadata, stronger embedded chat/tool templates, normal replacement files, explicit -MTP- variants for draft speculation, and a vision projector. Qwythos-27B is announced as the next larger Mythos model.

Model card on HF ↗GGUF v2 for llama.cpp / Ollama ↗Full eval transcripts ↗

+34.3

MMLU vs base

0.232 → 0.575

token context

YaRN · default

GGUF refresh

templates · MTP · vision

27B

announced next

larger Mythos tier

Matched lm-evaluation-harness run (limit 100). Not every benchmark moved up — gpqa-diamond −0.05, arc-challenge roughly flat. Full table on the model card.

Reason → tool → verify · real transcripts, condensedfrom the published eval set

PROMPT

How many prime numbers are there below 100,000? Verify with python.

TOOL CALLS

›python_executor → wrote and ran a primality test → stdout: 9592

<THINK> · REASONING · 2 rounds · 10.5s

This is a counting problem over a fixed range — the kind of thing I should verify rather than recall. I'll write a primality test, count every prime from 2 through 99,999, and run it in the python_executor rather than trusting a memorised figure.

FINAL ANSWER

There are 9,592 prime numbers below 100,000.

I verified this by computing all primes from 2 through 99,999 using a primality test and counting them.

→ Reaches for a tool to verify rather than recalling a number. Writes the primality test, runs it, reports the checked result.

02Published models6 ON HUGGING FACE · OPEN WEIGHTS

FIG. 02 — Qwythos-9B-Claude-Mythos-5-1MApache-2.0

Qwythos-9B-Claude-Mythos-5-1M

Our shipped flagship. A full-parameter reasoning model on a deeply uncensored Qwen3.5-9B base, post-trained on 500M+ tokens of Claude Mythos and Claude Fable traces with in-house chain-of-thought. Ships with a 1,048,576-token (1M) context via YaRN by default and native function calling — and self-corrects with tools (7/7 on hard factual prompts spanning math, cybersecurity, pharmacology and biochem). Versus the base, under a matched harness: +34.3 MMLU, +30 gsm8k-strict, +19 gsm8k-flex. GGUF v2 adds fixed runtime metadata, MTP variants and vision-projector support; Qwythos-27B is announced as the next larger Mythos member.

Benchmarks

MMLU

57.5

GSM8K (strict)

GSM8K (flex)

Open model card →

02bVariants & side projectsGGUF · ABLITERATION · TRIBUTES

Qwythos-9B-Claude-Mythos-5-1M-GGUF

v2 GGUF quantizations of the Qwythos flagship for llama.cpp, Ollama and LM Studio. Fixed Qwen3.5 runtime metadata and embedded chat/tool templates, normal Q4/Q5/Q6/Q8/BF16 replacements, explicit -MTP- variants for draft speculation, plus a vision projector for image input.

OPEN ON HUGGING FACE ↗

Qwable-9B-Claude-Fable-5-GGUF

GGUF quantizations of the Qwable flagship for llama.cpp, Ollama and LM Studio. Run the Claude Fable 5 distill locally on CPU or a single GPU.

OPEN ON HUGGING FACE ↗

openNemo-9B-Claude-Opus-4.6-distill

Reasoning distill of openNemo-9B. SFT + DPO on ~21k Claude Opus 4.6 reasoning traces. NVIDIA OML.

OPEN ON HUGGING FACE ↗

openNemo-9B-abliterated

Safety alignment removed via Snakehead, an internal abliteration tool for hybrid Mamba2 + sparse-attention architectures. Refusal rate 97% → 13%, KL 0.022. Research use only.

OPEN ON HUGGING FACE ↗

GGUF quantizations

GGUF builds for llama.cpp / Ollama / LM Studio across openNemo-9B, Qwen3.5-9B-Claude-Opus and Qwen3.5-9B-Claude-Code.

OPEN ON HUGGING FACE ↗

TEMPLE2

~63M GPT-2 trained from scratch on public-domain scripture. Built with spare compute as a tribute to Terry A. Davis. Not a serious model — a side project, kept around because it's honest about what it is.

OPEN ON HUGGING FACE ↗

02cDatasets7 TASKLIST-* CORPORA · UNFILTERED

MiniMax-M3-150k-MixedDistillation traces from MiniMax-M3 across code, math, competition math, instruction-following and multiple-choice (25 source sets) — every row deterministically verified (executed unit tests, exact-answer compare) by our faststill pipeline before it's kept. ~102M tokens.107k verified rows ↗tasklist-grok-multilingual-100000x-unfilteredMultilingual SFT tasks generated against Grok. Largest of the family.166k rows ↗tasklist-grok4-multilingual-50000x-unfilteredSecond Grok pass with a refined difficulty distribution.95.8k rows ↗tasklist-qwen3.6-pro-11000x-unfilteredQwen3.6-Pro generations. Mixed-difficulty SFT.11.3k rows ↗tasklist-gemma4b-10000x-unfilteredGemma-4B at scale. Smaller-teacher comparison set.10k rows ↗tasklist-qwen3.5-9B-7500x-unfilteredQwen3.5-9B traces. Pairs with the Qwen distill recipe.7.7k rows ↗tasklist-haiku4.5-6000x-unfilteredAnthropic Haiku 4.5 generations. Smallest set, highest cost-per-row.5.8k rows ↗

06 — Dispatch

Follow the build.

An occasional dispatch from the lab — progress on Qwythos and Claire, what we found with microverse, new Abacus releases and the one thing we got wrong that week. No hype, no roadmap theatre. Cancel from any line.

Frontieron your GPU.

A 9B that checksits own work.

Follow the build.

Frontier
on your GPU.

A 9B that checks
its own work.