Pure PyTorch,
load it and go.
Frontier architectures re-implemented in vanilla PyTorch so the original weights load with bitsandbytes, train with QLoRA, and run on a single consumer GPU — and reasoning distilled from closed frontier models onto small open Qwen weights. Apache-2.0 or NVIDIA OML.
Reasoning-focused fine-tune of Qwen3.5-9B trained to produce <think>-tagged chains before answering. Distilled from Claude Opus 4.6 + Qwen3.5 reasoning traces (~12.8k examples). QLoRA, single RTX 5090, ~4.5 hours.
Reasoning distill of openNemo-9B. SFT + DPO on ~21k Claude Opus 4.6 reasoning traces. NVIDIA OML.
OPEN ON HUGGING FACE ↗Safety alignment removed via Snakehead, an internal abliteration tool for hybrid Mamba2 + sparse-attention architectures. Refusal rate 97% → 13%, KL 0.022. Research use only.
OPEN ON HUGGING FACE ↗GGUF builds for llama.cpp / Ollama / LM Studio across openNemo-9B, Qwen3.5-9B-Claude-Opus and Qwen3.5-9B-Claude-Code.
OPEN ON HUGGING FACE ↗~63M GPT-2 trained from scratch on public-domain scripture. Built with spare compute as a tribute to Terry A. Davis. Not a serious model — a side project, kept around because it's honest about what it is.
OPEN ON HUGGING FACE ↗First in line for claire.
One letter every other Tuesday — and a single dispatch on the day claire ships with the install line. What we shipped, what we read, the one thing we got wrong. No hype, no roadmap teasers. Cancel from any line.