We work on small-model reasoning, agentic coding, and grammar-correction systems — research that ships into government, media, and education products.
Master Distillation + RLVR turns a 4B model into a chess reasoner that beats frontier LLMs.
RLVR that jointly optimises reasoning and self-refinement at +3% overhead, +11.5pt on AIME after refining.
Industrial mobile-development agent benchmark on a real production iOS codebase — 50 tasks, 449 human-verified tests.
Open-source sentence simplification dataset spanning English, Sinhala, Tamil, Thai and Pashto.