News · 新闻

Releases, papers, and lab milestones.

Updates from Coolwei AI Lab: research published, products shipped, and moments worth recording.

9 entriesLatest June 2026

Pinned · 置顶

May 18, 2026Major Release

Yanlan V3.0 is live — and it catches almost half the errors V2.0 missed

V3.0 sweeps all 8 headline metrics over V2.0 and takes 43 of 47 total comparisons.

8/8weighted wins43/47comparisons won95.69%F194.16%perfect correction

Read full

2026

5 entries

MayMajor Release

May 2026Major Release

Yanlan V3.0 released

All 8 headline metrics won, 43 of 47 total comparisons won, setting a stronger bar for pre-publication Chinese correction.

AprarXiv

April 2026arXiv

ThinkTwice released

Models are like students who grind problem sets but never check their paper: they solve, they don’t fix. ThinkTwice trains “checking” into a skill — an 11.5-point gain on AIME pass@4.

MararXiv

March 2026arXiv

Grounded Chess Reasoning released

Engines are like master craftsmen who cannot teach: accurate, but unable to explain. Master Distillation gives a 4B model concise puzzle commentary that surpasses its teacher.

MararXiv

March 2026arXiv

OasisSimp dataset released

Rewriting official prose into plain language has no yardstick in most languages. Five languages, 9,519 sentences, written by native speakers — the first open evaluation for low-resource simplification.

FebKDD 2026 (CCF-A)

February 2026KDD 2026 (CCF-A)

SWE-Bench Mobile released

50 real iOS feature tasks, 449 human-written tests, ~500K lines of production code, and a best task pass rate of 12%.

2025

3 entries

AugCOLM 2025

August 2025COLM 2025

SEAM accepted at COLM 2025

A report should read the same in any format; models often disagree with themselves. SEAM quantifies cross-modal inconsistency in 21 vision-language models across chess, molecules, scores, and graphs.

AugProject Launch

August 2025Project Launch

Mobile-Agent-Bench project launched

The start of a long-running collaboration to evaluate coding agents on real mobile production codebases.

JunFounding

June 2025Founding

Coolwei AI Lab founded

Focused on safe deployment, evaluation, and real-world applications of large language models.

2024

1 entry

DecNeurIPS Spotlight

December 2024NeurIPS Spotlight

Report Cards receives NeurIPS SoLaR Spotlight

Like a teacher writing comments, Report Cards auto-write behavior reports for models — verified to genuinely help people tell models apart. A NeurIPS SoLaR Spotlight.