← Back to News NeurIPS Spotlight · December 2024

Report Cards: writing model behavior into verifiable reports

Like a teacher writing comments, Report Cards auto-write behavior reports for models — verified to genuinely help people tell models apart. A NeurIPS SoLaR Spotlight. The research page carries the full story: background, method, key figures, and links to the paper.

Research pagearXivPromo copy
3report-quality axes
3automatic metrics
100%automated generation
SpotlightNeurIPS SoLaR
2409.00844arXiv

Models with similar averages can fail in completely different ways. Report Cards automatically write model behavior into reports — and verify the reports themselves.

Launch Highlights

  • Problem:A single average score hides where a model succeeds, fails, and changes behavior.
  • Method:Report Cards generate natural-language behavior summaries and evaluate them with contrastive, Elo, and human scoring.
  • Finding:Strong reports compress many examples into evidence that helps people tell models apart.
  • Why it matters:Evaluation results can feed product review, model choice, and safe deployment, rather than stopping at a leaderboard.

Continue reading

The research page covers the background, method, key figures, and paper links; for quick sharing, use the illustrated promo copy.

Open research pageOpen promo copy