SEAM accepted at COLM 2025 - News

The same question, drawn as an image or written as text, carries identical information — yet vision-language models often give two different answers. SEAM turns that inconsistency into a controlled measurement.

Launch Highlights

Problem：OCR-style tests that screenshot text into images cannot tell whether a model fails to see or fails to reason.
Method：SEAM uses FEN/boards, SMILES/molecules, ABC/sheet music, and graphs/matrices to preserve semantics.
Finding：Vision usually trails language, and answer agreement across modalities remains far from ideal.
Why it matters：Researchers can separate perception failures from cross-modal reasoning failures.

Continue reading

The research page covers the background, method, key figures, and paper links; for quick sharing, use the illustrated promo copy.

Open research page Open promo copy

SEAM: change the format, change the answer

Launch Highlights

Continue reading