Tag: dropping

Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings

Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings arXiv:2508.11847v1 Announce Type: new Abstract: We propose a method for evaluating the robustness of a widely used LLM ranking system — the Bradley–Terry ranking system — to dropping a worst-case very small fraction of evaluation data. Our approach is computationally fast and…

August 19, 2025

Dropping Just a Handful of Preferences Can Change Top Large Language Model Rankings