| Rank | Model | Arena Score | CI | 24h Win% | Status |
|---|---|---|---|---|---|
| Loading... | |||||
Users enter a question, 4 models answer simultaneously. Users vote without knowing model identities, ensuring fairness.
Uses the internationally recognized Plackett-Luce probabilistic model. Winners gain points, losers lose points after each vote, reflecting true model strength.
Uses Upper Confidence Bound algorithm to balance exploration vs exploitation. New models get more exposure, preventing the Matthew effect.
System detects abnormal voting: short dwell time, repeated votes, frequent operations by same user. Suspicious votes are downweighted or filtered.