LLM Leaderboard - Real-time AI Model Rankings

Rank	Model	Arena Score	CI	24h Win%	Status
Loading...

Ranking Rules

1. Blind Comparison

Users enter a question, 4 models answer simultaneously. Users vote without knowing model identities, ensuring fairness.

2. Plackett-Luce Ranking

Uses the internationally recognized Plackett-Luce probabilistic model. Winners gain points, losers lose points after each vote, reflecting true model strength.

3. UCB-E Dynamic Exposure

Uses Upper Confidence Bound algorithm to balance exploration vs exploitation. New models get more exposure, preventing the Matthew effect.

4. Anti-Cheat

System detects abnormal voting: short dwell time, repeated votes, frequent operations by same user. Suspicious votes are downweighted or filtered.

Arena Score = γ (Plackett-Luce param) | CI = 1.96 × σ | Vote Weight = f(dwell time, user history)