Phi3-Mini Scores 68.07 on Open LLM Leaderboard - Outperforms LLama3 8B, Llama2 70B, Falcon 180B, Mistral 7B, Solar 11B and Qwen 14B

Phi3-Mini achieved a score of 68.07 on the Open LLM Leaderboard, which is better than LLama3 8B, Llama2 70B, Falcon 180B, Mistral 7B, Solar 11B and Qwen 14B (also only 0.5 diffrence with Mixtral 8x7B).

https://twitter.com/OpenLLMLeaders/status/1783494317568352703

With these kinds of scores and the history of the Phi models, some questions come to mind. To be started:

Is this model contaminated?

https://preview.redd.it/p90i1020zowc1.png?width=750&format=png&auto=webp&s=216ab287ddf5b10b4409a268e5200554f1ef9f14

💬 Discussion r/LocalLLaMA (22 points, 15 commentaires)

Bazaroid

Explorateur

Phi3-Mini Scores 68.07 on Open LLM Leaderboard - Outperforms LLama3 8B, Llama2 70B, Falcon 180B, Mistral 7B, Solar 11B and Qwen 14B - Is this model contaminated?

Vue Graphique