The Science of LLM Benchmarks: Methods, Metrics, and Meanings
Date
January 10, 2024
In this talk, Jonathan discussed LLM benchmarks and their performance evaluation metrics. He addressed intriguing questions such as whether Gemini truly outperformed Open AI GPT-4V.