Listen to this story
|
In an exciting development from the xAI team, Grok-2 and Grok-Mini have officially secured positions on the LMSys Chatbot Arena leaderboard. Grok-2 has taken the #2 spot, surpassing GPT-4o (May) and tying with the latest Gemini model, driven by over 6,000 community votes.
Meanwhile, Grok-2-Mini has earned the #5 position.
Grok-2 has excelled particularly in mathematical tasks, ranking #1 in this category, and secured the #2 positions across various other tasks, including hard prompts, coding, and instruction-following.
Additionally, Grok-2-Mini has undergone significant speed enhancements, now performing twice as fast as before. This boost was achieved after xAI’s inference team as they completely rewrote the inference stack using SGLang, enabling more efficient multi-host inference and improved accuracy.
The team also introduced new algorithms for computation and communication kernels, alongside better batch scheduling and quantisation, further enhancing the models’ performance.
Several people are still sceptical about the performance. OpenAI’s GPT-4o, which claims the top spot, does not perform as well as Claude 3.5, which is at the 5th spot. Though, people have started experimenting with Grok-2 and claim that the model is actually brilliant in coding and maths related tasks.
Released in Beta this month, the Grok-2 family of models are also available for testing on X. The model also allows users to generate images using the FLUX.1 image generation model.