Google unveils a new Gemini model.

Google unveils a new Gemini model.

Google is constantly updating Gemini, releasing new versions of its AI model family every few weeks. The latest version is so good that it topped the Imarena Chatbot Arena leaderboard, beating out the latest version of OpenAI's GPT-4o.

Update: Gemini caught fire after telling users to “die.”

Formerly known as LMSys Arena, the platform allows AI labs to blindly pit their best models against each other. Users vote, but do not know which model is which until after the voting is over.

Google DeepMind's new model, catchily named Gemini-Exp-1114, matches the latest version of GPT-4o and exceeds the capabilities of OpenAI's o1-preview inference model.

The top five models in the arena are all versions of OpenAI or Google models. The first model on the leaderboard is xAI's Grok 2.

The success of this new model is due to Google finally releasing a Gemini app for the iPhone, beating out the ChatGPT app in a seven-round Gemini vs. ChatGPT matchup.

The latest Gemini model appears to perform particularly well on math and visual tasks.

Gemini-Exp-1114 is currently not available on the Gemini app or website. It can only be accessed by signing up for a free Google AI Studio account (a platform for developers who want to try out new ideas).

We also do not know if this is a version of Gemini 1.5 or an early insight into Gemini 2, which is due next month. If the latter, the improvements over the previous generation may not be as extreme as some might expect.

However, benchmarks indicate that it is doing well in technical and creative areas. This would tie in with the idea that it would help in reasoning and agent management. It is #1 in math, solving hard problems, creative writing, and vision.

Unlike other benchmarks, the chatbot arena is based on human perceptions of performance and output quality rather than rigorous testing against data.

Whether this is just a new version of Gemini 1.5 Pro or an early insight into the capabilities of Gemini 2, it is going to be an interesting few months in the AI world.

Categories