Claude 3.5 Sonnet vs GPT-4o – Which is Best?

Claude Sonnet 3.5's features spark debate and interest, showcasing its strengths as a compelling model.

Share

Published on July 5, 2024

by Gopika Raj

Since the release of Anthropic’s Claude 3.5 model family, social media platforms, particularly X, have been abuzz with comparisons and testing of Claude 3.5 Sonnet and OpenAI’s GPT-4o. These models are being evaluated based on their features through various testing methods.

Claude 3.5 Sonnet is part of the Claude 3 model family, which was released in 2024. It’s important to note that Claude 3.5 Sonnet outperforms its predecessor, Claude 3 Opus, as well as other leading AI models in various evaluations. It combines enhanced intelligence with improved speed and efficiency.

The latest model is available for free on Claude.ai and the Claude iOS app, with higher rate limits for Claude Pro and Team plan subscribers. The model can also be accessed via the Anthropic API, Amazon Bedrock, and Google Cloud’s Vertex AI, priced at $3 per million input tokens and $15 per million output tokens.

As posted by Perplexity CEO Aravind Sreenivas, the model is now available on Perplexity. And with 2x the speed of Opus, Claude 3.5 Sonnet unlocks new possibilities for complex AI applications across reasoning, knowledge, and coding tasks.

OpenAI’s GPT-4o, released earlier, has also demonstrated significant improvements over its predecessors, including GPT-3.5. It shows enhanced language understanding, broader knowledge, and better contextual comprehension, often generating more accurate, coherent, and contextually relevant responses.

Does Claude Sonnet 3.5 Outperform GPT-4o

Several features of Claude Sonnet 3.5 generate debate or interest among people, highlighting why it is a good model. Let’s take a look at a few standout features and compare them to GPT-4o.

Artifacts Feature

This is something interesting that Claude 3.5 Sonnet came up with. The Artifacts feature expands how users interact with Claude, offering a dedicated window alongside conversations. While generating content like code snippets, text documents, or web design, users can now see a preview of the output.

However, GPT-4o lacks this feature, making the one in Claude stand out even more.

Coding Abilities

Coding with Claude 3.5 Sonnet is 10 times more efficient and faster than with GPT-4o or any other LLM available. The Artifacts feature enhances the user experience by allowing you to generate and run code directly within your chat, providing an amazing user experience.

https://twitter.com/_ann_nguyen/status/1804472602217578744

In a Reddit discussion many users found that Claude 3.5 Sonnet outperforms GPT-4o in coding tasks, often producing nearly bug-free code in the first try. Claude is praised for its accuracy in text summarisation and natural, human-like communication style.

Developing Games From Scratch

Claude 3.5 goes far beyond simple text generation. It’s fun to use Artifacts to make games playable inline. In fact, with the help of Artifacts, it’s more enjoyable to create interactive experiences.

For instance, Pietro Schirano, the founder of EverArt AI, used Claude 3 Sonnet to create a new and original game designed for quick sessions. It generated Color Cascade, a game where players catch the correct colour from a series of falling shapes, hinting at the advanced capabilities of Claude 3.5 Sonnet.

Reasoning Capabilities

Claude 3.5 Sonnet shows advanced visual reasoning, surpassing earlier models. It accurately interprets charts, graphs, and imperfect images, making it valuable for retail, logistics, and finance sectors that rely on visual data analysis.

For instance, Muratcan Koylan, a marketing professional, tried Claude 3.5 to analyse financial data and provide trading insights. The model demonstrated impressive capabilities in data extraction, correlation analysis, and generating trading strategies.

It provided detailed predictions for interest rates, the USD Index, and the S&P 500, along with sophisticated trading strategies and potential black swan events.

When compared with other models like GPT 4o, users were particularly impressed by the model’s ability to offer nuanced, context-specific insights and its advanced reasoning capabilities, which they found superior to other AI models.

Solving Pull Request

Claude 3.5 Sonnet shows major improvements in coding tasks, especially pull requests. It solved 64% of problems in an internal evaluation, up from 38% for Claude Opus. This leap demonstrates Sonnet’s enhanced reasoning and coding abilities, making it a potentially valuable tool for collaborative software development.

Alex Albert from Anthropic AI posted on X a demo video of a simple pull request:

To start, if you want to see Claude 3.5 Sonnet in action solving a simple pull request, here's a quick demo video we made.

(voiceover by the one and only @sumbhavsethia) pic.twitter.com/Jaet2oFgvk
— Alex Albert (@alexalbert__) June 20, 2024

He mentions that Claude is starting to get really good at coding and autonomously fixing pull requests. It’s becoming clear that in a year’s time, a large percentage of code will be written by LLMs.

Whereas for GPT, there is no clear evidence that GPT-4o can directly solve pull requests. However, there are some related developments and applications of GPT models in the context of GitHub and pull requests.

Final Verdict?

A Reddit discussion rated GPT 4o against Claude 3.5 Sonnet. The users generally found Claude 3.5 Sonnet to be superior to GPT-4o for many tasks, particularly coding and writing.

A user described Claude as a doctoral candidate, while GPT-4o was an intelligent undergrad or master’s level student.

📣 Want to advertise in AIM? Book here

Gopika Raj

With a Master's degree in Journalism & Mass Communication, Gopika Raj infuses her technical writing with a distinctive flair. Intrigued by advancements in AI technology and its future prospects, her writing offers a fresh perspective in the tech domain, captivating readers along the way.