Claude 3.5 Sonnet vs GPT 4o: Which AI Leads the Future?

No items found.

October 25, 2024

Like an unending clash of fire and ice, this battle began long ago and continues to this day — Claude 3.5 Sonnet vs GPT 4o. What should you choose? Should you give preference to the nuanced conversational flow of ChatGPT? Or maybe Claude will be the best choice with its focus on providing fast responses and a new Artifact feature?

According to the latest information from Statista, the page chat.openai.com was visited approximately 484.82 million times by users worldwide in January 2024. At the same time, the Claude AI website was visited about 54.4 million times in March, based on information from SimilarWeb. But after the release date of a new model, the numbers jumped to 72.9 million. They are still continuing to grow, generating significant buzz in society as everyone began discussing the new model.

Our team at Tensorway decided to see if the new trending version is worth your attention or if it's better to stick with the trusted ChatGPT 4. To find this out, we conducted a series of tests and examined the models offered by two popular development companies. Now it's time to reveal our results and determine which model truly deserves your attention!

First Look at Claude 3.5 Sonnet

Anthropic AI updated their medium-sized model to the 3.5 version on June 20, 2024. This model is considered to be their best one yet since it is 2x faster and 5x cheaper than the first competitor to the ChatGPT 4 — Claude 3 Opus. In a single interaction, Claude 3.5 Sonnet can take into account 200,000 tokens when generating responses.

This allows for more nuanced and contextually aware interactions, as it can remember details from earlier in the conversation or from the provided text. But there are more fresh things that we will cover down below.

Interactive Code Editor

A pivotal factor in Claude vs GPT4 assessments was to find new unique functions that weren't featured in previous models. Anthropic made a massive step forward in terms of how the interface of an AI is presented to consumers. The experimental feature provides a new way to interact with AI-generated content — an interactive code editor in the other half of the chat screen.

By default, users don't even see the code, so they have to chat with the AI on one side in order to get and change the results on the other. This is probably similar to demos like Devon, but none of these were available to the public before.

Advanced NLP Capabilities

As time passed by, more and more companies began to pay more attention to human-like AI models. The Anthropic company did not waste time and contributed to the understanding of high-quality Natural Language Processing technology.

It can be noticed how well Claude was improved in understanding and generating human language. The new model has become an excellent addition to NLP applications, among which are such popular ones as Salesforce, Slack, and Cohere.

Seamless Integration with Leading Platforms

When we compare Claude vs GPT 4, it's essential to evaluate not just the features but also their integration into other software. Before the update, Claude Sonnet had limited availability and integration with platforms. But today's reality and fast-evolving market force us to look for new ways to connect the user with our product.

Currently, you can find Claude 3.5 Sonnet available on such major platforms as Google Cloud's Vertex AI and Amazon Bedrock. However, we'll dive into this aspect in more detail a little bit later.

Progressive Visual Reasoning Capabilities

One of the newest features in Claude is a more advanced and sophisticated vision model. It differs from previous versions since it can analyze visual data from images more deeply, interpreting charts and graphs.

Also among the highlights of this update is decoding text from images that may not be clear or ideal for optical character recognition. That allows users to automate image analysis, even for visuals that include coding elements.

Improved Code Generation

In the ongoing debate of Chat GPT vs Claude, various aspects, such as code generation, are often highlighted. Based on the latest data provided in internal agentic coding evaluation by Anthropic, Claude 3.5 Sonnet solved 64% of problems, while Claude 3 Opus only 38% of them. In other words, the new model more effectively corrects errors in the code, translates it, and generates a new one based on user requests.

All this allows users to utilize AI more accurately when it comes to automating a huge part of the coding process. Not to mention the easier modernization of old applications. Claude significantly improves the efficiency of codebase transfers between programming languages, offering greater effectiveness than previous model versions.

First Look at ChatGPT 4о

ChatGPT 4о has been available since May 13, 2024. Released by OpenAI, this model differed from previous ones due to its improved ability to understand and generate human-like text while focusing on accuracy and broader general knowledge.

As the authors of this AI model themselves say, it provides "GPT-4-level intelligence". But what does it actually mean for users? Let us break it down for you one by one.

Enhanced Multimodal Capabilities

The OpenAI team has thoroughly approached the issue of processing user requests and added the ability to their AI to recognize images. ChatGPT 4о can easily interact with a variety of data formats, such as diagrams, documents, and other visual content.

Once the visuals have been loaded, the AI can analyze the data and provide broad answers based on the information received. In addition to describing objects, ChatGPT can also help users create presentations, solve visual-based problems, and create images.

Deep Contextual Understanding

The previous version of ChatGPT had problems in understanding nuanced contexts. As a result, a user could receive inaccurate and low-detailed answers. During our analysis of Claude 3 vs GPT 4, we discovered that OpenAI was able to minimize misunderstandings between the user and the AI.

Yes, you may still encounter incorrect answers. However, with the right approach and a more carefully crafted prompt, you can get the desired answer that covers all relevant aspects.

Real-Time Voice & Video Interactions

Now, ChatGPT has a voice, which can also be selected from a fairly diverse range of choices. It is available in such languages as English, French, German, Spanish, Italian, and Japanese. Today, other languages like Ukrainian are also available on the platform. And yet, they remain in the experimental stage due to less natural-sounding outputs.

Users can engage with AI in deep conversations through voice and video interactions. Using the camera, you can get explanations and descriptions of what the AI model sees, and we can say that it copes with those tasks very well.

Efficient Desktop Application

Finally, in our exploration of Claude AI vs GPT 4, we cannot overlook the new ChatGPT desktop app, which is available only for macOS. Now, you don’t have to go to the browser and search for the webpage. All you need to do is tap the headphone icon in the bottom right corner of the desktop app to start a voice conversation.

In other cases, users can also use the keyboard shortcut (Option + Space) to get help from ChatGPT 4о. We honestly think it's more convenient and saves lots of time.

Claude vs GPT 4: In-Depth Comparison

While Claude 3.5 Sonnet and ChatGPT 4o excel in generating human-like text and performing different tasks, they have distinct strengths. The Anthropic AI model has achieved huge success in providing users with robust visual reasoning and enhanced interactivity. It tries to be more engaging for users and advance with our fast-evolving world.

On the other hand, OpenAI created AI that concentrates on producing accurate responses and maintaining a conversational tone. ChatGPT hit the market first and has been winning users over for a long time.

However, there is more that is hidden behind the companies' ideas and fresh features described above. These are performance metrics, application versatility, user interaction capabilities, robust security features, and seamless integration options. All of this we will discuss below in more detail.

Core Capabilities

The first aspect in our comparison of Claude vs GPT 4 is their core features. The newest model of Anthropic prefers to deliver concise and engaging information regarding inquiries as well as complex scientific and technical prompts. Claude 3.5 can also analyze and explain images, including images with code. However, its explanations sometimes can be more difficult to understand. And, of course, we can see that Claude has the artifacts feature, which can't be seen in ChatGPT for now.

As for ChatGPT 4o, the AI tries to offer more elaborate answers and a more detailed but straightforward breakdown, including benefits and drawbacks. This AI model can provide the user with a more understandable explanation when it comes to image reading. Also, the AI model can generate images directly (when integrated with tools like DALL-E). However, Claude 3.5 Sonnet lacks both this and voice features.

Performance & Speed Analysis

Claude 3.5 Sonnet obtained more optimized architecture than its previous versions and even ChatGPT 4o. It can analyze and provide information during complex visual tasks 2x times faster. Also, this AI model has 200,000 tokens supported by the input context window. The maximum number of tokens that can be generated by the model in a single request is 4,096. This overall increases the model's performance in long conversations, saving context throughout the whole session.

In contrast, the ChatGPT 4o model can provide 128K tokens, while the maximum output tokens reach 2,048, which is lower than in the newest Claude version. And yet, it has an output generation speed of around 200 wpm. This is especially true for text queries. For example, the speed of human speech, on average, reaches 150-190 wpm.

Precision & Accuracy Metrics

The debate continues with GPT4 vs Claude, particularly regarding precision and accuracy metrics. We can see that these two AI models are performing almost equally well in these aspects. Claude 3.5 Sonnet and ChatGPT 4o identify around 60-80% of data correctly most of the time. And yet, none of them truly excels in this parameter. The main difference can be seen in specific tasks.

For example, according to our classification test, Claude 3.5 Sonnet performed better. However, in verbal reasoning tasks like analogy questions, specific calculations, and antonym identification, GPT 4o outperformed the Anthropic model. And again, both models had difficulties doing tasks, but now related to numerical, date-related, and factual questions.

Integration Features & Compatibility

Claude is a common choice for integration in various applications, like chatbots and virtual assistants. A robust API makes it easy to integrate AI with websites and even third-party services like Microsoft products. Among API providers, you can find Anthropic, AWS Bedrock, and Google Cloud Vertex AI Model Garden.

OpenAI also created a comprehensive API for easy AI integration with applications and services. It can seamlessly be added to enterprise software. For now, it is provided only by OpenAI.

Security Measures & Protocols

The last but not least aspect of our Claude vs GPT 4 comparison is about security. According to Anthropic's statement, the company does not train its generative models on user-submitted data unless a user gives explicit permission. It also uses TLS protocols to encrypt data during transmission. Moreover, the developers of the new model chose to use AES 256-bit encryption for data storage, which allows users' personal data to be preserved even if the storage media is hacked.

Meanwhile, GPT 4o was rigorously tested during development, which helped identify risks and built mitigations for misuse. The new version of the AI also adheres to a Preparedness Framework, GDPR compliance, and the latest data security practices. It helps to evaluate its safety measures, ensuring privacy protection and avoiding the use of user-submitted data unless explicitly permitted.

Claude 3.5 Sonnet vs GPT 4o Comparison Summary

Parameter	Claude 3.5 Sonnet	GPT 4o
Coding Capability	Completes 78.2% of coding problems correctly.	Successfully completes 72.9% of coding tasks.
Vision Capacities	Excels in visual benchmarks with 67.7% in visual math reasoning, 94.7% in science diagrams, and 95.2% in document visual Q&A. Exhibits slightly lower performance in visual question answering - 94.7%.	Has slightly lower indicators, 63.8% in visual math reasoning, 94.2% in science diagrams, and 92.8% in document visual Q&A. But excels in visual question answering with 69.1%.
Creative Writing	Rated 4.7 out of 5 for engagement and emotional depth in flash fiction prompts. Produces concise, impactful poems with an average rating of 4.5/5 for depth and resonance.	Rated 3.5 out of 5, indicating less emotional engagement. Longer poems rated around 3/5 for engagement, indicating more generic outcomes.
Contextual Understanding	The AI model scored 88.7% in Massive Multitask Language Understanding using 0-shot CoT. It is strong in reasoning across multiple domains.	Claude scores 88.7% in MMLU with 5-shot CoT prompting, and 88.3% in 0-shot. It has better performance in tasks that benefit from contextual understanding and learning from examples.
Response Speed	Generates text at approximately 80 tokens per second (TPS).	Generates text at around 60 TPS, making it slower than Claude 3.5 Sonnet.
Unique Features	Among the latest technologies can be found an Artifacts feature, but it still needs improvement.	ChatGPT 4o can generate user-friendly voice responses, has a desktop app for macOS, and can also create images according to user requests.
Pricing	Claude 3.5 Sonnet is cheaper by about 40.0% since its users have to pay $3.00 per million tokens for input data provided and $15.00 for the output.	The cost of input data provided to the ChatGPT 4o is $3.00 per million tokens, and the output is $15.00.

So, is Claude better than GPT 4?

Actually, the answer depends on your needs and goals. If you want to streamline coding tasks and make more engaging content through longer requests, it might be better to use Claude with its unique Artifacts feature. On the other hand, if you are looking for a versatile model that has a voice and excels in conversational tasks, content creation, and image generation, you may be more interested in ChatGPT 4o.

But the battle still goes on since OpenAI will soon release their new version of ChatGPT Driven by a new technology called OpenAI o1. Meanwhile, Anthropic will fulfill the family of the models and publish Claude 3.5 Haiku and Claude 3.5 Opus later this year. So stay tuned and don't forget to check our latest news for more updates!