OpenAI Now Has a GPT-4o Mini. Here's Why That Matters

ChatGPT maker OpenAI introduced a smaller model called GPT-4o Mini on Thursday, which it says is smarter and cheaper than GPT-3.5 Turbo, an earlier model that was built for simple tasks like dialogue.

OpenAI hopes developers will use GPT-4o Mini to "significantly expand the range of applications built with AI," according to a blog post.

Chatbots like ChatGPT are the interface we use to communicate with large language models, or LLMs, like GPT-4o Mini and the original, much larger GPT-4o. These models are trained to understand how we use language so they can generate content that sounds human.

An LLM can have 1 billion or more parameters, which is a measure of how much content it can ingest before producing a response to your prompt. That means LLMs can learn from and understand a lot, but they aren't ideal for every situation. They can be expensive and consume a lot of energy because of the need for expansive server farms and access across the cloud.

A small language model is a compromise of sorts. It offers AI horsepower and speed but doesn't require the same computing resources or cost. Microsoft's Phi-3 Mini, which is built to run on phones and PCs, is one example. Google's Gemini 1.5 Flash, which is designed for high-volume, high-frequency tasks like generating captions and extracting data from forms, is another. Now we have GPT-4o Mini as well.

The skinny on GPT-4o Mini

Both free and paid ChatGPT users can access GPT-4o Mini starting Thursday in place of GPT-3.5, which was released in November 2022.

GPT-4o Mini currently supports text and vision in the OpenAI API, which is what developers use to build new applications based on OpenAI technology. Support for text, image, video and audio inputs and outputs is "coming in the future," the post said.

Enterprise users will have access to GPT-4o Mini starting the week of July 22.

OpenAI said GPT-4o Mini excels in mathematical reasoning and coding, but has also demonstrated skills in tasks that require reasoning. Financial tech startup Ramp and email app Superhuman tested out GPT-4o Mini to extract data from files and generate email responses, according to the post.

The new model has a context window of 128,000 tokens, which is a measurement of how much it can remember in a given conversation. By way of comparison, GPT-4o has the same context window, while GPT-3.5 Turbo has a context window of 16,000 tokens.

GPT-4o Mini costs 15 cents per million input tokens and 60 cents per million output tokens, which OpenAI said is about equal to 2,500 pages in a book.

GPT-4o, which was released in May, costs $5 per million input tokens and $2.50 per million output tokens.

"We envision a future where models become seamlessly integrated in every app and on every website," the blog post said. "GPT-4o mini is paving the way for developers to build and scale powerful AI applications more efficiently and affordably."

GPT-4o Mini uses the same safety parameters as GPT-4o. However, The Verge reported that it's also the first model to use a safety technique called instruction hierarchy, which teaches the model to prioritize prompts from developers over third parties. This is meant to decrease the likelihood the model will fall prey to vulnerabilities from outsiders.

OpenAI Now Has a GPT-4o Mini. Here's Why That Matters

The skinny on GPT-4o Mini

Services and Software Guides