X

OpenAI Now Has a GPT-4o Mini. Here's Why That Matters

The smaller, more affordable language model is meant to spur AI development.

Headshot of Lisa Lacy
Headshot of Lisa Lacy
Lisa Lacy Former Lead AI Writer
Lisa joined CNET after more than 20 years as a reporter and editor. Her coverage of AI ranged from a hands-on with OpenAI's empathetic Advanced Voice Mode to in-depth explainers on LLMs, hallucinations and more, as well as a series profiling AI startups. Earlier career highlights include a 2020 story about problematic brand mascots, which preceded historic name changes, and going viral in 2021 after daring to ask, "Why are cans of cranberry sauce labeled upside-down?" She has interviewed celebrities like Serena Williams, Brian Cox and Tracee Ellis Ross. Anna Kendrick said her name sounds like a character from Beverly Hills, 90210. Rick Astley asked if she knew what Rickrolling was. She lives outside Atlanta with her son, two golden retrievers and two cats.
Expertise Technology | AI | Advertising | Retail
Lisa Lacy
3 min read
The logo for OpenAI's latest small language model, GPT-4o Mini.
OpenAI/Screenshot by Lisa Lacy/CNET

ChatGPT maker OpenAI introduced a smaller model called GPT-4o Mini on Thursday, which it says is smarter and cheaper than GPT-3.5 Turbo, an earlier model that was built for simple tasks like dialogue.

OpenAI hopes developers will use GPT-4o Mini to "significantly expand the range of applications built with AI," according to a blog post.

Chatbots like ChatGPT are the interface we use to communicate with large language models, or LLMs, like GPT-4o Mini and the original, much larger GPT-4o. These models are trained to understand how we use language so they can generate content that sounds human.

AI Atlas art badge tag

An LLM can have 1 billion or more parameters, which is a measure of how much content it can ingest before producing a response to your prompt. That means LLMs can learn from and understand a lot, but they aren't ideal for every situation. They can be expensive and consume a lot of energy because of the need for expansive server farms and access across the cloud.

A small language model is a compromise of sorts. It offers AI horsepower and speed but doesn't require the same computing resources or cost. Microsoft's Phi-3 Mini, which is built to run on phones and PCs, is one example. Google's Gemini 1.5 Flash, which is designed for high-volume, high-frequency tasks like generating captions and extracting data from forms, is another. Now we have GPT-4o Mini as well.

The skinny on GPT-4o Mini

Both free and paid ChatGPT users can access GPT-4o Mini starting Thursday in place of GPT-3.5, which was released in November 2022.

GPT-4o Mini currently supports text and vision in the OpenAI API, which is what developers use to build new applications based on OpenAI technology. Support for text, image, video and audio inputs and outputs is "coming in the future," the post said.

Enterprise users will have access to GPT-4o Mini starting the week of July 22.

OpenAI said GPT-4o Mini excels in mathematical reasoning and coding, but has also demonstrated skills in tasks that require reasoning. Financial tech startup Ramp and email app Superhuman tested out GPT-4o Mini to extract data from files and generate email responses, according to the post.

The new model has a context window of 128,000 tokens, which is a measurement of how much it can remember in a given conversation. By way of comparison, GPT-4o has the same context window, while GPT-3.5 Turbo has a context window of 16,000 tokens.

GPT-4o Mini costs 15 cents per million input tokens and 60 cents per million output tokens, which OpenAI said is about equal to 2,500 pages in a book.

GPT-4o, which was released in May, costs $5 per million input tokens and $2.50 per million output tokens.

"We envision a future where models become seamlessly integrated in every app and on every website," the blog post said. "GPT-4o mini is paving the way for developers to build and scale powerful AI applications more efficiently and affordably."

GPT-4o Mini uses the same safety parameters as GPT-4o. However, The Verge reported that it's also the first model to use a safety technique called instruction hierarchy, which teaches the model to prioritize prompts from developers over third parties. This is meant to decrease the likelihood the model will fall prey to vulnerabilities from outsiders.