OpenAI’s fresh concept, called o3, replaces o1, which the company introduced in September. The new type, like o1, spends time pondering a problem in order to provide more accurate responses to questions that call for step-by-step logical reasoning. Because the name “o2” is now used as the name of a mobile provider in the UK, OpenAI chose to skip the moniker.
” We view this as the beginning of the second stage of AI,” said OpenAI CEO Sam Altman on a Friday video. Where you can apply these concepts to perform extremely difficult tasks that call for a lot of reasoning?
The o3 type values significantly higher on some steps than its predecessor, OpenAI says, including ones that measure difficult coding-related skills and advanced math and science competence. It is three times greater than o1 at responding to ARC-AGI, a benchmark designed to evaluate an AI model’s ability to reason over first-hand, very challenging quantitative and logic problems they are facing.
Google is pursuing a comparable line of research. Noam Shazeer, a Google scholar, yesterday revealed in a blog on X that the company has developed its own logic model, called Gemini 2.0 Flash Thinking. Google’s CEO, Sundar Pichai, called it “our most intelligent design but” in his own blog. Google’s new design achieved a great rating on SWE-Bench, a test that measures a designs ‘ agentic capabilities.
But, OpenAI’s new o3 type is 20 percent better than o1. “o3 blew it out of the liquid”, says Ofir Press, a post-doctoral scholar at Princeton University who helped build SWE-Bench. ” Very astonishing raise, never sure how they did it”.
The two competing models demonstrate how fiercely fierce opposition is between Google and OpenAI. OpenAI needs to demonstrate that it can continue to advance as it aims to attract more funding and establish a profitable business. Google is in the meantime making a determined effort to demonstrate that it is still leading the charge in AI analysis.
The new versions even demonstrate how AI companies are increasingly considering ways to expand their base of operations in order to extract more knowledge.
OpenAI says there are two types of the new model, o3 and o3-mini. The company has not yet made the models publicly accessible, but it has said it will encourage people to use to have them tested.
OpenAI immediately provided more information about the methods used to correlate o1. Rational position, a new technique, involves training a model using a set of safety specifications, having it consider the nature of the request as well as the response it is given to determine whether it might conflict with its guardrails. Because of its logic approach, the strategy makes the design more hard to deceive into misbehavior.
Big language models can answer a lot of questions extremely well, but they frequently stumble when asked to solve riddles that call for basic arithmetic or logic. OpenAI’s o1 incorporates training on step-by-step problem-solving that makes an AI type better able to handle these types of issues.
As businesses try to employ so-called AI providers who can effectively figure out how to solve complicated problems on the behalf of a user, models that cause over problems will also be important.
” This definitely signifies that we are actually climbing the border of utility”, Mark Chen, senior vice president of study at OpenAI said on today’s video.
” This type is amazing at programming”, Atlman added.
At the end of the time, tech giants had to wait for a real breakthrough, but the pace of AI announcements has recently been sluggish.
Google released a new Gemini 2.0 model at the beginning of the month, and it was demonstrated using it as a web browsing aide and an assistant that uses a smartphone or pair of smart glasses to navigate the world.
OpenAI has made several announcements in the run up to Christmas, including a new version of its video-generating design, a free version of its ChatGPT-powered seek website, and a way to access ChatGPT over the telephone by calling 1-800-ChatGPT.
Update 12/20/24 1: 16 am ET: This account has been updated with additional post and information from OpenAI.