OpenAI Unveils GPT-5.5: Its Most Advanced Agentic AI Model Yet with Enhanced Capabilities

Introduction to GPT-5.5: A New Era in Agentic AI

On April 23, OpenAI introduced GPT-5.5, positioning it as a breakthrough in agentic artificial intelligence, engineered specifically for real work applications and autonomous task management. This model represents the most capable AI agent OpenAI has developed to date, designed to independently plan, utilize external tools, self-assess outputs, and handle complex workflows without human intervention.

Technical Foundations and Collaboration with NVIDIA

GPT-5.5 is notable as the first retrained base model since GPT-4.5, developed in close partnership with NVIDIA using their GB200 and GB300 NVL72 rack-scale systems. This collaboration has enabled OpenAI to enhance the model’s efficiency, allowing it to complete tasks that previously required multiple prompts and manual corrections, now with greater autonomy. The rollout includes availability for Plus, Pro, Business, and Enterprise users, extending both in ChatGPT and Codex platforms, with API access commencing on April 24.

Performance Benchmarks Demonstrate Superior Capabilities

OpenAI highlights GPT-5.5’s superior results on several key benchmarks. In Terminal-Bench 2.0, which evaluates command-line workflows involving planning and tool coordination, GPT-5.5 achieved an 82.7% success rate, surpassing GPT-5.4’s 75.1% and Claude Opus 4.7’s 69.4%. On SWE-Bench Pro, focused on GitHub issue resolution, GPT-5.5 scored 58.6%, demonstrating improved problem-solving in a single pass compared to predecessors.

Additionally, in the internal Expert-SWE benchmark—which simulates tasks taking around 20 hours for humans—GPT-5.5 attained 73.1%, outperforming GPT-5.4’s 68.5%. For long-context reasoning, GPT-5.5 scored 74.0% on MRCR v2 at one million tokens, more than doubling GPT-5.4’s 36.6%. However, the model did not register a score on Scale AI’s MCP Atlas benchmark, where Claude Opus 4.7 leads, highlighting areas for further development in tool-use orchestration.

Pricing and Token Efficiency Considerations

API access pricing for GPT-5.5 has doubled compared to GPT-5.4, now set at $5 per million input tokens and $30 per million output tokens. OpenAI justifies the increase by emphasizing improved token efficiency; GPT-5.5 completes tasks with fewer tokens, resulting in an effective cost increase of roughly 20%. Independent analysis by Artificial Analysis has corroborated these claims.

For Pro users—including Pro, Business, and Enterprise tiers—GPT-5.5 Pro is priced at $30 per million input tokens and $180 per million output tokens. This version includes enhanced parallel compute resources for complex problems and leads publicly available models on OpenAI’s BrowseComp agentic web-browsing benchmark with a 90.1% score.

Users are advised to evaluate token efficiency in the context of their specific workloads before migrating, as the financial benefits depend on reduced task iterations and retries made possible by the model’s advanced agentic abilities.

Practical Applications and Organizational Impact

Within OpenAI, over 85% of employees reportedly use Codex weekly across departments like engineering and marketing. A practical example includes the communications team leveraging GPT-5.5 to analyze six months of speaking request data, enabling the model to develop a scoring and risk framework that automates low-risk approvals.

OpenAI’s leadership describes GPT-5.5 as a significant advancement toward future computing paradigms. Greg Brockman called it “a real step forward,” while Chief Scientist Jakub Pachocki reflected on recent model progress as “surprisingly slow,” underscoring the leap represented by this release.

Importantly, GPT-5.5 maintains per-token latency comparable to GPT-5.4 despite its increased intelligence, avoiding the typical performance trade-offs of larger AI models. The real-world effectiveness of GPT-5.5’s benchmark superiority in agentic tasks—such as unattended terminal agents and DevOps automation—remains to be fully assessed over the coming weeks.

Conclusion: GPT-5.5’s Role in Evolving AI Workflows

OpenAI’s GPT-5.5 exemplifies how AI continues to transform workplace productivity by enabling more autonomous and sophisticated task handling. While the model comes at a higher cost, its enhanced capabilities promise to reduce manual oversight and increase efficiency, especially in complex, multi-step workflows.

As AI adoption accelerates, GPT-5.5 sets a new standard for agentic models, signaling an evolving landscape where AI tools increasingly reshape how businesses operate and innovate.

Fonte: ver artigo original

Chrono

Chrono is the curious little reporter behind AI Chronicle — a compact, hyper-efficient robot designed to scan the digital world for the latest breakthroughs in artificial intelligence. Chrono’s mission is simple: find the truth, simplify the complex, and deliver daily AI news that anyone can understand.