OpenAI GPT-5.3-Codex-Spark: Faster Coding with Cerebras Chips

by Anika Shah - Technology February 18, 2026

February 18, 2026 0 comments

OpenAI’s GPT-5.3-Codex-Spark: Real-Time Coding Powered by Cerebras

OpenAI has unveiled GPT-5.3-Codex-Spark, a fresh coding model designed for real-time software development. This marks a significant departure for OpenAI, as it’s the first GPT-class model to run on hardware other than NVIDIA, utilizing Cerebras’ Wafer Scale Engine 3. The model prioritizes responsiveness, aiming to provide developers with near-instant feedback during coding sessions.

A Leap in Coding Speed

Codex-Spark is optimized for speed, generating code at a rate of over 1,000 tokens per second – approximately 15 times faster than the base GPT-5.3-Codex [ExtremeTech]. This speed is achieved through a combination of a smaller model size and the power of Cerebras’ specialized hardware. The improvements extend beyond raw speed, with reductions in time-to-first-token and per-token overhead, making interactions perceive nearly instantaneous for common coding tasks [NxCode].

The Cerebras Partnership

This launch signifies a strategic diversification for OpenAI, which has historically relied on NVIDIA hardware. The collaboration with Cerebras, announced in January, allows OpenAI to explore alternative architectures optimized for specific workloads [OpenAI]. Cerebras’ Wafer Scale Engine 3 integrates millions of AI-oriented cores and large on-chip memory on a single silicon wafer, enabling the high-speed performance of Codex-Spark [ExtremeTech].

Designed for Iterative Development

While agentic coding – where machines autonomously work on software development – has gained traction, developers often feel disconnected from the process due to long wait times. Codex-Spark addresses this by enabling a more iterative workflow, allowing developers to inject their expertise, direction, and sensibility in real-time [Cerebras]. The model excels at precise edits, revising plans, and answering contextual questions about existing codebases.

Performance and Capabilities

Codex-Spark is a smaller version of Codex, optimized for fast inference. Benchmarks show it outperforms GPT-5.1-Codex-mini on agentic software engineering tasks like SWE-Bench Pro and Terminal-Bench 2.0, while completing those tasks significantly faster [Cerebras]. It’s particularly effective for tasks like refining UI layouts, styling, and testing interface changes. Although, larger, more complex design changes may still benefit from the capabilities of larger models.

Expanding AI Infrastructure

OpenAI’s multi-year deal with Cerebras includes up to 750MW of inference capacity, alongside continued investment in AMD GPUs and other accelerators [ExtremeTech]. This diversification demonstrates OpenAI’s commitment to building a robust and flexible AI infrastructure.

Key Takeaways

GPT-5.3-Codex-Spark is a new coding model optimized for real-time interaction.
It runs on Cerebras’ Wafer Scale Engine 3, achieving speeds of over 1,000 tokens per second.
This marks OpenAI’s first deployment of a GPT-class model on non-NVIDIA hardware.
Codex-Spark is designed to enhance the iterative development process, giving developers more control and responsiveness.

OpenAI GPT-5.3-Codex-Spark: Faster Coding with Cerebras Chips

OpenAI’s GPT-5.3-Codex-Spark: Real-Time Coding Powered by Cerebras

A Leap in Coding Speed

The Cerebras Partnership

Designed for Iterative Development

Performance and Capabilities

Expanding AI Infrastructure

Key Takeaways

Vinicius Jr: Benfica Tie Delayed After Alleged Racism Incident

Portland Housing Project Faces Lawsuit from Riverton Residents | Belfort Landing Dispute

Related Posts

Leave a Comment Cancel Reply