GPT-5.3-Codex represents a brand new technology of the Codex mannequin constructed to deal with actual, end-to-end work. As an alternative of focusing solely on writing code, it combines robust coding capability with planning, reasoning, and execution. The mannequin runs quicker than earlier variations and handles lengthy, multi-step duties involving instruments and selections extra successfully.
Quite than producing remoted solutions, GPT-5.3-Codex behaves extra like a working agent. It could keep on process for lengthy intervals, modify its method mid-way, and reply to suggestions with out shedding context.
Codex 5.3 Benchmarks
OpenAI’s GPT-5.3 Codex units new efficiency requirements on real-world coding and agentic benchmarks, outperforming prior fashions on exams like SWE-Bench Professional and Terminal-Bench 2.0 with stronger accuracy. It additionally exhibits substantial positive factors on OSWorld and GDPval evaluations, which measure computer-use {and professional} data work, whereas working about 25% quicker than GPT-5.2 Codex. This marks a major step towards AI that may deal with longer, multi-step improvement duties and broader software program workflows.

Key Options
Right here’s what makes OpenAI Codex fascinating:
Constructed With Codex, For Codex
One of the fascinating features of GPT-5.3-Codex is that the workforce used early variations of the mannequin throughout its personal improvement. Engineers relied on it to debug coaching runs, examine failures, and analyze analysis outcomes. This helped velocity up iteration and uncovered points earlier within the course of.
This self-use is a robust sign of maturity. The OpenAI workforce not solely examined the mannequin on benchmarks but in addition trusted it in actual inside workflows.
From the benchmark picture, we are able to see that GPT-5.3-Codex maintains increased accuracy as output tokens enhance. It performs higher on longer and extra complicated duties. This exhibits stronger consistency in comparison with earlier fashions.
Anthropic additionally launched their new coding mannequin just lately. Discover all about it on our detailed weblog on Claude Opus 4.6.
Past Writing Code
GPT-5.3-Codex is designed to deal with extra than simply code technology. It could assist with debugging, refactoring, deployment duties, documentation, information evaluation, and even non-coding work like writing specs or getting ready stories.
It operates greatest when given objectives moderately than detailed directions. The mannequin can resolve what to do subsequent, run instructions, examine outputs, and hold going till the duty is full.
Designed for Secure, Sensible Use
To help hands-on work, GPT-5.3-Codex runs inside managed environments. By default, it really works in sandboxes that restrict file entry and community utilization, decreasing the danger of unintended injury. The mannequin additionally pauses and asks for clarification earlier than performing probably damaging actions.
These selections make it simpler to experiment, particularly when engaged on actual tasks or unfamiliar techniques.
Working Collectively With the Mannequin
Interplay with GPT-5.3-Codex is steady moderately than one-off. As it really works, it shares progress, explains selections, and reacts to suggestions. You possibly can interrupt, redirect, or refine the duty at any level.
This makes it really feel much less like a command-based software and extra like a collaborator you supervise.
The way to Entry Codex 5.3?
Now that the high-level image is evident, it’s time to maneuver from description to motion.
Within the subsequent part, we’ll strive Codex hands-on. We’ll begin by downloading and setting it up, then stroll by way of a easy workflow step-by-step. This can present how GPT-5.3-Codex behaves in follow and the best way to work with it successfully on actual duties.
Let’s see the steps:
1. Drag the Codex icon into your Utility folder

2. Open Codex

3. Sign up with ChatGPT

4. After signing in, choose a folder or git repository in your laptop the place Codex will work

5. Kick off your first process

5. Choose the mannequin from right here and Reasoning as per your selection.

Activity 1: Textual content to 3D Scene Generator
The primary process I labored on with Codex was constructing a easy text-to-3D scene generator. The objective was deliberately minimal. I wished to check how nicely Codex may take a loosely outlined thought and switch it right into a working visible venture with out overengineering.
The Preliminary Immediate
The very first immediate I gave Codex was easy:
Construct a easy text-to-3D scene generator.
The necessities have been clear however restricted. It needed to be a single HTML file, use Three.js by way of a CDN, and run straight within the browser with no construct instruments. The scene wanted a textual content enter the place a consumer may describe one thing like “3 timber and a home”, and the output must be a fundamental 3D scene utilizing easy shapes, lighting, and gradual rotation. I additionally requested it to begin with a minimal working model.
This immediate was meant to check fundamentals, not polish.
First Working Model
Codex created a clear index.html from scratch. It arrange a Three.js scene with a digicam, lights, floor airplane, and a easy animation loop. A textual content enter and submit button have been added. Fundamental key phrases like tree, home, cloud, and solar have been parsed and mapped to easy shapes. The main target was correctness. The scene loaded, objects appeared, and the whole lot rotated easily. The outcome was already usable.
Iterations
I iterated step-by-step. I improved parsing so phrases like “3 timber” labored appropriately, with a default of 1 object. Subsequent, I mounted object spacing to stop overlap and added scene cleanup so every submission rebuilt the scene as an alternative of stacking objects. In one other move, I targeted on readability by simplifying feedback and clarifying the construction for freshmen. Every change was small and fast to implement.
Consequence
By the third model, a number of objects rendered appropriately, nevertheless it took extra time than anticipated and the outcome was nonetheless not very robust. The scene did clear and rebuild on each submit, however the conduct was inconsistent. Within the video, you too can see that once I entered “cone,” nothing modified within the scene. The ultimate output ran within the browser, nevertheless it clearly confirmed that Codex may do extra and that the answer was removed from its full potential.
Activity 2: Area Flight Sandbox
This process targeted on constructing a real-time area flight sandbox with a robust emphasis on construction and efficiency. The objective was to create a clean and plausible expertise the place the system may scale with out breaking.
Core Gameplay
The participant flies a ship in open area with inertial motion. Mouse enter controls pitch and yaw, whereas the keyboard handles thrust, strafe, roll, and reverse. A big asteroid subject surrounds the participant and constantly streams because the ship strikes. The participant can hearth lasers to destroy asteroids, which break up into smaller items when hit.
Efficiency and Construction
Efficiency was handled as a tough constraint. Asteroids have been rendered utilizing InstancedMesh and recycled to take care of a steady occasion rely. Collision checks relied on a spatial grid to remain environment friendly. Physics ran on a hard and fast timestep, whereas rendering remained clean and decoupled. No exterior physics engines or frameworks have been used.
System Design
The venture adopted a clear modular design. Every main system lived in its personal file, with principal.js dealing with the scene and loop, ship.js managing flight physics, asteroids.js dealing with instancing and streaming, weapons.js managing lasers and collisions, and controls.js dealing with enter. This construction remained unchanged all through improvement.
Audio Suggestions
Audio was added to enhance readability and impression. Laser photographs set off a pointy firing sound, and asteroid hits play a heavier explosion-like thud. All audio makes use of Three.js Audio and is connected to the digicam to remain in step with the participant’s perspective.
Consequence
The ultimate sandbox is totally playable and steady, nevertheless it took for much longer to construct than anticipated. The ship feels weighty and responsive, asteroids stream endlessly with out efficiency drops, and lasers really feel highly effective and visual. Nevertheless, the event time was noticeably excessive, presumably as a result of reasoning mannequin I selected. After seeing the outcome, I used to be not very proud of it, as different fashions do a lot better, or this might have been made a lot better total.
Conclusion
GPT-5.3-Codex exhibits clear strengths in lengthy, complicated duties and benchmark efficiency. It behaves extra like an agent than a easy code generator. It plans, executes, and adapts over time. Benchmarks recommend robust consistency at scale. Nevertheless, hands-on work revealed gaps. Some duties took longer than anticipated. Outcomes weren’t all the time as robust as they may have been. In follow, iteration velocity and output high quality assorted. Whereas the mannequin is highly effective and mature, the workflow didn’t all the time really feel optimum. With higher selections or tuning, the identical duties may possible be performed quicker and higher.
Login to proceed studying and luxuriate in expert-curated content material.

