GPT-5.4 Leaked: 2M Context Window and Persistent Memory Coming Soon

OpenAI has accidentally revealed its next major model, GPT-5.4, through two separate leaks in its public Codex GitHub repository. While both references were quickly scrubbed via force pushes and edits, screenshots circulated widely across social media, giving us the first concrete details about what’s coming next.

The Leak: How It Happened

The leaks occurred through pull requests in OpenAI’s public Codex GitHub repository. Despite OpenAI’s quick response to remove the references, the AI community was faster—screenshots and discussions spread rapidly across X (formerly Twitter), Reddit, and developer forums. The timing couldn’t be more significant, coming just days after Anthropic’s Claude Opus 4.6 launch and Google’s Gemini 3.1 Flash-Lite release.

What We Know About GPT-5.4

Based on the leaked information and community analysis, here are the key features expected in GPT-5.4:

1. Massive 2M Token Context Window

GPT-5.4 will feature a 2 million token context window—the largest context window in any production AI model to date. This represents a significant leap from GPT-4’s 128K tokens and even surpasses Claude Opus 4.6’s 1M token window.

However, some AI researchers are skeptical. As one developer noted on X: “2M context sounds wild but current transformers still degrade hard past ~200K on complex reasoning. The real test is whether attention scaling actually holds or if it’s just a marketing number.”

2. Persistent Memory Features

Perhaps the most exciting feature is persistent memory—the ability to maintain state across sessions and conversations. Unlike traditional context-based memory, this means GPT-5.4 will remember:

Full project context across weeks or months
User preferences and coding styles
Tool states and reasoning chains
Previous conversations without re-explanation

As one developer enthusiastically commented: “Persistent memory is the real unlock here. Most devs waste hours rebuilding context every session. With this: Keep full project state across weeks, agents remember your exact style/preferences, no more ‘where was I?’ moments.”

3. Full-Resolution Image Processing

GPT-5.4 will directly process highly detailed image files in PNG, JPEG, and WebP formats without any loss of visual data. This means the model can properly read:

Detailed architectural drawings
High-density screenshots with small text
Complex graphics and diagrams
Technical documentation with fine details

Preserving original image bytes helps the system avoid missing crucial visual information that might be lost through compression or downscaling.

4. Priority Speed Tier

OpenAI is introducing a new priority speed tier for faster responses. While details are limited, this likely means premium users will get:

Reduced latency for API calls
Priority queue access during high-demand periods
Faster inference times for time-sensitive applications

The Competitive Landscape

The timing of this leak is no coincidence. OpenAI is facing unprecedented competition:

Anthropic’s Dominance

Claude Opus 4.6 just launched with Agent Teams and a 1M context window. More importantly, Anthropic’s Claude Code now dominates the coding market with an impressive 54% market share. OpenAI clearly cannot afford to fall behind in the developer tools space.

DeepSeek’s Independence

DeepSeek V4 is reportedly training on Huawei hardware, completely outside the NVIDIA ecosystem. This represents a significant shift in AI infrastructure and shows that alternatives to US-based chip manufacturers are becoming viable for frontier model training.

Google’s Speed Play

Google DeepMind just unveiled Gemini 3.1 Flash-Lite, focusing on speed and efficiency. The AI race is no longer just about raw capability—it’s about speed, cost, and specialized use cases.

Release Timeline Predictions

Prediction markets on Manifold give GPT-5.4:

55% chance of shipping before April 2026
74% chance of shipping before June 2026

Given that we’re already in early March 2026, this suggests a release could be imminent—possibly within the next 4-8 weeks.

What This Means for Developers

If the leaked features are accurate, GPT-5.4 could fundamentally change how developers work with AI:

For Coding Assistants:

Persistent memory means no more context rebuilding
2M tokens can hold entire codebases in context
Full-resolution image processing helps with UI/UX work

For Enterprise Applications:

Long-term memory enables true personalization
Priority speed tier ensures reliable performance
Massive context window supports complex workflows

For Research and Analysis:

Process entire research papers with images
Maintain reasoning chains across multiple sessions
Handle complex multi-document analysis

The Arms Race Continues

OpenAI’s accidental leak reveals more than just technical specifications—it shows the intense competitive pressure in the AI industry. With Claude dominating coding, DeepSeek building independent infrastructure, and Google pushing speed boundaries, OpenAI needs GPT-5.4 to be a game-changer.

The question isn’t whether GPT-5.4 will be powerful—it’s whether 2M tokens and persistent memory will be enough to win back market share from Anthropic and maintain OpenAI’s position as the AI leader.

Conclusion

While OpenAI hasn’t officially confirmed GPT-5.4’s existence or features, the GitHub leaks provide compelling evidence that a major release is coming soon. The combination of 2M token context, persistent memory, full-resolution image processing, and priority speed tier represents a significant evolution in AI capabilities.

Whether these features will translate into real-world productivity gains remains to be seen. But one thing is clear: the AI arms race is accelerating, and OpenAI is betting big on context size and memory to stay ahead.

Stay tuned for official announcements—if the prediction markets are right, we won’t have to wait long.