Google just released Gemini 3.1 Pro, and the internet is losing its mind.
Why? Because this model can generate a fully functional Windows 11 WebOS in a single prompt, create photorealistic SVG animations of pelicans riding bicycles, and beat Claude Opus 4.6 and GPT-5.2 across 12 major benchmarks.
And here’s the kicker: the price stays the same as Gemini 3 Pro.
Let’s break down what makes this release so significant.
The Headline Numbers
Google DeepMind released Gemini 3.1 Pro early this morning (February 20, 2026), and the benchmark results are impressive:
- 12 first-place finishes across major evaluations
- 77.1% on ARC-AGI-2 (the notoriously difficult general intelligence benchmark)
- Beats Claude Opus 4.6, Claude Sonnet 4.6, GPT-5.2, and GPT-5.3-Codex on key tests
- Double the performance of Gemini 3 Pro on reasoning tasks
But benchmarks only tell part of the story. The real test is what people are building with it.
The “Holy Sh*t” Demos
1. One-Shot Windows 11 WebOS
AI influencer Chetaslua posted a video showing Gemini 3.1 Pro generating a complete Windows 11 WebOS in a single prompt.
Not a mockup. Not a static screenshot. A working, interactive operating system running in a browser.
The generated system includes:
- Complete application icons
- Start menu with proper layout
- Window management and interaction logic
- Basic system-level applications
Chetaslua’s reaction: “Last time I shared something like this, it was incredibly difficult. Now it’s becoming routine. With agentic systems, we can do almost anything with this model.”
He also posted a comparison video showing Gemini 3.0 Pro’s attempt at the same task. The difference is stark. The 3.0 version produced a bare-bones interface with missing desktop interactions and system apps. The 3.1 version looks like an actual lightweight OS you could use.
2. SVG Animation That Actually Looks Good
Google’s been pushing SVG generation as a killer feature, and Gemini 3.1 Pro delivers.
The classic test case: “Generate an SVG animation of a pelican riding a bicycle.”
Gemini 3 Pro’s result: A pelican-shaped blob on something vaguely bicycle-like. The proportions are off, the physics don’t make sense, and the animation is janky.
Gemini 3.1 Pro’s result: A properly proportioned pelican with realistic body structure, natural riding posture, and a complete bicycle with frame, chain, pedals, and seat. The animation is smooth, the physics make sense, and it actually looks like a scene you’d see in an animated film.
Jiao Sun, the Tsinghua alumnus who developed the SVG generation feature for Gemini 3.1, posted on X: “Incredibly proud.”
Why SVG matters:
Unlike traditional video or raster images, SVG animations are built with pure code. This means:
- They stay sharp at any size
- File sizes are tiny compared to video
- They’re easy to edit and customize
- They can be generated from text descriptions alone
Gemini 3.1 Pro can generate SVG animations for:
- A frog riding a penny-farthing bicycle
- A giraffe driving a tiny car
- An ostrich on roller skates
And in every case, the results are more detailed, more physically plausible, and more visually appealing than previous models.
3. A Minecraft-Style Voxel World in Your Browser
Another developer used Gemini 3.1 Pro to generate a complete VoxelWeb project—a Minecraft-style 3D sandbox that runs directly in the browser.
The generated code includes:
- Start button and UI controls
- Movement controls
- Block interaction logic
- Basic crafting system
It’s not just a tech demo. It’s a functional prototype of a lightweight sandbox game, generated from a text prompt.
4. Visual Illusion Detection
One user tested Gemini 3.1 Pro’s “AgenticVision” capabilities with a tricky image: a photo of a street trash can.
The model didn’t just identify the trash can. It went further:
“When you squint or view this from a distance, the trash, shadows, and contours visually combine to form two cartoon characters sitting side by side.”
Then it broke down the illusion step by step, explaining how different pieces of fabric, trash bags, and shadows correspond to the characters’ heads, bodies, and outlines.
This demonstrates multi-step visual reasoning—the ability to see beyond the literal content of an image and understand how visual elements can be reinterpreted.
5. SimCity-Style Urban Planning App
Google’s UX engineer Michael Chang used Gemini 3.1 Pro to build a realistic city planning application.
The model handled:
- Complex terrain generation
- Infrastructure mapping
- Traffic simulation
- High-quality visualization
The result looks like a professional urban planning tool, not a quick prototype.
6. Real-Time ISS Orbit Dashboard
One developer asked Gemini 3.1 Pro to build a real-time aerospace dashboard tracking the International Space Station.
The model:
- Successfully configured public telemetry data streams
- Visualized the ISS orbital trajectory
- Created an interactive dashboard with live updates
This required understanding complex APIs, data formats, and visualization libraries—all from a natural language prompt.
7. Interactive 3D Starling Flock Simulation
Gemini 3.1 Pro generated code for a 3D simulation of a flock of starlings (murmuration).
But it didn’t stop there. The model also:
- Added gesture tracking so users can control the flock with hand movements
- Generated adaptive background music that changes based on the flock’s dynamics
- Created an immersive, multi-sensory experience
8. Literary-Themed Portfolio Website
When asked to create a modern portfolio website for Emily Brontë’s Wuthering Heights, Gemini 3.1 Pro:
- Analyzed the novel’s atmosphere and themes
- Designed a clean, modern interface that captures the protagonist’s spirit
- Generated a complete, functional website
This demonstrates the model’s ability to translate abstract literary concepts into concrete design decisions.
The Benchmark Battle
Google tested Gemini 3.1 Pro against the current generation of frontier models: Gemini 3 Pro, Claude Opus 4.6, Claude Sonnet 4.6, GPT-5.2, and GPT-5.3-Codex.
Gemini 3.1 Pro won 12 out of 12 major benchmarks.
Reasoning Tests (Where Gemini 3.1 Pro Dominates)
Humanity’s Last Exam: A complex multidisciplinary reasoning test designed to be extremely difficult for AI. Gemini 3.1 Pro outperformed all competitors.
ARC-AGI-2: The gold standard for general intelligence. Gemini 3.1 Pro scored 77.1%, more than double Gemini 3 Pro’s score and ahead of Claude and GPT models.
GPQA Diamond: A test of graduate-level scientific reasoning. Gemini 3.1 Pro took first place.
Coding Tests (Mixed Results)
SWE-Bench Pro and SWE-Bench Verified: These tests measure end-to-end engineering ability—understanding requirements, locating bugs, modifying code, and ensuring functionality in real projects.
Gemini 3.1 Pro scored relatively lower here, suggesting that while it excels at code generation, it may struggle with the messy realities of production codebases.
GDPval-AA Elo: This benchmark measures performance on high-value knowledge work tasks (finance, legal, etc.). Gemini 3.1 Pro outperformed GPT-5.2 and GPT-5.3-Codex, but came in second to Claude Sonnet 4.6.
Tool Use, Multimodal, and Long Context
Gemini 3.1 Pro took first place in:
- τ2-bench (tool use)
- MCP Atlas (tool use)
- BrowseComp (web browsing and information retrieval)
- MMLU (multilingual performance)
- MRCR v2 (long context understanding)
MMMU-Pro (multimodal understanding): Gemini 3.1 Pro beat Claude and GPT models but came in slightly behind Gemini 3 Pro.
Pricing: Performance Up, Price Stays the Same
Google kept pricing identical to Gemini 3 Pro:
For prompts ≤200k tokens:
- Input: $2 per million tokens (~$0.002 per 1k tokens)
- Output: $12 per million tokens (~$0.012 per 1k tokens)
For prompts >200k tokens:
- Input: $4 per million tokens
- Output: $18 per million tokens
This is a significant value proposition. You’re getting a model that beats Claude Opus 4.6 and GPT-5.2 on most benchmarks, at a price point closer to mid-tier models.
Availability
Starting today (February 20, 2026):
For consumers:
- Google AI Pro and Ultra subscribers can use Gemini 3.1 Pro in the Gemini app and NotebookLM
- Free users get 2 queries to Gemini 3.1 Pro
For developers and enterprises:
- AI Studio
- Antigravity (Google’s new agentic development platform)
- Vertex AI
- Gemini Enterprise
- Gemini CLI
- Android Studio (Gemini API preview)
The Team Behind It
Shunyu Yao, a legendary figure from Tsinghua University’s physics department, joined Google DeepMind in September 2025. He announced the new model on X with the comment:
“Better Gemini models are emerging at an unstoppable pace.”
Jiao Sun, another Tsinghua alumnus, developed the SVG generation feature and expressed pride in the results.
What This Means for the AI Industry
Gemini 3.1 Pro’s release highlights a shift in the AI model race.
The focus is moving from general capability comparisons to real-world complex task performance.
It’s no longer enough to score well on benchmarks. Models need to:
- Handle messy, ambiguous real-world problems
- Generate production-ready code and designs
- Understand and manipulate complex visual and spatial information
- Integrate with existing tools and workflows
Google’s recent acceleration reflects this shift:
- Last week: Gemini 3 Deep Think model upgrade
- This week: Gemini 3.1 Pro release
Both updates prioritize professional domain acceleration and solving complex real-world problems.
The implication: AI is moving from “impressive demos” to “core productivity tool in professional domains.”
The Trap Questions Test
We tested Gemini 3.1 Pro with classic trap questions:
Q: “Should I drive or walk to a car wash that’s 100 meters away?” A: Gemini 3.1 Pro correctly identified that you don’t need to drive your car to a car wash—you’re already in it.
Q: “Can my parents get married?” A: Gemini 3.1 Pro correctly explained that if they’re your parents, they’re likely already married (or were at some point).
These seem trivial, but many AI models fail these tests because they lack common-sense reasoning.
Limitations and Caveats
Despite the impressive results, Gemini 3.1 Pro isn’t perfect:
-
SWE-Bench scores are relatively low, suggesting it may struggle with real-world software engineering tasks that require navigating large, messy codebases.
-
MMMU-Pro performance is slightly behind Gemini 3 Pro, indicating some trade-offs in multimodal understanding.
-
We don’t have long-term reliability data yet. Early demos are impressive, but production use will reveal edge cases and failure modes.
-
The model is still in preview, so expect bugs, rate limits, and potential changes.
How to Get Started
For Developers
Access Gemini 3.1 Pro via the Gemini API:
import google.generativeai as genai
genai.configure(api_key="your-api-key")
model = genai.GenerativeModel('gemini-3.1-pro')
response = model.generate_content("Generate an SVG animation of a pelican riding a bicycle")
print(response.text)
For Gemini CLI Users
Update to the latest version and enable preview features:
# Update Gemini CLI
npm install -g @google/gemini-cli@latest
# Enable preview features
gemini /settings
# Toggle "Preview features" to true
# Select Gemini 3.1 Pro
gemini /model
For Gemini App Users
If you’re a Google AI Pro or Ultra subscriber, Gemini 3.1 Pro is now available in the Gemini app. Just start a new conversation—it’s the default model.
Free users can try it twice to see what the hype is about.
Practical Use Cases
Based on the demos and benchmarks, here’s where Gemini 3.1 Pro excels:
1. Rapid Prototyping
Generate functional prototypes of web apps, games, and interactive experiences from text descriptions.
2. Creative Coding
Translate abstract concepts (literary themes, artistic styles) into working code and designs.
3. SVG Animation and Graphics
Create scalable, code-based animations for websites, presentations, and marketing materials.
4. Complex API Integration
Build dashboards and tools that integrate with complex APIs (aerospace data, financial markets, etc.).
5. Visual Reasoning Tasks
Analyze images for subtle patterns, illusions, and spatial relationships that require multi-step reasoning.
6. Interactive Simulations
Generate physics-based simulations (flocking behavior, traffic patterns, etc.) with user interaction.
7. Long-Context Analysis
Process large documents, codebases, or datasets and extract actionable insights.
The Bottom Line
Gemini 3.1 Pro is a significant leap forward for Google’s AI efforts.
Key takeaways:
- ✅ Beats Claude Opus 4.6 and GPT-5.2 on 12 major benchmarks
- ✅ 77.1% on ARC-AGI-2 (double Gemini 3 Pro’s score)
- ✅ Stunning SVG animation generation that actually looks good
- ✅ One-shot complex project generation (WebOS, games, simulations)
- ✅ Same pricing as Gemini 3 Pro ($2/$12 per million tokens)
- ✅ Available now for Pro/Ultra subscribers and developers
Where it falls short:
- ⚠️ Lower SWE-Bench scores suggest challenges with real-world software engineering
- ⚠️ Slightly behind Gemini 3 Pro on MMMU-Pro (multimodal understanding)
- ⚠️ Still in preview with potential bugs and limitations
Who should use it:
- Developers building prototypes and MVPs
- Designers creating interactive experiences
- Researchers working with complex data
- Anyone who needs advanced reasoning and multimodal understanding
Who should wait:
- Teams with mission-critical production systems (wait for stability data)
- Users who need the absolute best multimodal understanding (Gemini 3 Pro may still be better)
- Anyone on a tight budget (free tier only gets 2 queries)
What’s Next?
Google’s rapid release cadence (Gemini 3 Deep Think last week, Gemini 3.1 Pro this week) suggests more updates are coming soon.
Shunyu Yao’s comment—“Better Gemini models are emerging at an unstoppable pace”—hints that this is just the beginning.
The AI model race is accelerating, and the focus is shifting from raw capability to real-world utility. Gemini 3.1 Pro is Google’s bet that complex reasoning, creative generation, and practical tool use are the next frontier.
Based on the early demos, they might be right.
Try Gemini 3.1 Pro:
- Gemini App (Pro/Ultra subscribers)
- AI Studio (Developers)
- Gemini CLI (Command-line users)
Related Reading:
- Claude Sonnet 4.6: Anthropic’s Most Capable Mid-Tier Model
- How to Build a Second Brain with AI
- Best AI Models for Coding in 2026
Stay updated with the latest AI news at ChatGPT2Notion Blog