Latest AI Model Updates (Feb 11, 2026): GPT-5.3-Codex, Claude Opus 4.6, Gemini 3 Deep Think

Q: Is GPT-6 released yet?

As of February 11, 2026, there is no official GPT-6 launch announcement in the sources linked on this page.

Q: Is Llama 5 released?

As of February 11, 2026, Meta’s latest official open-weight family is Llama 4; there is no Llama 5 announcement in the sources linked on this page.

Q: Is Grok 5 released?

As of February 11, 2026, xAI’s latest public releases include Grok 4.1 and fast variants; there is no Grok 5 launch announcement in the sources linked on this page.

Q: What should I use for cost-efficient coding?

If you need cost control or self-hosting flexibility, open-weight models like DeepSeek V3.2, MiniMax M2.1, and Qwen3-Coder-Next are strong options to evaluate.

As of February 13, 2026, frontier AI progress is increasingly about agentic reliability: models that can plan, write, test, refactor, and stay grounded across long tasks. This page has been updated with the latest official releases and rollouts through today.

Note: Despite constant speculation, there are no official GPT-6 / Llama 5 / Grok 5 launch announcements as of Feb 13, 2026. The biggest recent jumps are Claude Opus 4.6, GPT-5.3-Codex, MiniMax M2.5, GLM-5, and Mistral Large 3.

What's New Since November 2025

1) Anthropic Claude Opus 4.6 (Feb 2026)

Latest release: Claude Opus 4.6 released Feb 2026, with a Fast mode research preview announced on Feb 7, 2026.

What changed

Improved performance for coding, reasoning, and agentic tasks
Available via Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI
Fast mode introduces a speed/intelligence trade-off for lower-latency agent loops (research preview)

2) OpenAI GPT-5.2 → GPT-5.3-Codex (Dec 2025 – Feb 2026)

Latest releases: GPT-5.2 (Dec 11, 2025), GPT-5.2-Codex (Dec 18, 2025), and GPT-5.3-Codex (Feb 5, 2026).

Highlights

GPT-5.2: August 2025 knowledge cutoff; up to 3.2M context and up to 256K output tokens
GPT-5.2-Codex: Codex-tuned GPT-5.2 variant focused on software engineering workflows
GPT-5.3-Codex: 25% faster than GPT-5.2-Codex; OpenAI reports new highs on SWE-Bench Pro and Terminal-Bench

3) Google Gemini 3 Deep Think is now available (Feb 2026)

Update: Google announced Gemini 3 Deep Think availability updates in Feb 2026, expanding access beyond the initial Gemini 3 Pro launch (Nov 18, 2025).

What this means

Deep Think mode becomes available for Ultra subscribers and via API (per Google)
Best fit for tasks where extra thinking time pays off: planning, difficult debugging, and research synthesis

4) Mistral AI: European Open-Weight Leadership

Latest Releases: Mistral AI released Mistral Large 3 and Ministral 3 family (Dec 2, 2025), Devstral 2 (Dec 10, 2025), and Voxtral Transcribe 2 (Feb 4, 2026).

Mistral Large 3 (Dec 2025)

41B active parameters, 675B total parameters (sparse MoE architecture)
Ranks #2 among open-source non-reasoning models on LMArena leaderboard
256K context window for extended document processing
Released under Apache 2.0 license for maximum openness
Trained on NVIDIA H200 GPUs with frontier-class capabilities
Available on Mistral AI Studio, Amazon Bedrock, Azure Foundry, Hugging Face

Ministral 3 Family (Dec 2025)

Three compact models: 3B, 8B, and 14B parameters
Best performance-to-cost ratio in their class
Designed for edge devices, robotics, and on-device AI
14B reasoning variant achieves 85% on AIME 2025
Optimized for NVIDIA Spark, RTX PCs, and Jetson devices
Multimodal and multilingual capabilities from the start

Devstral 2 & Voxtral (2025-2026)

Devstral 2: 24B coding model outperforming Qwen 3 Coder (30B)
Devstral Small 2: Compact variant for local deployment
Voxtral Transcribe 2: On-device speech-to-text with real-time diarization
Voxtral: Privacy-first design runs entirely on smartphone/laptop
Competitive with OpenAI Whisper and Google Cloud Speech on FLEURS benchmark

Ecosystem: Mistral positions itself as Europe's answer to US AI dominance, with $2B+ raised and ~€14B valuation. The company emphasizes efficiency, transparency, and open-source values while maintaining competitive performance.

5) Zhipu AI (Z.ai): China's GLM Family Evolution

Latest Releases: GLM-4.7 (Dec 22, 2025), GLM-5 (Feb 11, 2026), GLM-Image (Jan 2026), and GLM-4.7-Flash (Jan 20, 2026).

GLM-5 (Feb 11, 2026)

744B total parameters (doubled from GLM-4.7's 355B)
Trained on 28.5 trillion tokens
Shifts from "vibe coding" to "agentic engineering"
Achieves industry-leading scores for open models in coding and agentic tasks
Surpasses Gemini 3 Pro on some benchmarks according to internal tests
Still lags Claude Opus 4.6 on coding benchmarks overall
Available at $3/month or free for local deployment

GLM-4.7 (Dec 2025)

84.9% on LiveCodeBench (ahead of Claude Sonnet 4.5)
73.8% on SWE-bench Verified (highest among open-source models at release)
95.7% on AIME 2025 mathematics benchmark
"Preserved Thinking" maintains reasoning chains across multiple turns
42.8% on Humanity's Last Exam (41% improvement over predecessor)
87.4% on τ²-Bench for multi-step tool usage

GLM-4.7-Flash (Jan 2026)

30B-A3B MoE model optimized for local deployment
Strongest in 30B class for coding and reasoning
128K context window for large codebases
Free tier option via chat.z.ai
Designed for lightweight deployment with strong performance

GLM Ecosystem Updates

GLM-Image: Trained entirely on Chinese Huawei Ascend hardware
AutoGLM-Phone: Multilingual mobile automation framework
GLM-OCR: High-performance document understanding
IPO Success: Listed on Hong Kong Stock Exchange (Jan 8, 2026) at HK$116.20
Stock surged 173% in one month post-IPO to HK$317.80
US Entity List (Jan 2025) accelerated sovereign AI strategy

Strategic Context: Zhipu (Z.ai) represents China's push for AI sovereignty after US export restrictions. All GLM models from 4.6 onward are trained on domestic Chinese chips (Huawei Ascend, Cambricon, Moore Threads), demonstrating independence from NVIDIA hardware.

6) Moonshot AI: Kimi K2.5 Agent Swarm

Latest Release: Kimi K2.5 released January 27, 2026, introducing groundbreaking Agent Swarm technology.

Kimi K2.5 Key Features

1 trillion total parameters, 32B active (MoE architecture)
Trained on 15 trillion mixed visual and text tokens
Native multimodal: Vision and language trained together from scratch
Agent Swarm: Coordinates up to 100 specialized AI agents in parallel
4.5x faster task completion compared to sequential processing
50.2% on Humanity's Last Exam at 76% lower cost than Claude Opus 4.5
78.4% on BrowseComp (vs 60.6% for standard single-agent approach)

Four Operational Modes

K2.5 Instant: Fast responses for standard queries
K2.5 Thinking: Extended reasoning for complex problems
K2.5 Agent: Single autonomous agent for task execution
K2.5 Agent Swarm: Parallel multi-agent coordination

Technical Innovations

Visual Coding: Generates code from UI designs and video workflows
Parallel-Agent Reinforcement Learning: Prevents serial collapse in task execution
Critical Path Optimization: Measures slowest sub-agent at each stage
Autonomous image search and layout iteration
Works best with Kimi Code CLI framework

Availability & Pricing

API: $0.60/M input tokens, $2.50/M output tokens
Access via kimi.com (browser), Kimi App (mobile), moonshot.ai (API)
Open-source under Modified MIT License
Weights available on Hugging Face and GitHub
Compatible with OpenAI SDK for easy migration

Use Cases: K2.5 excels at multi-modal AI agents, visual analysis, web development with autonomous iteration, and tool-augmented agentic workflows. The Agent Swarm technology particularly shines in wide information gathering tasks requiring parallel execution.

7) MiniMax M2.5: The $1/Hour Frontier Model

Latest Release: MiniMax M2.5 released February 12, 2026, positioning itself as "intelligence too cheap to meter."

M2.5 Performance Benchmarks

80.2% on SWE-Bench Verified (within 0.6pp of Claude Opus 4.6)
51.3% on Multi-SWE-Bench (best performance in industry for multilingual)
55.4% on SWE-Bench Pro
76.3% on BrowseComp (with context management)
37% faster than M2.1, matching Claude Opus 4.6 speed

Cost Revolution

$1 per hour of continuous generation at 100 tokens/second
1/10 to 1/20 the cost of GPT-5.3-Codex and Claude Opus 4.6
M2.5: $0.30/M input, $2.40/M output
M2.5-Lightning: 100 TPS variant with same pricing
Makes 24/7 agentic operation economically feasible

Technical Architecture

230B total parameters, 10B active (MoE architecture)
Trained on 200,000+ real-world environments
Supports 10+ programming languages (Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, Ruby)
Forge RL Framework: Agent-native reinforcement learning with 40x training speedup
Spec-Writing Tendency: Plans like an architect before coding

Full-Stack Capabilities

0-to-1 system design and environment setup
1-to-10 system development
10-to-90 feature iteration
90-to-100 comprehensive code review and testing
Web, Android, iOS, and Windows platform support
Server-side APIs, business logic, databases, and more

Office & Productivity

Word documents and LaTeX-enabled PDFs
PowerPoint presentations with professional layouts
Excel spreadsheets with formulas, pivot tables, and charts
Performance on MEWC (Microsoft Excel World Championship) problems

Availability

Open-sourced on HuggingFace and GitHub
API access via minimax.io
Free tier available through OpenHands Cloud (limited time)
Supports private cluster deployment and fine-tuning
Compatible with vLLM and SGLang for optimal performance

Market Impact: M2.5 represents the first frontier model where cost is truly negligible for continuous use. At $1/hour, developers can run M2.5 24/7 for an entire month for less than $750, making it a game-changer for agentic applications and automated workflows.

8) Other Open-Weight Coding Models

DeepSeek V3.2 (Dec 1, 2025): Released V3.2 with open-source weights and announced a major API price reduction. Continues as cost-effective alternative with 88% on AIME 2025 and 82% on GPQA Diamond.

Qwen3-Coder-Next (Feb 2026): Open-weight MoE coder model with up to 256K context (per model card). Designed for large-scale code generation and refactoring tasks.

Industry Landscape & Competitive Position

Model	Best For	Key Strength	Released
Claude Opus 4.6	Coding & Agents	Improved coding/reasoning/agentic tasks (plus Fast mode preview)	Feb 2026
GPT-5.3-Codex	Software engineering	Faster agentic coding; new highs on SWE-Bench Pro & Terminal-Bench	Feb 5, 2026
MiniMax M2.5	Cost-efficient coding	80.2% SWE-Bench at 1/10 the cost; $1/hour frontier model	Feb 12, 2026
GLM-5	Agentic engineering	744B params, sovereign Chinese AI, competitive with Gemini 3 Pro	Feb 11, 2026
Kimi K2.5	Agent Swarm	100 parallel agents, 4.5x faster execution, visual agentic intelligence	Jan 27, 2026
Mistral Large 3	Open-weight flagship	675B MoE, #2 open-source on LMArena, Apache 2.0 license	Dec 2, 2025
GPT-5.2	General purpose	Aug 2025 cutoff; very long context and large outputs	Dec 11, 2025
Gemini 3 Deep Think	Deep reasoning	Deep Think mode availability expands (Ultra + API)	Feb 2026
Ministral 3	Edge AI	3B/8B/14B compact models, 85% AIME with 14B reasoning variant	Dec 2, 2025
GLM-4.7	Coding agents	84.9% LiveCodeBench, Preserved Thinking, $3/month	Dec 22, 2025
DeepSeek V3.2	Open-weight coding	Open-source weights + major API price reduction	Dec 1, 2025
Qwen3-Coder-Next	Open-weight coding	MoE coder model with up to 256K context	Feb 2026

Key Trends in Early 2026

Cost disruption: MiniMax M2.5 at $1/hour and GLM-4.7 at $3/month force repricing across the industry.
Agent-first coding: Frontier releases now optimize for long-running software engineering tasks, not just chat.
Multi-agent systems: Kimi K2.5's Agent Swarm and MiniMax's parallel execution show the future of AI workflows.
European open-source push: Mistral Large 3 and Ministral 3 establish Europe as a credible alternative to US/China dominance.
Chinese AI sovereignty: GLM models trained entirely on domestic chips prove independence from US export restrictions.
Speed/quality knobs: "Fast modes" and tiered inference are becoming standard for agent loops.
Open-weight momentum: MiniMax M2.5, GLM-5, Mistral Large 3 keep improving and are easier to deploy across platforms.
Long-context becomes practical: Bigger contexts matter when agents must stay grounded in repo history and docs.
Benchmark evolution: SWE-bench variants + terminal-style tasks are now mainstream evaluation targets.
Platform integration: Models ship alongside app/IDE tooling for end-to-end workflows.
On-device AI: Ministral 3 and Voxtral show frontier capabilities can run on edge devices.

Sources & Official Links

OpenAI

Anthropic

Google

Mistral AI

Zhipu AI (Z.ai)

Moonshot AI

MiniMax

Other

Frequently Asked Questions (FAQ)

Is GPT-6 released yet?

As of February 13, 2026, there is no official GPT-6 launch announcement in the sources linked on this page.

Is Llama 5 released?

As of February 13, 2026, Meta's latest official open-weight family is Llama 4; there is no Llama 5 announcement in the sources linked on this page.

Is Grok 5 released?

As of February 13, 2026, xAI's latest public releases include Grok 4.1 and fast variants; there is no Grok 5 launch announcement in the sources linked on this page.

What are the newest coding-focused models?

Recent coding-focused updates include OpenAI's GPT-5.3-Codex, Anthropic's Claude Opus 4.6, MiniMax M2.5, Zhipu's GLM-5, Mistral's Devstral 2, plus open-weight options like DeepSeek V3.2 and Qwen3-Coder-Next.

What should I use for cost-efficient coding?

MiniMax M2.5 at $1/hour and GLM-4.7 at $3/month represent the most cost-efficient frontier coding models. For self-hosting, consider Mistral Large 3, Kimi K2.5, or GLM-5 - all available as open weights.

What about European AI models?

Mistral AI leads Europe's open-source AI efforts with Mistral Large 3 (675B MoE), Ministral 3 (3B/8B/14B edge models), and Devstral 2 for coding. All released under Apache 2.0 license with strong performance and privacy focus.

What's the best model for multi-agent workflows?

Kimi K2.5's Agent Swarm technology can coordinate up to 100 specialized AI agents in parallel, achieving 4.5x faster execution on complex tasks compared to sequential approaches.

Bottom Line (Feb 2026)

Early 2026 is defined by three major shifts: cost disruption (MiniMax M2.5, GLM-4.7), multi-agent systems (Kimi K2.5's Agent Swarm), and geopolitical AI sovereignty (Chinese models on domestic chips, European open-source push).

The gap between open-weight and proprietary models continues to narrow. MiniMax M2.5 matches Claude Opus 4.6 on SWE-Bench at 1/10 the cost. GLM-5 challenges Gemini 3 Pro. Mistral Large 3 ranks #2 among open-source models. The era of expensive, closed AI may be ending faster than expected.

Next frontier: workflow integration, tool-use robustness, multi-modal agent coordination, and continued price/performance improvements from open-weight models.

Last Updated: February 13, 2026

Information compiled from official announcements and release notes (see sources above)