UPDATED FEBRUARY 13, 2026

Latest AI Model Updates (February 2026)

Verified highlights through Feb 13, 2026: OpenAI's GPT-5.3-Codex, Anthropic's Claude Opus 4.6, Mistral Large 3, Google's Gemini 3 Deep Think, Zhipu's GLM-5, Moonshot's Kimi K2.5, MiniMax M2.5, plus DeepSeek V3.2 and Qwen3-Coder-Next.

As of February 13, 2026, frontier AI progress is increasingly about agentic reliability: models that can plan, write, test, refactor, and stay grounded across long tasks. This page has been updated with the latest official releases and rollouts through today.

Note: Despite constant speculation, there are no official GPT-6 / Llama 5 / Grok 5 launch announcements as of Feb 13, 2026. The biggest recent jumps are Claude Opus 4.6, GPT-5.3-Codex, MiniMax M2.5, GLM-5, and Mistral Large 3.

What's New Since November 2025

1) Anthropic Claude Opus 4.6 (Feb 2026)

Latest release: Claude Opus 4.6 released Feb 2026, with a Fast mode research preview announced on Feb 7, 2026.

What changed

  • Improved performance for coding, reasoning, and agentic tasks
  • Available via Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI
  • Fast mode introduces a speed/intelligence trade-off for lower-latency agent loops (research preview)

2) OpenAI GPT-5.2 → GPT-5.3-Codex (Dec 2025 – Feb 2026)

Latest releases: GPT-5.2 (Dec 11, 2025), GPT-5.2-Codex (Dec 18, 2025), and GPT-5.3-Codex (Feb 5, 2026).

Highlights

  • GPT-5.2: August 2025 knowledge cutoff; up to 3.2M context and up to 256K output tokens
  • GPT-5.2-Codex: Codex-tuned GPT-5.2 variant focused on software engineering workflows
  • GPT-5.3-Codex: 25% faster than GPT-5.2-Codex; OpenAI reports new highs on SWE-Bench Pro and Terminal-Bench

3) Google Gemini 3 Deep Think is now available (Feb 2026)

Update: Google announced Gemini 3 Deep Think availability updates in Feb 2026, expanding access beyond the initial Gemini 3 Pro launch (Nov 18, 2025).

What this means

  • Deep Think mode becomes available for Ultra subscribers and via API (per Google)
  • Best fit for tasks where extra thinking time pays off: planning, difficult debugging, and research synthesis

4) Mistral AI: European Open-Weight Leadership

Latest Releases: Mistral AI released Mistral Large 3 and Ministral 3 family (Dec 2, 2025), Devstral 2 (Dec 10, 2025), and Voxtral Transcribe 2 (Feb 4, 2026).

Mistral Large 3 (Dec 2025)

  • 41B active parameters, 675B total parameters (sparse MoE architecture)
  • Ranks #2 among open-source non-reasoning models on LMArena leaderboard
  • 256K context window for extended document processing
  • Released under Apache 2.0 license for maximum openness
  • Trained on NVIDIA H200 GPUs with frontier-class capabilities
  • Available on Mistral AI Studio, Amazon Bedrock, Azure Foundry, Hugging Face

Ministral 3 Family (Dec 2025)

  • Three compact models: 3B, 8B, and 14B parameters
  • Best performance-to-cost ratio in their class
  • Designed for edge devices, robotics, and on-device AI
  • 14B reasoning variant achieves 85% on AIME 2025
  • Optimized for NVIDIA Spark, RTX PCs, and Jetson devices
  • Multimodal and multilingual capabilities from the start

Devstral 2 & Voxtral (2025-2026)

  • Devstral 2: 24B coding model outperforming Qwen 3 Coder (30B)
  • Devstral Small 2: Compact variant for local deployment
  • Voxtral Transcribe 2: On-device speech-to-text with real-time diarization
  • Voxtral: Privacy-first design runs entirely on smartphone/laptop
  • Competitive with OpenAI Whisper and Google Cloud Speech on FLEURS benchmark

Ecosystem: Mistral positions itself as Europe's answer to US AI dominance, with $2B+ raised and ~€14B valuation. The company emphasizes efficiency, transparency, and open-source values while maintaining competitive performance.

5) Zhipu AI (Z.ai): China's GLM Family Evolution

Latest Releases: GLM-4.7 (Dec 22, 2025), GLM-5 (Feb 11, 2026), GLM-Image (Jan 2026), and GLM-4.7-Flash (Jan 20, 2026).

GLM-5 (Feb 11, 2026)

  • 744B total parameters (doubled from GLM-4.7's 355B)
  • Trained on 28.5 trillion tokens
  • Shifts from "vibe coding" to "agentic engineering"
  • Achieves industry-leading scores for open models in coding and agentic tasks
  • Surpasses Gemini 3 Pro on some benchmarks according to internal tests
  • Still lags Claude Opus 4.6 on coding benchmarks overall
  • Available at $3/month or free for local deployment

GLM-4.7 (Dec 2025)

  • 84.9% on LiveCodeBench (ahead of Claude Sonnet 4.5)
  • 73.8% on SWE-bench Verified (highest among open-source models at release)
  • 95.7% on AIME 2025 mathematics benchmark
  • "Preserved Thinking" maintains reasoning chains across multiple turns
  • 42.8% on Humanity's Last Exam (41% improvement over predecessor)
  • 87.4% on τ²-Bench for multi-step tool usage

GLM-4.7-Flash (Jan 2026)

  • 30B-A3B MoE model optimized for local deployment
  • Strongest in 30B class for coding and reasoning
  • 128K context window for large codebases
  • Free tier option via chat.z.ai
  • Designed for lightweight deployment with strong performance

GLM Ecosystem Updates

  • GLM-Image: Trained entirely on Chinese Huawei Ascend hardware
  • AutoGLM-Phone: Multilingual mobile automation framework
  • GLM-OCR: High-performance document understanding
  • IPO Success: Listed on Hong Kong Stock Exchange (Jan 8, 2026) at HK$116.20
  • Stock surged 173% in one month post-IPO to HK$317.80
  • US Entity List (Jan 2025) accelerated sovereign AI strategy

Strategic Context: Zhipu (Z.ai) represents China's push for AI sovereignty after US export restrictions. All GLM models from 4.6 onward are trained on domestic Chinese chips (Huawei Ascend, Cambricon, Moore Threads), demonstrating independence from NVIDIA hardware.

6) Moonshot AI: Kimi K2.5 Agent Swarm

Latest Release: Kimi K2.5 released January 27, 2026, introducing groundbreaking Agent Swarm technology.

Kimi K2.5 Key Features

  • 1 trillion total parameters, 32B active (MoE architecture)
  • Trained on 15 trillion mixed visual and text tokens
  • Native multimodal: Vision and language trained together from scratch
  • Agent Swarm: Coordinates up to 100 specialized AI agents in parallel
  • 4.5x faster task completion compared to sequential processing
  • 50.2% on Humanity's Last Exam at 76% lower cost than Claude Opus 4.5
  • 78.4% on BrowseComp (vs 60.6% for standard single-agent approach)

Four Operational Modes

  • K2.5 Instant: Fast responses for standard queries
  • K2.5 Thinking: Extended reasoning for complex problems
  • K2.5 Agent: Single autonomous agent for task execution
  • K2.5 Agent Swarm: Parallel multi-agent coordination

Technical Innovations

  • Visual Coding: Generates code from UI designs and video workflows
  • Parallel-Agent Reinforcement Learning: Prevents serial collapse in task execution
  • Critical Path Optimization: Measures slowest sub-agent at each stage
  • Autonomous image search and layout iteration
  • Works best with Kimi Code CLI framework

Availability & Pricing

  • API: $0.60/M input tokens, $2.50/M output tokens
  • Access via kimi.com (browser), Kimi App (mobile), moonshot.ai (API)
  • Open-source under Modified MIT License
  • Weights available on Hugging Face and GitHub
  • Compatible with OpenAI SDK for easy migration

Use Cases: K2.5 excels at multi-modal AI agents, visual analysis, web development with autonomous iteration, and tool-augmented agentic workflows. The Agent Swarm technology particularly shines in wide information gathering tasks requiring parallel execution.

7) MiniMax M2.5: The $1/Hour Frontier Model

Latest Release: MiniMax M2.5 released February 12, 2026, positioning itself as "intelligence too cheap to meter."

M2.5 Performance Benchmarks

  • 80.2% on SWE-Bench Verified (within 0.6pp of Claude Opus 4.6)
  • 51.3% on Multi-SWE-Bench (best performance in industry for multilingual)
  • 55.4% on SWE-Bench Pro
  • 76.3% on BrowseComp (with context management)
  • 37% faster than M2.1, matching Claude Opus 4.6 speed

Cost Revolution

  • $1 per hour of continuous generation at 100 tokens/second
  • 1/10 to 1/20 the cost of GPT-5.3-Codex and Claude Opus 4.6
  • M2.5: $0.30/M input, $2.40/M output
  • M2.5-Lightning: 100 TPS variant with same pricing
  • Makes 24/7 agentic operation economically feasible

Technical Architecture

  • 230B total parameters, 10B active (MoE architecture)
  • Trained on 200,000+ real-world environments
  • Supports 10+ programming languages (Go, C, C++, TypeScript, Rust, Kotlin, Python, Java, JavaScript, PHP, Lua, Dart, Ruby)
  • Forge RL Framework: Agent-native reinforcement learning with 40x training speedup
  • Spec-Writing Tendency: Plans like an architect before coding

Full-Stack Capabilities

  • 0-to-1 system design and environment setup
  • 1-to-10 system development
  • 10-to-90 feature iteration
  • 90-to-100 comprehensive code review and testing
  • Web, Android, iOS, and Windows platform support
  • Server-side APIs, business logic, databases, and more

Office & Productivity

  • Word documents and LaTeX-enabled PDFs
  • PowerPoint presentations with professional layouts
  • Excel spreadsheets with formulas, pivot tables, and charts
  • Performance on MEWC (Microsoft Excel World Championship) problems

Availability

  • Open-sourced on HuggingFace and GitHub
  • API access via minimax.io
  • Free tier available through OpenHands Cloud (limited time)
  • Supports private cluster deployment and fine-tuning
  • Compatible with vLLM and SGLang for optimal performance

Market Impact: M2.5 represents the first frontier model where cost is truly negligible for continuous use. At $1/hour, developers can run M2.5 24/7 for an entire month for less than $750, making it a game-changer for agentic applications and automated workflows.

8) Other Open-Weight Coding Models

DeepSeek V3.2 (Dec 1, 2025): Released V3.2 with open-source weights and announced a major API price reduction. Continues as cost-effective alternative with 88% on AIME 2025 and 82% on GPQA Diamond.

Qwen3-Coder-Next (Feb 2026): Open-weight MoE coder model with up to 256K context (per model card). Designed for large-scale code generation and refactoring tasks.

Industry Landscape & Competitive Position

Model Best For Key Strength Released
Claude Opus 4.6 Coding & Agents Improved coding/reasoning/agentic tasks (plus Fast mode preview) Feb 2026
GPT-5.3-Codex Software engineering Faster agentic coding; new highs on SWE-Bench Pro & Terminal-Bench Feb 5, 2026
MiniMax M2.5 Cost-efficient coding 80.2% SWE-Bench at 1/10 the cost; $1/hour frontier model Feb 12, 2026
GLM-5 Agentic engineering 744B params, sovereign Chinese AI, competitive with Gemini 3 Pro Feb 11, 2026
Kimi K2.5 Agent Swarm 100 parallel agents, 4.5x faster execution, visual agentic intelligence Jan 27, 2026
Mistral Large 3 Open-weight flagship 675B MoE, #2 open-source on LMArena, Apache 2.0 license Dec 2, 2025
GPT-5.2 General purpose Aug 2025 cutoff; very long context and large outputs Dec 11, 2025
Gemini 3 Deep Think Deep reasoning Deep Think mode availability expands (Ultra + API) Feb 2026
Ministral 3 Edge AI 3B/8B/14B compact models, 85% AIME with 14B reasoning variant Dec 2, 2025
GLM-4.7 Coding agents 84.9% LiveCodeBench, Preserved Thinking, $3/month Dec 22, 2025
DeepSeek V3.2 Open-weight coding Open-source weights + major API price reduction Dec 1, 2025
Qwen3-Coder-Next Open-weight coding MoE coder model with up to 256K context Feb 2026

Key Trends in Early 2026

  • Cost disruption: MiniMax M2.5 at $1/hour and GLM-4.7 at $3/month force repricing across the industry.
  • Agent-first coding: Frontier releases now optimize for long-running software engineering tasks, not just chat.
  • Multi-agent systems: Kimi K2.5's Agent Swarm and MiniMax's parallel execution show the future of AI workflows.
  • European open-source push: Mistral Large 3 and Ministral 3 establish Europe as a credible alternative to US/China dominance.
  • Chinese AI sovereignty: GLM models trained entirely on domestic chips prove independence from US export restrictions.
  • Speed/quality knobs: "Fast modes" and tiered inference are becoming standard for agent loops.
  • Open-weight momentum: MiniMax M2.5, GLM-5, Mistral Large 3 keep improving and are easier to deploy across platforms.
  • Long-context becomes practical: Bigger contexts matter when agents must stay grounded in repo history and docs.
  • Benchmark evolution: SWE-bench variants + terminal-style tasks are now mainstream evaluation targets.
  • Platform integration: Models ship alongside app/IDE tooling for end-to-end workflows.
  • On-device AI: Ministral 3 and Voxtral show frontier capabilities can run on edge devices.

Sources & Official Links

Frequently Asked Questions (FAQ)

Is GPT-6 released yet?

As of February 13, 2026, there is no official GPT-6 launch announcement in the sources linked on this page.

Is Llama 5 released?

As of February 13, 2026, Meta's latest official open-weight family is Llama 4; there is no Llama 5 announcement in the sources linked on this page.

Is Grok 5 released?

As of February 13, 2026, xAI's latest public releases include Grok 4.1 and fast variants; there is no Grok 5 launch announcement in the sources linked on this page.

What are the newest coding-focused models?

Recent coding-focused updates include OpenAI's GPT-5.3-Codex, Anthropic's Claude Opus 4.6, MiniMax M2.5, Zhipu's GLM-5, Mistral's Devstral 2, plus open-weight options like DeepSeek V3.2 and Qwen3-Coder-Next.

What should I use for cost-efficient coding?

MiniMax M2.5 at $1/hour and GLM-4.7 at $3/month represent the most cost-efficient frontier coding models. For self-hosting, consider Mistral Large 3, Kimi K2.5, or GLM-5 - all available as open weights.

What about European AI models?

Mistral AI leads Europe's open-source AI efforts with Mistral Large 3 (675B MoE), Ministral 3 (3B/8B/14B edge models), and Devstral 2 for coding. All released under Apache 2.0 license with strong performance and privacy focus.

What's the best model for multi-agent workflows?

Kimi K2.5's Agent Swarm technology can coordinate up to 100 specialized AI agents in parallel, achieving 4.5x faster execution on complex tasks compared to sequential approaches.

Bottom Line (Feb 2026)

Early 2026 is defined by three major shifts: cost disruption (MiniMax M2.5, GLM-4.7), multi-agent systems (Kimi K2.5's Agent Swarm), and geopolitical AI sovereignty (Chinese models on domestic chips, European open-source push).

The gap between open-weight and proprietary models continues to narrow. MiniMax M2.5 matches Claude Opus 4.6 on SWE-Bench at 1/10 the cost. GLM-5 challenges Gemini 3 Pro. Mistral Large 3 ranks #2 among open-source models. The era of expensive, closed AI may be ending faster than expected.

Next frontier: workflow integration, tool-use robustness, multi-modal agent coordination, and continued price/performance improvements from open-weight models.

Last Updated: February 13, 2026

Information compiled from official announcements and release notes (see sources above)