DeepSeek just dropped two insanely powerful AI models that rival GPT-5 and they’re totally free
On Sunday, Chinese AI startup DeepSeek unveiled two groundbreaking AI models, DeepSeek-V3.2 and its high-powered variant, DeepSeek-V3.2-Speciale, claiming they rival or surpass the capabilities of OpenAI’s GPT-5 and Google’s Gemini-3.0-Pro. Based in Hangzhou, DeepSeek’s latest models are not only designed for everyday reasoning tasks but also excelled in prestigious international competitions, achieving gold medals in events like the 2025 International Mathematical Olympiad and the International Olympiad in Informatics. This release signifies a critical shift in the competitive dynamics between American tech giants and their Chinese counterparts, especially as DeepSeek continues to innovate despite U.S. export controls limiting access to advanced computing hardware.
Central to these advancements is DeepSeek’s Sparse Attention (DSA) mechanism, which significantly reduces the computational costs associated with processing lengthy documents. Traditional AI models struggle with longer inputs, leading to exponential increases in processing requirements. In contrast, DSA allows DeepSeek’s models to efficiently process up to 128,000 tokens—equivalent to a 300-page book—at a fraction of the previous costs, achieving a 70% reduction in inference expenses. The technical report highlights that DeepSeek-V3.2-Speciale scored impressively across various benchmarks, including a 96% pass rate on the AIME 2025 mathematics competition, outpacing both GPT-5 and Gemini-3.0-Pro. Moreover, the models are released under an open-source MIT license, allowing developers and researchers to access and modify the technology freely, which could disrupt the current AI industry business model dominated by proprietary systems.
DeepSeek’s strategic move to make its advanced models openly accessible raises questions about the future of AI competition and the potential for American companies to maintain their technological edge. As regulatory hurdles mount against DeepSeek in Europe and the U.S., concerns over data privacy and national security loom large. Nevertheless, the company’s success in developing cutting-edge AI without reliance on restricted U.S. technology suggests that the landscape of AI innovation is evolving rapidly. As noted by observers, DeepSeek’s achievements challenge the notion that substantial capital investment is a prerequisite for leadership in AI, indicating that the race between the U.S. and China in this field has entered a new and unpredictable phase.
https://www.youtube.com/watch?v=KrV6ldHymwQ
Chinese artificial intelligence startup
DeepSeek
released two powerful new AI models on Sunday that the company claims match or exceed the capabilities of OpenAI’s
GPT-5
and Google’s
Gemini-3.0-Pro
— a development that could reshape the competitive landscape between American tech giants and their Chinese challengers.
The Hangzhou-based company launched
DeepSeek-V3.2
, designed as an everyday reasoning assistant, alongside DeepSeek-V3.2-Speciale, a high-powered variant that achieved gold-medal performance in four elite international competitions: the 2025 International Mathematical Olympiad, the International Olympiad in Informatics, the ICPC World Finals, and the China Mathematical Olympiad.
The release carries profound implications for American technology leadership. DeepSeek has once again demonstrated that it can produce frontier AI systems despite U.S. export controls that
restrict China’s access to advanced Nvidia chips
— and it has done so while making its models freely available under an open-source MIT license.
“People thought DeepSeek gave a one-time breakthrough but we came back much bigger,” wrote
Chen Fang
, who identified himself as a contributor to the project, on X (formerly Twitter). The release drew swift reactions online, with one user declaring: ”
Rest in peace, ChatGPT
.”
How DeepSeek’s sparse attention breakthrough slashes computing costs
At the heart of the new release lies
DeepSeek Sparse Attention
, or DSA — a novel architectural innovation that dramatically reduces the computational burden of running AI models on long documents and complex tasks.
Traditional AI attention mechanisms, the core technology allowing language models to understand context, scale poorly as input length increases. Processing a document twice as long typically requires four times the computation. DeepSeek’s approach breaks this constraint using what the company calls a “lightning indexer” that identifies only the most relevant portions of context for each query, ignoring the rest.
According to
DeepSeek’s technical report
, DSA reduces inference costs by roughly half compared to previous models when processing long sequences. The architecture “substantially reduces computational complexity while preserving model performance,” the report states.
Processing 128,000 tokens — roughly equivalent to a 300-page book — now costs approximately $0.70 per million tokens for decoding, compared to $2.40 for the previous
V3.1-Terminus model
. That represents a 70% reduction in inference costs.
The 685-billion-parameter models support context windows of 128,000 tokens, making them suitable for analyzing lengthy documents, codebases, and research papers. DeepSeek’s
technical report
notes that independent evaluations on long-context benchmarks show V3.2 performing on par with or better than its predecessor “despite incorporating a sparse attention mechanism.”
The benchmark results that put DeepSeek in the same league as GPT-5
DeepSeek’s claims of parity with America’s leading AI systems rest on extensive testing across mathematics, coding, and reasoning tasks — and the numbers are striking.
On
AIME 2025
, a prestigious American mathematics competition,
DeepSeek-V3.2-Speciale
achieved a 96.0% pass rate, compared to 94.6% for GPT-5-High and 95.0% for Gemini-3.0-Pro. On the
Harvard-MIT Mathematics Tournament
, the Speciale variant scored 99.2%, surpassing Gemini’s 97.5%.
The standard
V3.2 model
, optimized for everyday use, scored 93.1% on AIME and 92.5% on HMMT — marginally below frontier models but achieved with substantially fewer computational resources.
Most striking are the competition results.
DeepSeek-V3.2-Speciale
scored 35 out of 42 points on the
2025 International Mathematical Olympiad
, earning gold-medal status. At the
International Olympiad in Informatics
, it scored 492 out of 600 points — also gold, ranking 10th overall. The model solved 10 of 12 problems at the
ICPC World Finals
, placing second.
These results came without internet access or tools during testing. DeepSeek’s report states that “testing strictly adheres to the contest’s time and attempt limits.”
On coding benchmarks,
DeepSeek-V3.2
resolved 73.1% of real-world software bugs on
SWE-Verified
, competitive with GPT-5-High at 74.9%. On
Terminal Bench 2.0
, measuring complex coding workflows, DeepSeek scored 46.4%—well above GPT-5-High’s 35.2%.
The company acknowledges limitations. “Token efficiency remains a challenge,” the technical report states, noting that DeepSeek “typically requires longer generation trajectories” to match Gemini-3.0-Pro’s output quality.
Why teaching AI to think while using tools changes everything
Beyond raw reasoning,
DeepSeek-V3.2
introduces “thinking in tool-use” — the ability to reason through problems while simultaneously executing code, searching the web, and manipulating files.
Previous AI models faced a frustrating limitation: each time they called an external tool, they lost their train of thought and had to restart reasoning from scratch. DeepSeek’s architecture preserves the reasoning trace across multiple tool calls, enabling fluid multi-step problem solving.
To train this capability, the company built a massive synthetic data pipeline generating over 1,800 distinct task environments and 85,000 complex instructions. These included challenges like multi-day trip planning with budget constraints, software bug fixes across eight programming languages, and web-based research requiring dozens of searches.
The technical report describes one example: planning a three-day trip from Hangzhou with constraints on hotel prices, restaurant ratings, and attraction costs that vary based on accommodation choices. Such tasks are “hard to solve but easy to verify,” making them ideal for training AI agents.
DeepSeek
employed real-world tools during training — actual web search APIs, coding environments, and Jupyter notebooks — while generating synthetic prompts to ensure diversity. The result is a model that generalizes to unseen tools and environments, a critical capability for real-world deployment.
DeepSeek’s open-source gambit could upend the AI industry’s business model
Unlike OpenAI and Anthropic, which guard their most powerful models as proprietary assets, DeepSeek has released both
V3.2
and
V3.2-Speciale
under the MIT license — one of the most permissive open-source frameworks available.
Any developer, researcher, or company can download, modify, and deploy the 685-billion-parameter models without restriction. Full model weights, training code, and documentation are
available on Hugging Face
, the leading platform for AI model sharing.
The strategic implications are significant. By making frontier-capable models freely available, DeepSeek undermines competitors charging premium API prices. The Hugging Face model card notes that DeepSeek has provided Python scripts and test cases “demonstrating how to encode messages in OpenAI-compatible format” — making migration from competing services straightforward.
For enterprise customers, the value proposition is compelling: frontier performance at dramatically lower cost, with deployment flexibility. But data residency concerns and regulatory uncertainty may limit adoption in sensitive applications — particularly given DeepSeek’s Chinese origins.
Regulatory walls are rising against DeepSeek in Europe and America
DeepSeek’s global expansion faces mounting resistance. In June, Berlin’s data protection commissioner Meike Kamp declared that DeepSeek’s transfer of German user data to China is ”
unlawful
” under EU rules, asking Apple and Google to consider blocking the app.
The German authority expressed concern that “Chinese authorities have extensive access rights to personal data within the sphere of influence of Chinese companies.” Italy ordered DeepSeek to
block its app
in February. U.S. lawmakers have moved to
ban the service
from government devices, citing national security concerns.
Questions also persist about U.S. export controls designed to limit China’s AI capabilities. In August, DeepSeek hinted that China would soon have ”
next generation
” domestically built chips to support its models. The company indicated its systems work with Chinese-made chips from
Huawei
and
Cambricon
without additional setup.
DeepSeek’s original V3 model was reportedly trained on roughly 2,000 older
Nvidia H800 chips
— hardware since restricted for China export. The company has not disclosed what powered V3.2 training, but its continued advancement suggests export controls alone cannot halt Chinese AI progress.
What DeepSeek’s release means for the future of AI competition
The release arrives at a pivotal moment. After years of massive investment, some analysts question whether an AI bubble is forming. DeepSeek’s ability to match American frontier models at a fraction of the cost challenges assumptions that AI leadership requires enormous capital expenditure.
The company’s
technical report
reveals that post-training investment now exceeds 10% of pre-training costs — a substantial allocation credited for reasoning improvements. But DeepSeek acknowledges gaps: “The breadth of world knowledge in DeepSeek-V3.2 still lags behind leading proprietary models,” the report states. The company plans to address this by scaling pre-training compute.
DeepSeek-V3.2-Speciale
remains available through a temporary API until December 15, when its capabilities will merge into the standard release. The Speciale variant is designed exclusively for deep reasoning and does not support tool calling — a limitation the standard model addresses.
For now, the AI race between the United States and China has entered a new phase. DeepSeek’s release demonstrates that open-source models can achieve frontier performance, that efficiency innovations can slash costs dramatically, and that the most powerful AI systems may soon be freely available to anyone with an internet connection.
As one commenter on X observed: “Deepseek just casually breaking those historic benchmarks set by Gemini is bonkers.”
The question is no longer whether Chinese AI can compete with Silicon Valley. It’s whether American companies can maintain their lead when their Chinese rival gives comparable technology away for free.