DeepSeek V4: Open Source AI's Leap in Long Context & Cost

The AI world is buzzing! Chinese AI powerhouse DeepSeek has just unveiled a preview of V4, its highly anticipated new flagship model. This release marks a significant leap forward, especially in its ability to process vastly longer prompts thanks to an innovative design that handles large amounts of text with unprecedented efficiency. Best of all, like its predecessors, DeepSeek V4 is entirely open-source, meaning anyone can download, use, and even modify it.

This launch is DeepSeek’s most significant moment since January 2025, when its reasoning model, R1, stunned the global AI industry. Despite being trained on limited computing resources, R1’s strong performance and efficiency transformed DeepSeek from a relatively unknown research team into China’s most prominent AI company almost overnight. It even sparked a wave of open-weight model releases from other Chinese AI firms.

DeepSeek has maintained a somewhat low profile since that initial splash. However, earlier this month, the company effectively teased V4’s arrival by adding “expert” and “flash” modes to its online model, sparking speculation about a larger upcoming release. This big return to cutting-edge frontier models comes after a period of scrutiny for the company, including notable personnel departures, delays in previous model launches, and increasing attention from both US and Chinese governments.

While V4 might not shake the AI field in the same revolutionary way R1 did, its release is incredibly important for several key reasons. It promises frontier AI capabilities at a fraction of the cost, pushes the boundaries of long-context understanding, and importantly, champions China’s burgeoning domestic chip industry. Let’s dive into what makes DeepSeek V4 a game-changer.

Unmatched Performance at a Fraction of the Cost

Just like R1 before it, DeepSeek claims that V4’s performance rivals the best models available, but at a significantly lower price point. This is fantastic news for developers and companies, offering them access to cutting-edge AI capabilities on their own terms without the burden of skyrocketing operational costs. It democratizes access to powerful AI tools, enabling more innovation across the board.

The new model arrives in two distinct versions, both readily available on DeepSeek’s website, through its app, and via API access for developers. V4-Pro is the larger, more robust model, specifically engineered for complex coding tasks and sophisticated agentic workflows. Its counterpart, V4-Flash, is a smaller, more nimble version designed for speed and cost-efficiency in everyday applications.

Both versions feature sophisticated reasoning modes, allowing the model to carefully dissect user prompts and demonstrate its problem-solving process step-by-step. The pricing is where DeepSeek truly stands out: V4-Pro costs just $1.74 per million input tokens and $3.48 per million output tokens, a mere fraction of what comparable models from industry leaders like OpenAI and Anthropic charge. V4-Flash is even more affordable, priced at approximately $0.14 per million input tokens and $0.28 per million output tokens, making it one of the most cost-effective top-tier models on the market.

In terms of raw performance, V4 represents a substantial improvement over R1 and positions itself as a strong contender against virtually all major AI models released recently. Company-shared benchmarks show DeepSeek V4-Pro competing head-to-head with leading closed-source models, including Anthropic’s Claude-Opus-4.6, OpenAI’s GPT-5.4, and Google’s Gemini-3.1. When stacked against other open-source alternatives like Alibaba’s Qwen-3.5 or Z.ai’s GLM-5.1, DeepSeek V4 surpasses them all in critical areas such as coding, mathematics, and STEM problems, solidifying its place as one of the most powerful open-source models ever released.

DeepSeek also highlights that V4-Pro now ranks among the strongest open-source models for agentic coding tasks and excels in tests measuring its ability to handle multi-step problems. Its advanced writing capabilities and broad world knowledge further lead the field, according to the comprehensive benchmarking results. An internal survey of 85 experienced developers revealed that over 90% included V4-Pro among their top model choices for coding tasks, underscoring its practical appeal. Moreover, DeepSeek has specifically optimized V4 for popular agent frameworks such as Claude Code, OpenClaw, and CodeBuddy, ensuring seamless integration for developers.

Revolutionizing Long Context Processing

One of V4’s most significant innovations is its remarkable long context window, which dictates the amount of text the model can process simultaneously. Both V4-Pro and V4-Flash can comfortably handle an astounding 1 million tokens – a capacity large enough to encompass all three volumes of The Lord of the Rings and The Hobbit combined! This impressive context window size is now the default across all DeepSeek services and matches the capabilities offered by cutting-edge versions of models like Gemini and Claude.

This leap isn’t just about size; it’s about a fundamental shift in how the model operates. V4 introduces significant architectural changes, particularly within its attention mechanism – the core feature that helps AI models understand how each part of a prompt relates to the whole. As text grows longer, these internal comparisons become incredibly resource-intensive, making the attention mechanism a primary bottleneck for long-context models.

DeepSeek’s genius lies in making the model more selective about what it “pays attention to.” Instead of treating all past text as equally important, V4 intelligently compresses older information and prioritizes the parts most relevant to the current moment. Crucially, it still keeps nearby text in full detail, ensuring no critical information is missed. This novel approach sharply reduces the computational cost of using long contexts.

For instance, when processing 1 million tokens, V4-Pro uses only 27% of the computing power and just 10% of the memory required by its predecessor, V3.2. The reductions for V4-Flash are even more dramatic, consuming only 10% of the computing power and 7% of the memory. In practical terms, this dramatically lowers the cost of building tools that need to analyze vast amounts of material, such as an AI coding assistant that can read an entire codebase or a research agent that can sift through long archives without “forgetting” earlier context.

Forging China’s AI Self-Reliance

Significantly, V4 is DeepSeek’s first model optimized for domestic Chinese chips, such as Huawei’s Ascend series. This strategic move transforms the launch into a crucial test of whether China’s homegrown AI industry can begin to lessen its dependence on US chip giant Nvidia. This development was largely anticipated, especially after reports surfaced that DeepSeek bypassed American chipmakers like Nvidia and AMD, granting early access only to Chinese manufacturers – a stark departure from industry norms for pre-release optimization.

Following this, Huawei swiftly confirmed that its Ascend supernode products, based on the Ascend 950 series, would fully support DeepSeek V4. This ensures that companies and individuals wishing to run their own modified versions of DeepSeek V4 can do so seamlessly using Huawei chips. The pressure to integrate domestic hardware is not new; Chinese government officials reportedly recommended that DeepSeek incorporate Huawei chips into its training process.

This aligns with a broader pattern in China’s industrial policy, where strategic sectors are increasingly pushed to align with national self-reliance goals. The urgency is particularly acute in AI, given that US export controls since 2022 have restricted Chinese firms’ access to Nvidia’s most powerful chips. Beijing’s response has been to aggressively accelerate the development of a domestic AI stack, spanning from chips to software frameworks and data centers.

Replacing Nvidia, however, is not a simple chip-for-chip swap. Nvidia’s dominance stems not just from its hardware, but from its robust software ecosystem that developers have spent years building around. Transitioning to Huawei’s Ascend chips necessitates adapting model code, rebuilding tools, and rigorously proving the stability of these new systems for serious deployment.

While DeepSeek has made significant strides, it appears the transition away from Nvidia is not yet complete. The company’s technical report indicates that Chinese chips are primarily being used to run V4 for inference – meaning when the model is asked to complete a task. However, according to experts like Tsinghua University professor Liu Zhiyuan, DeepSeek may have only partially adapted V4’s training process for Chinese chips, suggesting that key long-context features might still rely on Nvidia hardware. Sources close to the matter, speaking anonymously due to political sensitivities, also noted that Chinese chips, while improving, are currently better suited for inference than for the intensive demands of AI model training.

DeepSeek is also strategically linking the future costs of V4 to this hardware shift. The company projects that V4-Pro prices could fall significantly once Huawei’s Ascend 950 supernodes begin shipping at scale in the second half of this year. If successful, this ambitious initiative with V4 could very well be an early, powerful sign that China is effectively building a robust, parallel AI infrastructure.

Source: MIT Tech Review – AI

Kristine Vior

With a deep passion for the intersection of technology and digital media, Kristine leads the editorial vision of HubNextera News. Her expertise lies in deciphering technical roadmaps and translating them into comprehensive news reports for a global audience. Every article is reviewed by Kristine to ensure it meets our standards for original perspective and technical depth.

DeepSeek V4: Open Source AI’s Leap in Long Context & Cost

Unmatched Performance at a Fraction of the Cost

Revolutionizing Long Context Processing

Forging China’s AI Self-Reliance

Kristine Vior

Leave a Comment Cancel Reply

Unmatched Performance at a Fraction of the Cost

Revolutionizing Long Context Processing

Forging China’s AI Self-Reliance

Kristine Vior

Related Posts

Leave a Comment Cancel Reply