The affordability of DeepSeek is a myth: The revolutionary AI actually cost $1.6 billion to develop

DeepSeek's innovative chatbot has made a significant impact in the AI market, even causing a notable drop in NVIDIA's stock price. The company's AI model, known for its advanced architecture and training methods, offers users a unique experience with its ability to provide surprising and insightful answers.

DeepSeek's model stands out due to its use of several cutting-edge technologies:

Multi-token Prediction (MTP): This method allows the model to predict multiple words at once, enhancing both accuracy and efficiency by analyzing various parts of a sentence simultaneously.

Mixture of Experts (MoE): DeepSeek V3 utilizes 256 neural networks, with eight activated for each token processing task, accelerating AI training and boosting performance.

Multi-head Latent Attention (MLA): This mechanism focuses on the most critical parts of a sentence, repeatedly extracting key details to ensure no important information is missed, thus improving the AI's ability to capture nuances in the data.

Despite DeepSeek's claim of training their powerful neural network, DeepSeek V3, for just $6 million using 2048 graphics processors, further analysis reveals a more substantial investment in their infrastructure. DeepSeek operates a significant computational setup with around 50,000 Nvidia Hopper GPUs, including various models like H800 and H100, spread across multiple data centers. This infrastructure supports AI training, research, and financial modeling, with a total server investment of about $1.6 billion and operational costs of approximately $944 million.

DeepSeek, a subsidiary of the Chinese hedge fund High-Flyer, was spun off in 2023 to focus on AI technologies. Unlike many startups that rely on cloud providers, DeepSeek owns its data centers, allowing for full control over AI model optimization and faster innovation implementation. The company's self-funded status enhances its flexibility and decision-making speed.

DeepSeek attracts top talent from leading Chinese universities, with some researchers earning over $1.3 million annually. However, the company's claim of training DeepSeek V3 for just $6 million is misleading, as this figure only accounts for GPU usage during pre-training and does not include other significant expenses like research, model refinement, data processing, and infrastructure costs.

Since its start, DeepSeek has invested over $500 million in AI development. Its compact structure enables it to implement AI innovations more actively and effectively than larger, more bureaucratic companies. While DeepSeek's success is driven by substantial investments, technical breakthroughs, and a strong team, the notion of a "revolutionary budget" for AI model development is somewhat overstated.

Nonetheless, DeepSeek's costs are still lower than those of its competitors. For example, DeepSeek spent $5 million on the R1 model, while ChatGPT4o's training cost $100 million.

DeepSeek Test Image: ensigame.com

DeepSeek V3 Image: ensigame.com

DeepSeek Image: ensigame.com