DeepSeek R1 model was trained on NVIDIA H800 AI GPUs, while inferencing was done on Chinese made chips from Huawei, the new 910C AI chip.
However, while being cut down, the HGX H20 performs extraordinarily well ... language model on a cluster of 2,048 Nvidia H800 GPUs and that it took two months, a total of 2.8 million GPU hours.
One of DeepSeek's research papers showed that it had used about 2,000 of Nvidia's H800 chips, which were designed to comply with U.S. export controls released in 2022, rules that experts told ...
Worse for Nvidia, the state-of-the-art V3 LLM was trained on just 2,048 of Nvidia’s H800 GPUs over two months, equivalent to about 2.8 million GPU hours, or about one-tenth the computing power ...