arXiv

How Much Progress Has There Been in NVIDIA Datacenter GPUs?

June 2, 2026 · Emanuele Del Sozzo, Martin Fleming, Kenneth Flamm, Neil Thompson · Original Source

Title: Assessing Advancements in NVIDIA Datacenter GPUs

Abstract: As modern Graphics Processing Units (GPUs) grow indispensable for a wide array of computational tasks, evaluating their historical and contemporary evolution is critical for predicting future limitations on scientific inquiry. This assessment is especially urgent within the Artificial Intelligence (AI) sector, where intense global rivalry and swift technological strides have prompted the United States to enforce export control measures that restrict international access to cutting-edge AI chips. Accordingly, this study investigates the technical evolution of NVIDIA datacenter GPUs spanning from the mid-2000s to 2025. Key findings reveal doubling periods of 1.43 and 1.67 years for FP16 and FP32 dense operations, respectively, while FP64 doubling times vary between 2.05 and 3.79 years. In contrast, off-chip memory capacity and bandwidth have expanded at a more moderate pace, doubling every 3.29 to 3.41 years. Meanwhile, release costs and power usage have roughly doubled every 5.03 and 15 years, respectively. Furthermore, a comparative analysis of annual leading GPUs indicates that while NVIDIA’s performance edge is diminishing, it remains insufficient to trigger a significant market realignment. Lastly, the paper quantifies the impact of existing U.S. export restrictions and the resulting performance disparities, noting that newly suggested policy adjustments could reduce the gap from 23.6X to 3.54X.

Source: arXiv Generated at: 2026-06-02 00:00:00 UTC