AI progress is not mostly driven by progress in hardware efficiency

Lowest compute points at any given time shown in blue, all points measured shown in gray. Efficiency doubling time of 16 months. Hernandez, D. and Brown, T.B. (2020) ‘Measuring the Algorithmic Efficiency of Neural Networks’. arXiv. Available at: https://doi.org/10.48550/arXiv.2005.04305.

Hernandez and Brown show that you can measure algorithmic progress by comparing the amount of floating-point operations/training costs needed to train a model to the same level of accuracy as a model a few years before, but using a different algorithm. They are holding performance constant, while analysing training costs.

What they’ve found is that between 2012 and 2019, the amount of compute needed to train a neural network to the same level as AlexNet decreased by a factor of 44x. This result surprised me, given that with the original Moore’s Law rate, this factor would be 11x for the same period. The factor was similar for other networks, shown below.

This result confuses me, since I thought that progress was mostly predicted and driven by increases in data and compute.

Source: Hernandez, D. and Brown, T.B. (2020) ‘Measuring the Algorithmic Efficiency of Neural Networks’. arXiv. Available at: https://doi.org/10.48550/arXiv.2005.04305.

Links:
So far, the best predictor for AI progress was compute scaling
So far, AI progress is driven by the amount of data and compute during training