Hernandez and Brown show that you can measure algorithmic progress by comparing the amount of floating-point operations/training costs needed to train a model to the same level of accuracy as a model a few years before, but using a different algorithm. They are holding performance constant, while analysing training costs.
What they’ve found is that between 2012 and 2019, the amount of compute needed to train a neural network to the same level as AlexNet decreased by a factor of 44x. This result surprised me, given that with the original Moore’s Law rate, this factor would be 11x for the same period. The factor was similar for other networks, shown below.
This result confuses me, since I thought that progress was mostly predicted and driven by increases in data and compute.
Source: Hernandez, D. and Brown, T.B. (2020) ‘Measuring the Algorithmic Efficiency of Neural Networks’. arXiv. Available at: https://doi.org/10.48550/arXiv.2005.04305.