Bring The Noise: Embracing Randomness Is the Key to Scaling Up Machine Learning Algorithms
Despite the progress made in scaling up data processing and storage, major bottlenecks still exist when using big data for statistical inference and predictive modeling. Many classic algorithms used for predictive modeling with big data are constrained by the very math that defines them. These algorithms tend to have complexities that do not scale linearly, so doubling your data often means quadrupling the time it takes to learn models. Many of today’s statistical modelers have been trained using the algorithms and software that have largely been optimized for a ‘‘small data’’ paradigm. New tools and new ways of approaching many machine learning problems are now called for in order to scale up industrialsized applications. Fortunately, new theory, intuition, and tools are emerging to solve these problems. In this article, we introduce the works of Leon Bottou and John Langford (both coincidentally at Microsoft Research), two leading researchers in the field whose research missions have been to scale up machine learning. The work by Bottou, Langford, and their colleagues shows us that by adopting algorithmic strategies that classically would be considered inferior, we can paradoxically both scale up machine learning and end up with better predictive performance. From a predictive modeling perspective, more data certainly is better, but only if you can use it. The strategies presented here offer practitioners promising options to making that happen.
To read the full paper, click here.