Skip to main content
dstillery dstillery
dstillery

TDWI Accelerate Boston

The leading conference focused on Big Data, Data Science, Analytics, Machine Learning, and AI training and education.

ACCELERATE brings together the brightest minds in data to share their expertise and insight on the future of data science and analytics.

KEYNOTE: Predictability and other Predicaments in Machine Learning Applications

In the context of building predictive models, predictability is usually considered a blessing. After all – that is the goal: build the model that has the highest predictive performance. The rise of ‘big data’ has in fact vastly improved our ability to predict human behavior thanks to the introduction of much more informative features. However, in practice things are more differentiated than that. For many applications, the relevant outcome is observed for very different reasons. In such mixed scenarios, the model will automatically gravitate to the one that is easiest to predict at the expense of the others. This even holds if the predictable scenario is by far less common or relevant. We present a number of applications where this happens: clicks on ads being performed ‘intentionally’ vs. ‘accidentally’, consumers visiting store locations vs. their phones pretending to be there, and finally customers filling out online forms vs. bots defrauding the advertising industry. In conclusion, the combination of different and highly informative features can have significantly negative impact on the usefulness of predictive modeling.

Claudia Perlich
Chief Scientist
Connect