Skip to main content
dstillery dstillery

by Dstillery Contributor


Considering Privacy in Predictive Modeling Applications


Large-scale data applications are increasingly a part of daily life. For example, the GPS in your phone that can tell you the fastest way to the airport incorporating real-time traffic data, Netflix suggests movies based on your entire viewing history, and the spam filters in email software learn individual spam preferences. Some of these applications rely on methodologies that are more data hungry’ than others. Predictive modeling based on fine-grained data, which powers many of these applications, often requires a great deal of data, but relatively little understanding of any individual data point. In this work, we examine some simple modeling decisions that can make predictive modeling more privacy friendly without jeopardizing its performance and ultimate value.