Firing on All Cylinders: The 2017 Big Data Landscape
Read the full article by Matt Turck on MattTurck.com
It feels good to be a data geek in 2017.
Last year, we asked “Is Big Data Still a Thing?”, observing that since Big Data is largely “plumbing”, it has been subject to enterprise adoption cycles that are much slower than the hype cycle. As a result, it took several years for Big Data to evolve from cool new technologies to core enterprise systems actually deployed in production.
In 2017, we’re now well into this deployment phase. The term “Big Data” continues to gradually fade away, but the Big Data space itself is booming. We’re seeing everywhere anecdotal evidence pointing to more mature products, more substantial adoption in Fortune 1000 companies, and rapid revenue growth for many startups.
Meanwhile, the froth has indisputably moved to the machine learning and artificial intelligence side of the ecosystem. AI experienced in the last few months a “Big Bang” in collective consciousness not entirely dissimilar to the excitement around Big Data a few years ago, except with even more velocity.
2017 is also shaping up to be an exciting year from another perspective: long-awaited IPOs. The first few months of this year have seen a burst of activity for Big Data startups on that front, with warm reception from the public markets.
All in all, in 2017 the data ecosystem is firing on all cylinders. As every year, we’ll use the annual revision of our Big Data Landscape to do a long-form, “State of the Union” roundup of the key trends we’re seeing in the industry.
Let’s dig in.
High level trends
Big Data + AI = The New Stack
As any VC privileged to see many pitches will attest, 2016 was the year when every startup became a “machine learning company”, “.ai” became the must-have domain name, and the “wait, but we do this with machine learning” slide became ubiquitous in fundraising decks.
Faced with an enormous avalanche of AI press, panels, newsletters and tweets, many people who had a long standing interest in machine learning reacted the way one does when your local band suddenly becomes huge: on the one hand, pride; on the other hand, a distinct distaste for all the poseurs who show up late to the party, with ensuing predictions of impending gloom.
While it’s easy to poke gentle fun at the trend, the evolution is both undeniable and major: machine learning is quickly becoming a key building block for many applications.
We’re witnessing the emergence of a new stack, where Big Data technologies are used to handle core data engineering challenges, and machine learning is used to extract value from the data (in the form of analytical insights, or actions).
In other words: Big Data provides the pipes, and AI provides the smarts.
Of course, this symbiotic relationship has existed for years, but its implementation was only available to a privileged few.
The democratization of those technologies has now started in earnest. “Big Data + AI” is becoming the default stack upon which many modern applications (whether targeting consumers or enterprise) are being built. Both startups and some Fortune 1000 companies are leveraging this new stack (see for example, JP Morgan’s “Contract Intelligence” application here).
Often, but not always, the cloud is the third leg of the stool. This trend is precipitated by all the efforts of the cloud giants, who are now in an open war to provide access to a machine learning cloud (more on this below).
Does democratization of AI mean commoditization in the short term? The reality is that AI remains technically very hard. While many engineers are scrambling to build AI skills, deep domain experts are, as of now, still in very rare supply around the world.
However, there is no reversing this democratization trend, and machine learning is going to evolve from competitive advantage to table stakes sooner or later.
This has consequences both for startups and large companies. For startups: unless you’re building AI software as your final product, it’s quickly going to become meaningless to present yourself as a “machine learning company”. For large organizations: if you’re not actively building a Big Data + AI strategy at this point (either homegrown or by partnering with vendors), you’re exposing yourself to obsolescence. People have been saying this for years about Big Data, but with AI now running on top of it, things are accelerating in earnest.
Enterprise Budgets: Follow the Money
In our conversations with both buyers and vendors of Big Data technologies over the last year, we’re seeing a strong increase in budgets allocated to upgrading core infrastructure and analytics in Fortune 1000 companies, with a key focus on Big Data technologies. Analyst firms seem to concur – IDC expects the Big Data and Analytics market to grow from $130 billion in 2016 to more than $203 billion in 2020.
Many buyers in Fortune 1000 companies are increasingly sophisticated and discerning when it comes to Big Data technologies. They have done a lot of homework over the last few years, and are now in full deployment mode. This is now true across many industries, not just the more technology-oriented ones.
This acceleration is further propelled by the natural cycle of replacement of older technologies, which happens every few years in large enterprises. What was previously a headwind for Big Data technologies (hard to rip and replace existing infrastructure) is now gradually turning into a tailwind (“we need to replace aging technologies, what’s best in class out there?”).
Certainly, many large companies (“late majority”) are still early in their Big Data efforts, but things now seem to be evolving quickly.
Enterprise Data moving to the Cloud
As recently as a couple of years ago, suggestions that enterprise data could be moving to the public cloud were met with “over my dead body” reactions from large enterprise CIOs, except perhaps as a development environment or to host the odd non-critical, external-facing application.
The tone seems to have started to change, most noticeably in the last year or so. We’re hearing a lot more openness – a gradual acknowledgement that “our customer data is already in the cloud in Salesforce anyway” or that “we’ll never have the same type of cyber-security budget as AWS does” – somewhat ironic considering that security was for many years the major strike against the cloud, but a testament to all the hard work that cloud vendors have put into security and compliance (HIPAA).
Undoubtedly, we’re still far from a situation where most enterprise data goes to the public cloud, in part because of legacy systems and regulation.
However, the evolution is noticeable, and will keep accelerating. Cloud vendors will do anything to facilitate it, including sending a truck to get your data.
The 2017 Big Data Landscape
Without further ado, here’s our 2017 landscape.