In the article, he tracks the term’s ascendancy, like many tech phrases before it, from the world of IT into the wider popular vernacular.
“The stack” as a technology organizing principle and indicator of an end-to-end capability has particular resonance for us now, because it helps frame what predictive data science is capable of doing, while also bringing into focus the structural gaps the enterprise must overcome to benefit from it.
So, what is “the stack” and, particularly, what is it in the context of predictive data science?
I think of the stack as the distinct but interdependent layers of software that need to work together as part of a system to produce a desired result. And, in the world of predictive data science, the stack is designed to put enterprise clients in the driver’s seat – masking the complex code and interactions that make it work – so they can use prediction to accelerate better business outcomes.
Here’s a truth about the stack: in modern software development, we are never building systems or platforms from scratch. Instead, they are built off pre-existing components or packages. Put another way, the stack is those components and packages – along with custom layers developers may be adding – that we build on top of.
This act of appropriation not only accelerates our ability to focus on layers of the stack that drive business value, it is turning the lion of Big Data into a lamb – despite what you might be hearing to the contrary. Thanks to the Open Source community, it’s easier than ever to pull in an Open Source component to manage data volume – it is no longer something we or the enterprise has to use custom engineering to solve.
So, unless you are Facebook or Google, you can start worrying less about the “Big” in Big Data and focusing more on the “Data” – that’s where the real value is. To do so, companies need to get “data smart,” building out a data science capability, bringing in third-party data to better understand customers, and embracing discovery-driven planning, an iterative, learn-by-doing approach to predictive data science that puts wins on the board quickly.
Understanding predictive data science in terms of the stack helps expedite this shift.
You might have heard of stacks like LAMP (Linux/Apache/MySQL/PHP), MEAN (MongoDB/Express.js/AngularJS/Node.js), SMACK (Spark/Mesos/Akka/Cassandra/Kafka), and variants of the Hadoop stack. There isn’t one platform to rule the world, and there isn’t one stack either – but there are stacks that become popular for a reason. Predictive data science necessitates certain requirements that made TROVE favor a modified version of the SMACK stack over others.
Following is my personal inventory of data-science stack considerations:
The million-dollar question then is once you have a technology stack that’s able to store unique data in appropriate and efficient ways, and you have frameworks to allow analytics to run at scale, what business problems is that stack actually solving? Again, that’s the part of being data smart that is missing from most of the big platform players.
So how can this be? How has the enterprise bought the equivalent of millions of Big Macs (the original burger stack?) without getting any of the special sauce?
The short answer comes down to semantics. Software platform vendors have played up “Big Data” and the need to “manage it.” And they aren’t wrong, exactly, but managing data is just a step, albeit an important one, towards doing predictive data science. Too many stacks stop short of solving business problems, and that’s like building a rocket without a way to launch it.
The more businesses we talk to, the more we are finding a universal need for Solvers. We recently worked with a utility client who made a 7-figure investment in data “infrastructure” software, including an “analytics server,” an impressive technology stack big on “managing data,” but small on business impact. They essentially had bought an expensive data science rocket that was stuck on the launch pad.
We quickly realized they had a lot of valuable data being managed in their stack, but not being put to use. They needed some sauce! So, we layered our platform into their stack and deployed a predictive Solver to flag suspect accounts that were costing them millions of dollars. The approach worked. Without the Solver layer, the dollar saving potential of data – i.e. making it useful – had laid dormant, a condition we are finding in abundance across the enterprise.
This dormancy is often a product of endless tinkering with the data management stack. It’s o.k. to try to make your stack better; it may even be o.k. to try to perfect it; but it’s not o.k. to do so at the expense of using it. Predictive data science is an active verb in the lexicon of the enterprise – it gets results, and it gets better the more it is employed. So put it to work!
Put another way, if your company is spending millions of dollars on what is essentially a data management stack, and you’re not feeding it with the right data and not applying the right models for the right business cases on top of it, you are never going to monetize your investment.
That takes Solvers™. Add them to your stack today and prepare for liftoff.