The Long Road from ‘Big Data’ to Just ‘Data’

 [originally published in The Transformed Datacenter, a UBM/DeusM publication ]

A few weeks ago, editor-in-chief Marilyn Cohodas shared with us the results of a poll showing that while three-fourths of respondents plan to learn something this year about the topic of big data, only 4 percent are planning to actually deploy it.  The takeaway from that report:  Businesses are displaying healthy skepticism about the need for this new technology.

Now, let’s be honest:  Usually, you start to see these sorts of adoption polls during the second phase of a new technology’s marketing campaign.  The first phase is where you’re told about the massive wave of change sweeping the enterprise, and how companies that have listened closely to their customers were inspired to create or build this new technology after gauging consumer demand that’s rising into the stratosphere.  The second phase comes when said demand fails to rise into said stratosphere, and companies begin waging a campaign of embarrassing customers into adopting a new technology before their competitors do.

In other words, you don’t understand what big data is, and whose fault is that?

In the absence of hard and factual definitions, “big data” becomes an ambiguous, vacuous marketing phrase left to blow in the wind, like “cloud computing” or “central intelligence.”  Inevitably, the concept gets filled from the outside, and folks become paranoid of the results.  An entire industry has consolidated around making you afraid of things you don’t understand, and the reason why is because it works.

Thus begins the dreaded Third Phase of a new technology’s marketing campaign: an attempt to capitalize on the momentum of the ominous specter of the undefined technology’s repercussions, while simultaneously making a daredevil U-turn toward positivity.  (Its counterpart in the political realm is best characterized by the memorable phrase, “I’m not a witch.  I’m you.”)  You can always spot the onset of this phase when marketers begin their messages with the word, “Imagine.”

Play the John Lennon music in your mind for a moment, while sampling this article, posted in response to a Computerworld article noting the shortage of big-data expertise in the enterprise:  “Imagine what a symphony and big data have in common... Just like a symphony, effective data management must take into account enterprise business management skills along with the sophisticated technical skills required for data manipulation.”  (And you thought Mozart wasn’t certified.)

An entire industry has consolidated around making you afraid of things you don’t understand, and the reason why is because it works.

Here’s my favorite, from an IBM-sponsored report on big-data to a public sector technology conference (PDF available here):  “Imagine a world with an expanding population but a reduced strain on services and infrastructure; dramatically improved healthcare outcomes with greater efficiency and less investment... a world with more cars, but less congestion; more insurance claims but less fraud; fewer natural resources, but more abundant and less expensive energy.”

A world comprised entirely of Fox News’ target audience.

So it should come as no surprise that the backlash against this literally imaginary characterization of the concept looks like it came straight out of MSNBC.  Consider this thought experiment:  “Imagine you had a massive computer database that contained all possible measurements that could ever be made over the entire span of all space and time.”  The author posits this by way of arguing that, by valuing data itself over the model or theory behind it, society de-evolves to a point where nothing is believed before all the data pertaining to it exists.  Thus, nothing is real.  (And nothing to get hung about.)

The title of another example frames the backlash argument in a nutshell:  “Big Data is our generation’s civil rights issue, and we don’t know it.”  The author here writes, “With the new, data-is-abundant model, we collect first and ask questions later... And this is a dangerous thing.”

The reason why businesses remain skeptical about actually deploying big-data technology is because the more they read about it, the less they know about it.  This is not their fault.  The outcome of the implementation of big-data is either the end of all socio-economic hardship or the backslide of civilization toward the Stone Age.

When all we’re talking about is this:  A special operating system developed for search engines makes it possible to store data in large quantities, across multiple volumes, without it having to become structured or indexed first.  A process scheduler (a surprisingly unsophisticated one) enables mathematical processes to be performed on this unstructured data, including analytics, rendering the data useful in some capacity prior to, or in place of, becoming indexed.

That’s “big data.”  Or, as Hortonworks’ director of product marketing Jim Walker explained to me, for an earlier article for Enterprise Conversation:

I think big-data is a market group of these newer technologies that are taking advantage of processing in massive, parallel systems: being able to store massive amounts of data, to serve up data very quickly, to cluster machines [together]... What we’ve done is, we’ve started to mix processing with storage.  The market is moving so fast that terms like “big data” are going to become just “data” at some point.  Many organizations are already thinking like that.  It’s becoming a critical piece of the overall enterprise data architecture.  “Big data” is just another conceptual way of thinking about how you’re storing and processing the data.

That said, there are still a lot of organizations that are exploring the edges of what’s possible with these new technologies.  The vast majority of organizations are just getting their arms around it, just starting down the big-data journey with Hadoop and NoSQL.  So [because] the market’s moving very fast, there’s this odd time for things to catch up — this odd kind of race between hype and reality.  Quite honestly, I think the hype overtook the reality a little bit.  [Meanwhile], we’re moving beyond the hype of big data into this world of just data.  And quite honestly, everybody’s thinking like that anyway.