Friday, February 1, 2013

Big Data. Big deal?

Everywhere you turn you find someone is writing about big data.  The cover of Forbes, the Wall Street Journal, Time, and many others are covering the topic.  But, what is big data, and more importantly, why should I care?

Let's start with the definition....  Big data is well.... big.  But the real definition lies in not only volume, but also variety and velocity.  Variety meaning that it comes from many sources.  Look at businesses for example.  For years they have been reporting and analyzing their internal financial data.  There is a lot of it, but it comes from just one place - their internal system.  Now imagine combining that data with external data like Twitter feeds, weather information, population demographics, etc.  The insights you draw from the data may be different when you make it "big".

What about this velocity thing?  Are we about to have a physics discussion?  No. there won't be discussion of the V = D/T equation here.  For data, velocity is about the rate at which it changes.  The change in the data allows for better analysis over time because the sample size is greater.  Velocity plays a significant role in making the data "big"

So why should we care?  Is this about a bunch of computer geeks playing with 1's and 0's in a new way to create more job security?  No, not this time.  This big data stuff is real.  By using data that changes often, comes from disparate sources, and has enough samples to draw conclusions you can derive insights that were not easily made without big data. 

Just look at our last presidential election.  The Obama team spent millions of dollars to use big data to help win the election.  They created a team of data scientists, tucked them away in a windowless office in Chicago and asked them to take information from various sources to help predict and manage donor and voter behaviors.  They measured and tracked everything.  From the effectiveness of who an email piece came from to the tendencies of online donors.  When campaigners went door-to-door they knew in advance what type of information to leave behind.  They knew that emails from Michelle Obama were much more likely to raise money than from Joe Biden when sent to a specific demographic of prospective donors.  Not because Joe Biden is a putz, but because the numbers were right there in front of their eyes.  Television ad content, times, durations, etc were set as a result of the data, not some gut feel.  They were using the data to drive their decisions.

These data efforts helped the Obama campaign raise over $1 billion and ultimately know days before the election which swing states they were going to win.  They knew the outcome of the election well before the first ballot was cast because they knew the voters.  The data told them what the voters were going to do.

The sources for data are wide ranging.  The US government has a web site dedicated to data -  It may not always meet the velocity requirements - you may have heard that they can be a little slow at times - but the variety and volume are there.  Access to information on Twitter, Facebook, and other social sites can provide great insights into trends and behaviors.  Digital marketplaces are popping up with people selling access to their data. 
With the digitization of our world, the volume, velocity, and variability of data will only grow.  Those that will succeed will be those who embrace the concept of big data and use it to drive value.  Simply put, big data can translate to big opportunities.

1 comment:

  1. There are lots of information about latest technology, like Hadoop cluster is a special type of computational cluster designed specifically for storing and analyzing huge amounts of unstructured data in a distributed computing environment. This information seems to be more unique and interesting. Thanks for sharing.
    Big Data Course in Chennai | Best Hadoop Training in Chennai | Big Data Training