So what exactly is Big Data?

In the real world view, Big Data is the culmination of several years’ worth of data that your company has stored in their data warehouse as instructed by their DBA since, well, forever. This data that has been archived in different locations for safe keeping, and possible later use, is extremely valuable for marketing, sales and other decision makers in your organization.

The official Wiki definition of Big Data is: “a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, curation, storage, search, sharing, transfer, analysis, and visualization.” You will see that definition used in a lot of places.

The success of any company is becoming more and more dependent on unlocking the value of data and turning it into trusted information for critical decision making. The ability to deliver the right information at the right time and in the right context is crucial. Today, organizations are bursting with data, yet most executives would agree they need to improve how they leverage information to prevent multiple versions of the truth, improve trust and control and respond quickly to change.

If you are an IBM customer, it is very likely you have received some level of education about IBM’s Information Management solutions platform, which includes IBM’s Big Data strategy.

IBM released Hadoop-based InfoSphere BigInsights in May 2013. There are already Hadoop-based commercial distributions from other vendors such as Cloudera, HortonWorks and MapR. So it was interesting to learn how IBM stacks up against other vendors in the Big Data landscape. I learned more about this because I had the opportunity to get hands-on with the InfoSphere BigInsights Big Data ecosystem the week of October 7, at an IBM boot camp.

Click here to see the October Issue of the iOLAP iNSIGHT Newsletter.

Every successful technology goes through several cycles of invention, discovery, socialization, adoption and continuous improvement. Hadoop is no exception. It has been embraced by early adopters and is now in the “discovery path” for other customers and vendors. The adoption is well supported by third party vendors who have customized and extended their product offerings with their own Hadoop distributions and implementation to help customers adopt the new technology.

BI Professionals are used to working with a wide range of products and platforms and typically have a pretty substantial tool belt to be able to work across a multitude of different technologies. Over the past couple of months I took the opportunity to experiment with technologies that are entering the data warehousing ecosystem. These technologies included the Cloudera Sandbox, Hortonworks Sandbox, IBM Big Insights Sandbox, and Amazon’s Red Shift.

The business world continues to evaluate and implement the cloud for some of its IT requirements. The concept of the cloud as a viable IT storage solution as well as a way to cut costs is gaining momentum. But it might prompt the question: is the cloud the right place for a data warehouse?

This is an interesting question for many, and a problematic question for some.

We’ve all read the articles where someone with a massive data set unleashes the power of Big Data and discovers an awesome and magical insight about their customers that changes the course of history as we know it. We like to call them Magical Big Data Unicorns—so frequently discussed and pursued, but so rarely seen in the wild! We’re not saying they don’t exist, but we would like to suggest a better approach to getting solid ROI from these new-fangled Big Data tools.