Gearing up for the Microsoft Data Insights Summit, Microsoft hosted a Power BI Data Analytics Challenge. We participated by entering in the “Just For Fun” category, submitting a quickly thrown together dashboard with analytics about The Simpsons. The data was sourced from Kaggle – The Simpsons by the Data. The data is driven by script lines. In addition, we supplemented the data with images and bios from http://www.simpsonsworld.com/. The Simpsons Data Analysis dashboard allows users to explore character profiles, season appearances, and dialog sentiment. The Character Report tab assess who’s talking and where they are talking by word count. Images
Recent years have seen the emergence of self-service and data visualization vendors such as Tableau, Power BI and Qlik Sense that has forced enterprise BI players such as MicroStrategy to improvise their tools to compete with these vendors. I have been a solutions architect in the field of BI been using MicroStrategy for more than 10 years. From my perspective as an architect, I appreciate the power of the underlying SQL engine in MicroStrategy and do not foresee self-service vendors replacing MicroStrategy as an enterprise BI tool in the coming years. What thrills me most as a Solutions Architect in
In the spirit of exploring new visualizations, I wanted to learn how to create a Sankey diagram within MicroStrategy Desktop (also known as Visual Insight in web). A Sankey diagram is used to visually explain flow or many to many relationships with respective proportions. The visualization needs a source attribute, target attribute, and measure to size the path or flow. In order to support the visualization, I grabbed data off of http://www.starwars.com/databank. The databank also had images I could leverage as a twist on the visual format. Cross referencing IMDB, I created a list of popular characters by Star Wars
We were recently asked by a customer to assist with getting their Cloudera environment spun up on Azure. While this has been accomplished several times, we had some unique challenges to solve due to security requirements. This post will cover the major pre-requisites and challenges we faced along the way.
We had Cloudera and Microsoft professional services work with us as we performed the installation for the client.
Star ratings have become a staple for assessing review sentiment online. We see them all the time when we shop and can quickly associate whether a review is worth reading based on the 1 star vs 5 stars attached to the review. Given the popularity, I recently included star ratings by product in a dashboard designed for an executive audience. The visualization is intuitive and straight to the point. However, I started thinking why stars and not other shapes? To test out using other shapes, I kept it simple and fun. Star ratings, or in this case custom shape ratings
I had a requirement to create a document with object prompts. The document was essentially acting as a report builder that could be customized based on subscription. There was an additional requirement to support alias name functionality. I found previous blog posts and TNs on how to create documents with object prompts, but I was running into challenges supporting the alias names. In addition, I noticed that when I selected a smart metric only, if its components were available in the object prompt, both the smart metric and component metric would show up on the document. The functionality differs for
Several years ago, one of our project teams was tasked with building out a full EDW in Amazon Redshift. The solution needed to source from multiple subject areas including sales, supply chain, operations, and customer demographics. We truly had a great amount of variety, velocity, and volume requirements. The sales source was configured to publish large sets of flat files to Amazon S3 throughout the day and large reconciliation batches overnight. Seeing that multiple different types of SCD’s would need to be maintained, the lack of native S3 connectivity became a significant roadblock in the workflow. Our team set out
Introduction Netezza is a data warehousing appliance that uses an Asymmetric Massive Parallel Processing Architecture. The Netezza architecture is driven by two fundamental principles- process close to the data source and do not move data unless absolutely necessary. Among the two principles, the latter is implemented primarily by commodity hardware Field Programmable Gate Arrays (FPGAs). FPGA plays the pivotal role in filtering out data as soon as possible, removing I/O bottlenecks, freeing up valuable downstream components such as memory and processor. Zone Maps are internal data structures of Netezza that enables FPGAs to filter out data. In simplistic terms, a
Business Intelligence as a field has gained rapid maturity over a period of time. We are living in an era when we constantly hear buzzwords like Big Data, Prescriptive Analytics and Data Science. In the midst of these catchy phrases, the one phrase that still stands out for me is Self-Service BI. Self-Service BI is defined as an approach that enables business users to access and work with corporate data even if they do not have a background in Business Intelligence. This approach is primarily intended to reduce dependency of business users on IT for creating their reports. In the
Introduction Every five to six years, there comes a technology wave, and if you are able to catch it, it will take you a long way. Throughout my career, I’ve ridden several of these waves. MPP data warehouses brought us incredible speed for analytics and a few headaches for data integration. We’re seeing in-memory analytics reducing disk latency. Hadoop based technologies are opening up new solutions every day for storage and compute workloads while our source systems are still generating varying degrees of velocity, volume, and variety. As a traditional ETL developer, I would usually try to figure out the