In Part 1 of this 3 part series on Power BI, I discussed the tenets of self-service analytics and how Power BI could be an enabler. In part 2 of this series, the focus will be on the specific capabilities of the Power BI platform. How to perform specific tasks with Power BI will not be addressed, only the capabilities and potential of the overall platform. Part 3 of the series will discuss and review deployment scenarios.
As a review, the major components of Power BI are: Power Query, Power Pivot, Power View, Power Maps (available as Excel plugins), Question & Answer (available via Office 365) and SharePoint as the delivery mechanism. Each of the components is designed to complement the other components of the Power BI platform. From a data management perspective, the components can be categorized as follows:
From a process perspective, the components of the Power BI platform build upon one another in the following manner:
The above process flow serves to illustrate how the components of the Power BI platform may be used to build and deploy a solution. Now the question is what are the capabilities of the components and how can they be leveraged.
Power Query – Data Discover & Transformation
The first questions that are asked when building an analytics solution are what is the goal/purpose of the solution, what data is required and where does the data exist? Power Query provides the capabilities to address these questions.
With Power Query, multiple internal or external data sources can be imported and subsequently transformed to fulfill the defined solution requirements. Query results are stored as tabs within an Excel Power Pivot workbook. The data sources may come from standard sources such as databases, files and web services. Additionally, data may come from external sources such as web pages, and registered data providers such as the Azure Data Marketplace or even Wikipedia.
This provides the capability for data enrichment that has not been readily available in the past. An example would be supplementing internal customer address data with web based public data on average household incomes. Furthermore, transformation rules can be generated that filter or augment the data prior to final analysis. The transformation rules can be saved for future use when the data is refreshed. The following figure illustrates the process.
For registered data providers, Power Query can facilitate data discovery using simple queries such as “Average home selling price in region ABC”. Using this capability Power Query presents samples of candidate data. Based on the samples, the most appropriate can be selected based on requirements. Power Query will then retrieve the associated data, making it available for transformation and analysis.
The following figure shows an example of an online query performed for the “mean income for California counties”. Note that Power Query presents multiple options. A preview of each data source is displayed as it is selected. Once an appropriate source has been chosen, it can be imported into Excel.
Figure 1: Power Query online search
Here is the data that was retrieved
Figure 2: Power Query results selection
Next the data is filtered by Los Angeles
As previously stated, data retrieved via Power Query may be stored as a tab in the associated Excel Power Pivot workbook where further data integration and consolidation may be performed.
Power Pivot – Data Integration and Consolidation
Once data has been identified, it needs to be integrated and consolidated for further analysis. Power Pivot can be used for this task. Power Pivot does not provide some of the data discovery capabilities of Power Query but, it can leverage Power Query data as well as integrate data from other sources. Power Pivot overcomes the data volume limitations of Excel by allowing for the analysis of millions of rows of data from multiple data sources. This is accomplished via a column oriented in-memory analytics engine and data compression.
Each Power Pivot source is materialized as a tab in an Excel Power Pivot workbook. Relationships can then be established among the data sources using common fields. This will require some level of planning and design to ensure that relationships do exist among the data. In addition to defining relationships, Power Pivot supports the Data Analysis Expression (DAX) Language which is used to define calculations and perform data manipulation and analysis. It is possible to define calculated columns and fields that are context aware, key performance indicators and hierarchies. The following figures illustrate how data sources and data relationships are materialized in Power Pivot.
Figure 3: Power Pivot Data Sources
Figure 4: Power Pivot Data Source Relationships
It is also possible to create perspectives (views) of Power Pivot data that may provide a limited subset of fields available in the Power Pivot model. This would be useful to design functional views of the data (Sale, Operations, Finance, etc.)
Power Pivot has an additional component that can be integrated into SharePoint. In a SharePoint environment, Power Pivot query processing and data refresh for published workbooks is enabled through Power Pivot server components that are available in SQL Server 2008 R2 and later. The Power Pivot for SharePoint feature provides services, a management dashboard, library templates, application pages, and reports for using and managing server software. Power Pivot server components are fully integrated in SharePoint. SharePoint administrators use Central Administration to configure and manage Power Pivot services in the farm. Power Pivot workbooks stored in SharePoint are stored in a Power Pivot gallery that provides previews of the contents of Power Pivot based Excel reports.
Power Pivot workbooks can serve as data sources for Excel pivot tables, SQL Server Reporting Services (SSRS), Power View/Power Map visualizations and other Power Pivot workbooks. Power Pivot workbooks also form the foundation for Power BI Question & Answer functionality.
Power View – Data Visualization
Power View facilitates data visualization. Visualizations are developed in Excel and leverage an underlying Power Pivot Model, Tabular Mode SQL Server Analysis Services (SSAS) instance or more recently, a Multidimensional Mode SSAS instance.
Power View allows for the creation of highly complex and interactive visualizations. The associated model (Power Pivot, SSAS Tabular or SSAS Multidimensional) provides the context that drives the visualization.
Several Types of visualization are available in Power View in the forms of:
Tables – Flat Table, Matrix (Pivot Table), Card (Data visualized as a note card)
Bar Chart – Stacked Bar, 100% Stacked Bar, Clustered Bar
Column Chart – Stacked Column, 100% Stacked Column, Clustered Column
Maps – Bing Maps
Other – Line, Scatter, Pie
Slicers – Objects which filter data
Tiles – Provide filtering capability and facilitates nesting of objects
Within Excel, each Power View visualization resides on a separate tab and may contain 1 or more of any of the aforementioned visualization objects. This provides flexibly in creating visualizations that are only limited by one’s imagination. Objects may also be nested. The following figure illustrates some of these capabilities:
Figure 5: Power View Visualization Objects
The above figure depicts a tile object that is filtering on “Bikes”. The Tile object contains a table, column chart and bar chart. On the right hand side of the image is displayed a context sensitive menu that contains formatting options for the selected visualization object.
Power View visualizations can be developed in Excel and uploaded to SharePoint for distribution. Furthermore Power View visualizations can be created directly in SharePoint. When created directly in SharePoint, the Power View visualizations may be exported to PowerPoint with their full interactivity.
Power Maps – Data Visualization
Power Maps extend the data visualization capabilities of Power View with a focus on Geo-spatial data. Data can be visualized as column, bubble, heat map or regional (used with country, region, postal codes) charts with multiple map layers, annotated with text or overlaid with two dimensional charts. Data maybe mapped using standard country, region/province/state, postal codes as well as longitude and latitude. Data to be mapped can be provided by Power Pivot or individual worksheets within an Excel workbook.
A Power Map visualization within an Excel workbook is called a tour and each tour contains individual sheets referred to as scenes. This provides the ability to map multiple areas or scenarios within a single Power Map visualization. Additionally the tour and its associated scenes may be played back and captured as video or screen shots for distribution.
One more dimension of a Power Map visualization is time. A time attribute may be associated with data in a Power Map visualization so that the changes in the data values may be displayed over time. Time can be specified as a day, month, quarter, or year. The associated values may be displayed as point in time, cumulative or static until replaced. Using this feature with the ability to record and playback Power Map tours provides significant visibly and insight into what the data visualizations represent. The following figure shows a Power Map tour and scenes.
Figure 6: Power Maps Visualization
The left side of the visualization displays the Power Map tour and its associated scenes. The tour name is “Product Line” and the scene names are “Product Line Sales” and “Product Line Margin” respectively.
The mapping mechanism for Power Maps is Provided by Bing Maps. This means it is possible to zoom in/out, tilt/rotate and display the visualization in a globe or projected (flat) view.
Self Service BI – Question & Answer
There are times when data is available but not appropriately formatted for presentation or there may be ad hoc questions that need to be asked. This is where Question & Answer (Q&A) becomes relevant. Normally one would just consume a Power Pivot workbook or Power View/Power Map visualization as-is. Q&A moves beyond that. Q&A can be used to query existing Power Pivot workbooks as well as data that is on premises.
Q&A is a component of Power BI that is hosted via Office 365. Data sources can be queried using natural language queries within the context of the data selected data. An example of such a query would be “Total sales for product line 123 and in the United States”. Q&A parses the query as it is entered and presents its version of what it considers is being asked. To best take advantage of Q&A, a vocabulary of synonyms should be established to tie business concepts to the underlying data.
If the same question is asked regularly, then that question may be saved as a featured question in the Q&A catalogue. Also, Power Pivot workbooks can be saved as favorites and presented as featured workbooks for analysis (see the following figure).
Figure 7: Q&A Featured Questions & Reports
Q&A will determine the best way to display query results based on what s being asked. Results may be presented as a single value, table, chart (bar, line) or a map. If required, an option exists to override the query results display format (see the following figure).
In the figure below, underneath the entered query is the Power BI interpretation of the query. On the right hand side of the visualization are the options to change the visualization display format.
What has been presented thus far is a high level overview of Power BI capabilities with the intent of providing ideas as to how it can be deployed as a component of a self-service analytics strategy. With planning and understanding of these capabilities, it is possible to develop and deploy highly interactive, functional and complex analytics solutions that can meet the needs of multiple business roles and scenarios.