The IoT Journey — Visualizing IoT Streaming Data with Power BI

In my previous posts in this series(see posts one, two, three, four, five and six) I discussed the construction of a system to collect sensor data from a device and send that data to the Microsoft Azure IoT Suite. The final step in this journey is to build an application that will use the data that we’re sending to the cloud. There are many approaches to building the visualization layer (for a complete discussion on this topic, see the IoT Suite Remote Monitoring solution here: https://www.microsoft.com/en-us/server-cloud/remotemonitoring/Index.html ), but I wanted to incorporate the use of Microsoft Power BI to demonstrate a Software as a Service (SaaS) solution to visualize the output of the sensor platform.

Obviously this is overkill for temperature and humidity data for a single sensor, but imagine having a worldwide network of these sensors reporting data. The beauty of the cloud and the SaaS platform for visualization, is that there is virtually an unlimited amount of capacity available to you with very little work on the front end to build the solution.

The first step in this process is to obtain and provision a Microsoft Power BI subscription if you don’t already have one. Power BI is available in a free tier that will work for this example, so you do not need to purchase a subscription in order to build the solution.

Step One – Obtain a Power BI Subscription

Sign up for Microsoft Power BI at www.powerbi.com and select the Get Started now button. Follow the instructions on the screen to sign up. Note that you must use a corporate email address (not an @gmail, @outlook or @hotmail address). If you want, you can sign up for a 30-day trial of Office 365, or sign up for a $5 per month plan and then use that address as your Power BI login. The good news there is that after the 30-day trial, Power BI will continue to function normally.  Once you’ve signed up for Power BI and logged in, you’ll see the following screen:

image

Once Power BI is successfully provisioned, the next step is to configure the Azure IoT Hub to send data to Power BI so that we can add it to a dataset.

Step Two – Send IoT Data to Power BI

One of the services available in the Azure IoT Suite is Azure Stream Analytics (ASA). ASA is a fully-managed cloud service that enables real-time processing of streaming data. It is a very powerful service, and when coupled with Azure Event Hubs, can scale to millions of devices.

For the purpose of this post, we will use ASA to receive data from the IoT Hub that we created in the previous post, and then output the data to a data set that is read by Power BI to build a report and dashboard to represent the data being sent by our sensors. As mentioned earlier, this is overkill for a single sensor, but it will give you an idea of how simple building this solution is, and of course it can be easily scaled to a virtually unlimited amount of sensors.

As you recall, in the last post we created an Event Hub (in my case it was named FedIoTHubDemo) and we configured it to receive data from our Raspberry Pi device. Now we will use that hub to send data to ASA so that it can be viewed in Power BI. You will need to open the Azure Management Portal and navigate to the Event Hub that you created in the previous post.

image

Make sure that you make note of the name and region where the resources are located.

To connect ASA to the Event Hub, we will perform the following steps:

  • Create a new Stream Analytics Job
  • Configure an Input to ASA
  • Write a Query using the Azure Stream Analytics Query Language
  • Configure an Output to Power BI (as of the time of this post, this is not supported in the management portal, we will need to use the legacy portal to perform this step)
  • Start the Job

In the Azure Management Portal, select New in the upper left, and then select everything in the Marketplace pane, and then type Stream Analytics Job in the search box:

image

Then select Stream Analytics Job from the search results:

image

 

Then select Create to open the ASA blade:

image

Give the job a name, select the resource group that you used for the Hub and then select a location for the resources. Then select Create to create the Job. This will start the deployment process and you will receive a message in the portal when the deployment is completed:

image

Once the deployment is complete, click on the job to open the dashboard and settings blade:

image

Note that there are no inputs or outputs configured. For our example, we will configure an Input that uses the IoT Hub that we previously created.

Click on the cloud icon in the essentials section of the pane to open the Quick Start Wizard:

image

Then click Add inputs. Give the input a name, and then select IoT Hub as the source. Make sure the drop-downs are filled with the appropriate names from the IoT Hub that you created earlier. Use the Shared Access Key for the iothubowner policy that you copied to the text file in the last post (or copy it again from the IoT Hub that you created).

image

Once all of the fields are filled out (don’t worry about the Consumer Group dialog unless you chose a name for it previously. It is named $default unless you chose a name) click Create to create the Input. The system will create the input and then test it to ensure that it is properly configured.

Once the input is created, click on Step 2 of the quick start wizard, Develop a Query. This will open the query pane with a sample query:

image

The ASA Query Language is very similar to T-SQL with some extensions specifically for streaming data. In our scenario, the message that is sent from the Raspberry Pi to the Azure IoT Hub is very basic, consisting of 7 fields(deviceID, temperatureF, temperatureC, humidity, latitude, longitude and timestamp):

This message is sent to Azure approximately every 3 seconds. In our case, we will want to create a query that collects the data in the stream, groups it appropriately and then assigns a time window to the collection times so that we know what the groupings refer to. The most appropriate groupings are deviceID, latitude & longitude, and the time window is controlled by the timestamp value. In theory this will be every 3 seconds, but there is no guarantee that the Pi will send the data on that schedule, so we will create a Tumbling Window to represent a 5 second interval. (in production we would likely change this to have a wider window, as we have no driving need to see the temperature every 5 seconds). The resulting query will look like this:

image

Click Save to save the query. Make sure that the FROM table you reference in the query is the same name that you gave the Input earlier. Typically you would test the query at this time, but currently testing is not supported in the Azure Management Portal. (It should be available soon, and I’ll update this post when it is)

image

Once the query is saved, you will need to launch the legacy Azure Management Portal (http://manage.windowsazure.com ) as the current portal does not support creating a Power BI output sink for ASA.

 image

Launch the legacy portal and navigate to the Stream Analytics Job that you created.

image

Select the OUTPUTS tab, and add a new output. Select Power BI and then click the next arrow:

image

Then click the Authorize Now button and sign in using the credentials that you used to provision your Power BI environment. This will bring you to the configuration dialog for the output:

image

Note that the values you use for the DataSet and Table will appear in Power BI, so choose a name that is meaningful. Click the OK CheckMark to create and test the output.

image

Now that the output has been configured, you can exit the legacy portal and return to the new portal where you will see the output you just created listed in the outputs section of the job:

image

Next, in the ASA Job blade, click the Start button, select Now, and click the Start command. This will start the ASA Job, but will take several minutes to complete.

Once the job is started, you will see a message in the portal.:

image

Now that the job has been started, we will need to start the sensor application on the Raspberry Pi and start sending data to Azure.

Step Three – Send the Data to Azure

This is the easiest step, since we’ve already built and tested the application. Follow the steps from the previous post to use an SSH client to connect to your Raspberry Pi and start the DHT11Test application to start sending data to Azure:

image

Let the application run for a few minutes before proceeding to the next step.

Step Four – Build the Power BI Dashboard

After the application has been running for awhile, launch Power BI and expand the workspace on the left side of the screen (note the arrow icon at the bottom left)

image

You will see the DataSet name that you provided above in the Datasets section. Click the dataset to open the designer. Note the table name that you specified earlier, and the fields from the Raspberry Pi application:

image

For the purposes of this post, we will build a simple report and dashboard to visualize our environmental data. Click on the deviceid, temperaturef, temperaturec and timestamp fields. Note that you will have a table displayed with each of these values:

image

Note that the numeric values are summarized at the bottom of the table. Since we do not want this, choose each of the numeric fields in the Values section and select Don’t Summarize in the drop down list. Your table should look like this:

image

Since we want to build a nice dashboard instead of a table of values, lets switch to a gauge to display each of the values. In the Visualizations pane, select the Gauge icon. Drag the temperaturec field to the Values box, and in the drop down, select Average. Leave the other textboxes blank. Resize the Gauge to make it look nice. Your canvas should now look like this:

image

Now click on a blank area in the report canvas, and then select the humidity field. Click on the Gauge icon in visualizations, and repeat the above step to visualize the average humidity. Repeat for the temperaturef field as well.

image

Now click on a blank area under the gauges, and select the timestamp field. In the visualizations select a line chart, and drag the temperaturef field to the values and the timestamp field to the axis. Resize the graph and then click on an open area in the canvas. Repeat these steps choosing the humidity field, and then resize the graph. It should look like this:

image

Now, click a blank area under the line graphs and select the Map visualization. Drag the devicelatitude field to the latitude box, the devicelongitude field to the longitide box, and the temperaturef field to the size box. Resize the graph to fit the canvas. It should look like this:

image

You have now built a report, using data from the sensors on the Raspberry Pi and streamed to Power BI via Azure Streaming Analytics!

On the report canvas, choose File and then Save and then give the report a name. Once the report is saved, click the Pin icon on each graph to pin to the dashboard (the first pin will require you to name the dashboard). Once everything is pinned, select the dashboard in the left side and resize the elements as desired.

Now you have a dashboard connected to the report. Notice that when you click on a dashboard element, it drills down to the report page that contains the visualization.

Spend some time experimenting with various visualizations, and explore the power of Power BI.

Conclusion

In this post we completed the journey by connecting the data collected from sensors on our Raspberry Pi to Power BI by using the Azure IoT Suite.

This showcases how powerful the Azure IoT Suite can be, while remaining relatively easy to develop solutions that utilize the suite.

If you purchased the sensor kit from Sunfounder that I mentioned in earlier posts, you’ll find that you have several additional sensors to play with that can be added to your Raspberry Pi environment and connected to Power BI.

Using Power BI Desktop to Create Data Visualizations and Explore Data

In previous posts, I discussed how to create an Azure SQL Database using data from the City of Chicago Data portal. (See previous posts on this blog for more information) Part of the process was showing how Microsoft Excel can be used in conjunction with Power Pivot and Power View in order to create data visualizations. Using tools such as Excel to build visualizations is very practical because many people are comfortable working with Excel, and sharing the data visualizations is a matter of simply sharing the Excel workbook. There are, however, situations where a richer design canvas is required, or you want to publish your visualizations to a wide audience that may not have Excel installed.

Microsoft has recently released the Power BI Desktop, a tool that will allow you to create full-featured data visualizations on your desktop, and then publish them to the Power BI service (which has a free tier as well as a professional, paid tier) allowing you to share your visualizations with anyone that has a web browser and Internet connectivity.

Downloading and Installing Power BI Desktop

Before attempting to install the Power BI Desktop application, read through the system requirements to make sure that your system will support it. To download, click on the download button on the Power BI Desktop page: https://powerbi.microsoft.com/desktop Save the file to a local folder and then double-click to install. Follow the instructions on the screen and install the Power BI Desktop application along with any pre-requisites that the installer determines that it needs. Once the install completes, start Power BI Desktop which will open the following screen:

Once the application starts, you are ready to build your first dashboard. For the purposes of this post, we’ll recreate the visualizations that we built from the same data source using Excel Power View. (See Part 3 in my previous posts)

Connecting to Azure SQL Database and Loading Data

Click on the Get Data button and select the Microsoft Azure SQL Database option.

Click the Connect button and then enter your Azure SQL Database Server and Database information and then click OK.

When prompted, select Database Authentication and enter your connection credentials.

If you receive an error related to the firewall rules, follow the instructions from Managing the Database in Part 1 and then click the Refresh Preview button to refresh the data connection. If you do not receive an error, skip to the Navigator section below.

Once the refresh is complete you will see a list of objects in the database (which will look slightly different than the picture below as there are more objects in my database than what I’ve documented in previous posts)

On the Close & Load button, select Close to return to the blank canvas. When the canvas loads, select the Recent Sources button and select your database connection. This will open the Navigator.

Select the ChicagoCrimes, Community, DateDimension, PoliceStations and SocioEconomicFactors tables and then click the Load button.

The data load will start (during this process, Power BI Desktop is loading the data into a local Data Model that is using the same xVelocity data engine that Power Pivot uses) and will take some time to complete.

Once the data is loaded, we will want to ensure all of the appropriate relationships are in place and will also use the DAX language to create a very simple calculated measure to ensure that appropriate visualizations are easy to create. You will also want to click the Save command to save the data model at this time.

Managing Data Relationships

When the data load completes, you’ll be returned to the Power BI Desktop canvas. You will want to verify that the relationships between the fact table and dimension tables were properly detected. (At the time I’m writing this article, there is an issue with Power BI Desktop where it did not detect the relationship between the DateDimension table and the ChicagoCrimes fact table). Click on the Relationships icon on the left side (the bottom button) to open the data diagram view.

If there are any missing relationships (in the image above, the relationship between the ChicagoCrimes table and the DateDimension table is missing), click the Manage Relationships button on the ribbon to open the Manage Relationships dialog.

Click New to create a new relationship. Select the ChicagoCrimes table from the first drop-down, and then highlight the DateOnly column. In the next drop down select the DateDimension table and verify that the Date column is selected. And then select OK.

Select Close and verify that the relationship has been created. When all the relationships are created, the data model should look like this:

 

Once the relationships are properly created, we’ll want to add a calculated measure to simplify the creation of visualizations using the model.

Creating a New Calculated Measure

For the purposes of this post, we will create a very simple calculated measure that provides a count of the number of crime reports in the data. To create the measure, we’ll use the DAX language (that is typically thought of as a Power Pivot language, but since Power BI Desktop uses the same engine, the language is used here as well) to create the measure.

Select the Data icon on the left side (the middle icon) to open the data view, and ensure that the ChicagoCrimes table is selected in the Fields area on the right.

Click the New Measure button in the ribbon to enable the formula bar. Enter the DAX expression TotalReports = COUNT([ID]) and then click the checkmark icon to ensure that the expression is correct.

 

Once the new measure is created, we will want to make a change to the data type of the Year column in the DateDimension table to ensure that appropriate reports can be easily created. Because Power BI Desktop is designed to minimize errors and simplify the creation of reports, there are sometimes manipulations to the data that need to be made in order for Power BI to properly render visualizations of that data. In the case of the DateDimension table, we will want to use the Year column as an axis in a timeline. Because the Year column is currently defined as an Integer type, PowerBI Desktop will consider that column a measure and will make it difficult to use on a chart axis. The simple workaround is to change the data type of the column to Text.

In the fields section, select the DateDimension table and then select the Year column. From the ribbon, change the Data Type to Text.

Once these changes are made, you are ready to create a report based on the model.

Creating a Report

For the purposes of this post, we’ll recreate the visualization that was created in Excel Power View in Part 3 of the earlier blog series. Click the Report icon on the left side (the top icon) to open the report design canvas.

From the Fields area, select the TotalReports measure in the ChicagoCrimes table and then expand the DateDimension table and select the Year column. A column chart will be created on the canvas.

In the Visualizations area, select the Line Chart icon to change the visualization to a line chart. On the canvas, click a blank area and then in the Fields section, select the TotalReports measure as well as the Location Description field in the ChicagoCrimes table. On the canvas, drag the resulting visualization to the right of the line chart and resize it appropriately.

With the new chart selected, in the Visualizations area select the Bar Chart icon to change the visualization type to a bar chart. On the Bar chart, click the ellipsis (3 dots in the upper right) and sort the chart by the TotalReports field.

Click in a blank area on the canvas and then in the Fields area, select TotalReports from the ChicagoCrimes table and then expand the Community table and select the Communityad field. In the Visualizations area select the Map icon to change the visualization to a map, and then from the Fields area drag the Communityad field to the Location area under Visualizations. Resize the visualization appropriately.

 

After completing all of the steps above you will have a report that details all types of crime reports in Chicago from 2001 until the last date you loaded into your database. To filter the date to a specific type of crime, drag the Primary Type field from the ChicagoCrimes table to the Page Level Filters area under Visualizations. To replicate the report that was created in Excel, select the Homicide type in the filter.

Conclusion

In this blog post, we created a report using Power BI Desktop that connected to a Microsoft Azure SQL Database and detailed Homicides in the City of Chicago from 2001 until the last day of data that you loaded into your database.

Power BI Desktop is a powerful data analysis and visualization tool that allows you to gain insight from your data very quickly, and also share that insight by posting reports to the Power BI Service (future posts will discuss posting reports to the Power BI service).