AdventureWorks DW Series 3 : How to Create Pie Chart Visualization Using Python
Prerequisites To Follow this Exercise :
- Microsoft SQL Server Database Express Edition & Adventure Works DataWarehouse – If you Don’t have a Microsoft SQL Server express Database and want to install it in your system and also install AdventureWorks DW , Follow –https://instrovate.com/2019/05/22/download-install-free-microsoft-sql-server-install-adventureworks-database-data-warehouse/
- Python Installed in your System : If you are a new user to Python and want to know how to install Python via the Anaconda Distribution , You can go through the step by step Blog i have written to install Python via Anaconda Distribution & start using Jupyter Notebook : https://instrovate.com/2019/06/09/python-anaconda-distribution-how-to-download-and-install-it-and-run-the-first-python-program/
Once you have the Microsoft SQL Server Express Edition and Python Installed in your system you are Good to Go ahead and follow the below Use Case and Example.
In the data model of AdventureWorksDW the fact table FactInternetSales has the` transactions where in we find the sales amount incurred in each order transaction.
The Data Model to fetch the required data is as follows:
The join between FactInternetSales and DimSalesTerritory can be made using field SalesTerritoryKey.
So, the first step is to write a sql query that can fetch the sum of SalesAmount based on SalesTerritoryCountry. The SalesTerritoryCountry column is in the table DimSalesTerritory. So, we can join the above two tables based on join condition mentioned above and using the sql function sum with group by clause based on column SalesTerritoryCountry to get the Total Sales amount country wise. The below is the sql query for the same :
Now we will write a python code to connect to AdwentureWorksDW database stored in Microsoft SQL server. To learn how to connect python to Microsoft SQL Server please refer to below blog:
So, the python solution for the above problem would begin with making ODBC connection from python to Microsoft SQL server by using the library pyodbc. After connection is established the python code would execute the above query and fetch the results in a python data structure. The code piece for the above solution is as follows:
So, once we have query executed, now the next is to fetch the data. If we are running the same query from Microsoft SQL server the output will looks like as follows:
This is relevant when we process data in python as sequence of fields or columns in each row is important.
To make the visualization the programmer must be familiar with the basic functionality of matplotlib library in python for which the below forum can be referred:
Now to plot the data we would need three lists i.e. one for storing the country names, one for storing total sales amount in the respective country and one for stroring name of color to represent each country’s sale in pie chart. Below is the code:
There is another variable used here is explore. The length of this variable is 6 which is equivalent to number of countries we have i.e. 6 as we can see in sql query output above. The first 5 values are zero and last one is 0.1. The value signifies that in the pei chart the sixth visual slice will be exploded which corresponds to country United States.
Now we have all the required details to go for plotting. Below is the code to draw the pie chart from the required data :
The output of the above code is pie chart based visualization. The visualization we will get after code execution is as follows:
AdventureWorks DW Series 4 : How to Create Histogram Visualization Using Python – https://instrovate.com/2019/05/20/adventureworks-dw-series-4-how-to-create-histogram-visualization-using-python/
AdventureWorks DW Series 5 : Box PLot to identify Outliers and Targeted Cutomers in Python – https://instrovate.com/2019/05/28/adventureworks-dw-series-5-box-plot-to-identify-outliers-and-targeted-cutomers-in-python/