LOADING

Type to search

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python

To Know more about the Different Corporate Training & Consulting Visit our website www.Instrovate.com Or Email : info@instrovate.com or WhatsApp / Call at +91 74289 52788

Data Analytics Python

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python

Share

Prerequisites To Follow this Exercise :

  1. Microsoft SQL Server Database Express Edition & Adventure Works DataWarehouse – If you Don’t have a Microsoft SQL Server express Database and want to install it in your system and also install AdventureWorks DW , Follow –https://instrovate.com/2019/05/22/download-install-free-microsoft-sql-server-install-adventureworks-database-data-warehouse/
  2. Python Installed in your System : If you are a new user to Python and want to know how to install Python via the Anaconda Distribution , You can go through the step by step Blog i have written to install Python via Anaconda Distribution & start using Jupyter Notebook : https://instrovate.com/2019/06/09/python-anaconda-distribution-how-to-download-and-install-it-and-run-the-first-python-program/

Once you have the Microsoft SQL Server Express Edition and Python Installed in your system you are Good to Go ahead and follow the below Use Case and Example.

 

Bar Chart in Python

In this blog we create a visualization in Python over AdventureWorksDW to showcase the Sales of different Products according to their category and sub category. The visualization will contain the bar chart to showcase the sales for different categories and subcategories of the products.

In the data model of AdventureWorksDW the fact table FactInternetSales has the transactions where in we find the sales amount incurred in each order transaction.

The Data Model to fetch the required data is as follows:

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 29

The join between FactInternetSales and DimProduct can be made using field ProductKey.

The join between DimProduct and DimProductSubCategory can be made using field ProductSubcategoryKey.

The join between DimProductSubCategory and DimProductCategory can be made using field ProductCategoryKey.

So, the first step is to write a sql query that can fetch the sum of SalesAmount based on ProductSubCategory and ProductCategory. Since ProductCategory is the parent for ProductSubCategory so we will take the sum of SalesAmount based on ProductSubCategory with a column giving the value of parent field ProductCategory. The sql query for the same is as follows:

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 30

Now we will write a python code to connect to AdwentureWorksDW database stored in Microsoft SQL server. To learn how to connect python to Microsoft SQL Server please refer to below blog:

https://instrovate.com/2019/04/12/how-to-connect-python-with-sql-server/

So, the python solution for the above problem would begin with making ODBC connection from python to Microsoft SQL server by using the library pyodbc. After connection is established the python code would execute the above query and fetch the results in a python data structure. The code piece for the above solution is as follows:

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 31

So, once we have query executed, now the next is to fetch the data. If we are running the same query from Microsoft SQL server the output will looks like as follows:

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 32

This is relevant when we process data in python as sequence of fields or columns in each row is important.

To make the visualization the programmer must be familiar with the basic functionality of matplotlib library in python for which the below forum can be referred:

https://instrovate.com/forums/topic/plotting-actual-vs-predicted-sales/

Now to plot the data we would need two lists for each plot i.e. one list having data for the X – Axis and one list having data for the Y – Axis. Also to compute the total sales per Category Wise (Parent Filed) we will initialize one dictionary by the name sales. Below is the code;

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 33

In the list sales_cat the sale for each sub-category will be stored.

In product_category the distinct value of each category will be stored.

In product_sub_category the distinct values of each sub category will be stored.

Now we will populate the above data structure with the relevant data that can be used to draw visualization.

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 34

In dictionary sales the key used is product_category and the value is the lit containing the SalesAmount for each subcategory corresponding to that parent category that is there in key part.

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 35

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 36

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 37

So, now we have sales for each sub_category in sales_cat list.

Now the next step is to compute the summation of sales across each parent category. The code to compute total sales for each product category is as follows:

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 38

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 39

So, now we have two types of information:

  • Product category with total sales in each product category
  • Product SubCategory and total sales in each product SubCategory

We will plot these two informations in two different bar charts.

The barh function of the module plt of matplotlib is used to plot bar chart in python. But before doing that we need to perform some kind of pre-processing to get the visualization as we require. The code for pre processing and creating visualization is as below:

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 40

The output visualization is as follows:

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 41

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 42

AdventureWorks Data Warehouse Series#2 : How to Create Bar Chart Visualization Using Python 43

 

If you would like to have additional examples on Data Visualization over AdventureWorks DW using Python you can visit the below link on our Company Website