Python Integration with Tableau And Running Python Scripts inside Tableau

Tableau Integration with Python

During my Corporate Tableau Training in Gurgaon , i get questions many time regarding Python Integration with Tableau . This course is intended to show how to leverage Python to extend Tableau capabilities and visualize outputs from Python.

This is not a Tutorial on Python.

What is Python

  • Python is a widely used general-purpose programming language, popular among academia and industry alike.

  • It provides a wide variety of statistical and machine learning techniques, and is highly extensible.

  • Together, Python and Tableau is the data science dream team to cover any organization’s data analysis needs.

  • In 2013 Tableau introduced the R Integration [since Tableau 8.1], the ability to call R scripts in calculated fields using R Server [ Rserve()].

  • With the release of Tableau 10.1, you can use Python scripts as part of your calculated fields in Tableau, just as you’ve been able to do with R

  •   The Python Integration happens through the   Tableau Python Server – [ TabPy ] .

  • Tableau works with both the version of Python i.e. 2.7 and 3.x. Python 3.x support has been recently added.

You can install Tableau Python Server by either of the below methods.

TabPy Installation Method 1: (using pip install)

If you are familiar with Python environments and have already set one up or prefer not to use Anaconda and just want to start the server process, you can skip the setup script, install the dependencies and run the process directly from the command line.

The manual installation instructions assume either Conda or Python are defined as part of PATH. It is optional but recommended you create a new Conda environment for this project:

conda create –name Tableau-Python-Server python=2.7 anaconda

The example above creates a Python 2.7 environment but Tableau Python Server is supported on both Python 2.7+ and Python 3.5+.

Now activate the newly created environment. On Windows:

activate Tableau-Python-Server

 

You can install TabPy by simply running the following command from cmd.

pip install tabpy-server

 

As the packages are installed, you will see the install locations listed in the command line. These might look like /Users/username/anaconda/envs/Tableau-Python-Server/lib/python2.7/site-packages or \Users\username\anaconda\envs\Tableau-Python-Server\lib\site-packages depending on your environment.

 

cd c:\Users\User Name\Anaconda\envs\Tableau-Python-Server\Lib\site-packages\tabpy_server

And type → startup and press enter

Navigate to the tabpy_server folder under site-packages and run startup.bat or startup.sh on Windows and Linux/MacOS respectively. You can specify a custom port number as an argument e.g. startup.bat 9001.

 

You can stop the server simply by terminating the process (for example, with CTRL-C) or Closing the window.

 

If you are not able to install TabPy from pip install , you can do the installation as below

TabPy Installation Method 2: From Tableau GitHub Site

 

Extract TabPy-master.zip

  • Within the TabPy-master directory, execute setup.bat .

The script carries out the following steps:

  • Downloads and installs Anaconda, under the current user account for example C:\users\yourUserName\ unless Anaconda is in the PATH or has a registry entry. Anaconda provides an exhaustive set of Python packages, including ML-related libraries that can be used by Python code executed by the server.

  • Creates a Python environment named Tableau-Python-Server if one doesn’t already exist. The script then activates the environment.

  • Installs the required Python packages into the new environment, as well as the client package, as it contains common functionality that the server depends on.

  • Initializes the server, displays the install location and instructions on how to start the server next time.

  • This script downloads and installs Python, TabPy and all necessary dependencies.

  • After completion, TabPy starts up and listens on port 9004 or the custom port number that you specified while installing TabPy server

 

  • When you get the below message it means the python server is successfully installed and running fine.

  • Also Note the path for starting python server next time or else you will keep doing the same process again and again

 

  • So only the first time the python server takes time to install. From next time you need to go to your location as above and do startup and this gets started instantaneously.

  • So for next time to start python tableau server (tabpy) ,go to the below path

cd c:\Users\User Name\Anaconda\envs\Tableau-Python-Server\Lib\site-packages\tabpy_server

And type → startup  and press enter

Setting Up Tableau Desktop with Python

 

Configure  a  TabPy Connection on Tableau

On the Help menu in Tableau Desktop choose Settings and Performance > Manage External Service Connection to open the TabPy connection dialog box.

Enter or select a server name using a domain or an IP address.
The drop-down list includes localhost and the server you most recently connected to.
Specify a port. Port 9004 is the default port for TabPy servers.
Click Test Connection.
Click OK.

Pass Expressions to Python

  • In order to let tableau know that the calculations need to go to Python, it must be passed through one of the 4 functions.

  • These 4 functions are : SCRIPT_BOOL , SCRIPT_INT , SCRIPT_REAL , SCRIPT_STR

  • Python Functions are computed as Table calculations in Tableau.

  • Since these are table calculations, all the Fields being passed to Python must be aggregated like Sum(PROFIT), MIN(Profit), Max (Profit), ATTR( Category) etc.

 

Python Functions in Tableau

 

Run a Python script on Tableau

SCRIPT_BOOL

Returns a Boolean result from the specified expression. The expression is passed directly to a running external service instance. In Python expressions, use _argn (with a leading underscore ) to reference parameters ( _arg1, _arg2, etc.).

  • In this python example, _arg1 is equal to SUM([Profit])

  • All the Fields being passed to python must be aggregated like Sum(PROFIT), MIN(Profit), Max (Profit), ATTR( Category) etc..

 

SCRIPT_BOOL : Finding Profit Greater Zero by python

 

PythonBoolPositive : Python Calculated Field Function Code

SCRIPT_BOOL(“

lst= []

for i in _arg1 :

       lst.append(i>0)

return lst

“,

SUM([Profit])

)

Another Simpler Approach :

SCRIPT_BOOL(“return [i > 0 for i in _arg1]”,SUM([Sales]))

 

SCRIPT_INT – Example → Multiply Sales with 2 from Python

 

PythonIntegerMultiplyBy2 : Python Calculated Field Function Code

SCRIPT_INT(“

lst= []

for i in _arg1 :

       lst.append(i*2)

return lst

“,

SUM([Sales])

)

Another Simpler Approach

SCRIPT_REAL(“return [i * 2 for i in _arg1]”,SUM([Sales]))

SCRIPT_REAL – Example  – Finding log of sales by Python

 

PythonRealLog : Python Calculated Field Function Code

SCRIPT_REAL(“

import math

lst = []

for i in _arg1:

   if math.isnan(i):

       lst.append(0)

   else :

       lst.append(math.log(i))

return lst”,

SUM([Sales])

)

 

Another Simpler Approach

SCRIPT_REAL(“import math

return map(math.log,_arg1)”,SUM([Sales]))

 

SCRIPT_STR – Concatenate Two Strings using Python

PythonStringConcatenate : Python Calculated Field Function Code

SCRIPT_STR(“

lst= []

for i in range(0,len(_arg1)) :

       lst.append(_arg1[i]+_arg2[i])

return lst

“,

ATTR([Category]),ATTR([Sub-Category])

)

 

Finding correlation coefficient of Sales & Profit by Python

PythonCorrCoeff: Python Calculated Field Function Code

SCRIPT_REAL(“

import numpy as np

return np.corrcoef(_arg1,_arg2)[0,1]”,

SUM([Sales]),sum([Profit])

)

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top