Actualités d'Argusa

  • joanammiguens

Automate writing multiple outputs with Tableau Prep parameters

Tableau Prep is a powerful tool that makes data preparation simple and intuitive. It provides a visual and straightforward way to combine, shape, and clean data, making it easier for users to start their analysis.


Imagine a scenario where you analyze the sales data of a store. This is a large store with multiple regional managers. Some managers like to analyze their own sales data, so they will regularly ask you to produce a summary dataset specific to them.


Here we will show how you can use Tableau Prep to produce multiple and distinct outputs with summary data. In our scenario, we will be producing one output file for each of the different regional managers, filtered for the data that is relevant to each of them. We will also propose solutions to streamline and automate this workflow, using Tableau Prep’s command line interface and a simple command line script.


In summary, we will follow these main steps:

  • Build a simple Tableau Prep flow that aggregates/summarizes the sales data and writes it to a single excel output file

  • Filter the data in the Prep flow to a single regional manager using a parameter

  • Run the same flow multiple times using the command line, each time changing the parameter and filtering on a different regional manager, to obtain multiple distinct outputs


We will be using Tableau’s Superstore dataset throughout, which you can find in the “Datasources” folder of your “My Tableau Repository” location. Also please note we will be working with excel files for simplicity. In a corporate environment you will most likely work with database tables, but this is not a problem - Tableau Prep allows you to read/write data from/to a variety of databases, so you can easily adapt the example provided. If you would like to know more about using Tableau Prep to write to a database, you can read our previous blog post.



Writing multiple output excel files with Tableau Prep


Let us start with this simple flow:


You can see on the left that we have created a connection to the Superstore dataset excel file. We then brought into the flow the “People” (blue) and “Orders” (orange) tables.


“People” is a simple table with two columns that contain the name of the regional managers and the region they are assigned to:


The “Orders” table is a more complex one, containing the sales data for each order placed with the Superstore (note not all columns are shown in the screenshot):


Let us now go through the different steps of the flow. We first perform a left join between the two tables (“Join” step in orange), using the common field “Region”, which allows us to assign the different orders to each regional manager:


The “Aggregate” step in orange allows us to obtain a summary of the data and provide the managers with a less granular dataset:


In the “Filter manager” step in orange, we use the “Keep-only” option to select the manager we want to produce the summary dataset for. In our example we will start with Fred Suzuki:


Finally, we use the “Output” step to configure writing out an excel file, named “Fred Suzuki Summary Data”, when we run the flow. The data will be written in the “Manager summary data” sheet of the excel file:


Here is what our output file looks like:


At this point, one can imagine two simple ways of moving forward and producing the excel files for the remaining regional managers:

  • We could modify the filtered manager and the name of the output file in the corresponding last two steps of the flow and re-run the flow three more times to obtain the separate summary files for each regional manager.

  • We could also add three more filter steps with dedicated outputs to the Prep flow and produce the files for each manager with one press of the “Run flow” button.

However, both options can become tedious and error prone, especially for a large number of managers. Furthermore, neither option is flexible enough in case one or more of your managers change or you get more managers, in which case you would need to manually adjust and correct your flow. So let us investigate how we can automate this process.



Automation with Tableau Prep parameters


A long-awaited new featured was introduced in Tableau Prep version 2021.4, released at the end of 2021 - parameters. Like Tableau Desktop, parameters in Prep are variables that allow users to replace a value in a calculation or a filter.


Let us create a parameter in our flow using the menu at the top. Our parameter will be called “Manager”, have the type “String”, and will make sure to tick “Prompt for value at run time” (the current value is not so important, as we will ultimately change this multiple times):


We will use this newly created parameter in the “Filter manager” step, where we will remove the specific filter we coded before and replace it by one that requires that the field “Regional Manager” equals the parameter “Manager”. Notice that just like in Tableau Desktop, the calculation shows fields in orange and parameters in purple:


Finally, we will modify the name of the output excel file in the “Output” step, to also include the newly created parameter:


If we attempt to run the flow, we will see the following pop up, which lets us set the value for the parameter “Manager”. We will go with Chuck Magee, to make it different from the default, and then run the flow:


The output file indicates the parameter did the job – we get the file “Chuck Magee Summary Data.xlsx”, filtered to the desired manager:


We will now close the Prep flow and attempt to run it using Tableau Prep’s command line interface. Our simple flow can be run by typing the following on a Command Prompt:

"C:\Program Files\Tableau\Tableau Prep Builder 2021.4\scripts\tableau-prep-cli.bat" -t ManagerFlow.tfl -p Manager.json

Keep in mind you should modify the first part to point to your Tableau Prep install location and the part after -t should match the name of your flow. The json file that appears after -p is what allows us to set the values for the parameters used in the flow via the command line. We will therefore create the file “Manager.json” with the format below. This file will tell Tableau Prep to override the default value of the “Manager” parameter (set directly in the flow) with “Roxanne Rodriguez”:

{"Manager":"Roxanne Rodriguez"}

We can now type the command in the Command Prompt:


Which produces the expected output:


With this setup, we can see that to produce the summary data for specific manager we need to first modify the “Manager.json” file, and then run the Prep flow. So we can now write a command line script that includes this sequence of commands. Our script can even put this sequence inside a loop, that runs through the list of managers. This way the sequence will be executed once for Roxanne, then once more for Fred, etc.


Let us therefore create the list of managers we want to produce the summary data for. We will use a text file, named “ManagerList.txt”. In this case we copied the list directly from the People table, because we will be producing outputs for all four managers:


And now onto our script - here is a very simple version of what it could look like:


To create your version of the script, you can copy and paste the lines below and save them as “runall.cmd” in your working directory (we assume that the script, the Tableau Prep flow, the ManagerList.txt file, the Manager.json file and the output excel files are all saved/written to the same working directory of your choice and that you run the script from this working directory):

FOR /F "tokens=*" %%A IN (ManagerList.txt) DO (
ECHO {"Manager":"%%A"} > Manager.json && "C:\Program Files\Tableau\Tableau Prep Builder 2021.4\scripts\tableau-prep-cli.bat" -t ManagerFlow.tfl -p Manager.json
)

Here is a breakdown of what the script does:


FOR /F "tokens=*" %%A IN (ManagerList.txt) DO ()

This is the loop part of the script. It reads in the ManagerList.txt file and loops over each of the lines in the file, with %%A representing the value of each line. The script will then execute whatever is in parenthesis as many times as there are lines in the ManagerList.txt file (in our case four) – first using %%A equals Sadie Pawthorne (first manager in the list), then using %%A as Chuck Magee (second in the list), etc.


ECHO {"Manager":"%%A"} > Manager.json

This part writes the value of %%A, i.e. the manager name, into the json file, with the format required for the correct setting of the parameter in the flow. The code is inside the parenthesis of the FOR loop, so it will run four times. The first time it will write

{"Manager":"Sadie Pawthorne"} 

into Manager.json, the second time it will write

{"Manager":"Chuck Magee "} 

into the json file, and so on.


&& "C:\Program Files\Tableau\Tableau Prep Builder 2021.4\scripts\tableau-prep-cli.bat" -t ManagerFlow.tfl -p Manager.json

This is the command we ran above directly in the Command Prompt. It simply runs our ManagerFlow.tfl Prep flow using the json file we just wrote. The “&&” symbols indicate that this command will not run unless the previous command runs successfully. So the flow will not run if the manager name was not yet written to Manager.json file, otherwise the Manager parameter will not be set correctly.


To run the script, you can simply type runall.cmd in the Command Prompt:

runall.cmd

Once the script finishes running you should have four different excel files in your working directory, each one containing data from a different regional manager:


This solution is a more efficient and less error-prone approach to the workflow:

  • You only need to configure one output step in your Prep Flow once

  • You produce multiple output files in one go by running a single script

  • The manager names are not hard coded in the Prep flow, so you never need to open or modify it. When managers request data, you simply need to add their names to the list in the text file and run script. You can even add names of managers that did not exist at the time you developed the Prep flow.