SlideShare a Scribd company logo
Tableau Prep
Tableau prep
• Tableau Prep Builder is a tool in the Tableau product suite designed to
make preparing your data easy and intuitive. Use Tableau Prep Builder
to combine, shape, and clean your data for analysis in Tableau.
• Start by connecting to your data from a variety of files, servers, or
Tableau extracts. Connect to and combine data from multiple data
sources. Drag and drop or double-click to bring your tables into the
flow pane, and then add flow steps where you can then use familiar
operations such as filter, split, rename, pivot, join, union and more to
clean and shape your data.
• Each step in the process is represented visually in a flow chart that you
create and control. Tableau Prep tracks each operation so that you can
check your work and make changes at any point in the flow.
• When you are finished with your flow, run it to apply the operations to
the entire data set.
Using tableau prep
• Go to the > Connections + -> Text File (Any type of file).
• Choose the file Orders_South 2015 (for the given example) -> open
• It will be loaded into the tableau as shown in the next slide.
• The first option is connections that shows the file that we have
loaded into the tableau prep.
• Below that, there is a table option that shows the tables available in
the file that we have uploaded.
• Then in the top right (white area with a flow chart) is known as flow
area that shows how the data flow is going on.
Tableau Prep.pptx
• Below flow area different options are there such as Input, connection,
text options, field separator, etc.
• Table like structure in the bottom right is the information about the
dataset that we have loaded such as column names, datatype, sample
data, etc.
• Check boxes are given in that table where we can uncheck any box for
the column that we don’t want for our analysis.
• Type specifies the type of data stored in the table.
• Field name is the name of the column that we can change according
to our requirement. Eg: Sales to sale for our data
• Original Field name is the name of the column in the actual dataset
that will remain same.
• Changes represent if any changes made into the data.
• Preview is some sample data given.
• There is one another option i.e., Filter values with the help of which
we can apply filter to the data.
• When we will click on the filter values option, we will get one window
(shown in the next slide) where we need to mention the logic or
condition based on which we want to filter out our data.
• Eg:- We only want to display the data of First-Class Ship Mode
Tableau Prep.pptx
• Add a formula [Ship Mode]='first class’ into the window and click on
apply.
• We can also remove/edit the filter applied to our data by right clicking
• Under input, there are certain options like settings, multiple files,
data sample, changes.
• Multiple files is the option with the help of which we can join multiple
files.
• Data sample is the option that help us to choose how many rows we
want to use.
• Changes is the tab that shows the changes that we made to our data.
• We can even add new files to the existing area by dragging from the
folder to the flow area.
• In the orders_south table, there are the two columns order date and ship
date. But in orders_central there are 6 columns that are representing order
day, month, year and ship day, month and year. So, this is the problem.
• In the orders_south, region column is there. But in orders_central, region
column is not there. This is also the problem.
• In the orders_central table, order date, ship date and regions, all these
columns are available. But when scrolling down through the columns, there
are duplicate columns with a prefix Right. To remove these duplicate
columns, just check the check box in front of them. And the value given for
the state is AZ, not the full name just like in other tables.
• In the orders_East, there is a prefix (USD) with the values of Sales
column. But in other tables no such prefix is there. So, we need to
remove all these problems from our dataset as a part of cleaning.
Cleaning using
Tableau Prep
• Go to the orders_central
table in the flow area, click on
the + and choose clean step.
• We will get the output as
shown in the picture that is
showing the data
distribution. Eg: Most of the
orders are from standard
class and many more.
• In the orders_central table,
no region column is there. So,
this is the first step of our
cleaning process.
• There are 3 dots (…) available
after rename fields option.
Click on it to add a new field.
• Then give Region as a field
name and write “Central” in
the formula section and click
on Apply and Save.
• A new column has been
added to our data now with a
name Region and value
Central as shown in the
picture on the next slide.
Removal of 1st problem in Orders_Central
• Second issue with the orders_central is missing order date and ship
date column. For this again we will add a new calculated field.
• Enter the details as Order Date as a field name and formula as
MAKEDATE([Order Year],[Order Month],[Order Day]) and click on
apply and save.
• New field with a name Order Date has been added into the data as
shown in the picture on the next slide.
• Similarly add a new field ship date with a formula MAKEDATE([Ship
Year],[Ship Month],[Ship Day]).
Removal of 2nd problem in Orders_Central
• As we have now ship date
and order date, so we do not
need separate fields such as
order month, year etc. To
remove them, just go that
field and click on 3 dots and
then remove.
• Apply the same procedure to
remove ship date, month and
year from the dataset.
• All the changes that we are
making is not in the original
dataset.
• For ship date and order date,
we are getting summary
information not the detailed
information. For this click on
the 3 dots and choose detail.
• There is one another issue in the orders_central table that there is a
discount column which is of type abc and one value is None. But if no
discount is there it must be 0. So, to do this, double click on none and
type 0. And to change the datatype, click on abc and choose number.
• We can see all the changes that we have made on the next slide.
• If we want, we can also assign a new name to the 1st clean step by
just double clicking clean 1 and enter a new name as “Fixing date and
discount”.
Tableau Prep.pptx
Removal of 1st
Problem in
Orders_East
• The problem in this table is
there is USD written with the
sales value. So, for this go to
the sales field and click on 3
dots and then choose clean
and then remove letters. It
will automatically remove
USD from the sales.
• After that convert the
datatype to the decimal
number.
• Now orders_east is fine.
Removal of 1st
problem in
Orders_West
• Problem in this table is the
state name. So to do this, go
to state column -> click on 3
dots -> group values ->
manual selection.
• Then enter the state names
one by one and press enter.
• After changing all the state
names, click on done.
Tableau Prep.pptx
Union of two files
• To perform the union of west and east dataset, drag the cleaning box
of west on to the cleaning step of east. New step will be added that
represents the union.
• Similarly, drag cleaning step of central and drop it on the union made
by east and west.
• At last, drag south data and drop it on the union box created earlier.
• The structure created is represented on the next slide.
• As shown in the picture, in the left side there is mismatched fields
column that is representing all the mismatched columns from the
union of 4 tables.
Tableau Prep.pptx
• Click on the checkbox of show only mismatched fields and we will
only get those fields.
• Drag Discounts field on the Discount to merge these two and drag
Product to Product Name.
• There are no mismatched fields left as shown in the picture on the
next slide.
Tableau Prep.pptx
• Now add new file, go to + -> Excel file -> Return Reasons_New.

More Related Content

PPTX
Tableau Presentation
PPTX
Introduction to Tableau
PDF
Hadoop & MapReduce
PPTX
Over fitting underfitting
PPTX
Tableau
PPTX
Power BI: Introduction with a use case and solution
PPTX
Innovation
PPTX
What are sensex and nifty
Tableau Presentation
Introduction to Tableau
Hadoop & MapReduce
Over fitting underfitting
Tableau
Power BI: Introduction with a use case and solution
Innovation
What are sensex and nifty

What's hot (20)

PPT
Tableau PPT.ppt
PDF
Tableau Training For Beginners | Tableau Tutorial | Tableau Dashboard | Edureka
PPT
Tableau PPT
PPTX
PDF
Data Visualisation & Analytics with Tableau (Beginner) - by Maria Koumandraki
PDF
Data Visualization With Tableau | Edureka
PPTX
Tableau Visual analytics complete deck 2
PPTX
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
PPTX
Introduction to Oracle Database
PDF
Tableau Tutorial for Data Science | Edureka
PDF
Tableau Dashboard Tutorial | Tableau Training For Beginners | Tableau Tutoria...
PPT
Entity relationship modelling
PPT
Star schema PPT
PPTX
Null / Not Null value
PPSX
Best practices to deliver data analytics to the business with power bi
PDF
Tableau 7.0 prsentation
PPTX
Tableau Server Basics
PPTX
Introduction to SAP Gateway and OData
PDF
Data Visualization with Tableau - by Knowledgebee Trainings
PPTX
Slowly changing dimension
Tableau PPT.ppt
Tableau Training For Beginners | Tableau Tutorial | Tableau Dashboard | Edureka
Tableau PPT
Data Visualisation & Analytics with Tableau (Beginner) - by Maria Koumandraki
Data Visualization With Tableau | Edureka
Tableau Visual analytics complete deck 2
DATA WAREHOUSE IMPLEMENTATION BY SAIKIRAN PANJALA
Introduction to Oracle Database
Tableau Tutorial for Data Science | Edureka
Tableau Dashboard Tutorial | Tableau Training For Beginners | Tableau Tutoria...
Entity relationship modelling
Star schema PPT
Null / Not Null value
Best practices to deliver data analytics to the business with power bi
Tableau 7.0 prsentation
Tableau Server Basics
Introduction to SAP Gateway and OData
Data Visualization with Tableau - by Knowledgebee Trainings
Slowly changing dimension
Ad

Similar to Tableau Prep.pptx (20)

PPTX
Pivots and Slicers_v5
DOC
Microsoft access exercises
PPT
Chapter 7 -DescriptiveStatistics and Pivot Table
PPTX
Lecture 4-Prepare data-Clean, transform, and load data in Power BI.pptx
PPTX
Microsoft Excel Core Points Training.pptx
PPTX
IS100 Week 8
PPTX
Electronic Spreadsheet Notes.. Libre office
PPTX
Getting started with Tableau
PPTX
Microsoft Excel Tutorial
PPTX
CREATING A DATASET FROM EXCEL IN POWER BI REPORT BUILDER
PPTX
Manipulate data in a spreadsheet, Numeric formatting, Sorting and filtering (...
PPTX
mod3part 3 of robotic process automation
PPTX
Basic Computer skill-P4 Excel.pptx
PPT
kiromax.ppt
PPTX
Advanced Filter Concepts in MS-Excel
PDF
excell.pdf
PPT
Introduction to Excel
PPTX
Working with Google Sheet - Portfolio.pptx
PPTX
Libre Office Calc Lesson 2: Formatting and Charts
PDF
Tableau-tutorial.pdf with proper guidance
Pivots and Slicers_v5
Microsoft access exercises
Chapter 7 -DescriptiveStatistics and Pivot Table
Lecture 4-Prepare data-Clean, transform, and load data in Power BI.pptx
Microsoft Excel Core Points Training.pptx
IS100 Week 8
Electronic Spreadsheet Notes.. Libre office
Getting started with Tableau
Microsoft Excel Tutorial
CREATING A DATASET FROM EXCEL IN POWER BI REPORT BUILDER
Manipulate data in a spreadsheet, Numeric formatting, Sorting and filtering (...
mod3part 3 of robotic process automation
Basic Computer skill-P4 Excel.pptx
kiromax.ppt
Advanced Filter Concepts in MS-Excel
excell.pdf
Introduction to Excel
Working with Google Sheet - Portfolio.pptx
Libre Office Calc Lesson 2: Formatting and Charts
Tableau-tutorial.pdf with proper guidance
Ad

More from Venneladonthireddy1 (10)

PPTX
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
PPTX
CG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDY
PPTX
CG Lecture0.pptx
PPT
Supervised Learning-classification Part-3.ppt
PPT
PDF
EOD Continued.pdf
PPT
lecture 1234.ppt
PPTX
VLAN _SLAN and VSAN.pptx
PPTX
fashion.pptx
PPT
Lecture12_16717_Lecture1.ppt
02 Hadoop.pptx HADOOP VENNELA DONTHIREDDY
CG Lecture 1.pptx GRAPHIS VENNELA DONTHIREDDY
CG Lecture0.pptx
Supervised Learning-classification Part-3.ppt
EOD Continued.pdf
lecture 1234.ppt
VLAN _SLAN and VSAN.pptx
fashion.pptx
Lecture12_16717_Lecture1.ppt

Recently uploaded (20)

PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Well-logging-methods_new................
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
PPT on Performance Review to get promotions
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Geodesy 1.pptx...............................................
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
DOCX
573137875-Attendance-Management-System-original
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
OOP with Java - Java Introduction (Basics)
CYBER-CRIMES AND SECURITY A guide to understanding
CH1 Production IntroductoryConcepts.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Model Code of Practice - Construction Work - 21102022 .pdf
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Well-logging-methods_new................
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPT on Performance Review to get promotions
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Strings in CPP - Strings in C++ are sequences of characters used to store and...
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
UNIT 4 Total Quality Management .pptx
Geodesy 1.pptx...............................................
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
573137875-Attendance-Management-System-original

Tableau Prep.pptx

  • 2. Tableau prep • Tableau Prep Builder is a tool in the Tableau product suite designed to make preparing your data easy and intuitive. Use Tableau Prep Builder to combine, shape, and clean your data for analysis in Tableau. • Start by connecting to your data from a variety of files, servers, or Tableau extracts. Connect to and combine data from multiple data sources. Drag and drop or double-click to bring your tables into the flow pane, and then add flow steps where you can then use familiar operations such as filter, split, rename, pivot, join, union and more to clean and shape your data.
  • 3. • Each step in the process is represented visually in a flow chart that you create and control. Tableau Prep tracks each operation so that you can check your work and make changes at any point in the flow. • When you are finished with your flow, run it to apply the operations to the entire data set.
  • 5. • Go to the > Connections + -> Text File (Any type of file). • Choose the file Orders_South 2015 (for the given example) -> open • It will be loaded into the tableau as shown in the next slide. • The first option is connections that shows the file that we have loaded into the tableau prep. • Below that, there is a table option that shows the tables available in the file that we have uploaded. • Then in the top right (white area with a flow chart) is known as flow area that shows how the data flow is going on.
  • 7. • Below flow area different options are there such as Input, connection, text options, field separator, etc. • Table like structure in the bottom right is the information about the dataset that we have loaded such as column names, datatype, sample data, etc. • Check boxes are given in that table where we can uncheck any box for the column that we don’t want for our analysis. • Type specifies the type of data stored in the table. • Field name is the name of the column that we can change according to our requirement. Eg: Sales to sale for our data
  • 8. • Original Field name is the name of the column in the actual dataset that will remain same. • Changes represent if any changes made into the data. • Preview is some sample data given. • There is one another option i.e., Filter values with the help of which we can apply filter to the data. • When we will click on the filter values option, we will get one window (shown in the next slide) where we need to mention the logic or condition based on which we want to filter out our data. • Eg:- We only want to display the data of First-Class Ship Mode
  • 10. • Add a formula [Ship Mode]='first class’ into the window and click on apply. • We can also remove/edit the filter applied to our data by right clicking
  • 11. • Under input, there are certain options like settings, multiple files, data sample, changes. • Multiple files is the option with the help of which we can join multiple files. • Data sample is the option that help us to choose how many rows we want to use. • Changes is the tab that shows the changes that we made to our data.
  • 12. • We can even add new files to the existing area by dragging from the folder to the flow area.
  • 13. • In the orders_south table, there are the two columns order date and ship date. But in orders_central there are 6 columns that are representing order day, month, year and ship day, month and year. So, this is the problem. • In the orders_south, region column is there. But in orders_central, region column is not there. This is also the problem. • In the orders_central table, order date, ship date and regions, all these columns are available. But when scrolling down through the columns, there are duplicate columns with a prefix Right. To remove these duplicate columns, just check the check box in front of them. And the value given for the state is AZ, not the full name just like in other tables.
  • 14. • In the orders_East, there is a prefix (USD) with the values of Sales column. But in other tables no such prefix is there. So, we need to remove all these problems from our dataset as a part of cleaning.
  • 15. Cleaning using Tableau Prep • Go to the orders_central table in the flow area, click on the + and choose clean step. • We will get the output as shown in the picture that is showing the data distribution. Eg: Most of the orders are from standard class and many more. • In the orders_central table, no region column is there. So, this is the first step of our cleaning process.
  • 16. • There are 3 dots (…) available after rename fields option. Click on it to add a new field. • Then give Region as a field name and write “Central” in the formula section and click on Apply and Save. • A new column has been added to our data now with a name Region and value Central as shown in the picture on the next slide.
  • 17. Removal of 1st problem in Orders_Central
  • 18. • Second issue with the orders_central is missing order date and ship date column. For this again we will add a new calculated field. • Enter the details as Order Date as a field name and formula as MAKEDATE([Order Year],[Order Month],[Order Day]) and click on apply and save. • New field with a name Order Date has been added into the data as shown in the picture on the next slide. • Similarly add a new field ship date with a formula MAKEDATE([Ship Year],[Ship Month],[Ship Day]).
  • 19. Removal of 2nd problem in Orders_Central
  • 20. • As we have now ship date and order date, so we do not need separate fields such as order month, year etc. To remove them, just go that field and click on 3 dots and then remove. • Apply the same procedure to remove ship date, month and year from the dataset. • All the changes that we are making is not in the original dataset.
  • 21. • For ship date and order date, we are getting summary information not the detailed information. For this click on the 3 dots and choose detail.
  • 22. • There is one another issue in the orders_central table that there is a discount column which is of type abc and one value is None. But if no discount is there it must be 0. So, to do this, double click on none and type 0. And to change the datatype, click on abc and choose number. • We can see all the changes that we have made on the next slide. • If we want, we can also assign a new name to the 1st clean step by just double clicking clean 1 and enter a new name as “Fixing date and discount”.
  • 24. Removal of 1st Problem in Orders_East • The problem in this table is there is USD written with the sales value. So, for this go to the sales field and click on 3 dots and then choose clean and then remove letters. It will automatically remove USD from the sales. • After that convert the datatype to the decimal number. • Now orders_east is fine.
  • 25. Removal of 1st problem in Orders_West • Problem in this table is the state name. So to do this, go to state column -> click on 3 dots -> group values -> manual selection. • Then enter the state names one by one and press enter. • After changing all the state names, click on done.
  • 27. Union of two files • To perform the union of west and east dataset, drag the cleaning box of west on to the cleaning step of east. New step will be added that represents the union. • Similarly, drag cleaning step of central and drop it on the union made by east and west. • At last, drag south data and drop it on the union box created earlier. • The structure created is represented on the next slide. • As shown in the picture, in the left side there is mismatched fields column that is representing all the mismatched columns from the union of 4 tables.
  • 29. • Click on the checkbox of show only mismatched fields and we will only get those fields. • Drag Discounts field on the Discount to merge these two and drag Product to Product Name. • There are no mismatched fields left as shown in the picture on the next slide.
  • 31. • Now add new file, go to + -> Excel file -> Return Reasons_New.