From the course: Complete Guide to Tableau for Data Scientists

Combining data using relationships

- [Instructor] Often when you're building a visualization, you need to bring in additional data in order to answer the questions and do your analysis. Now, another example here, I'm looking at the Summer Sales broken down by category, but what I want to know is which of our regional managers is responsible for those sales? I want to compare each one of their performance, for example. Now, if we look on our Data pane on the left-hand side, that information isn't in there. There's no manager name, and there's no regional manager. None of that data is currently available to us. We've got the region, so I know where the products were sold, but not the person responsible for it. Luckily, Tableau makes it very easy to combine data together using relationships. We simply define that these two tables are related somehow, and Tableau does all the work of combining that data for us. To see that in action, we're going to go and edit our Data pane, and we're going to add our new table. So I'm going to right-click on the order name, click Edit Data Source, and I can see that there are three other tables within my Excel sheet The orders one I'm already using, that's got the order information. We have a tab here called People. When I click on the View Grid, it shows four regions and the regional manager for each one of those regions. So I know that's the table that I want to bring in. Now to build a relationship, all we do is drag that table into the middle of the view, and a little orange noodle would appear to say that when I release the mouse button, Tableau's going to relate these two tables together. Now at the bottom of the screen, we can see the result of that relationship. Tableau's identified a column in Orders called Region and the same column in People called Region, and makes the assumption that these two tables are related by that column. That's important to note. All Tableau does is match it on the name. It doesn't know the content. So if you[re trying to relate two tables together that just so happen to have a same column name in each one, Tableau will attempt to do that relationship join between it. Now you can remove that join if it's not the correct one and select the correct ones that you want to do the join on. But it's always good just to check to make sure that Tableau's made that correct assumption. But, in this case, it has 'cause it's quite a simple join. So I can see the four values that Tableau's going to put in together. And when I go back to my sheet, we can see that the Date pane has changed slightly. We now see that the Measures and Dimensions are grouped together. So one set is under Orders, and towards the bottom, there's another set under People. So what Tableau's done is in the background, behind the scenes, it's related those two tables together and says that these two are connected via the Region. And then once that's happened, these are just treated as additional fields. So my People and my Regional Manager are exactly the same as all the other data types. So I can basically just now add them into my views. So I could maybe put Regional Manager onto Color. So now my Regional Managers are color-coded blue, orange, red, and green. So I can then see for each one of my subcategories, how each one of my managers has performed. I might want to put the manager on say the column and split out each individual manager. The point is, is that as of now, that extra information is just part of my data. I don't have to think about doing the joins. I don't have to think about how to combine that data. Tableau's doing it for me. Tableau finds the most efficient way of combining that information to create the ultimate view that we're after.

Contents