Joins in Power Query

Joins in Power Query

6 Joins in Power Query

Joins in Power Query can help us with some of the most laborious and tedious tasks in Data Analysis. i.e. Comparing and Combining data sets.

There are 6 Joins in Power Query which help us to Compare and Merge tables containing data. Those Joins are…

  • Left Outer
  • Right Outer
  • Full Outer
  • Inner
  • Left Anti
  • Right Anti

In this section, I will brief each Join using the following tables and later we will see how to use each one of them.

Here, we have two tables containing the list of actors who have worked in the movies Inception and The Dark Knight Rises. As you can see, a few actors have worked in both the films and others have worked in only one.

Michael Caine, for example. He has acted in both films. But Leonardo DiCaprio has acted only in Inception. Similarly, Christian Bale acted only in The Dark Knight Rises.

How can we find out the actors who have worked in both the films?

That is what Inner Join in Power Query is for.

Inner Join

Inner Join in Power Query returns the matching records from both tables

When we merge the above tables using Inner Join, we will get the list of the actors who have acted in both the films. And they are Joseph Gordon-Levitt, Tom Hardy, Michael Caine, Cillian Murphy & Marion Cotillard.

The following Venn diagram will give a more clear picture of the output produced by Inner Join.

Full Outer Join

Full Outer Join returns all the records from both the tables

If you want the complete list of actors from both the tables, apply Full Outer Join.

Left Outer

Left Outer Join will return all records from the First Table and the matching records from the Second Table

Use Left Outer Join for the list of actors in Inception and those who have acted in both films.

Right Outer

Right Outer Join will return all records from Second Table and the matching records from the First Table

In this case, Right Outer Join will return the list of actors in The Dark Knight rises and those who have acted in both films.

Left Anti Join

Left Anti Join will return the records that are present only in the First Table

In this case, Left Anti Join will return the list of actors who have acted only in Inception.

Right Anti Join

Right Anti Join will return the records that are present only in the Second Table

Use Right Anti Join for the list of actors who have acted only in The Dark Knight Rises.

How to use Joins in Power Query

Now that we know what each join is, let’s see how to use these Joins between tables containing data.

I will be using the same tables which you saw above. i.e. list of actors worked in Inception and The Dark Knight Rises.

The following are a few Questions related to the above tables and we are going to find the answers to these questions using different Joins in Power Query

  1. List of actors who acted in both Inception and The Dark Knight Rises?
  2. List of actors worked either in Inception or The Dark Knight Rises?
  3. List of actors who worked only in Inception?
  4. List of actors who worked only in The Dark Knight Rises?

1. List of actors who acted in both Inception and The Dark Knight Rises?

Like I mentioned above, Inner Join will return the records that are present in both tables. When we apply Inner Join between these tables, we will get the list of actors who have worked in both films

To Merge two tables and apply a Join, we need to load that tables into the Power Query Editor. For that,

Step 1

Select a cell in the first table > go to the Data tab in the Excel ribbon > Click on From Table/Range > Click OK in the Create Table dialog

Selected data is loaded into the Power Query Editor.

Step 2

In the Home tab of Power Query Editor, Click on the split button for Close & Load and select Close & Load to…

Step 3

In the Import Data dialog, select Only Create Connection > Click OK

We have established a connection with the first table. Now, we have to repeat the above procedure (Steps 1 to 3) with the second table and create a connection with it.

Once we have created a connection with a particular table, that table will be available in the Power Query Editor.

To find out the matching records in these two tables,

Step 4

In the Data tab of the Excel ribbon > Get Data > Combine Queries > Merge

The Dialog called Merge will get activated.

We have to specify 3 parameters in the Merge dialog.

  1. First Table – Table 1 that contains that the list of actors in the movie Inception
  2. Second Table – Table 2 that contains that the list of actors in the movie The Dark Knight Rises
  3. Join Kind – We need the actors who acted in both the movies, so Inner should be the Join Kind (only matching rows)

Step 5

Specify the Tables and the Join Kind using the drop down menus and click OK.

Note: While applying Merge, you have to select the columns to match. In this case our Tables have a single column each and those columns are selected.

Step 6

Once I click OK, I will have the following table in the Power Query Editor. The list of actors who have acted in both the films.

Step 7

Here the second column called Table2, contains a Table in every row. When expanded, that table will have the same list of actors.

To expand the tables in the second column, Click on the two sided arrow in the Column header > Click OK

The table in second column got expanded and contains the same list of actors.

Step 8

To remove a column before loading this data into the Excel sheet, Right-click on the column header > Select Remove

Step 9

You can also rename the column by double clicking on the column header. Here I have renamed the table as Inner

To load the data into the Excel sheet, Click on the split button for Close & Load > Close & Load to…

Step 10

In the Import Data dialog > Select Existing worksheet > Select the cell where you want to place the data (Here, I have selected the cell F2)

In the cell F2, we have the Excel table containing the list of actors who have acted in both Inception and The Dark Knight Rises.

So that is what an Inner Join does, returns the matching records from both tables.

Now, the second question.

2. List of actors worked either in Inception or The Dark Knight Rises?

Here, we need the complete list of actors.

Full Outer Join can be used for this purpose as it will provide us all records from both the tables.

Repeat the Steps from 1 to 4 and select Full Outer in the Merge dialog

Once I click OK, I will have the following table in the Power Query Editor. The list of actors worked in Inception and a table in every row of the second column.

To expand the tables in the second column, Click on the two sided arrow in the Column header > Click OK

The table in second column got expanded and it looks like this.

To combine the data in both the columns,

go to the Add Column tab of the Power Query Editor > Conditional Column

In the Add Conditional Column dialog, specify the New column name [optional] and create a condition which means…

if the value in the first column (Inception) is ‘null’, then use the value from the other column (Table2.Dark Knight Rises)

When I click OK, I will have a new column called Outer which contains the combined data from both columns.

Repeat the Steps 8 (Remove unwanted columns), 9 (Close & Load To…) and 10 (Import Data) to load the data in the column called Outer into the Excel sheet.

And we have the complete list of actors.

3. List of actors who worked only in Inception

Merge the tables and apply Left Anti Join for the list of actors who have associated only with Inception.

Repeat the Steps from 1 to 4 and select Left Anti in the Merge dialog

Once I click OK, I will have the following table in the Power Query Editor. The list of actors worked only in Inception.

Repeat the Steps 8 (Remove unwanted column), 9 (Close & Load To…) and 10 (Import Data) to load the data in the column called Inception into the Excel sheet.

4. List actors who worked only in The Dark Knight Rises

Right Anti Join when applied between the above tables will provide the list of actors worked in The Dark Knight Rises.

Repeat the Steps from 1 to 4 and select Right Anti Join in the Merge dialog

Once I click OK, I will have the following in the Power Query Editor. The first column is empty and the second column has a table in it.

Click on the double sided arrow in the column header of the second column to expand the table and Click OK

The table got expanded and here we have the list of actors who have worked only the second movie, The Dark Knight Rises.

Rename the column [optional], Repeat the Steps 8 (Remove unwanted column), 9 (Close & Load To…) and 10 (Import Data) to load this data into the Excel sheet.

Left Outer Join and Right Outer Join are relevant when the tables contain more than one column.

Still, I will show you what these two Joins will return when applied between the above tables.

To apply Left Outer Join, Repeat the Steps from 1 to 4 and select Left Outer in the Merge dialog

When I click OK, I will have the following in the Power Query Editor. First column contains the data in Table 1 and the second column contains a table in every row.

When expanded the second column will have the list of matching records from the second table

So, that is what Left Outer Join gives us. All the records from the first table along with the matching records in the second table.

Similarly, Right Outer Join will return all the records from the second table along with the matching records in the first table.


Read about Power Query Editor

Tips & Tricks in Power Query