Select Page

VIDEO: Why and When Merge Queries in Power Query

by | Apr 27, 2020 | Video

Combining queries is one of the most essential tasks in Power BI. There are two options to combine queries – the append and merge functions in Power Query.
In our last video we looked at appending queries, which in plain English means stacking tables on top of each other. Ideally, the tables being appended should have the same structure, i.e. the same number of columns for each query. That way, a resulted table will include all rows of the appended queries.

The second type of combining queries is merge, which is based on matching rows rather than columns. When merging tables, you are joining them side-by-side using a matching column. An output of merge will be a single wider query (not a taller one as in the append function).

Comparing merge and VLOOKUP

Merge is often compared to Excel VLOOKUP and one may wonder why bother with it if it gives the similar result. Well, the truth is the merge function is much more powerful and has better functionality than the good old VLOOKUP.

First and foremost, Power Query can work directly with external data sources. This means when working large amounts of data, stored in databases, CSV and web queries, you can bypass Excel and jump straight into Power BI. This goes back to Excel’s inability to deal with a large number of rows.

Second, the merge can lookup based on multiple conditions. This is something that VLOOKUP can’t do. When there are two tables with two matching columns, but you need a third column from the second table in the first table, the merge function in Power Query would work best.

Third, when there is a need to compare lists from two tables, where some values are in both tables while others are not. For example, there is one list of students who attended one class, the second list of students who attended another class, and your task is to find out who went to both classes. A simple merge using only the matching rows from the two tables (inner-join kind) will get the job done quickly.

Lastly, the merge function can return multiple columns with one query. No more multiple VLOOKUPs when you need to bring in more columns. Again, saves you computing power.

Comparing merge and relationships

Experts say that when working with very large tables (in the 10 million range), it’s better to create relationships between tables, as merge can create some performance issues. Watch one of our previous videos on how to create relationships here.

However, when working with relatively small tables, you’d want to keep data organized without having to create too many relationships, especially when dealing with tables, which have fewer columns. In this case, it’s better to flatten the data by merging two queries simply for readability purposes. For example, when there is a table with information about address, city and province, and a separate table with city, country and region, merge them together to avoid creating too many relationships in the data model.

Another occasion when merge is preferred over a relationship is when there is a need to group values and do aggregation. For instance, when you’re looking to count a number of items sold and a total amount of revenues they generated, the merge function can do that on one of the tables and then show the name of a product on the other table. In this case, it’s similar to the “Group By” operation.

To sum up, the merge queries function joins two or more tables together and provides you with an extra tool to transform your data and make it more organized.

Share This