Power BI - Power Query Vs. DAX

Power BI Power Query Vs. DAX Image

In this article we will compare the two main ways to manipulate data in Power BI - Power Query and DAX (Data Analysis Expressions). Along the way we will look at the strengths and weaknesses of each approach and give suggestions on how to make the most of each method.

What Is Power Query?

Power Query is a tool available in both Microsoft Excel and Microsoft Power BI that allows end users to perform Extract/Transform/Load (ETL) operations on data. The overarching purpose of Power Query is to allow a simple way to "massage" data into data structures that are more amenable to reporting and analytics.

Power Query "feels" like an ETL tool - it handles extraction, conversion of data types, filtering, and other functions that traditional ETL tools excel at.

What Is DAX?

DAX stands for Data Analysis Expressions and is a way to perform dynamic, IN MEMORY operations on data in Power BI. DAX does not filter or perform manipulations on data before it enters memory.

DAX feels less like a query language and more like working with functions in Excel or setting up "group by" columns in SQL. Getting into the DAX mentality is often difficult for people from a more linear programming background as it requires you to think about a calculation and how it might be performed at different aggregation levels of the data.

Power Query and DAX Storage Modes

To understand how Power Query and DAX relate to one another one must first understand the two different "storage modes" that Power BI uses to deal with data - DirectQuery, and Import. Direct Query goes directly (hence the name) against the data source and all manipulations against the data are performed in memory via DAX expressions. Therefore Power Query is not available for tables using DirectQuery mode.

In contrast, Import mode brings data from the data source, processes it, and saves it in its own datastore. In Power BI desktop this datastore is contained in the .pbix file, and in Power BI online it is stored as a separate data set from the report.

Power Query allows you to take data that is imported in Import Mode and perform further manipulations upon it to get the data in the format that you would like so that it is easier to build reports using the Power BI visual designer.

Import Mode
Compatible with Power Query
Compatible with DAX
Data is pulled and saved into a local data store for further manipulation, similar to traditional ETL processes.
Direct Query Mode
Incompatible with Power Query
Compatible with DAX
Data is pulled from the data store into memory and operations are performed in memory but no data is saved locally.

As stated above, Power Query can only be used on imported data, whereas DAX can be used on both imported and DirectQuery data. If you are using DirectQuery you are stuck with DAX. However, if you are importing data you can use either one (or both in some circumstances). Here are some considerations that might make you choose one over the other:

File Size Vs. Performance Tradeoff

DAX queries run on top of existing data and perform calculations in memory and they do not store extra data, so DAX wins hands down in terms of keeping the size of your dataset/pbix file small. On the flip side, all of that calculation can greatly slow down reports.

Power Query only executes when a data refresh is performed, and it stores the results of its data manipulation in static tables. Therefore the "performance hit" of Power Query only happens when the report is refreshed. This makes Power Query ideal for reports that contain a lot of data but are not refreshed frequently.

Power Query
Larger File Sizes
Slower Data Load/Refresh
Faster Performance Once Data Is Loaded
Changing Slicer Values Is Faster Because Data Is Local
DAX
Smaller File Sizes
No Initial Data Load
Much more memory required
Performance tends to be worse when manipulating data because Power BI goes back to the database for every significant change.

Known Aggregations Vs. Flexible Aggregations

Power Query is great when you know from the beginning what kinds of aggregations you will have to perform on data. For example a lot of business reporting is simply rolling up numbers (sales, customer acquisition, etc) to weekly/monthly/quarterly/annual aggregations. Power Query shines in this regard because aggregation tables can be built from the beginning with this well-defined end reporting in mind. Pre-calculating these numbers makes the front-end reports significantly faster.

On the flip side Power Query really does not work when the end user needs to be more flexible in how they aggregate data, or when there are other dynamic filters that will affect the data aggregation. In these cases it makes more sense to use DAX calculated measures which take into consideration any filters that have been put onto the data in the report front-end.

Use Power Query for aggregation when:
Aggregations are known and stable ahead of time
Dashboard GUI responsiveness is key
Power BI file size is not an issue
Use DAX for aggregation when:
Date Ranges/Filters are not known ahead of time
Data Load Times are Key
Power BI files are becoming too large

Development and Maintenance Complexity

Power Query has a very nice visual interface for editing queries and the Power Query "mentality" is very similar to working with SQL, thus it is much easier to pick up than DAX. DAX, by contrast, requires a complete shift in mindset. Therefore it is much easier for a data analyst to pick up Power Query skills than DAX.

As an organization's business intelligence capacity and complexity grows it becomes important also to be able to track changes to code. Neither Power Query nor DAX are particularly easy to track changes and back up code. However, of the two, Power Query is somewhat better as the steps Power Query takes can be expressed as a text-based script in the Power Query M language. These scripts can be copied and pasted into some kind of master file and then added to version control, allowing you to track changes.

Power Query
It is possible to export Power Query queries to text file for source control.
Easier to follow flow of logic from start to finish
Uses a linear programming language (Power Query M)
DAX
The "DAX Mentality" is hard to get at first if you come from a traditional SQL background.
Difficult to use version control or track changes over time.
Much more difficult to debug - many serious Power BI developers turn to external tools such as DAX Studio to develop and debug DAX statements.

What are some common tasks performed with Power Query?

Data Filtering

One area in which Power Query shines is pre-filtering data for a particular report. Often the concern of a particular Power BI report is narrow - show the performance of one particular product, or department, over the course of one year. Using DAX one must query the entire table and then apply the filters in memory every time there is a change in the report parameters/slicers/filters. Power Query allows you to only download the data you want when the report is refreshed so that you do not have to think about applying filters later in the visual aspect of Power BI. Here are some common filters that are implemented in Power Query:

  • Date Ranges - often you are only interested in a particular date range in a report (current year, quarter, etc) and Power Query therefore limits the data just to that date range.
  • Remove Unwanted Data Points - Sometimes you want to focus on a particular sales region, or product line, or customer segment, and Power Query allows you to filter out all others.
  • Remove Deleted/Archived Records - In some database designs deleted/archived records continue to be stored with the rest of the data and are only distinguished by a flag like IsDeleted or IsArchived. Power Query allows you to get rid of those records without having to remember to filter at the report level.
  • Remove invalid data - Let's face it - we are not always working with clean data. Power Query allows you to find and adjust or remove bad data.

Known Data Aggregations

Power Query allows for intuitive grouping of data into separate aggregation tables. This aggregation is particularly powerful when dealing with large sets of data as this allows you to avoid calculating aggregations on the fly in reports.

Data denormalization

Power BI will create a data model for you when you import data, but sometimes it is easier to work from one denormalized data table rather than dealing with the cognitive complexity of many joined tables. Power Query allows you to denormalize the data up front into a few mega-tables that have all the detail you need. Then additional filters can be applied to this denormalized table in DAX to get the results you need.

What are some common tasks performed with DAX?

Data Aggregations in Hierarchical/Drill Down Data Views

DAX is best when you are going to let users explore data freely and aggregate it at different levels in the front end. In this case you can define a custom measure in the data that is calculated at different aggregation levels - the measure calculated itself based on whatever grouping is being used on the data at the time.

Calculations involving dynamic date ranges or many slicers

DAX works well when you are going to look at data based on different date ranges or where you want to filter based on certain data dimensions (sales regions, customer segments, etc).

Dynamic Data Tables

DAX allows you to create data tables "on the fly" in memory based on certain conditions. This can simplify certain reports and visuals such as master/detail views.

Tips for Using Power Query and DAX together to create great reports:

1. Use Power Query to create a local data set that is focused on your set of dashboards/reports. Pre-filtering data and storing it locally using Power Query leads to some great performance gains as the DAX expressions that you might use later will be going against locally stored data rather than a network data store.

2. Use Power Query to aggregate data if you know what kind of aggregations you need ahead of time. If you are doing very standard end of month/quarter/year reports then you can perform those aggregations in Power Query and store the results in a table that can simply be dropped into a front-end report with little fuss.

3. Use Power Query for very common business logic that is repeated across reports. Because it is easier to "version control" and back up Power Query data operations it makes sense to perform as much business logic as possible in Power Query so that you can copy this logic from report to report.

4. Use Power Query to create calendar tables and other lookup tables that can be created beforehand. While it would be nice to have all the important tables already existing in datastores, sometimes it makes sense to create some "local" data tables that can be used to code data. Power Query is great for these local tables such as calendar tables, aggregation categories that vary from report to report, etc.

5. Use Power Query to denormalize data for consumption in the Power BI frontend. Every organization has a few people whose skill set is better than the average data analyst but not quite up to the level of developer. These power users often do well with a dashboard that has many slicers sitting on top of a denormalized data table. You can use Power Query to "dump" all the data from various relational tables into one denormalized "mega table" and then allow these power users to explore the data from the denormalized table.

6. Use DAX to do dynamic calculations and aggregations. DAX is great when you are creating a dashboard for a power user who is doing exploratory data analysis (EDA). In this case you can create all the aggregation calculations as measures built upon data tables and allow the user to set their own filters and aggregation levels.

We hope this overview of Power Query and DAX has been useful to you. If you are in need of more targeted Power BI expertise get in touch using the contact form below and we would be glad to talk to you about your organization's specific needs.
Contact Us

322 North Shore Drive

Building 1B, Suite 200

Pittsburgh, PA 15212

Phone number
+1 (412) 230-8552
info@synthelize.com

I am interested in information on: