## An example on DAX SUMX function which is the perfect row iteration function for a lot of use cases

## What

Often there is a need to calculate a DAX measure in a row level combining data from multiple tables. An example is a margin calculation: the margin percentage in a product level multiplied with the sales currency amount in an event level.

## How

DAX SUMX function is the choice for the job, as said by Microsoft: *Returns the sum of an expression evaluated for each row in a table*. However, there are a couple of pitfalls to this function. These are discussed in the Code section and Theory section of this post.

In addition, one should consider when to use DAX MEASURE, especially SUMX, in the first place. SUMX uses a lot of memory and resources, and typically the operation of SUMX can be done without SUMX and with SUM by making new columns to the data model and then using the simple SUM. You can read more about whether to use DAX MEASURE in my following post: Power BI DAX When to Use Measure VS Calculated Column VS Other Tools. To see a DAX query time benchmark where SUMX is used see my post: The Cost of Relationships, Snowflake vs Star Schema.

## Code

Here are presented two different approaches. Be aware that the 1) CALCULATE approach gives correct results only with a table (to which the SUMX refers to iterate) which has a row id in each row. If there are duplicates, like duplicate sales events, one should either add a row id (best solution for the data model usability and reliability) or use the 2) COLUMN approach of SUMX

--1) CALCULATE approach Measure SUM MARGIN with CALCULATE = SUMX(sales;CALCULATE(SUM(sales[Sales])*MAX(category_margin[Margin]))) --2) COLUMN approach Measure SUM MARGIN with COLUMNS = SUMX(sales;sales[Sales]*RELATED(category_margin[Margin]))

## Theory

With the 1) CALCULATE approach presented above happens context transition (see for example https://www.sqlbi.com/articles/understanding-context-transition/). This means the sales table being iterated is a filter context table = distinct row presentation. If there is a duplicate row, it is iterated as one row. This is not usually the desired behavior when we want to iterate sales events in the row level.

With the 2) COLUMN approach presented above there is no context transition, but one needs to use the RELATED function to refer to another table. As nicely formulated by SQLBI in this post about row context and filter context (must read if you have not done already): “*A row context does not propagate through relationships. If you have a row context in a table, you can iterate the rows of a table on the many side of a relationship using RELATEDTABLE, and you can access the row of a parent table using RELATED.”* This is exactly what is done with SUMX and RELATED in the 2) COLUMN approach.

## Sample Power BI file

Example measures are created under the sales and sales_with_duplicate tables in the example file: https://drive.google.com/file/d/1gw2kfBF4m261rtdygUthqAJiXX7mQPbq/view?usp=sharingc