How to get DISTINCT row based on a column from UNION - powerbi

In Power BI, I have a table based on UNION of 2 different tables:
ResultTable =
UNION (
SELECTCOLUMNS (
'Table1',
"Name", 'Table1'[name] ,
"Number", 'Table1'[number]
) ,
SELECTCOLUMNS (
'Table2',
"Name", 'Table2'[name] ,
"Number", 'Table2'[number]
)
)
Here is the ResultTable output:
Name
Number
A
1
A
2
A
3
A
1
A
2
C
5
A
3
B
4
Can I get distinct rows based on the Number column so that it becomes:
Name
Number
A
1
A
2
A
3
C
5
B
4

Note that you have to specify an aggregation in case there are different names for a single number. This is not the case with your sample data, so either MIN() or MAX() will work. Use this expression as a calculated table:
Distinct Numbers =
SUMMARIZE(
ResultTable,
ResultTable[Number],
"Name", MIN(ResultTable[Name])
)

Your desired output still describes a simple distinct on the entire UNION result, so just wrap it in DISTINCT:
ResultTable =
DISTINCT (
UNION (
SELECTCOLUMNS (
'Table1',
"Name", 'Table1'[name] ,
"Number", 'Table1'[number]
) ,
SELECTCOLUMNS (
'Table2',
"Name", 'Table2'[name] ,
"Number", 'Table2'[number]
)
)
)
Did you mean to do a group by NUMBER, while implementing some logic for picking the right NAME out of aggregation?

Related

DAX PATH function with table variable

In DAX, the PATH function requires parameters which are column references. In the example below, I can use a PATH function referring to the "ID" and "ParentID" columns which are statically defined in the DEFINE statement.
DEFINE
TABLE MyTable = UNION (
ROW ( "ID", 1, "ParentID", BLANK ( ) ),
ROW ( "ID", 2, "ParentID", 1 ),
ROW ( "ID", 3, "ParentID", 1 ),
ROW ( "ID", 4, "ParentID", 2 )
)
EVALUATE
ADDCOLUMNS (
MyTable,
"Path", PATH ( MyTable[ID], MyTable[ParentID] )
)
-- Result
-- ID ParentID Path
-- 1 BLANK 1
-- 2 1 1|2
-- 3 1 1|3
-- 4 2 1|2|4
My question is how I can refer to columns created dynamically in a table variable? With the below query, I get an error saying that the MyTableVar is not found.
EVALUATE
VAR MyTableVar = UNION (
ROW ( "ID", 1, "ParentID", BLANK ( ) ),
ROW ( "ID", 2, "ParentID", 1 ),
ROW ( "ID", 3, "ParentID", 1 ),
ROW ( "ID", 4, "ParentID", 2 )
)
RETURN ADDCOLUMNS (
MyTableVar,
"Path", PATH ( MyTableVar[ID], MyTableVar[ParentID] ) -- Error: MyTableVar is not found
)
I also tried PATH ( [ID], [ParentID] ), but the error message says [ID] is not a valid reference to a table column.
I researched several articles but could not find a solution.
Microsoft Docs - PATH
DAX GUIDE - PATH
Microsoft Docs - Column and measure references
SQLBI - Table and column references using DAX variables
Microsoft Power BI Community Forum - DAX: Is it possible to refer to columns of a table variable?

ALLEXCEPT not working when filtering blanks

I have a simple problem. My DAX measure does not seem to be working correctly when I filter for non-existing values. Here are some details:
Table:
Column1: A,A,A,A,A,B,B,B,B
Column2: 1,2,3,4,5,1,2,3,5
Measure = calculate(countrows(table), allexcept(column1))
Card Visual returns correct row count when I filter by column1 (any value in filtering pane)
However it returns wrong row count when I filter by column2 = "4" and Column1 = "B" (in filtering pane). It seems that it should ingore filtering by column2 and it does except when I specifically filer for value = "4". It gives "blank" result value in a card visual then.
Any ideas why?
Here's the screen. I would like to populate that blank cell with "4" (in a singe-table data model.enter image description here
In your case you dont need to add allexcept in your measure. Below code would be fine.
TestMeasure = countrows(Test_Data)
PFB screenshot
I am hoping that you have a data model as following
table name _dim1
colA
A
B
C
table name _dim2
colB
1
2
3
4
5
table name _fact
colA
colB
A
1
A
2
A
3
A
4
A
5
B
1
B
2
B
3
B
5
C
2
C
3
If you have this you can reach where you need by using following measures
Measure3 =
CALCULATE ( COUNTROWS ( _fact ), ALL ( _dim2[colB] ), VALUES ( _fact[colA] ) )
Measure9 =
VAR _1 =
MAX ( _dim2[colB] )
VAR _2 =
CALCULATE (
MAXX (
FILTER ( _dim2, _dim2[colB] <= _1 ),
LASTNONBLANKVALUE ( _dim2[colB], [Measure3] )
),
ALL ( _dim2[colB] )
)
RETURN
_2
Measure10 =
VAR _1 =
MAX ( _dim2[colB] )
VAR _2 =
CALCULATE (
MAXX (
FILTER ( _dim2, _dim2[colB] > _1 ),
FIRSTNONBLANKVALUE ( _dim2[colB], [Measure3] )
),
ALL ( _dim2[colB] )
)
RETURN
IF ( ISBLANK ( [Measure9] ) = TRUE (), _2, [Measure9] )
I don't think you can reach here from a single table like following
colA
colB
A
1
A
2
A
3
A
4
A
5
B
1
B
2
B
3
B
5
C
2
C
3

PowerBI: How to get distinct count for a column in a table, while grouping for many columns separately

I have a table with multiple date columns, and a single label column, as shown by following code
Data = DATATABLE (
"Date1", DATETIME,
"Date2", DATETIME,
"Label", STRING,
{
{ "2020-01-01","2020-01-02", "A" },
{ "2020-01-01","2020-01-01", "A" },
{ "2020-01-01","2020-01-02", "B" },
{ "2020-01-01","2020-01-01", "D" },
{ "2020-01-01","2020-01-02", "E" },
{ "2020-01-02","2020-01-01", "A" },
{ "2020-01-02","2020-01-02", "B" },
{ "2020-01-02","2020-01-01", "C" }
}
)
I want to plot a chart of count of distinct labels for each day, when considering date1, as well as when considering date2. These need to be in same plot, as a clustered bar plot, as shown below. This means I need to get the values on a new date column.
The expected result looks like this,
Date | value1 | value2
---------------------------------
1/1/2020 12:00:00 AM | 4 | 3 |
1/2/2020 12:00:00 AM | 3 | 3 |
Current Solution:
I am creating two different tables for each of the counts, as follows
Date1_Count =
ADDCOLUMNS (
ALL ( Data[Date1] ),
"Count",
CALCULATE (
DISTINCTCOUNT ( Data[Label] )
)
)
and
Date2_Count =
ADDCOLUMNS (
ALL ( Data[Date2] ),
"Count",
CALCULATE (
DISTINCTCOUNT ( Data[Label] )
)
)
Then I create a third table with dates as such,
Final_Counts = CALENDAR("2020-01-01", "2020-01-04")
Next, I add relationship between the three dates, viz. Date1_Count table, Date2_Count table, and Final_Counts table
Finally, I combine the data using RELATED function as follows
value1 = RELATED(Date1_Count[Count])
value2 = RELATED(Date2_Count[Count])
Question
Is there a simpler solution that does not require creating one table per date column? The current method is not scalable to many date columns.
Assuming you only have a handful of date columns, you just need a single date dimension table and one measure per date column.
Define a date table to use on the x-axis (no relationships to other tables):
DimDate = CALENDAR("2020-01-01", "2020-01-04")
Then define measures that match the various date columns to the date table:
value1 =
CALCULATE (
DISTINCTCOUNT ( Data[Label] ),
Data[Date1] IN VALUES ( DimDate[Date] )
)
and
value2 =
CALCULATE (
DISTINCTCOUNT ( Data[Label] ),
Data[Date2] IN VALUES ( DimDate[Date] )
)
If you have more than a handful of DateN columns, then you'd probably be best served to reshape your data where you unpivot all those columns.
For just the two you have the data would look like
In this case, you use Unpivot[Column] as the Legend and only need a single measure:
value =
CALCULATE (
DISTINCTCOUNT ( Unpivot[Label] ),
Unpivot[Date] IN VALUES ( DimDate[Date] )
)
This gives a similar looking result:
It is possible to obtain the Final_Counts calculated table in one step, using ADDCOLUMNS to iterate over Data[Date1], and then calculating Value1 as the DISTINCTCOUNT over the Data table filtered on the currently iterated Date1.
This work thanks to the CALCULATE statement that triggers a context transition.
Obtaining the Value2 requires to create a new filter context over Date2 using the currently iterated Date1.
First we save the current Date1 in a variable to be used inside CALCULATE in the filter expression on Date2.
We also need REMOVEFILTERS( Data ) to remove the filter context over Date1 set by the context transition.
Final_Counts =
ADDCOLUMNS(
ALL( Data[Date1] ),
"Value1",
CALCULATE(
DISTINCTCOUNT( Data[Label] )
),
"Value2",
VAR CurrentDate = Data[Date1]
RETURN
CALCULATE(
DISTINCTCOUNT( Data[Label] ),
REMOVEFILTERS( Data ),
Data[Date2] = CurrentDate
)
)

Power BI if condition if true then column with date value else NULL

The below table has 2 columns
Where Column A is a Date column and Column B is a Text column where some values are equal to "x" and some are blank.
I need to create an output column which based on the below formula
IF (
AND ( ColumnA < EOMONTH ( ColumnA, 3 ), ( ColumnB = "x" ) ),
EOMONTH ( ColumnA, 3 ),
"-"
)
I have written the following DAX formula for it:
Output =
IF (
AND (
ColumnA
< EOMONTH ( DATE ( YEAR ( ColumnA ), MONTH ( ColumnA ), DAY ( ColumnA ) ), 3 ),
( ColumnB = "x" )
),
EOMONTH ( ColumnA, 3 ),
"-"
)
I'm getting an error with this formula that NULL is not allowed in this context
Note: We can leave Blank in place of "x".
How do I write the correct DAX formula to achieve the above?
The problem with your calculation is that you are mixing different data types in the same column.
The Output column is handling a date data types with a text data types, that's why you are getting an error. The columns could only handle date or text but not both at the same time.
To fix your calculation your need to change your ELSE statement from "-" to BLANK()

Intersection of Customer product purchase (powerBI)

I need help with producing a count of the intersections between customers and which items they have purchased. For example, if there are 5 products, a customer can purchase any single product or any combination of the 5. Customers can also re-purchase a product at any date - this is where my problem arises as an end user wants to be able to see the intersections for any selected date range.
I have managed to come up with a solution which includes the use of parameters but this is not ideal as the end user does not have access to change any parameters of the report.
I'm open to any solution that does not involve parameters, ideally a slicer with dates would be the best solution
The fields I have on the table are customer_ID, date_ID, and product
Example Data
customer_id date_id product
1 9/11/2018 A
1 10/11/2018 A
1 10/11/2018 B
1 11/11/2018 C
1 11/11/2018 A
2 9/11/2018 C
2 10/11/2018 D
2 11/11/2018 E
2 11/11/2018 A
3 10/11/2018 A
3 10/11/2018 B
3 11/11/2018 A
3 11/11/2018 B
3 11/11/2018 B
4 10/11/2018 A
4 11/11/2018 A
5 9/11/2018 A
5 10/11/2018 B
5 10/11/2018 E
5 10/11/2018 D
5 11/11/2018 C
5 11/11/2018 A
6 9/11/2018 A
6 10/11/2018 A
6 11/11/2018 A
Possible output with different slicer selections
Any help at all would be greatly appreciated
This is pretty tricky since I can't think of a way to use the values of a dynamically calculated table as a field in a visual. (You can create calculated tables, but those aren't responsive to slicers. You can also create dynamically calculated tables inside of a measure, but measures don't return tables, just single values.)
The only way I can think of to do this requires creating a table for every possible product combination. However, if you have N products, then this table has 2N rows and that blows up fast.
Here's a calculated table that will output all the combinations:
Table2 =
VAR N = DISTINCTCOUNT(Table1[product])
VAR Products = SUMMARIZE(Table1,
Table1[product],
"Rank",
RANKX(ALL(Table1),
Table1[product],
MAX(Table1[product]),
ASC,
Dense
)
)
VAR Bits = SELECTCOLUMNS(GENERATESERIES(1, N), "Bit", [Value])
VAR BinaryString =
ADDCOLUMNS(
GENERATESERIES(1, 2^N),
"Binary",
CONCATENATEX(
Bits,
MOD( TRUNC( [Value] / POWER(2, [Bit]-1) ), 2)
,,[Bit]
,DESC
)
)
RETURN
ADDCOLUMNS(
BinaryString,
"Combination",
CONCATENATEX(Products, IF(MID([Binary],[Rank],1) = "1", [product], ""), "")
)
Then add a calculated column to get the column delimited version:
Delimited =
VAR Length = LEN(Table2[Combination])
RETURN
CONCATENATEX(
GENERATESERIES(1,Length),
MID(Table2[Combination], [Value], 1),
","
)
If you put Delimited the Rows section on a matrix visual and the following measure in the Values section:
customers =
VAR Summary = SUMMARIZE(Table1,
Table1[customer_id],
"ProductList",
CONCATENATEX(VALUES(Table1[product]), Table1[product], ","))
RETURN SUMX(Summary, IF([ProductList] = MAX(Table2[Delimited]), 1, 0))
And filter out any 0 customer values, you should get something like this:
So yeah... not a great solution, especially when N gets big, but maybe better than nothing?
Edit:
In order to work for longer product names, let's use a delimiter in the Combination concatenation:
CONCATENATEX(Products, IF(MID([Binary],[Rank],1) = "1", [product], ""), ",")
(Note the "" to "," change at the end.)
And then rewrite the Delimited calculated column to remove excess commas.
Delimited =
VAR RemoveMultipleCommas =
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(
SUBSTITUTE(Table2[Combination], ",,", ","),
",,", ","),
",,", ","),
",,", ",")
VAR LeftComma = (LEFT(Table2[Combination]) = ",")
VAR RightComma = (RIGHT(Table2[Combination]) = ",")
RETURN
IF(RemoveMultipleCommas <> ",",
MID(RemoveMultipleCommas,
1 + LeftComma,
LEN(RemoveMultipleCommas) - RightComma - LeftComma
), "")
Finally, let's modify the customers measure a bit so it can subtotal.
customers =
VAR Summary = SUMMARIZE(Table1,
Table1[customer_id],
"ProductList",
CONCATENATEX(VALUES(Table1[product]), Table1[product], ","))
VAR CustomerCount = SUMX(Summary, IF([ProductList] = MAX(Table2[Delimited]), 1, 0))
VAR Total = IF(ISFILTERED(Table2[Delimited]), CustomerCount, COUNTROWS(Summary))
RETURN IF(Total = 0, BLANK(), Total)
The Total variable gives the total customer count for the total. Note that I've also set zeros to return as blank so that you don't need to filter out zeros (it will automatically hide those rows).
You can also try this measure to calculate the result.
[Count Of Customers] :=
VAR var_products_selection_count = DISTINCTCOUNT ( Sales[product] )
VAR var_customers = VALUES ( Sales[customer_id] )
VAR var_customers_products_count =
ADDCOLUMNS(
var_customers,
"products_count",
VAR var_products_count =
COUNTROWS (
FILTER (
CALCULATETABLE ( VALUES ( Sales[product] ) ),
CONTAINS (
Sales,
Sales[product],
Sales[product]
)
)
)
RETURN var_products_count
)
RETURN
COUNTROWS (
FILTER (
var_customers_products_count,
[products_count] = var_products_selection_count
)
)
I think I've found a better solution/workaround that doesn't require precomputing all possible combinations. The key is to use a rank/index as a base column and then built off of that.
Since the customer_id is already nicely indexed starting from 1 with no gaps, in this case, I will use that, but if it weren't, then you'd want to create an index column to use instead. Note that there cannot be more distinct product combinations within a given filter context than there are customers since each customer only has a single combination.
For each index/rank we want to find the product combination that is associated with it and the number of customers for that combination.
ProductCombo =
VAR PerCustomer =
SUMMARIZE (
ALLSELECTED ( Table1 ),
Table1[customer_id],
"ProductList",
CONCATENATEX ( VALUES ( Table1[product] ), Table1[product], "," )
)
VAR ProductSummary =
SUMMARIZE (
PerCustomer,
[ProductList],
"Customers",
DISTINCTCOUNT ( Table1[customer_id] )
)
VAR Ranked =
ADDCOLUMNS (
ProductSummary,
"Rank",
RANKX (
ProductSummary,
[Customers] + (1 - 1 / RANKX ( ProductSummary, [ProductList] ) )
)
)
VAR CurrID =
SELECTEDVALUE ( Table1[customer_id] )
RETURN
MAXX ( FILTER ( Ranked, [Rank] = CurrID ), [ProductList] )
What this does is first create a summary table that computes the product list for each customer.
Then you take that table and summarize over the distinct product lists and counting the number of customers that have each particular combination.
Then I add a ranking column to the previous table ordering first by the number of customers and tiebreaking using a dictionary order of the product list.
Finally, I extract the product list from this table where the rank matches the index/rank of the current row.
You could do a nearly identical measure for the customer count, but here's the measure I used that's a bit simpler and handles 0 values and the total:
Customers =
VAR PerCustomer =
SUMMARIZE (
ALLSELECTED ( Table1 ),
Table1[customer_id],
"ProductList",
CONCATENATEX ( VALUES ( Table1[product] ), Table1[product], "," )
)
VAR ProductCombo = [ProductCombo]
VAR CustomerCount =
SUMX ( PerCustomer, IF ( [ProductList] = ProductCombo, 1, 0 ) )
RETURN
IF (
ISFILTERED ( Table1[customer_id] ),
IF ( CustomerCount = 0, BLANK (), CustomerCount ),
DISTINCTCOUNT ( Table1[customer_id] )
)
The result looks like this