DAX Query Union Multiple Tables & Return Distinct - powerbi

I have two tables (CompletedJobs & ScriptDetails) and using DAX, I want to return distinct Names that appear in CompletedJobs that do not appear in ScriptDetails.
Here is my SQL Query. Works and return values.
Select Distinct CJ.[Name]
From CompletedJobs CJ
Left Join ScriptDetails SD
ON CJ.[Name]=SD.ActivityName
Where SD.ActivityName IS NULL
I started with creating the following DAX query, but just doing this, I get the following error message:
"A table of multiple values was supplied where a single value was expected"
AdhocJobs = DISTINCT(UNION(SELECTCOLUMNS(CompletedJobs,"Name",CompletedJobs[Name]),SELECTCOLUMNS(ScriptDetails,"Name",ScriptDetails[ActivityName])))
How do I create a DAX query that would replicate the SQL query?

Rather than recreate your SQL, there is DAX that already addresses your specific use case. The EXCEPT function returns a table where rows from the LEFT SIDE table do not appear in the RIGHT SIDE table.
EVALUATE
DISTINCT (
EXCEPT (
SUMMARIZE ( CompletedJobs , CompletedJobs [Name]),
SUMMARIZE ( ScriptDetails , ScriptDetails [ActivityName])
)
)
In this case I used SUMMARIZE to reduce each table down to one column, and then wrapped them with EXCEPT to take only the Names from Completed Jobs that aren't ActivityNames in ScriptDetails.
Hope it helps.

Related

Why using ALLSELECTED() AND FILTER() together?

I am trying to understand the below DAX code:
CALCULATE(
SUM(Revenue[Net])
,FILTER('Center', NOT 'Center'[Acc] IN {"RSM", BLANK() })
,ALLSELECTED()
,VALUES('Customer'[Customer Number])
)
I have the below questions:
What's the use of ALLSELECTED?? By definition ALLSELECTED returns all rows in a table, ignoring any filters that might have been applied inside the query, but keeping filters that come from outside. https://dax.guide/allselected/
So, what's the point of writing FILTER() if its going to be forced to be ignored by the next line (ALLSELECTED)?!?
Also by definition:
CALCULATE is just a expresion followed by filters...
What's the use of VALUES() ? It doesn't appear to be a filter, so how is it even allowed to appear there? (Per definition VALUES(): returns a one-column table that contains the distinct values from the specified column.)
I do not understand what is this returning? is it the SUM() or the VALUES()?
(I come from a SQL background, so any sql-friendly answer is appreciated).
In Dax every filter is a table of values its look similar to INNER JOIN;
ALLSELECTED is useful when you need to keep a row context (this is also a filter in DAX). You can use ALLSELECTED inside FILTER function.
For better understand what engine does you can use a DaxStudio with ServerTiming;
As you see this product one simple statement:
SELECT
SUM ( 'Table'[Cost Centre] )
FROM 'Table'
WHERE
'Table'[Project] NIN ( 'AB' ) ;
You can find useful article by Alberto Ferrari and Marco Russo:
https://www.sqlbi.com/tv/auto-exist-on-clusters-or-numbers-unplugged-22/
If it is only converting DAX queries if your connecting your sql analysis services to in DAX studio. because it is not working for PBI file data

Can I add a filter to a SELECTCOLUMNS so that I can use two different table depending on the filter in DAX

I am using DAX Studio and I would like to add a filter to the table field in SELECTCOLUMNS so that it using two different table depending on the filter's expression result.
In other words what I would like to do is similar to the following :
DEFINE
VAR cond_talble =
SELECTCOLUMNS(
IF(#param1="1",TABLE1,TABLE2),
"column1",[column1],
"column2",[column2]
)
Thank you kindly
there is a work around for this problem but might not be a good one for everyone, and it's to add a column in both tables containing a boolean that is set to true for table1 and 0 for table2 and then (if your trables contains the same columns like me) get a table that is the fruit of the union of both tables and add an if condition on a filter so that you filter with the added column

Combine columns from different tables to make one table Power BI DAX

I have three different tables. I want to select different columns from each of the tables and create one table based on some filters. Have a look at the following dax expression:
FILTER(DISTINCT(SELECTCOLUMNS(Test_Table,"site_key",[site_key],"is_active",[is_active])),[is_active]=TRUE&&[dbsource]=="DB2")
As you can see, I've selected olumns from Test_Table.
Firstly, How can I add columns from the other two tables?
Secondly, please note that these tables are related so I've created a relationship between them based on relevant IDs.
Will I have to create a natural join in DAX as well?
As I mentioned in the comment, SUMMARIZECOLUMNS can probably get you what you're looking for. It might look something like this in your case:
SUMMARIZECOLUMNS (
Test_Table[site_key],
Test_Table_2[industry_group],
Test_Table_2[country],
Test_Table_2[state],
FILTER ( Test_Table, [is_active] = TRUE && [dbsource] = "DB2" )
)

How to JOIN summarized data from two queries into new table in DAX Power BI

I have 2 queries:
Premium:
and Losses:
How can I simply summarize data from Premium query and LEFT JOIN it to summarized data in Losses query using DAX?
In SQL it would be like that:
declare #PremiumTable table (PolicyNumber varchar(50), Premium money)
insert into #PremiumTable values
('Pol1', 100),
('Pol1', 50),
('Pol2', 300),
('Pol3', 500),
('Pol3', 200),
('Pol4',400)
declare #LossesTable table (PolicyNumber varchar(50), Losses money)
insert into #LossesTable values ('Pol1',115),
('Pol1',25),
('Pol2',0),
('Pol3',110),
('Pol3',75)
select p.PolicyNumber,
sum(p.Premium) as Premium,
sum(l.Losses)as Losses
from #PremiumTable p
LEFT JOIN #LossesTable l on p.PolicyNumber = l.PolicyNumber
group by p.PolicyNumber
Result:
I tried using NATURALLEFTOUTERJOIN but it gives me an error:
*An incompatible join column, (''[PolicyNumber]) was detected. 'NATURALLEFTOUTERJOIN' doesn't support joins by using columns with different data types or lineage.*
MyTable =
VAR Premium =
SELECTCOLUMNS(
fact_Premium,
"PolicyNumber",fact_Premium[PolicyNumber],
"Premium", fact_Premium[Premium]
)
VAR Losses =
SELECTCOLUMNS(
fact_Losses,
"PolicyNumber", fact_Losses[PolicyNumber],
"Losses", fact_Losses[PaymentAmount]
)
VAR Result = NATURALLEFTOUTERJOIN(Premium,Losses)
RETURN Result
There are a few interdependent "bugs" or limitations around the use of variables (VAR) and NATURALLEFTOUTERJOIN which makes this a weird case to debug.
Some notable limitations are:
VAR:
Columns in table variables cannot be referenced via
TableName[ColumnName] syntax.
NATURALLEFTOUTERJOIN:
Either:
The relationship between both tables has to be defined before the
join is applied AND the names of the columns that define the
relationship need to be different.
Or:
In order to join two columns with the same name and no relationships,
it is necessary that these columns to have a data lineage.
(I'm a bit confused because the link mentioned do not have a data lineage; while official documentation said only columns from the same source table (have the same lineage) are joined on.)
Come back to this case.
SUMMARIZE should be used instead of SELECTCOLUMNS to obtain summary tables for Premium and Losses, i.e.:
Premium =
SUMMARIZE(
fact_Premium,
fact_Premium[PolicyNumber],
"Premium", SUM(fact_Premium[Premium])
)
Losses =
SUMMARIZE(
fact_Losses,
fact_Losses[PolicyNumber],
"Losses", SUM(fact_Losses[Losses])
)
When we apply NATURALLEFTOUTERJOIN to the above two tables, it'll return error No common join columns detected because of they have no relationship established.
To resolve this, we can make use of TREATAS as suggested in this blog post. But to use TREATAS, we have to reference the column names in Premium and Losses table, so we can't use VAR to declare them, but have to actually instantiate them.
To conclude, the solution would be:
Create calculate tables for Premium and Losses as mentioned above.
Use TREATAS to mimic a data lineage and join Premium table with Losses_TreatAs instead.
MyTable =
VAR Losses_TreatAs = TREATAS(Losses, Premium[PolicyNumber], Losses[Losses])
RETURN NATURALLEFTOUTERJOIN(Premium, Losses_TreatAs)
Results:
There's a sleazy hack that can successfully work around this awful limitation (what were the product designers thinking?).
If you add zeros (e.g. + 0) or concatenate an empty string (e.g. & "") to each join column within SELECTCOLUMNS, it breaks out of the data lineage straitjacket and runs the NATURALLEFTOUTERJOIN just using column names.
You can use this in a Measure to run dynamic logic (based on the query context from filters etc), not just while creating a calculated table.
Here's an tweaked version of your code:
MyTable =
VAR Premium =
SELECTCOLUMNS(
fact_Premium,
"PolicyNumber",fact_Premium[PolicyNumber] & "",
"Premium", fact_Premium[Premium]
)
VAR Losses =
SELECTCOLUMNS(
fact_Losses,
"PolicyNumber", fact_Losses[PolicyNumber] & "",
"Losses", fact_Losses[PaymentAmount]
)
VAR Result = NATURALLEFTOUTERJOIN(Premium,Losses)
RETURN Result
H/T to example #7 on this page, which shows this in code (without really explaining it).
https://www.sqlbi.com/articles/from-sql-to-dax-joining-tables/#code7
Hello I suggest you this way:
in PowerQuery, built up a table with policyNumber like that:
Duplicate Premium table, and remove the premium column on the duplicate. Call it PremiumPol
Duplicate the Losses table, and remove the losses column on duplicate. Call it LossesPol
Then use the button Append Query, to Append PremiumPol and LossesPol. Call it policynumber
Last remove duplicate from the appended tables
Then click on close and Apply
Check that your model is like that:
Then, to add losses and premium on a policy base is trivial, go on and select a table visual and these fields:
the result is like this:
Hope that helps!

DAX Query to Get Distinct Items from Multiple Tables

Problem
I'm trying to generate a table of distinct email addresses from multiple source tables. However, with the UNION statement on the outer part of the statement, it isn't generating a truly distinct list.
Code
Participants = UNION(DISTINCT('Registrations'[Email Address]), DISTINCT( 'EnteredTickets'[Email]))
*Note that while I'm starting with just two source tables, I need to expand this to 3 or 4 by the end of it.
A combination of using VALUES on the table selects plus wrapping the whole statement in one more DISTINCT did the trick:
Participants = DISTINCT(UNION(VALUES('Registrations'[Email Address]), VALUES( 'EnteredTickets'[Email])))
If you want a bridge table with unique values for all different tables, use DISTINCT instead of VALUES:
Participants =
FILTER (
DISTINCT (
UNION (
TOPN ( 0, ROW ("NiceEmail", "asdf") ), -- adds zero rows table with nice new column name
DISTINCT ( 'Registrations'[Email Address] ),
DISTINCT ( 'EnteredTickets'[Email] )
)
),
[NiceEmail] <> BLANK () -- removes all blank emails
)
DISTINCT AND VALUES may lead to different results. Essentially, using VALUES, you are likely to end up with (unwanted) blank value in your list.
Check this documentation:
https://learn.microsoft.com/en-us/dax/values-function-dax#related-functions
You might also like information under this link which you can use to get a specific column name for your table of distinct values:
DAX create empty table with specific column names and no rows