DAX Query to Get Distinct Items from Multiple Tables - powerbi

Problem
I'm trying to generate a table of distinct email addresses from multiple source tables. However, with the UNION statement on the outer part of the statement, it isn't generating a truly distinct list.
Code
Participants = UNION(DISTINCT('Registrations'[Email Address]), DISTINCT( 'EnteredTickets'[Email]))
*Note that while I'm starting with just two source tables, I need to expand this to 3 or 4 by the end of it.

A combination of using VALUES on the table selects plus wrapping the whole statement in one more DISTINCT did the trick:
Participants = DISTINCT(UNION(VALUES('Registrations'[Email Address]), VALUES( 'EnteredTickets'[Email])))

If you want a bridge table with unique values for all different tables, use DISTINCT instead of VALUES:
Participants =
FILTER (
DISTINCT (
UNION (
TOPN ( 0, ROW ("NiceEmail", "asdf") ), -- adds zero rows table with nice new column name
DISTINCT ( 'Registrations'[Email Address] ),
DISTINCT ( 'EnteredTickets'[Email] )
)
),
[NiceEmail] <> BLANK () -- removes all blank emails
)
DISTINCT AND VALUES may lead to different results. Essentially, using VALUES, you are likely to end up with (unwanted) blank value in your list.
Check this documentation:
https://learn.microsoft.com/en-us/dax/values-function-dax#related-functions
You might also like information under this link which you can use to get a specific column name for your table of distinct values:
DAX create empty table with specific column names and no rows

Related

Can I add a filter to a SELECTCOLUMNS so that I can use two different table depending on the filter in DAX

I am using DAX Studio and I would like to add a filter to the table field in SELECTCOLUMNS so that it using two different table depending on the filter's expression result.
In other words what I would like to do is similar to the following :
DEFINE
VAR cond_talble =
SELECTCOLUMNS(
IF(#param1="1",TABLE1,TABLE2),
"column1",[column1],
"column2",[column2]
)
Thank you kindly
there is a work around for this problem but might not be a good one for everyone, and it's to add a column in both tables containing a boolean that is set to true for table1 and 0 for table2 and then (if your trables contains the same columns like me) get a table that is the fruit of the union of both tables and add an if condition on a filter so that you filter with the added column

Sqlite Query to remove duplicates from one column. Removal depends on the second column

Please have a look at the following data example:
In this table, I have multiple columns. There is no PRIMARY KEY, as per the image I attached, there are a few duplicates in STK_CODE. Depending on the (min) column, I want to remove duplicate rows.
According to the image, one stk_code has three different rows. Corresponding to these duplicate stk_codes, value in (min) column is different, I want to keep the row which has minimum value in (min) column.
I am very new at sqlite and I am dealing with (-lsqlite3) to join cpp with sqlite.
Is there any way possible?
Your table has rowid as primary key.
Use it to get the rowids that you don't want to delete:
DELETE FROM comparison
WHERE rowid NOT IN (
SELECT rowid
FROM comparison
GROUP BY STK_CODE
HAVING (COUNT(*) = 1 OR MIN(CASE WHEN min > 0 THEN min END))
)
This code uses rowid as a bare column and a documented feature of SQLite with which when you use MIN() or MAX() aggregate functions the query returns that row which contains the min or max value.
See a simplified demo.

Remove duplicates based on sort

I have a customers table with ID's and some datetime columns. But those ID's have duplicates and i just want to Analyse distinct ID values.
I tried using groupby but this makes the process very slow.
Due to data sensitivity can't share it.
Any suggestions would be helpful.
I'd suggest using ROW_NUMBER() This lets you rank the rows by chosen columns and you can then pick out the first result.
Given you've shared no data or table and column names here's an example based on the Adventureworks database. The technique will be the same, you partition by whatever makes the group of rows you want to deduplicate unique (ProductKey below) and order in a way that makes the version you want to keep first (Children, birthdate and customerkey in my example).
USE AdventureWorksDW2017;
WITH CustomersOrdered AS
(
SELECT S.ProductKey, C.CustomerKey, C.TotalChildren, C.BirthDate
, ROW_NUMBER() OVER (
PARTITION BY S.ProductKey
ORDER BY C.TotalChildren DESC, C.BirthDate DESC, C.CustomerKey ASC
) AS CustomerSequence
FROM dbo.FactInternetSales AS S
INNER JOIN dbo.DimCustomer AS C
ON S.CustomerKey = C.CustomerKey
)
SELECT ProductKey, CustomerKey
FROM CustomersOrdered
WHERE CustomerSequence = 1
ORDER BY ProductKey, CustomerKey;
you can also just sort the columns with date column an than click on id column and delete duplicates...

DAX Query Union Multiple Tables & Return Distinct

I have two tables (CompletedJobs & ScriptDetails) and using DAX, I want to return distinct Names that appear in CompletedJobs that do not appear in ScriptDetails.
Here is my SQL Query. Works and return values.
Select Distinct CJ.[Name]
From CompletedJobs CJ
Left Join ScriptDetails SD
ON CJ.[Name]=SD.ActivityName
Where SD.ActivityName IS NULL
I started with creating the following DAX query, but just doing this, I get the following error message:
"A table of multiple values was supplied where a single value was expected"
AdhocJobs = DISTINCT(UNION(SELECTCOLUMNS(CompletedJobs,"Name",CompletedJobs[Name]),SELECTCOLUMNS(ScriptDetails,"Name",ScriptDetails[ActivityName])))
How do I create a DAX query that would replicate the SQL query?
Rather than recreate your SQL, there is DAX that already addresses your specific use case. The EXCEPT function returns a table where rows from the LEFT SIDE table do not appear in the RIGHT SIDE table.
EVALUATE
DISTINCT (
EXCEPT (
SUMMARIZE ( CompletedJobs , CompletedJobs [Name]),
SUMMARIZE ( ScriptDetails , ScriptDetails [ActivityName])
)
)
In this case I used SUMMARIZE to reduce each table down to one column, and then wrapped them with EXCEPT to take only the Names from Completed Jobs that aren't ActivityNames in ScriptDetails.
Hope it helps.

How to substitute NULL with value in Power BI when joining one to many

In my model I have table AssignedToUser that don't contain any NULL values.
Table TotalCounts contains number of tasks for each User.
Those two table joined on UserGUID, and table TotalCounts contains NULL value for UserGUID.
When I drop everything in one table there is NULL value for AssignedToUser.
How can I substitute value NULL for AssignedToUser for "POOL".
Under EditQuery I tried to Create additional column
if [AssignedToUser] = null then "POOL" else [AssignedToUser]
But that didnt help.
UPDATE:
Thanks Alexis.
I have created FullAssignedToUsers table, but when I try to make a relationship with TotalCounts on UserGUID - it doesnt like it.
Data in new a table looks like this:
UPDATE:
File .ipbx can be accessed here:
https://www.dropbox.com/s/95frggpaq6tce7q/User%20Open%20Closed%20Tasks%20Experiment.pbix?dl=0
I believe the problem here is that your join has UserGUID values that are not in your AssignedToUsers table.
To correct this, one possibility is to replace your AssignedToUsers table with one that contains all the UserGUID values from the TotalCounts table.
FullAssignedToUsers =
ADDCOLUMNS(VALUES(TotalCounts[UserGUID]),
"AssignedToUser",
LOOKUPVALUE(AssignedToUsers[AssignedToUser],
AssignedToUsers[UserGUID], TotalCounts[UserGUID]))
The should get you the full outer join. You can then create the custom column like you described in the table and use that column in your visual.
You'll probably want to break the relationships with the original AssignedToUsers table and create relationships with the new one instead.
If you don't want to take that extra step, you can do an ISBLANK inside your new table formula.
FullAssignedToUsers =
ADDCOLUMNS(VALUES(TotalCounts[UserGUID]),
"AssignedToUser",
IF(
ISBLANK(
LOOKUPVALUE(AssignedToUsers[AssignedToUser],
AssignedToUsers[UserGUID], TotalCounts[UserGUID])),
"POOL",
LOOKUPVALUE(AssignedToUsers[AssignedToUser],
AssignedToUsers[UserGUID], TotalCounts[UserGUID])))
Note: This is equivalent to doing a right outer join merge on the AssignedToUsers table in the query editor and then replacing the nulls with "POOL". I'd actually recommend approaching it that way instead.
Another way to approach it is to pull the AssignedToUser column over to the TotalCounts table in a custom column and use that column in your visual instead.
AssignedToUsers =
IF(ISBLANK(RELATED(AssignedToUsers[AssignedToUser])),
"POOL",
RELATED(AssignedToUsers[AssignedToUser]))
This is equivalent to doing a left outer join merge on the TotalCounts table in the query editor, expanding the AssignedToUser column, then replacing nulls with "POOL" in that column.
In Dax missing values are Blank() not null. Try this:
=IF(
ISBLANK(AssignedToUsers[AssignedToUser]),
"Pool",
AssignedToUsers[AssignedToUser]
)