We have some issues in removing some unwanted data from a large dataset.
The data set looks similar to the table below:
Inv_Number | Type | Week | Quarter | Amount | Order
1 | Invoice | W1 | Q1 | 100 | A1233
2 | Invoice | W2 | Q1 | 50 | A100
3 | Invoice | W2 | Q1 | 150 | A567
4 | CR MEMO | W3 | Q2 | -100 | A1233
5 | Invoice | W2 | Q4 | 70 | A345
6 | Invoice | W5 | Q3 | 100 | A1233
7 | CR MEMO | W7 | Q2 | -25 | A100
The expected filtered result should look like:
Type | Week | Quarter | Amount | Order
Invoice | W2 | Q1 | 25 | A100
Invoice | W2 | Q1 | 150 | A567
Invoice | W2 | Q4 | 70 | A345
Invoice | W5 | Q3 | 100 | A1233
Basically, we have a unique identifier (Order) and we need to remove all CR MEMO and related Invoices (partial or not)
I have tried the following:
HasCredit = if(CALCULATE(SUM('inv'[Amount]),FILTER(ALL('inv'),inv[Order]=EARLIER(inv[Order]) && inv[Type]="CR MEMO"))+CALCULATE(SUM(inv[Amount]),FILTER(ALL('inv'),inv[Order]=EARLIER(inv[Order])&&inv[Type]="ORIGINAL"))=0,1,0)
Then add the new calculated field to the filter and select only 0 (zero) should provide the desired output
Further explanations:
The data set speaks about invoices and CR MEMO can be understand as credited invoices. If you have one invoice (ex: Inv_number = 1) with 100USD credited entirely, you will have a new invoice (ex: Inv_number = 4) with Type = CR MEMO. Now, I need to remove those lines that are only linked by the Order and present the final output as already described. Keep in mind that one CR MEMO (credited invoice) can have a partial amount from one original invoice
Related
Good day! I have a sample employee table like the one below. I need a DAX formula in Power BI to create a measure to count the number of direct reports of each employee. For Example, the Direct Report count of GL0001 will be 2 (Because GL0001 is the line manager of GL0002 and GL0019 and they report to GL0001), the Direct Report count of EMP-02023 will be 3, Direct Report count of GL0002 will be 3. Please help me also to create measures regarding the count of only one direct reporting and less than three direct reporting
| Employee ID | Line Manager ID | Layer (of Employee) | Layer (of Line Manager) |
|--------------|-------------------|----------------------|--------------------------|
| EMP-01980 | GL0003 | 4 | 3 |
| EMP-02023 | EMP-02015 | 6 | 5 |
| EMP-01636 | EMP-02015 | 6 | 5 |
| EMP-02138 | EMP-02162 | 6 | 5 |
| EMP-02145 | EMP-01980 | 5 | 4 |
| GL0023 | GL0022 | 5 | 4 |
| GL0001 | | 1 | 0 |
| GL0002 | GL0001 | 2 | 1 |
| GL0003 | GL0002 | 3 | 2 |
| GL0019 | GL0001 | 2 | 1 |
| GL0020 | GL0002 | 3 | 2 |
| GL0024 | GL0002 | 3 | 2 |
| EMP-01918 | EMP-00791 | 9 | 8 |
| EMP-01941 | EMP-00791 | 9 | 8 |
| EMP-02019 | EMP-02156 | 8 | 7 |
| EMP-02024 | EMP-02023 | 7 | 6 |
| EMP-02025 | EMP-02023 | 7 | 6 |
| EMP-03001 | EMP-02023 | 7 | 6 |
Your data doesn't have all the Employee ID for each Line Manager ID. That means the PATH calculation would not work.
I've assumed your data looks like this
Employee ID
Line Manager ID
1000001
1000002
1000001
1000003
1000002
1000004
1000003
1000005
1000004
1000006
1000005
1000007
1000006
1000008
1000007
1000009
1000006
1000010
1000003
Creating Calculated columns you can calculate the PATH and the PATH SIZE
Path
Path = path('Table'[Employee ID],'Table'[Line Manager ID])
Path Size
Path Length = PATHLENGTH([Path])
Output
Edit
In that case, you can use the Line Manager ID column to count direct reports, measure below.
DAX: Calculated Column
CountDirectReport =
VAR EmpId = [Employee ID]
RETURN
CALCULATE (
COUNTROWS ( 'Table' ),
FILTER ( 'Table', [Line Manager ID] = EmpId )
)
Output
I have a dataset where I wish to reflect the totals from a custom SQL query I performed in Tableau. Here is some sample data:
1. I first performed a custom query that was a join, unpivot and placed my data into groups
Size Tb Val type Group Sum_AVG SKU Last_Refreshed
270 90.5 Free_Space_TB Group2 90.5 Excel 9/1/2020
270 179.5 Used Group2 179.5 Excel 9/1/2020
814 701 Free_Space_TB Group1 701 Gris 8/1/2020
814 112 Used Group1 112 Gris 8/1/2020
2. Then I aggregated the data by taking the sum of one group and the average of the other group (and final summed these groups values)
The data is being aggregated like this: (SUM_AVG)
zn(sum(if [Group]= 'Group1' then [Val] end))
+
zn(avg(if [Group] = 'Group2' then [Val] end))
The view looks like this
Here is the custom query output
Here is my view
The avail and used appear when I hover over, but how would I include the total?
This is the calculation I am using (thanks to help from a SO member):
{SUM({Fixed [type]: ZN(sum(if [Group]= 'Group1' then [Val] end))})
+
sum({Fixed [type]: zn(avg(if [Group] = 'Group2' then [Val] end))})}
I am doing something wrong, because it is totaling up across all the column(s), (I have more columns in the full dataset) when I just want the total for each column.
(Used was created from using a custom query)
Any assistance is appreciated.
In my opinion, this you can do without changing the underlying view. WINDOW_SUM is a table calculation and is always dependent on view/context generated. Therefore, I always prefer LOD calculations which do not depend on context.
I think you should proceed like this. As always I have changed the sample data to include sufficient details
Data used
| Id | Avail | group | used | Date |
|----|-------|--------|------|------------|
| A | 5 | Group1 | 5 | 20-01-2020 |
| A | 20 | Group1 | 20 | 20-01-2020 |
| B | 10 | Group2 | 10 | 20-01-2020 |
| B | 5 | Group2 | 5 | 20-01-2020 |
| B | 5 | Group2 | 5 | 20-01-2020 |
| A | 10 | Group1 | 10 | 20-01-2020 |
| A | 10 | Group1 | 10 | 20-01-2020 |
| B | 5 | Group2 | 5 | 20-01-2020 |
| B | 5 | Group2 | 5 | 20-01-2020 |
| A | 5 | Group1 | 5 | 20-02-2019 |
| A | 20 | Group1 | 20 | 20-02-2019 |
| B | 10 | Group2 | 10 | 20-02-2019 |
| B | 5 | Group2 | 5 | 20-02-2019 |
| B | 5 | Group2 | 5 | 20-02-2019 |
| A | 10 | Group1 | 10 | 20-02-2019 |
| A | 10 | Group1 | 10 | 20-02-2019 |
| B | 5 | Group2 | 5 | 20-02-2019 |
| B | 5 | Group2 | 5 | 20-02-2019 |
Step-1 Pivot generated in tableau as earlier.
Step-2 Calculated field sum-avg also generated as discussed.
step-3 View generated
Step-4 Add another field total
{FIXED [Date], [Group]: sum(
{FIXED [Date], [Group], [type]: zn(sum(if [Group]= 'Group1' then [val] end))}
+
{Fixed [Date], [Group], [type]: zn(avg(if [Group] = 'Group2' then [val] end))}
)}
Step-5 Add this field to details on marks card. See the GIF here
the code used in tooltip is mentioned below. Obviously, you can tweak it as per taste.
Under the <Group> , <AGG(Sum_Avg)> was <type> out of total <SUM(Total)> SKU on <YEAR(Date)>
This solution works:
1.Create a calculated field:
WINDOW_SUM([SUM_AVG])
2.Drag newly computed field to the view
3.Right click ‘Edit Table Calculation’
4.Specify and compute using [Last_Refreshed] and [type]
This will allow you to compute across cells, giving you your desired result
I have a table in PowerBI called "Dati Popolazione ATTR", which is something like:
Region | Province | Town | Population | Males | Females | Attribute
R1 | P1 | T1 | 1000 | 500 | 500 | A1
R1 | P1 | T1 | 1000 | 500 | 500 | A2
R1 | P1 | T1 | 1000 | 500 | 500 | A3
R2 | P2 | T2 | 2000 | 600 | 1400 | A1
R2 | P2 | T2 | 2000 | 600 | 1400 | A2
R2 | P2 | T2 | 2000 | 600 | 1400 | A3
R3 | P3 | T3 | 1500 | 550 | 950 | A1
R3 | P3 | T3 | 1500 | 550 | 950 | A2
R3 | P3 | T3 | 1500 | 550 | 950 | A3
I want to create a quick measure called 'Affinity'. This should have the following calculation:
Affinity = sum of the selected attribute / sum of the selected attribute in absolute terms regardless of any filter
Denominator should not vary if I select any filter.
Can you help me?
To achieve your aim we need to use two powerful functions in DAX: ALL() and DIVIDE().
This code divides the current sum of filtered populations by the sum of all populations.
Affinity=DIVIDE(
SUM('Dati Popolazione ATTR'[Population]),
SUMX(All('Dati Popolazione ATTR'),'Dati Popolazione ATTR'[Population])
)
Here is the sheet for testing: https://docs.google.com/spreadsheets/d/11CoQ_PAtVNQBkbtnHH0xR4bhCQVU-pcz645h1akTQuA/edit?usp=sharing
I have a table like this:
| id | category | irrelevant |
|----|----------|------------|
| 1 | cat1 | FALSE |
| 2 | cat2 | FALSE |
| 3 | | TRUE |
| 4 | cat1 | FALSE |
Each item has an ID and a category or, if it is considered irrelevant, it has no category and the column "irrelevant" is marked as TRUE.
What I would like to do is to write a formula that will return the number of items in each category plus a row with the number of irrelevant items. So in the case above the result would be:
| category | number |
|------------|--------|
| cat1 | 2 |
| cat2 | 1 |
| irrelevant | 1 |
If I try something like:
=QUERY(A1:C5,"select B,count(A) group by B")
I get the correct numbers, but since "irrelevant" is not a category its cell is empty, so the result is:
| category | count id |
|----------|----------|
| | 1 |
| cat1 | 2 |
| cat2 | 1 |
Notice the empty "B2" cell. Is there a way to rename it to "irrelevant" without altering the first table? One thing I tried was just to count the irrelevant items.
=transpose(query(A1:C5, "select count(A) where C = TRUE label count(A) 'irrelevant'"))
which returns me simply
| irrelevant | 1 |
And then altering slightly the first formula so it doesn't count the "empty" categories and finally joining both of them in an array:
={
QUERY(A1:C5,"select B,count(A) where B <> '' group by B");
TRANSPOSE(QUERY(A1:C5, "select count(A) where C = TRUE label count(A) 'irrelevant'"))
}
This returns me what I want for the example above
| category | count id |
|------------|----------|
| cat1 | 2 |
| cat2 | 1 |
| irrelevant | 1 |
But this won't work if my original table doesn't have irrelevant items. Which can occur depending on the range I chose to query, so if I want to query a table like this:
| id | category | irrelevant |
|----|----------|------------|
| 5 | cat1 | FALSE |
| 6 | cat2 | FALSE |
| 7 | cat2 | FALSE |
| 8 | cat3 | FALSE |
The solution I found will not work. Any suggestions on how can I do that?
try:
=ARRAYFORMULA(QUERY(IF((B2:B="")*(C2:C<>""), "irrelevant", ),
"select Col1,count(Col21)
where Col1 is not null
group by Col1
label count(Col2)''"))
I am trying to merge two tables. table A has an id column, a date column, and an amount value for every date in a period
Table B has both id and date, but also other columns with details. However, there is only one entry any time there is a change in the details, so I do not know how to merge with normal joins. I want that for every entry in A, the details are populated as of the latest day available in B for that ID before the date in A.
Table A
| ID | date | amount |
| 1 | 01JAN| 56 |
| 1 | 02JAN| 54 |
| 1 | 03JAN| 23 |
| 1 | 04JAN| 43 |
Table B
| ID | date | details|
| 1 | 01JAN| x |
| 1 | 03JAN| y |
Wanted Output
Table A
| ID | date | amount | details |
| 1 | 01JAN| 56 | x |
| 1 | 02JAN| 54 | x |
| 1 | 03JAN| 23 | y |
| 1 | 04JAN| 43 | y |
for the jan2 entry, the latest available details as of that date is 'x', for jan3 it is y
Thank you in advance for any guidance you could provide
This will work for the question you have asked literally:
data want;
retain details_last;
merge table1 table2;
by ID date;
if not missing(details) then details_last = details;
else details = details_last;
drop details_last;
run;
But this will only work if your data meets the conditions that you have presented like the date ranges in table B should always fall within the date ranges in table A and not outside (i.e. only interpolation, no extrapolation).