Power Bi Compare two tables and get values that do not matched criteria

Power Bi Compare two tables and get values that do not matched criteria - powerbi

i have 2 tables and, i would like to check if table 1 (Type_Sorting) == (CCSClassCode_Type) is matched with table 2 (_Type Sorting) == (_CCS Class Type):
for example, you can see vi got the wrong value in table 1 (CCSClassCode_Type)
and, the right value is XLBas you can see in table 2 (_CCS Class Type) not ULM,
the idea of table 2 to check if people type the right values, Please not that table 2 (_CCS Class Type) have duplicate values
thank you in advance :)

You can calculate this like that:
Table 2 =
Var trt =
SELECTCOLUMNS(Table_2, "XX"
, COMBINEVALUES(",",Table_2[_CCS Class Type],Table_2[_Type Sorting]))
return
SUMMARIZECOLUMNS(Table_1[Column1]
, Table_1[CCSClassCode_Type]
, Table_1[Type_Sorting]
, FILTER(ALL(Table_1[CCSClassCode_Type],Table_1[Type_Sorting]), not( COMBINEVALUES(",",Table_1[CCSClassCode_Type],Table_1[Type_Sorting])
in trt )
))

Related

Filter data using IF Statement in Tableau

I have a data source in tableau that looks something similar to this:
SKU Backup_Storage
A 5
A 1
B 2
B 3
C 1
D 0
I'd like to create a calculated field in tableau that performs a SUM calculation IF the SKU column contains the string 'A' or 'D' , and to perform an AVERAGE calculation if the SKU column contains the letters 'C' or 'B'
This is what I am doing:
IF CONTAINS(ATTR([SKU]),'A') or
CONTAINS(ATTR([SKU]),'D')
THEN SUM([Backup_Storage])
ELSEIF CONTAINS(ATTR([SKU]),'B') or
CONTAINS(ATTR([SKU]),'C')
THEN AVG([Backup_Storage])
END
UPDATE - desired output would be:
SKU BACKUP
A, D 6 (This is the SUM OF A and D)
B, C 2 (This is the AVG of B and C)
The calculation above shows as valid, however, I see NULLS in my data source table.
Any suggestion is appreciated.
I have named the calculated field:
SKU_FILTER_CALCULATION

Basically, IF THEN ELSE condition works when one test that is either TRUE/FALSE. Your specified condition is not a proper use case of IF THEN ELSE because SKUs can take all possible values. See it like this..
your data
SKU Backup_Storage
A 5
A 1
B 2
B 3
C 1
D 0
Let's name your calc field as CF, then CF will take value A in first row and will output SUM(5) = 5. For second row it will output sum(1) = 1, for third and onward rows it will output as avg(2) = 2, avg(3) = 3, avg(1) and sum(0) respectively. all these values just equals [Backup_storage] only and I'm sure that this you're not trying to get.
If instead you are trying to get sum(5,1,0) + avg(2,3,1) (obviously i have assumed + here) which equals 8 i.e. one single value for whole dataset, please proceed with this calculated field..
SUM(IF CONTAINS([SKU], 'A') OR CONTAINS([SKU], 'D')
THEN [Backup storage] END)
+
AVG(IF CONTAINS([SKU], 'B') OR CONTAINS([SKU], 'C')
THEN [Backup storage] END)
This will return an 8 when put to view
Needless to say, if you want any other operator instead of + you have to change that in CF accordingly

As per your edited post, I suggest a different methodology. Create diff groups where you want to perform different aggregations
Step-1 Create groups on SKU field. I have named this group as SKUG
Step-2 create a calculated field CF as
SUM(ZN(IF CONTAINS([SKU], 'A') OR CONTAINS([SKU], 'D')
THEN [Backup storage] END))
+
AVG(ZN(IF CONTAINS([SKU], 'B') OR CONTAINS([SKU], 'C')
THEN [Backup storage] END))
Step-3 get your desired view
Good Luck

Big query analytical function not giving expected results

I am trying to write a sql in bigquery and I have a requirement to filter records based on a group by column and another column in the table
what I mean is I want to check if the group by column(column name:mnt) has more than one row then I have to check if col2 (col name: zel) value, then I have to apply a filter saying col2 ='X' and only pass that record else pass i.e dont filter the records if the col1 has only distinct one value per group
So I have written a sql to do this I have used row_number as well as rank , dense rank function but I noticed the value of rank and dense rank and row number functions return same value for a group
Please see the below code
#standardsql
with t1 as (SELECT mnt,
case when rank() over (partition by ltrim(rtrim(mnt)) order by
ltrim(rtrim(mnt)) asc) >1 then 'Y' else 'N' end
as flag,
rank() over (partition by mnt order by mnt) as rn,
dense_rank() over (partition by mnt order by mnt) as drn, FROM
projectname.datasetname.tablename1),
t2 as ( SELECT
mnt,
rel,
lif,
lts,
lokez FROM projectname.datasetname.tablename2
WHERE lts <> "" AND _PARTITIONTIME = TIMESTAMP(CURRENT_DATE()) ) ,
t3 as (SELECT
lif,
lifn,
lts,
par FROM `projectname.datasetname.tablename3`)
,t4 as (SELECT rcv FROM `projectname.datasetname.tablename4` WHERE mes
= 'PRO')
select * from (
SELECT t1.mnt as mnt,
t1.flag,
t1.rn,
t1.drn
t2.rel as zel,
t2.lokez as ZLOEKZ,
t4.rcv as Zrcv
FROM t1 left join t2 on replace(t1.mnt, '00000000', '') =
REPLACE(t2.mnt, '00000000', '') AND t1.lif = t2.lif and t2.lts <> ""
and
case when t1.flag = 'Y' and t2.rel ='X' then 1
when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and t2.rel
is null) then 1
when t1.flag = 'Y' and t2.rel <>'X' then 2
else 3
end = 1
left join t3 ON t1.lif = t3.lif AND t2.lts = t3.lts AND
t3.par = 'BA' left join t4 on t4.rcv = t3.lifn and t2.lokez is null )
where ZLOEKZ is null order by mnt
As you can see I am using a case statement and even it seems to be not working fine. I am pasting the case condition below again
case when t1.flag = 'Y' and t2.rel ='X' then 1
when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and
t2.rel
is null) then 1
when t1.flag = 'Y' and t2.rel <>'X' then 2
else 3
end = 1
But the expected record count did not match so I added the above sql lines to see if my analytical functions were giving me result I wanted
rank() over (partition by mnt order by mnt) as rn,
dense_rank() over (partition by mnt order by mnt) as drn
strangely for same mnt number the rank , dense rank and row_number function are assigning the same value what am i doing wrong here.
mnt flag rn drn rel lokez rcv
100 N 1 1 X abc 123
100 N 1 1 null xyz 123
100 N 1 1 null def 234
This is my output
I mean as per my code for same mnt number I am seeing flag set to N instead of Y and for the rank and dense rank are giving me same number for all 3 mnt it is generating 1 instead of 123 (note for rank function I understand) but dense rank should not do that
I tried to convey the issue as efficiently as I could please let me know if there is any clarifications I can provide.
any help appreciated
thanks

SELECT * EXCEPT(ct) FROM (
SELECT *, COUNT() OVER(PARTITION BY mnt) AS ct
) WHERE ct=1 or zel='X'
This is the code snippet for the problem you mentioned. Use this in your code according to the logic.

How to remove null rows from MDX query results

How can I remove the null row from my MDX query results?
Here is the query I'm currently working with
select
non empty
{
[Measures].[Average Trips Per Day]
,[Measures].[Calories Burned]
,[Measures].[Carbon Offset]
,[Measures].[Median Distance]
,[Measures].[Median Duration]
,[Measures].[Rider Trips]
,[Measures].[Rides Per Bike Per Day]
,[Measures].[Total Distance]
,[Measures].[Total Riders]
,[Measures].[Total Trip Duration in Minutes]
,[Measures].[Total Members]
} on columns
,
non empty
{
(
[Promotion].[Promotion Code Name].children
)
} on rows
from [BCycle]
where ([Program].[Program Name].&[Madison B-cycle])
;results

This is not a null value however it is one of the children of [Promotion].[Promotion Code Name].Children.
You can exclude that particular value from children using the EXCEPT keyword of MDx.
Example query:
//This query shows the number of orders for all products,
//with the exception of Components, which are not
//sold.
SELECT
[Date].[Month of Year].Children ON COLUMNS,
Except
([Product].[Product Categories].[All].Children ,
{[Product].[Product Categories].[Components]}
) ON ROWS
FROM
[Adventure Works]
WHERE
([Measures].[Order Quantity])
Reference -> https://learn.microsoft.com/en-us/sql/mdx/except-mdx-function?view=sql-server-2017

Access the previous record to compare the value in DAX POWER BI

I need to access the previous record of the DTH_REFER_PEDID column to make the IF comparison (DTH_REFER_PEDID-1 <> "A").
That is, I'm reading the index X, I need to compare with the index X-1
Addition_Stats = VAR Atendido_OV = PR_HIST_MOVIM_PEDID[OVITEM_Hist]
VAR linha_anterior2 = CALCULATE(values(PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL]);filter(PR_HIST_MOVIM_PEDID;EARLIER(PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID])))
Return
if(PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Month]<PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Month];"Atraso mês ant";
if(PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL] = "A" && PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Day]<=PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Day];"Atendido no Prazo";
if((PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL]="P"||PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL]="L") && PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Day]<= PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Day];"Planejado no prazo";
if(PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL]<>"A" && PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Day]>PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Day];"Em atraso";
if(PR_HIST_MOVIM_PEDID[STA_ITEM_PEDCL] = "A"
&& linha_anterior2 <>"A"
&& PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Day]>PR_HIST_MOVIM_PEDID[DAT_MAIOR_PLANE].[Day];"Atend fora Prazo"
;IF((PR_HIST_MOVIM_PEDID[OVITEM_Hist]=Atendido_OV)&&(PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID]>FIRSTDATE(PR_HIST_MOVIM_PEDID[DTH_REFER_PEDID].[Date]));"A retido";"NA")
)
)
)
)
)
//)
The error displayed is: A circular dependency has been detected: PR_HIST_MOVIM_PEDID [Addition_Stats].
How do I compare DTH_REFER_PEDID-1 <> "A"?

An easy way to work with previous or next records is:
Make sure your data is in a table with a primary key (=ID)
Make a query with all the fields as in your table and add one colum with ID+1. (or ID-1)
Make another query with the table and the query mentioned above and make a join between ID and ID+1 (or ID-1). Place all the fields of the table and the 1st query and you end up with all the values in 1 record. This way you can work with the previous or next values.

Split Pandas Column by values that are in a list

I have three lists that look like this:
age = ['51+', '21-30', '41-50', '31-40', '<21']
cluster = ['notarget', 'cluster3', 'allclusters', 'cluster1', 'cluster2']
device = ['htc_one_2gb','iphone_6/6+_at&t','iphone_6/6+_vzn','iphone_6/6+_all_other_devices','htc_one_2gb_limited_time_offer','nokia_lumia_v3','iphone5s','htc_one_1gb','nokia_lumia_v3_more_everything']
I also have column in a df that looks like this:
campaign_name
0 notarget_<21_nokia_lumia_v3
1 htc_one_1gb_21-30_notarget
2 41-50_htc_one_2gb_cluster3
3 <21_htc_one_2gb_limited_time_offer_notarget
4 51+_cluster3_iphone_6/6+_all_other_devices
I want to split the column into three separate columns based on the values in the above lists. Like so:
age cluster device
0 <21 notarget nokia_lumia_v3
1 21-30 notarget htc_one_1gb
2 41-50 cluster3 htc_one_2gb
3 <21 notarget htc_one_2gb_limited_time_offer
4 51+ cluster3 iphone_6/6+_all_other_devices
First thought was to do a simple test like this:
ages_list = []
for i in ages:
if i in df['campaign_name'][0]:
ages_list.append(i)
print ages_list
>>> ['<21']
I was then going to convert ages_list to a series and combine it with the remaining two to get the end result above but i assume there is a more pythonic way of doing it?

the idea behind this is that you'll create a regular expression based on the values you already have , for example if you want to build a regular expressions that capture any value from your age list you may do something like this '|'.join(age) and so on for all the values you already have cluster & device.
a special case for device list becuase it contains + sign that will conflict with the regex ( because + means one or more when it comes to regex ) so we can fix this issue by replacing any value of + with \+ , so this mean I want to capture literally +
df = pd.DataFrame({'campaign_name' : ['notarget_<21_nokia_lumia_v3' , 'htc_one_1gb_21-30_notarget' , '41-50_htc_one_2gb_cluster3' , '<21_htc_one_2gb_limited_time_offer_notarget' , '51+_cluster3_iphone_6/6+_all_other_devices'] })
def split_df(df):
campaign_name = df['campaign_name']
df['age'] = re.findall('|'.join(age) , campaign_name)[0]
df['cluster'] = re.findall('|'.join(cluster) , campaign_name)[0]
df['device'] = re.findall('|'.join([x.replace('+' , '\+') for x in device ]) , campaign_name)[0]
return df
df.apply(split_df, axis = 1 )
if you want to drop the original column you can do this
df.apply(split_df, axis = 1 ).drop( 'campaign_name', axis = 1)
Here I'm assuming that a value must be matched by regex but if this is not the case you can do your checks , you got the idea

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Power Bi Compare two tables and get values that do not matched criteria - powerbi

Related

Filter data using IF Statement in Tableau

Big query analytical function not giving expected results

How to remove null rows from MDX query results

Access the previous record to compare the value in DAX POWER BI

Split Pandas Column by values that are in a list

Categories

Resources