Unnesting the columns in BigQuery which can have null elements - google-cloud-platform

with a1 as (
select 1 as num,
[1,2] as nested,
2 as cost
),
a2 as (
select 2 as num,
[4,7] as nested,
2 as cost
),
a3 as (
select 3 as num,
[9,8] as nested,
2 as cost
),
a4 as (
select 4 as num,
[4, 6, 8] as nested,
2 as cost
),
a5 as (
select 5 as num,
[19, 11] as nested,
2 as cost
),
a6 as (
select 6 as num,
[] as nested,
2 as cost
),
table as(
select * from a1
union all
select * from a2
union all
select * from a3
union all
select * from a4
union all
select * from a5
union all
select * from a6
)
select * except(nested)
from table, unnest(nested) as unnested
Result generated by the Query
select num, count(unnested) as count, max(cost) as cost_incurred
from table, unnest(nested) as unnested
group by num
Result generated by the Query
Problem with this result is: for num=6, I have no nested columns,
so when I unnest the nested column, it removes the row with num=6 completely, this will create discrepancy when I want to calculate the total cost incurred.
Any help is Appreciated!

Use left join instead of , (which is a shorthand for cross join):
select * except(nested)
from table left join unnest(nested) as unnested
select num, count(unnested) as count, max(cost) as cost_incurred
from table left join unnest(nested) as unnested
group by num

Related

Power BI SELECTCOLUMNS Based on Selecter

I have the following table
Column A
Column B
ColumnC
ColumnD
Cell 1
Cell 2
C1
D1
A 2
B2
C2
D2
Based on a slicer, I wanted to create a new table where ColumnCorD is ColumnC or ColumnD. If the slicer is "ColumnC", the new table is
Column A
Column B
ColumnCOrD
Cell 1
Cell 2
C1
A 2
B2
C2
If the slicer is "ColumnD", the new table is
Column A
Column B
ColumnCOrD
Cell 1
Cell 2
D1
A 2
B2
D2
Is there an easy way to do it with Power BI DAX?
I tried to use SELECTCOLUMNS but I do not find a solution for it.
You were already on the right track. SELECTCOLUMNS does exactly, what you requested.
DynamicTable = SELECTCOLUMNS(StaticTable, "A", [A], "B", [B], "C or D", [C]&"Or"&[D])
Thank you for your answer but it is not exactly what I wanted. I wrote a DAX below but when I select the slicer, the data in the new table are not correct.
My DAX:
NewTable =
SELECTCOLUMNS(
MyData,
"ColA", MyData[ColA],
"ColB", MyData[ColB],
"ColCorD", SWITCH( TRUE(),
SELECTEDVALUE(SlicerTable[Slicer]) = "ColumnC" , MyData[ColC],
SELECTEDVALUE(SlicerTable[Slicer]) = "ColumnD" , MyData[ColD],
"No selection")
)
When I click on Slicer C, I expect the values for ColCorD to be the values from ColC.
When I click on Slicer D, I expect the values for ColCorD to be the values from ColD.

SQL for nested WITH CLAUSE - RESULTS OFFSET in Oracle 19c

Please suggest a way to implement nesting of (temp - results - select) as shown below?
I see that oracle 19c does not allow nesting of WITH clause.
with temp2 as
(
with temp1 as
(
__
__
),
results(..fields..) as
(
select ..<calc part>.. from temp1, results where __
)
select ..<calc part>.. from temp1 join results where __
),
results(..fields..) as
(
select ..<calc part>.. from temp2, results where __
)
select ..<calc part>.. from temp2 join results where __
For instance:
DB Fiddle
I need to calculate CALC3 in similar recursive way as of CALC
CREATE TABLE TEST ( DT DATE, NAME VARCHAR2(10), VALUE NUMBER(10,3));
insert into TEST values ( to_date( '01-jan-2021'), 'apple', 198.95 );
insert into TEST values ( to_date( '02-jan-2021'), 'apple', 6.15 );
insert into TEST values ( to_date( '03-jan-2021'), 'apple', 4.65 );
insert into TEST values ( to_date( '06-jan-2021'), 'apple', 20.85 );
insert into TEST values ( to_date( '01-jan-2021'), 'banana', 80.5 );
insert into TEST values ( to_date( '02-jan-2021'), 'banana', 9.5 );
insert into TEST values ( to_date( '03-jan-2021'), 'banana', 31.65 );
--Existing working code -
with t as
( select
test.*,
row_number() over ( partition by name order by dt ) as seq
from test
),
results(name, dt, value, calc ,seq) as
(
select name, dt, value, value/5 calc, seq
from t
where seq = 1
union all
select t.name, t.dt, t.value, ( 4 * results.calc + t.value ) / 5, t.seq
from t, results
where t.seq - 1 = results.seq
and t.name = results.name
)
select results.*, calc*3 as calc2 -- Some xyz complex logic as calc2
from results
order by name, seq;
Desired output:
CALC3 - grouped by name and dt -
((CALC3 of prev day record * 4) + CALC2 of current record )/ 5
i.e for APPLE
for 1-jan-21, CALC = ((0*4)+119.37)/5 = 23.87 -------> since it is 1st record, have taken 0 as CALC3 of prev day record
for 2-jan-21, CALC = ((23.87*4)+99.19)/5= 115.33 -----> prev CALC3 is considered from 1-jan-21 - 23.87 and 99.19 from current row
for 3-jan-21, CALC = ((115.33*4)+82.14)/5= 477.76 and so on
For BANANA
1-jan-21, CALC = ((0*4)+48.30)/5=9.66
1-jan-21, CALC = ((9.66*4)+44.34)/5=47.51
etc
You do not need to, you can just do it all in one level:
with temp1(...fields...) as
(
__
__
__
),
results1(...fields...) as
(
select ...<calc part>... from temp1 where __
),
temp2( ...fields...) as
(
select ...<calc part>... from temp1 join results1 where __
),
results2(...fields...) as
(
select ...<calc part>... from temp2 where __
)
select ...<calc part>... from temp2 join results2 where __
For your actual problem, you can use a MODEL clause:
SELECT dt,
name,
amount,
calc,
seq,
calc2,
calc3
FROM (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY name ORDER BY dt) AS seq
FROM test t
)
MODEL
PARTITION BY (name)
DIMENSION BY (seq)
MEASURES ( dt, amount, 0 AS calc, 0 AS calc2, 0 as calc3)
RULES (
calc[1] = amount[1]/5,
calc[seq>1] = (amount[cv(seq)] + 4*calc[cv(seq)-1])/5,
calc2[seq] = 3*calc[cv(seq)],
calc3[1] = calc2[1]/5,
calc3[seq>1] = (calc2[cv(seq)] + 4*calc3[cv(seq)-1])/5
)
Which outputs:
DT
NAME
AMOUNT
CALC
SEQ
CALC2
CALC3
01-JAN-21
banana
80.5
16.1
1
48.3
9.66
02-JAN-21
banana
9.5
14.78
2
44.34
16.596
03-JAN-21
banana
31.65
18.154
3
54.462
24.1692
01-JAN-21
apple
198.95
39.79
1
119.37
23.874
02-JAN-21
apple
6.15
33.062
2
99.186
38.9364
03-JAN-21
apple
4.65
27.3796
3
82.1388
47.57688
06-JAN-21
apple
20.85
26.07368
4
78.22104
53.705712
db<>fiddle here

Multiply by regrouping in dax?

I have a dataset grouped by two rows as below
GroupName Score Multiply
Group1 2
Group2 3
Group2 5
Group1 1
I have a slicer based on the parameter table for storing the above GroupName values. so when I select Group1( I am using a selected function on variable in my dax) I want to multiply all rows for Group1(scores) by 4 and all rows for Group2(score) by 6.
I tried this but it is updating all rows.
var a=4*score
var b=6*score
var mm=selectedvalue('Group Paramter'[Group Name])
Return if( mm ="Group1", a, b)
But it multiplies all rows, how can I multiply by grouping using GroupName?
How can I achieve this? I apperciate for any help.
You can try some logic as below (measure)-
var mm = selectedvalue('Group Paramter'[Group Name])
Return if( mm ="Group1", min(table_name[score]) * 4, min(table_name[score]) * 6)

BigQuery Arrays - check if Array contains specific values

I'm trying to see if a certain set of items exist within a BigQuery Array.
Below query works (Checking if a 1 item exists within an array):
WITH sequences AS
(
SELECT 1 AS id, [10,20,30,40] AS some_numbers
UNION ALL
SELECT 2 AS id, [20,30,40,50] AS some_numbers
UNION ALL
SELECT 3 AS id, [40,50,60,70] AS some_numbers
)
SELECT id, some_numbers
FROM sequences
WHERE 20 IN UNNEST(some_numbers)
What I'm not able to do is below (Checking if a more than 1 item exists within an array):
(This query errors)
WITH sequences AS
(
SELECT 1 AS id, [10,20,30,40] AS some_numbers
UNION ALL
SELECT 2 AS id, [20,30,40,50] AS some_numbers
UNION ALL
SELECT 3 AS id, [40,50,60,70] AS some_numbers
)
SELECT id, some_numbers
FROM sequences
WHERE (20,30) IN UNNEST(some_numbers)
I managed to find below workaround, but I feel there's a better way to do this:
WITH sequences AS
(
SELECT 1 AS id, [10,20,30,40] AS some_numbers
UNION ALL
SELECT 2 AS id, [20,30,40,50] AS some_numbers
UNION ALL
SELECT 3 AS id, [40,50,60,70] AS some_numbers
)
SELECT id, some_numbers
FROM sequences
WHERE (
(
SELECT COUNT(1)
FROM UNNEST(some_numbers) s
WHERE s in (20,30)
) > 1
)
Any suggestions are appreciated.
Not much to suggest... Official docs suggest to use exists:
WHERE EXISTS (SELECT *
FROM UNNEST(some_numbers) AS s
WHERE s in (20,30));
Assuming you are looking for rows where ALL elements in match array [20, 30] are found in target array (some_numbers). Also assuming no duplicate numbers in both (match and target) arrays
select id, some_numbers
from sequences a,
unnest([struct([20, 30] as match)]) b
where (
select count(1) = array_length(match)
from a.some_numbers num
join b.match num
using(num)
)

How to filter by two related tables in the SUMX function in DAX

I have two tables A and B as shown below. The AccountID in A has a relationship with the AccountID in B.
A
AccountID CmpName AccFlag SysStartTime sysEndTime
A1 Test1 1 1/1/2020 12/31/9999
A2 Test2 0 1/2/2020 12/31/9999
A3 Test3 1 1/2/2020 12/31/9999
B
ContactId AccountID ConFlag SysStartTime SysEndTime
C1 A1 1 1/1/2020 12/31/9999
C2 A1 1 1/1/2020 12/31/9999
C3 A1 0 1/1/2020 12/31/9999
C4 A2 1 1/2/2020 12/31/9999
I want to get the count of records in A that have 3 or more related records in B. I also want to get the count filtered by the Accflag, conflag, sysStartTime and sysEndTime from both the tables. I have the following DAX and it gives me the count of records in A that have 3 or more related records in B filtered by the Accflag, sysStartTime and sysEndTime of A. I want to add the filtering with ConFlag, sysStartTime and sysEndTime as well but I'm not sure how to add it to the following DAX. Please help.
SUMX ( A,
IF ( COUNTROWS ( RELATEDTABLE ( B ) ) >= 3 &&
A[Accflag]=1 &&
A[SysStartTime]>=TODAY() &&
A[SysEndTime]>= VALUE("12/31/9999"),1 )
)
I think the easiest way to do this would be to create a calculated column which indicates whether each row passes the check or not. Something like the below might work:
Ind =
VAR AccountID=A[AccountID]
VAR Count1 = CALCULATE(COUNTROWS(B),FILTER(B,B[AccountID]=AccountID))
RETURN IF(Count1>=3 && A[Accflag]=1 && A[SysStartTime]>=TODAY() && A[SysEndTime]>= VALUE("12/31/9999"),1,0)
Ind will give out 0 or 1 for each row and then you can simply sum up the field to get the total number of rows which meet each criteria. This will be useful, in case you need to add further conditions to the calculation. Hope this helps.
You could do it like this:
Go to the query editor and add a blank query. Refer this blank query (lets call this query TableBGrouped) to your TableB:
= TableB
Now apply the following step:
The relationship will look now like this:
Add a measure to TableBGrouped:
3 or more Count = CALCULATE(COUNT(TableBGrouped[AccountID]); TableBGrouped[Count] > 2)
Add filter and you got your result:
You can now apply filter: