Big query analytical function not giving expected results - google-cloud-platform

I am trying to write a sql in bigquery and I have a requirement to filter records based on a group by column and another column in the table
what I mean is I want to check if the group by column(column name:mnt) has more than one row then I have to check if col2 (col name: zel) value, then I have to apply a filter saying col2 ='X' and only pass that record else pass i.e dont filter the records if the col1 has only distinct one value per group
So I have written a sql to do this I have used row_number as well as rank , dense rank function but I noticed the value of rank and dense rank and row number functions return same value for a group
Please see the below code
#standardsql
with t1 as (SELECT mnt,
case when rank() over (partition by ltrim(rtrim(mnt)) order by
ltrim(rtrim(mnt)) asc) >1 then 'Y' else 'N' end
as flag,
rank() over (partition by mnt order by mnt) as rn,
dense_rank() over (partition by mnt order by mnt) as drn, FROM
projectname.datasetname.tablename1),
t2 as ( SELECT
mnt,
rel,
lif,
lts,
lokez FROM projectname.datasetname.tablename2
WHERE lts <> "" AND _PARTITIONTIME = TIMESTAMP(CURRENT_DATE()) ) ,
t3 as (SELECT
lif,
lifn,
lts,
par FROM `projectname.datasetname.tablename3`)
,t4 as (SELECT rcv FROM `projectname.datasetname.tablename4` WHERE mes
= 'PRO')
select * from (
SELECT t1.mnt as mnt,
t1.flag,
t1.rn,
t1.drn
t2.rel as zel,
t2.lokez as ZLOEKZ,
t4.rcv as Zrcv
FROM t1 left join t2 on replace(t1.mnt, '00000000', '') =
REPLACE(t2.mnt, '00000000', '') AND t1.lif = t2.lif and t2.lts <> ""
and
case when t1.flag = 'Y' and t2.rel ='X' then 1
when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and t2.rel
is null) then 1
when t1.flag = 'Y' and t2.rel <>'X' then 2
else 3
end = 1
left join t3 ON t1.lif = t3.lif AND t2.lts = t3.lts AND
t3.par = 'BA' left join t4 on t4.rcv = t3.lifn and t2.lokez is null )
where ZLOEKZ is null order by mnt
As you can see I am using a case statement and even it seems to be not working fine. I am pasting the case condition below again
case when t1.flag = 'Y' and t2.rel ='X' then 1
when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and
t2.rel
is null) then 1
when t1.flag = 'Y' and t2.rel <>'X' then 2
else 3
end = 1
But the expected record count did not match so I added the above sql lines to see if my analytical functions were giving me result I wanted
rank() over (partition by mnt order by mnt) as rn,
dense_rank() over (partition by mnt order by mnt) as drn
strangely for same mnt number the rank , dense rank and row_number function are assigning the same value what am i doing wrong here.
mnt flag rn drn rel lokez rcv
100 N 1 1 X abc 123
100 N 1 1 null xyz 123
100 N 1 1 null def 234
This is my output
I mean as per my code for same mnt number I am seeing flag set to N instead of Y and for the rank and dense rank are giving me same number for all 3 mnt it is generating 1 instead of 123 (note for rank function I understand) but dense rank should not do that
I tried to convey the issue as efficiently as I could please let me know if there is any clarifications I can provide.
any help appreciated
thanks

SELECT * EXCEPT(ct) FROM (
SELECT *, COUNT() OVER(PARTITION BY mnt) AS ct
) WHERE ct=1 or zel='X'
This is the code snippet for the problem you mentioned. Use this in your code according to the logic.

Related

Power Bi Compare two tables and get values that do not matched criteria

i have 2 tables and, i would like to check if table 1 (Type_Sorting) == (CCSClassCode_Type) is matched with table 2 (_Type Sorting) == (_CCS Class Type):
for example, you can see vi got the wrong value in table 1 (CCSClassCode_Type)
and, the right value is XLBas you can see in table 2 (_CCS Class Type) not ULM,
the idea of table 2 to check if people type the right values, Please not that table 2 (_CCS Class Type) have duplicate values
thank you in advance :)
You can calculate this like that:
Table 2 =
Var trt =
SELECTCOLUMNS(Table_2, "XX"
, COMBINEVALUES(",",Table_2[_CCS Class Type],Table_2[_Type Sorting]))
return
SUMMARIZECOLUMNS(Table_1[Column1]
, Table_1[CCSClassCode_Type]
, Table_1[Type_Sorting]
, FILTER(ALL(Table_1[CCSClassCode_Type],Table_1[Type_Sorting]), not( COMBINEVALUES(",",Table_1[CCSClassCode_Type],Table_1[Type_Sorting])
in trt )
))

Sum of array elements in Bigquery

I have to calculate the total num of positive and (negative+null or empty values) from the table basically 2 values . I have the below query to list the negative and null and positive values .. but i want the entire count . please assist.
SELECT
ARRAY(
SELECT count(value),
FROM UNNEST(event_data_results) where REGEXP_CONTAINS(name, r'data.result.result') and ((REGEXP_CONTAINS(value, r'^-?\d+$') and SAFE_CAST(value AS INT64) <= 0 ))) AS negative_attributes,
ARRAY(
SELECT count(value) as neg_val,
FROM UNNEST(event_data_results) where value = 'null' or value='' ) AS null_attributes,
ARRAY(
SELECT count(value),
FROM UNNEST(event_data_results) where REGEXP_CONTAINS(name, r'data.result.result') and (REGEXP_CONTAINS(value, r'^-?\d+$') and SAFE_CAST(value AS INT64) > 0 )) AS positive_attributes
FROM `table` where EXISTS (SELECT 1 FROM UNNEST(event_keys) as keys , UNNEST(event_data_results) as results WHERE keys.value = "attribute")
event_keys,event_data_results , data_metrics all are repeatable struct
result should be postive : 4 negative+null :4
Below is for BigQuery Standard SQL
#standardSQL
SELECT
COUNTIF(result.value > 0) positive_attributes,
COUNTIF(result.value < 0) negative_attributes,
COUNTIF(IFNULL(result.value, 0) = 0) null_or_zero_attributes
FROM `project.dataset.table`,
UNNEST(event_data_results) AS result
WHERE EXISTS (
SELECT 1
FROM UNNEST(event_keys) AS key
WHERE key.value = "attribute"
)
you can add here whatever conditions you need
Also, if result.value is a string - you can use SAFE_CAST(result.value AS INT64) as you already do so i was not focusing on this aspect of your case

regular expression to solve for the following

Example 1
asdk[wovkd'vk'psacxu5=205478499|205477661zamd;amd;a;d
Example 2
sadlmdlmdadsldu5=205478499|205477661|234567899amsd/samdamd
u5 can have multiple values separated by |
How can I capture all u5 values from a long string I have?
Below is for BigQuery Standard SQL
#standardSQL
WITH data AS (
SELECT 1 AS id, "asdk[wovkd'vk'psacxu5=205478499|205477661zamd;amd;a;d" AS junk UNION ALL
SELECT 2, "sadlmdlmdadsldu5=205478499|205477661|234567899amsd/samdamd"
)
SELECT id, SPLIT(REGEXP_EXTRACT(junk, r'(?i)u5=([\d|]*)'), '|') AS value
FROM data
with output as below
id value
1 205478499
205477661
2 205478499
205477661
234567899

SELECT Statement within IF statement

I would like to get a different result to my select statement when a parameter is 0, 1 or 2. I am not very skilled in PLSQL so I am not sure if my code would give the expected result. If i run this code i get a "SQL statement ignored" on line 3.
BEGIN
IF (:PARTYPE = 1) THEN
SELECT * FROM x
WHERE to_date(date) >= (Select to_date(sysdate)from DNV.dual)
ELSE
SELECT * FROM x
WHERE to_date(date) <= (Select to_date(sysdate)from DNV.dual)
END IF;
END;
This is just a example of my SELECT statement. Later this statement will become longer and more complex but I think this shows which results I am trying to get.
Below is a copy of my entire code but because I am not allowed to show this it has become very unreadable:
BEGIN
IF (:PARTYPE = 1) THEN
Select table1.Column1
, table1.Column2
, table1.Column3
, table1.Column4
, table1.Column5
, table1.Column6
, table1.Column7
, table1.Column8
, table1.Column9
, table1.Column10
, table1.Column11
, table1.Column12
, (Select table2.ColumnX From x2 table2 Where somthing) as "something" From x1 table1
WHERE to_date(date) >= (Select to_date(sysdate)from DNV.dual)
Order by columnX
ELSE
Select table1.Column1
, table1.Column2
, table1.Column3
, table1.Column4
, table1.Column5
, table1.Column6
, table1.Column7
, table1.Column8
, table1.Column9
, table1.Column10
, table1.Column11
, table1.Column12
, (Select table2.ColumnX From x2 table2 Where somthing) as "something" From x1 table1
WHERE to_date(date) <= (Select to_date(sysdate)from DNV.dual)
Order by columnX
END IF;
END;
I have created some new code with which i am trying to learn how a case statement works. This might help me with the code above. Unfortunately this code also doesn't work but I think it explanes my situation better. In this excample i use a separate table with data i made up. In some cases user2 is null but user1 is always filled. I want to get all items where user2 equals the parameter but if user2 is null and user1 does equal the paramter i still need that item to apear.
Select t1.user1,
t1.user2
From table t1
Where (Case
When t1.user2 IS NULL Then t1.user1 in (:PARUSER)
ELSE t1.user2 in (:PARUSER)
End Case)
Since the relational operator of the where clause depends on the partype, you cannot do the traditional CASE statement charm here. I'll have to resort with this one:
SELECT * FROM x
WHERE (to_date(date) >= (Select to_date(sysdate)from DNV.dual) AND :PARTYPE = 1)
OR (to_date(date) <= (Select to_date(sysdate)from DNV.dual) AND :PARTYPE != 1)

Oracle - how to convert string to row pair with out using WITH clause

In one of the column I have role and organization position
Example postion is 1 and organization is 310492 ...
1|310492|1|12319|1|562548|1|5202558
I need to convert this string to multiple rows
1,310492
1,12319
1,562548
1,5202558
I can not use WITH clause as I need to have is as correlated subquery
SELECT EXTRACT (VALUE (d), '//row/text()').getstringval ()
FROM (SELECT XMLTYPE ( '<rows><row>' || REPLACE (USERPROF.FIELD1, '|', '</row><row>') || '</row></rows>' ) AS xmlval FROM USERPROF WHERE FIELD1 IS NOT NULL ) x, TABLE (XMLSEQUENCE (EXTRACT (x.xmlval, '/rows/row'))) d
however this is converting entire string to multiple rows.
I tried playing with regexp and connect which is not helping me but fetching content of entire table by ignore where condition.
select regexp_substr(FIELD1,'[^|]+', 1, LEVEL) from USERPROF WHERE USERS_ID = 23502
connect by regexp_substr(FIELD1, '[^|]+', 1, level ) is not null;
Thanks in advance.
The SQL below:
with data as
(select '1|310492|1|12319|1|562548|1|5202558' as x from dual)
select fin from(
select 1+level-1 as occurrence
, instr(x,'|',1,1+level-1) as pos
, nvl(lead(instr(x,'|',1,1+level-1),1) over (order by 1+level-1)
, length(x))
as xxxx
, case when
nvl(lead(instr(x,'|',1,1+level-1),1) over (order by 1+level-1)
, length(x)) = length(x)
then instr(x,'|',1,1+level-1)
else
nvl(lag(instr(x,'|',1,1+level-1),1) over (order by 1+level-1),1) end as yyyy
, substr(x
,case when
nvl(lead(instr(x,'|',1,1+level-1),1) over (order by 1+level-1)
, length(x)) = length(x)
then instr(x,'|',1,1+level-1)
else
nvl(lag(instr(x,'|',1,1+level-1),1) over (order by 1+level-1),1) end
,nvl(lead(instr(x,'|',1,1+level-1),1) over (order by 1+level-1)
, length(x))
- case when
nvl(lead(instr(x,'|',1,1+level-1),1) over (order by 1+level-1)
, length(x)) = length(x)
then instr(x,'|',1,1+level-1)
else
nvl(lag(instr(x,'|',1,1+level-1),1) over (order by 1+level-1),1) end
) as fin
, length(x) as lastrw
from data
connect by level <= length(x) - length(replace(x, '|')) - 1
order by 1) x
where mod(occurrence,2) = 1 or xxxx = lastrw
Results in:
FIN
1|310492
|1|12319
|1|562548
|1|520255
Note that I'm just using the with clause to use the data you gave as an example.