IF + AND / OR logic inside of a query

IF + AND / OR logic inside of a query - if-statement

below is an example document I have shared:
https://docs.google.com/spreadsheets/d/1WuQIqn8DA12R0mNFGMdjJahQ0eNoxKODpSwopk7KoYU/edit#gid=0
My data is simple table:
I want to do the following:
For starting cell K7 on patient tab
I want to query the call log tab for
two main conditions.
Query select loqic: return rows D,E,F,A when certain conditions are met:
if text colC equals text in patient tab cell c7 AND col D says "No beds Available" And colI shows time left to calling greater than 0
OR If not than:
if col B=cell H3 in patient tab, and Col C= Cell C7 in patient tab
Thank you for your help

My example could help you.
Suppose you have a small data, like this, columns A:D:
Then you may use query state with two or more OR conditions, but insert them into parentheses. Sample formula:
=QUERY({A:D},"select Col1, Col2, Col3, Col4 where (Col1 < 7 and Col3 = 'c') or (Col2 = 'a' and Col4 > 0)")
To use Col1, Col2, Col3... notation inside query, data must be inside {}

Related

Count and filter on the basis of third column in informatica

Like I have a question
Col1 Col2 Col3
45321_320 A Y
45321_320 A N
76453-10 A Y
45638_80 A Y
So we need to count the no of rows that have same col1 for example the first two rows should be considered as count=2 and rest as count=1 and after that count=2 or more that records need to filtered out on the basis of Col3=Y, so how we can do that in informatica
https://i.stack.imgur.com/JkxnG.png

This is little tricky. Pls follow below steps.
Sort the data base on col1.
Use agg to aggregate. Create a new col called count_col1.
create another col, cnt_col3_y = count(*, col3=y)
Join agg output with sorter output based on col1.
Put a filter. Logic should be
iif( count_col1>1 and cnt_col3>0, false, true)
Link output of filter to target.
This will generate output like below.
Col1 Col2 Col3
76453-10 A Y
45638_80 A Y
If you want different output let me know.

Informatica - SQ transformation

What will be the expected result of the below.
I have table A with column1,
I'm trying to map column1 to SQ, which has 3 columns - col1, col2 and col3.
Link Column1 to col1,col2 and col3 in SQ. Now when I try to generate SQL query for SQ, what will be the result?

Since OP is waiting for answer and doesnt have informatica to test it out, let me answer to that.
if you connect one column to three columns in SQ and then connect all those three columns to next transformation, then your generated SQL will contain one column repeated thrice from source.
Here are some screenshot from a dummy map i created.
mapping screenshot -
Then here is generate SQL -
SELECT
ITEM.ITEM_NUM, ITEM.ITEM_NUM, ITEM.ITEM_NUM
FROM
ITEM

How to update multiple columns in same update statement with one column depends upon another new column new value in Redshift

I want to update multiple columns in same update statement with one column depends upon another new column new value.
Example:
Sample Data: col1 and col2 is the column names and test_update is the table name.
SELECT * FROM test_update;
col1 col2
col-1 col-2
col-1 col-2
col-1 col-2
update test_update set col1 = 'new', col2=col1||'-new';
SELECT * FROM test_update;
col1 col2
new col-1-new
new col-1-new
new col-1-new
What I need to achieve is col2 is updated as new-new as we updated value of col1 is new.
I think may be its not possible in one SQL statement. If possible How can we do that, If its not What is best way of handling this problem in Data Warehouse environment, like execute multiple update 1st on col1 and then on col2 or any other.
Hoping my question is clear.

You cannot update the second column based on the result of updating the first column. However this can be achieved in a single by "pre-calculating" the result you want and then updating based on that.
The following update using a join is based on the example provided in the Redshift documentation:
UPDATE test_update
SET col1 = precalc.col1
, col2 = precalc.col2
FROM (
SELECT catid
, 'new' AS col1
, col1 || '-new' AS col2
FROM test_update
) precalc
WHERE test_update.id = precalc.id;
;

How to create new column in power bi using given string match condition in first column and get value from another column, make new column?

My table is as follow
Col1 Col2
11_A 9
12_B 8
13_C 7
14_A 6
15_A 4
The table we need after the query
Col1 Col2 Col3
11_A 0 9
12_B 8 0
13_C 7 0
14_A 0 6
15_A 0 4
My query is
Col3 =
LEFT( 'Table'[Col2],
SEARCH("A", 'Table'[Col1], 0,
LEN('Table'[Col1])
)
)

Go to the query designer Add Column > Custom Column and use the following expression:
Update
You need two expressions (two new columns) for this:
One is:
'Your Column3
=if Text.Contains([Col1], "A") = true then [Col2] else 0
And the second:
'Your Column2
=if Text.Contains([Col1], "A") = false then [Col2] else 0

There are many ways to solve this,
Another easy way I like to do this with no-coding is to use Conditional Columns:
In PBI select Power Query Editor
Select your table on the edge of the screen
Select Add Column tab
Select Conditional Columns...
Name your column
Enter your condition as in the picture
You can add several conditions if you like
Don't forget to format your column to numeric if needed.
see picture
Adding columns using Conditional Column

Redshift. Convert comma delimited values into rows

I am wondering how to convert comma-delimited values into rows in Redshift. I am afraid that my own solution isn't optimal. Please advise. I have table with one of the columns with coma-separated values. For example:
I have:
user_id|user_name|user_action
-----------------------------
1 | Shone | start,stop,cancell...
I would like to see
user_id|user_name|parsed_action
-------------------------------
1 | Shone | start
1 | Shone | stop
1 | Shone | cancell
....

A slight improvement over the existing answer is to use a second "numbers" table that enumerates all of the possible list lengths and then use a cross join to make the query more compact.
Redshift does not have a straightforward method for creating a numbers table that I am aware of, but we can use a bit of a hack from https://www.periscope.io/blog/generate-series-in-redshift-and-mysql.html to create one using row numbers.
Specifically, if we assume the number of rows in cmd_logs is larger than the maximum number of commas in the user_action column, we can create a numbers table by counting rows. To start, let's assume there are at most 99 commas in the user_action column:
select
(row_number() over (order by true))::int as n
into numbers
from cmd_logs
limit 100;
If we want to get fancy, we can compute the number of commas from the cmd_logs table to create a more precise set of rows in numbers:
select
n::int
into numbers
from
(select
row_number() over (order by true) as n
from cmd_logs)
cross join
(select
max(regexp_count(user_action, '[,]')) as max_num
from cmd_logs)
where
n <= max_num + 1;
Once there is a numbers table, we can do:
select
user_id,
user_name,
split_part(user_action,',',n) as parsed_action
from
cmd_logs
cross join
numbers
where
split_part(user_action,',',n) is not null
and split_part(user_action,',',n) != '';

Another idea is to transform your CSV string into JSON first, followed by JSON extract, along the following lines:
... '["' || replace( user_action, '.', '", "' ) || '"]' AS replaced
... JSON_EXTRACT_ARRAY_ELEMENT_TEXT(replaced, numbers.i) AS parsed_action
Where "numbers" is the table from the first answer. The advantage of this approach is the ability to use built-in JSON functionality.

If you know that there are not many actions in your user_action column, you use recursive sub-querying with union all and therefore avoiding the aux numbers table.
But it requires you to know the number of actions for each user, either adjust initial table or make a view or a temporary table for it.
Data preparation
Assuming you have something like this as a table:
create temporary table actions
(
user_id varchar,
user_name varchar,
user_action varchar
);
I'll insert some values in it:
insert into actions
values (1, 'Shone', 'start,stop,cancel'),
(2, 'Gregory', 'find,diagnose,taunt'),
(3, 'Robot', 'kill,destroy');
Here's an additional table with temporary count
create temporary table actions_with_counts
(
id varchar,
name varchar,
num_actions integer,
actions varchar
);
insert into actions_with_counts (
select user_id,
user_name,
regexp_count(user_action, ',') + 1 as num_actions,
user_action
from actions
);
This would be our "input table" and it looks just as you expected
select * from actions_with_counts;
id
name
num_actions
actions
2
Gregory
3
find,diagnose,taunt
3
Robot
2
kill,destroy
1
Shone
3
start,stop,cancel
Again, you can adjust initial table and therefore skipping adding counts as a separate table.
Sub-query to flatten the actions
Here's the unnesting query:
with recursive tmp (user_id, user_name, idx, user_action) as
(
select id,
name,
1 as idx,
split_part(actions, ',', 1) as user_action
from actions_with_counts
union all
select user_id,
user_name,
idx + 1 as idx,
split_part(actions, ',', idx + 1)
from actions_with_counts
join tmp on actions_with_counts.id = tmp.user_id
where idx < num_actions
)
select user_id, user_name, user_action as parsed_action
from tmp
order by user_id;
This will create a new row for each action, and the output would look like this:
user_id
user_name
parsed_action
1
Shone
start
1
Shone
stop
1
Shone
cancel
2
Gregory
find
2
Gregory
diagnose
2
Gregory
taunt
3
Robot
kill
3
Robot
destroy

Here are two ways to achieve this.
In my example, I'm assuming that I am accepting a comma separated list of values. My values look like schema.table.column.
The first involves using a recursive CTE.
drop table if exists #dep_tbl;
create table #dep_tbl as
select 'schema.foobar.insert_ts,schema.baz.load_ts' as dep
;
with recursive tmp (level, dep_split, to_split) as
(
select 1 as level
, split_part(dep, ',', 1) as dep_split
, regexp_count(dep, ',') as to_split
from #dep_tbl
union all
select tmp.level + 1 as level
, split_part(a.dep, ',', tmp.level + 1) as dep_split_u
, tmp.to_split
from #dep_tbl a
inner join tmp on tmp.dep_split is not null
and tmp.level <= tmp.to_split
)
select dep_split from tmp;
the above yields:
|dep_split|
|schema.foobar.insert_ts|
|schema.baz.load_ts|
The second involves a stored procedure.
CREATE OR REPLACE PROCEDURE so_test(dependencies_csv varchar(max))
LANGUAGE plpgsql
AS $$
DECLARE
dependencies_csv_vals varchar(max);
BEGIN
drop table if exists #dep_holder;
create table #dep_holder
(
avoid varchar(60000)
);
IF dependencies_csv is not null THEN
dependencies_csv_vals:='('||replace(quote_literal(regexp_replace(dependencies_csv,'\\s','')),',', '\'),(\'') ||')';
execute 'insert into #dep_holder values '||dependencies_csv_vals||';';
END IF;
END;
$$
;
call so_test('schema.foobar.insert_ts,schema.baz.load_ts')
select
*
from
#dep_holder;
the above yields:
|dep_split|
|schema.foobar.insert_ts|
|schema.baz.load_ts|
in conclusion
If you only care about one single column in your input (the X delimited values), then I think the stored procedure is easier/faster.
However, if you have other columns you care about and want to keep those columns along with your comma separated value column now transformed to rows, OR, if you want to know the argument (original list of delimited values), I think the stored procedure is the way to go. In that case, you can just add those other columns to your columns selected in the recursive query.

You can get the expected result with the following query. I'm using "UNION ALL" to convert a column to row.
select user_id, user_name, split_part(user_action,',',1) as parsed_action from cmd_logs
union all
select user_id, user_name, split_part(user_action,',',2) as parsed_action from cmd_logs
union all
select user_id, user_name, split_part(user_action,',',3) as parsed_action from cmd_logs

Here's my equally-terrible answer.
I have a users table, and then an events table with a column that is just a comma-delimited string of users at said event. eg
event_id | user_ids
1 | 5,18,25,99,105
In this case, I used the LIKE and wildcard functions to build a new table that represents each event-user edge.
SELECT e.event_id, u.id as user_id
FROM events e
LEFT JOIN users u ON e.user_ids like '%' || u.id || '%'
It's not pretty, but I throw it in a WITH clause so that I don't have to run it more than once per query. I'll likely just build an ETL to create that table every night anyway.
Also, this only works if you have a second table that does have one row per unique possibility. If not, you could do LISTAGG to get a single cell with all your values, export that to a CSV and reupload that as a table to help.
Like I said: a terrible, no-good solution.

Late to the party but I got something working (albeit very slow though)
with nums as (select n::int n
from
(select
row_number() over (order by true) as n
from table_with_enough_rows_to_cover_range)
cross join
(select
max(json_array_length(json_column)) as max_num
from table_with_json_column )
where
n <= max_num + 1)
select *, json_extract_array_element_text(json_column,nums.n-1) parsed_json
from nums, table_with_json_column
where json_extract_array_element_text(json_column,nums.n-1) != ''
and nums.n <= json_array_length(json_column)
Thanks to answer by Bob Baxley for inspiration

Just improvement for the answer above https://stackoverflow.com/a/31998832/1265306
Is generating numbers table using the following SQL
https://discourse.looker.com/t/generating-a-numbers-table-in-mysql-and-redshift/482
SELECT
p0.n
+ p1.n*2
+ p2.n * POWER(2,2)
+ p3.n * POWER(2,3)
+ p4.n * POWER(2,4)
+ p5.n * POWER(2,5)
+ p6.n * POWER(2,6)
+ p7.n * POWER(2,7)
as number
INTO numbers
FROM
(SELECT 0 as n UNION SELECT 1) p0,
(SELECT 0 as n UNION SELECT 1) p1,
(SELECT 0 as n UNION SELECT 1) p2,
(SELECT 0 as n UNION SELECT 1) p3,
(SELECT 0 as n UNION SELECT 1) p4,
(SELECT 0 as n UNION SELECT 1) p5,
(SELECT 0 as n UNION SELECT 1) p6,
(SELECT 0 as n UNION SELECT 1) p7
ORDER BY 1
LIMIT 100
"ORDER BY" is there only in case you want paste it without the INTO clause and see the results

create a stored procedure that will parse string dynamically and populatetemp table, select from temp table.
here is the magic code:-
CREATE OR REPLACE PROCEDURE public.sp_string_split( "string" character varying )
AS $$
DECLARE
cnt INTEGER := 1;
no_of_parts INTEGER := (select REGEXP_COUNT ( string , ',' ));
sql VARCHAR(MAX) := '';
item character varying := '';
BEGIN
-- Create table
sql := 'CREATE TEMPORARY TABLE IF NOT EXISTS split_table (part VARCHAR(255)) ';
RAISE NOTICE 'executing sql %', sql ;
EXECUTE sql;
<<simple_loop_exit_continue>>
LOOP
item = (select split_part("string",',',cnt));
RAISE NOTICE 'item %', item ;
sql := 'INSERT INTO split_table SELECT '''||item||''' ';
EXECUTE sql;
cnt = cnt + 1;
EXIT simple_loop_exit_continue WHEN (cnt >= no_of_parts + 2);
END LOOP;
END ;
$$ LANGUAGE plpgsql;
Usage example:-
call public.sp_string_split('john,smith,jones');
select *
from split_table

You can try copy command to copy your file into redshift tables
copy table_name from 's3://mybucket/myfolder/my.csv' CREDENTIALS 'aws_access_key_id=my_aws_acc_key;aws_secret_access_key=my_aws_sec_key' delimiter ','
You can use delimiter ',' option.
For more details of copy command options you can visit this page
http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

IF + AND / OR logic inside of a query - if-statement

Related

Count and filter on the basis of third column in informatica

Informatica - SQ transformation

How to update multiple columns in same update statement with one column depends upon another new column new value in Redshift

How to create new column in power bi using given string match condition in first column and get value from another column, make new column?

Redshift. Convert comma delimited values into rows

Categories

Resources