Replace a part of a varchar2 column in Oracle - regex

I've a varchar2 column in a table which contains a few entries like the following
TEMPORARY-2 TIME ECS BOUND -04-Insuficient Balance
I want to update these entries and make it TEMPORARY-2 X. What's the way out?

To accomplish this, you can either use character functions such as substr(),
replace()
or a regular expression function - regexp_replace() for instance.
SQL> with t1(col) as(
2 select 'TEMPORARY-2 TIME ECS BOUND -04-Insuficient Balance'
3 from dual
4 )
5 select concat(substr( col, 1, 11), ' X') as res_1
6 , regexp_replace(col, '^(\w+-\d+)(.*)', '\1 X') as res_2
7 from t1
8 ;
Result:
RES_1 RES_2
------------- -------------
TEMPORARY-2 X TEMPORARY-2 X
So your update statement may look like this:
update your_table t
set t.col_name = regexp_replace(col_name, '^(\w+-\d+)(.*)', '\1 X')
-- where clause if needed.

Related

Stored procedure for data excluding is not working as expected in oracle

I have written a query where I want to exclude the data for which values comes as _900. Below is the query for the same.
SELECT
TO_CHAR(TRIM(RJ_SPAN_ID)) AS SPAN_ID,TO_CHAR(RJ_MAINTENANCE_ZONE_CODE) AS MAINT_ZONE_CODE,RJ_INTRACITY_LINK_ID
FROM NE.MV_SPAN#DB_LINK_NE_VIEWER
--FROM APP_FTTX.SPAN_2#SAT
WHERE
LENGTH(TRIM(RJ_SPAN_ID)) = 21
--AND REGEXP_LIKE(TRIM(RJ_SPAN_ID), 'SP(N|Q|R|S)*.+_(BU|MP)$','i')
--AND (NOT REGEXP_LIKE(TRIM(RJ_SPAN_ID), '(_U)$|(/)$','i')
--AND REGEXP_LIKE(RJ_INTRACITY_LINK_ID, '(%*_9%)','i')--)
AND INVENTORY_STATUS_CODE = 'IPL'
AND RJ_MAINTENANCE_ZONE_CODE = 'INORBNPN01'
and RJ_SPAN_ID = 'ORKPRKORKONASPR001_BU';
I tried all the commented REGEXP but it's not working.
Also, with the above query below is the screenshot for the output which I am getting.
Why not LIKE?
SQL> with test (col) as
2 (select '900' from dual union all --> doesn't contain _900
3 select '123_900AB' from dual union all --> contains _900
4 select '_900' from dual union all --> contains _900
5 select 'ab900cd' from dual --> doesn't contain _900
6 )
7 select *
8 from test
9 where col not like '%\_900%' escape '\';
COL
---------
900
ab900cd
SQL>

Email validation for Apex application text item field

I have an Apex application text item to enter the email ids with comma separator.
Now I want to validate the email address whether it's correct or not which entered in application item.
How to achieve this?
I am using the below code but its not working.
declare
l_cnt varchar2(1000);
l_requestors_name varchar2(4000);
begin
select apex_item.text(1) Member
into l_requestors_name
from dual;
if not l_requestors_name not like '%#%' then
return true;
else
return false;
end if;
end;
I'd suggest you to create a function which returns a Boolean or - as in my example - a character (Y - yes, it is valid; N - no, it isn't valid) (why character? You can use such a function in SQL. Boolean works in PL/SQL, but my example is pure SQL).
I guess it isn't perfect, but should be way better than just testing whether a string someone entered contains a monkey (#).
SQL> create or replace
2 function f_email_valid (par_email in varchar2)
3 return varchar2
4 is
5 begin
6 return
7 case when regexp_substr (
8 par_email,
9 '[a-zA-Z0-9._%-]+#[a-zA-Z0-9._%-]+\.[a-zA-Z]{2,4}')
10 is not null
11 or par_email is null then 'Y'
12 else 'N'
13 end;
14 end f_email_valid;
15 /
Function created.
SQL>
As user can enter several e-mail addresses separated by a comma, you'll have to split them into rows and then check each of them. Have a look:
SQL> with test (text) as
2 -- sample data
3 (select 'littlefoot#gmail.com,bigfootyahoo.com,a##hotmail.com,b123#freemail.hr' from dual),
4 split_emails as
5 -- split that long comma-separated values column into rows
6 (select regexp_substr(text, '[^,]+', 1, level) email
7 from test
8 connect by level <= regexp_count(text, ',') + 1
9 )
10 -- check every e-mail
11 select email, f_email_valid(email) is_valid
12 from split_emails;
EMAIL IS_VALID
------------------------------ --------------------
littlefoot#gmail.com Y
bigfootyahoo.com N
a##hotmail.com N
b123#freemail.hr Y
SQL>

Redshift. Convert comma delimited values into rows

I am wondering how to convert comma-delimited values into rows in Redshift. I am afraid that my own solution isn't optimal. Please advise. I have table with one of the columns with coma-separated values. For example:
I have:
user_id|user_name|user_action
-----------------------------
1 | Shone | start,stop,cancell...
I would like to see
user_id|user_name|parsed_action
-------------------------------
1 | Shone | start
1 | Shone | stop
1 | Shone | cancell
....
A slight improvement over the existing answer is to use a second "numbers" table that enumerates all of the possible list lengths and then use a cross join to make the query more compact.
Redshift does not have a straightforward method for creating a numbers table that I am aware of, but we can use a bit of a hack from https://www.periscope.io/blog/generate-series-in-redshift-and-mysql.html to create one using row numbers.
Specifically, if we assume the number of rows in cmd_logs is larger than the maximum number of commas in the user_action column, we can create a numbers table by counting rows. To start, let's assume there are at most 99 commas in the user_action column:
select
(row_number() over (order by true))::int as n
into numbers
from cmd_logs
limit 100;
If we want to get fancy, we can compute the number of commas from the cmd_logs table to create a more precise set of rows in numbers:
select
n::int
into numbers
from
(select
row_number() over (order by true) as n
from cmd_logs)
cross join
(select
max(regexp_count(user_action, '[,]')) as max_num
from cmd_logs)
where
n <= max_num + 1;
Once there is a numbers table, we can do:
select
user_id,
user_name,
split_part(user_action,',',n) as parsed_action
from
cmd_logs
cross join
numbers
where
split_part(user_action,',',n) is not null
and split_part(user_action,',',n) != '';
Another idea is to transform your CSV string into JSON first, followed by JSON extract, along the following lines:
... '["' || replace( user_action, '.', '", "' ) || '"]' AS replaced
... JSON_EXTRACT_ARRAY_ELEMENT_TEXT(replaced, numbers.i) AS parsed_action
Where "numbers" is the table from the first answer. The advantage of this approach is the ability to use built-in JSON functionality.
If you know that there are not many actions in your user_action column, you use recursive sub-querying with union all and therefore avoiding the aux numbers table.
But it requires you to know the number of actions for each user, either adjust initial table or make a view or a temporary table for it.
Data preparation
Assuming you have something like this as a table:
create temporary table actions
(
user_id varchar,
user_name varchar,
user_action varchar
);
I'll insert some values in it:
insert into actions
values (1, 'Shone', 'start,stop,cancel'),
(2, 'Gregory', 'find,diagnose,taunt'),
(3, 'Robot', 'kill,destroy');
Here's an additional table with temporary count
create temporary table actions_with_counts
(
id varchar,
name varchar,
num_actions integer,
actions varchar
);
insert into actions_with_counts (
select user_id,
user_name,
regexp_count(user_action, ',') + 1 as num_actions,
user_action
from actions
);
This would be our "input table" and it looks just as you expected
select * from actions_with_counts;
id
name
num_actions
actions
2
Gregory
3
find,diagnose,taunt
3
Robot
2
kill,destroy
1
Shone
3
start,stop,cancel
Again, you can adjust initial table and therefore skipping adding counts as a separate table.
Sub-query to flatten the actions
Here's the unnesting query:
with recursive tmp (user_id, user_name, idx, user_action) as
(
select id,
name,
1 as idx,
split_part(actions, ',', 1) as user_action
from actions_with_counts
union all
select user_id,
user_name,
idx + 1 as idx,
split_part(actions, ',', idx + 1)
from actions_with_counts
join tmp on actions_with_counts.id = tmp.user_id
where idx < num_actions
)
select user_id, user_name, user_action as parsed_action
from tmp
order by user_id;
This will create a new row for each action, and the output would look like this:
user_id
user_name
parsed_action
1
Shone
start
1
Shone
stop
1
Shone
cancel
2
Gregory
find
2
Gregory
diagnose
2
Gregory
taunt
3
Robot
kill
3
Robot
destroy
Here are two ways to achieve this.
In my example, I'm assuming that I am accepting a comma separated list of values. My values look like schema.table.column.
The first involves using a recursive CTE.
drop table if exists #dep_tbl;
create table #dep_tbl as
select 'schema.foobar.insert_ts,schema.baz.load_ts' as dep
;
with recursive tmp (level, dep_split, to_split) as
(
select 1 as level
, split_part(dep, ',', 1) as dep_split
, regexp_count(dep, ',') as to_split
from #dep_tbl
union all
select tmp.level + 1 as level
, split_part(a.dep, ',', tmp.level + 1) as dep_split_u
, tmp.to_split
from #dep_tbl a
inner join tmp on tmp.dep_split is not null
and tmp.level <= tmp.to_split
)
select dep_split from tmp;
the above yields:
|dep_split|
|schema.foobar.insert_ts|
|schema.baz.load_ts|
The second involves a stored procedure.
CREATE OR REPLACE PROCEDURE so_test(dependencies_csv varchar(max))
LANGUAGE plpgsql
AS $$
DECLARE
dependencies_csv_vals varchar(max);
BEGIN
drop table if exists #dep_holder;
create table #dep_holder
(
avoid varchar(60000)
);
IF dependencies_csv is not null THEN
dependencies_csv_vals:='('||replace(quote_literal(regexp_replace(dependencies_csv,'\\s','')),',', '\'),(\'') ||')';
execute 'insert into #dep_holder values '||dependencies_csv_vals||';';
END IF;
END;
$$
;
call so_test('schema.foobar.insert_ts,schema.baz.load_ts')
select
*
from
#dep_holder;
the above yields:
|dep_split|
|schema.foobar.insert_ts|
|schema.baz.load_ts|
in conclusion
If you only care about one single column in your input (the X delimited values), then I think the stored procedure is easier/faster.
However, if you have other columns you care about and want to keep those columns along with your comma separated value column now transformed to rows, OR, if you want to know the argument (original list of delimited values), I think the stored procedure is the way to go. In that case, you can just add those other columns to your columns selected in the recursive query.
You can get the expected result with the following query. I'm using "UNION ALL" to convert a column to row.
select user_id, user_name, split_part(user_action,',',1) as parsed_action from cmd_logs
union all
select user_id, user_name, split_part(user_action,',',2) as parsed_action from cmd_logs
union all
select user_id, user_name, split_part(user_action,',',3) as parsed_action from cmd_logs
Here's my equally-terrible answer.
I have a users table, and then an events table with a column that is just a comma-delimited string of users at said event. eg
event_id | user_ids
1 | 5,18,25,99,105
In this case, I used the LIKE and wildcard functions to build a new table that represents each event-user edge.
SELECT e.event_id, u.id as user_id
FROM events e
LEFT JOIN users u ON e.user_ids like '%' || u.id || '%'
It's not pretty, but I throw it in a WITH clause so that I don't have to run it more than once per query. I'll likely just build an ETL to create that table every night anyway.
Also, this only works if you have a second table that does have one row per unique possibility. If not, you could do LISTAGG to get a single cell with all your values, export that to a CSV and reupload that as a table to help.
Like I said: a terrible, no-good solution.
Late to the party but I got something working (albeit very slow though)
with nums as (select n::int n
from
(select
row_number() over (order by true) as n
from table_with_enough_rows_to_cover_range)
cross join
(select
max(json_array_length(json_column)) as max_num
from table_with_json_column )
where
n <= max_num + 1)
select *, json_extract_array_element_text(json_column,nums.n-1) parsed_json
from nums, table_with_json_column
where json_extract_array_element_text(json_column,nums.n-1) != ''
and nums.n <= json_array_length(json_column)
Thanks to answer by Bob Baxley for inspiration
Just improvement for the answer above https://stackoverflow.com/a/31998832/1265306
Is generating numbers table using the following SQL
https://discourse.looker.com/t/generating-a-numbers-table-in-mysql-and-redshift/482
SELECT
p0.n
+ p1.n*2
+ p2.n * POWER(2,2)
+ p3.n * POWER(2,3)
+ p4.n * POWER(2,4)
+ p5.n * POWER(2,5)
+ p6.n * POWER(2,6)
+ p7.n * POWER(2,7)
as number
INTO numbers
FROM
(SELECT 0 as n UNION SELECT 1) p0,
(SELECT 0 as n UNION SELECT 1) p1,
(SELECT 0 as n UNION SELECT 1) p2,
(SELECT 0 as n UNION SELECT 1) p3,
(SELECT 0 as n UNION SELECT 1) p4,
(SELECT 0 as n UNION SELECT 1) p5,
(SELECT 0 as n UNION SELECT 1) p6,
(SELECT 0 as n UNION SELECT 1) p7
ORDER BY 1
LIMIT 100
"ORDER BY" is there only in case you want paste it without the INTO clause and see the results
create a stored procedure that will parse string dynamically and populatetemp table, select from temp table.
here is the magic code:-
CREATE OR REPLACE PROCEDURE public.sp_string_split( "string" character varying )
AS $$
DECLARE
cnt INTEGER := 1;
no_of_parts INTEGER := (select REGEXP_COUNT ( string , ',' ));
sql VARCHAR(MAX) := '';
item character varying := '';
BEGIN
-- Create table
sql := 'CREATE TEMPORARY TABLE IF NOT EXISTS split_table (part VARCHAR(255)) ';
RAISE NOTICE 'executing sql %', sql ;
EXECUTE sql;
<<simple_loop_exit_continue>>
LOOP
item = (select split_part("string",',',cnt));
RAISE NOTICE 'item %', item ;
sql := 'INSERT INTO split_table SELECT '''||item||''' ';
EXECUTE sql;
cnt = cnt + 1;
EXIT simple_loop_exit_continue WHEN (cnt >= no_of_parts + 2);
END LOOP;
END ;
$$ LANGUAGE plpgsql;
Usage example:-
call public.sp_string_split('john,smith,jones');
select *
from split_table
You can try copy command to copy your file into redshift tables
copy table_name from 's3://mybucket/myfolder/my.csv' CREDENTIALS 'aws_access_key_id=my_aws_acc_key;aws_secret_access_key=my_aws_sec_key' delimiter ','
You can use delimiter ',' option.
For more details of copy command options you can visit this page
http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html

Split multiple delimited string into unique rows - basically return unique words of sentences from table

Lot of different post out there on this subject.
But I really can't find the one suitable for my project.
I have a table with 4 columns of varchar2, length 20,60,72 and 160. Containing apx ≈ 700 000 records with data of items/products.
Example of table:
Text Id SHNAM
LEVI,GRADY Whitley 1 007C
Levi Grady;Whitley 2 0001
BEVIS,GRADY Leblanc 3 007D
Aladdin Grady;Green 4 0002
ULLA,GRADY Holman 5 0003
From this table I would like to populate a new table or a materialized view of every unique word. Delimiters used are either space, comma or semicolon (', ;').
Expected output:
OUTPUT
Levi
GRADY
Whitley
BEVIS
Leblanc
Aladdin
Green
ULLA
Holman
Note that the check is not case sensitive.
E.g. this blog post applies to your question: Splitting a comma delimited string the RegExp way, Part Two. My answer is derived directly the blog:
with data_(id_, str) as (
select 1, 'LEVI,GRADY Whitley' from dual union all
select 2, 'Levi Grady;Whitley' from dual union all
select 3, 'BEVIS,GRADY Leblanc' from dual union all
select 4, 'aladdin grady;green' from dual union all
select 5, 'ULLA,GRADY Holman' from dual union all
select 6, '1aar,1bar;1car 1dar,1ear' from dual
)
select distinct lower(regexp_substr(str, '[^,;[:space:]]+', 1, rownum_)) as splitted
from data_
cross join (select rownum as rownum_
from (select max(regexp_count(str, '[,;[:space:]]')) + 1 as max_
from data_
)
connect by level <= max_
)
where regexp_substr(str, '[^,;[:space:]]+', 1, rownum_) is not null
order by splitted
;
Note that this query doesn't have exactly the same output that you have listed in the question for the ids from 1 to 5. You expected Levi (with initcap) and GRADY (all caps) even the both names has both variations - this is inconsistent so I simply ignored it.

How do I extract a pattern from a table in Oracle 11g?

I want to extract text from a column using regular expressions in Oracle 11g. I have 2 queries that do the job but I'm looking for a (cleaner/nicer) way to do it. Maybe combining the queries into one or a new equivalent query. Here they are:
Query 1: identify rows that match a pattern:
select column1 from table1 where regexp_like(column1, pattern);
Query 2: extract all matched text from a matching row.
select regexp_substr(matching_row, pattern, 1, level)
from dual
connect by level < regexp_count(matching_row, pattern);
I use PL/SQL to glue these 2 queries together, but it's messy and clumsy. How can I combine them into 1 query. Thank you.
UPDATE: sample data for pattern 'BC':
row 1: ABCD
row 2: BCFBC
row 3: HIJ
row 4: GBC
Expected result is a table of 4 rows of 'BC'.
You can also do it in one query, functions/procedures/packages not required:
WITH t1 AS (
SELECT 'ABCD' c1 FROM dual
UNION
SELECT 'BCFBC' FROM dual
UNION
SELECT 'HIJ' FROM dual
UNION
SELECT 'GBC' FROM dual
)
SELECT c1, regexp_substr(c1, 'BC', 1, d.l, 'i') thePattern, d.l occurrence
FROM t1 CROSS JOIN (SELECT LEVEL l FROM dual CONNECT BY LEVEL < 200) d
WHERE regexp_like(c1,'BC','i')
AND d.l <= regexp_count(c1,'BC');
C1 THEPATTERN OCCURRENCE
----- -------------------- ----------
ABCD BC 1
BCFBC BC 1
BCFBC BC 2
GBC BC 1
SQL>
I've arbitrarily limited the number of occurrences to search for at 200, YMMV.
Actually there is an elegant way to do this in one query, if you do not mind to run some extra miles. Please note that this is just a sketch, I have not run it, you'll probably have to correct a few typos in it.
create or replace package yo_package is
type word_t is record (word varchar2(4000));
type words_t is table of word_t;
end;
/
create or replace package body yo_package is
function table_function(in_cur in sys_refcursor, pattern in varchar2)
return words_t
pipelined parallel_enable (partition in_cur by any)
is
next varchar2(4000);
match varchar2(4000);
word_rec word_t;
begin
word_rec.word = null;
loop
fetch in_cur into next;
exit when in_cur%notfound;
--this you inner loop where you loop through the matches within next
--you have to implement this
loop
--TODO get the next match from next
word_rec.word := match;
pipe row (word_rec);
end loop;
end loop;
end table_function;
end;
/
select *
from table(
yo_package.table_function(
cursor(
--this is your first select
select column1 from table1 where regexp_like(column1, pattern)
)
)