How to write the pattern in regular expression matching in Pl/SQL? - regex

I have written a substring regular expression in Oracle. I am having a problem with the correct pattern matching. The substring query first fetches the ddl of the trigger into a string and then tries to separate the table's columns from it.
Trigger DDL
CREATE OR REPLACE TRIGGER "SHIVAMG"."DVJ_CI_CURRENCY_CD_L_IU"
BEFORE INSERT OR UPDATE ON CI_CURRENCY_CD_L
FOR EACH ROW
BEGIN
IF INSERTING THEN
IF (UPPER(:NEW.CURRENCY_CD) NOT LIKE 'ZZ%') THEN
INSERT INTO JUNITUSR.CI_CURRENCY_CD_L
(CURRENCY_CD,
LANGUAGE_CD,
DESCR,
VERSION)
SELECT :NEW.CURRENCY_CD,
:NEW.LANGUAGE_CD,
:NEW.DESCR,
:NEW.VERSION
FROM DUAL
WHERE NOT EXISTS
(SELECT 1
FROM JUNITUSR.CI_CURRENCY_CD_L B
WHERE B.CURRENCY_CD =:NEW.CURRENCY_CD AND
B.LANGUAGE_CD = :NEW.LANGUAGE_CD);
END IF;
END IF;
IF UPDATING THEN
IF (UPPER(:NEW.CURRENCY_CD) NOT LIKE 'ZZ%') THEN
UPDATE JUNITUSR.CI_CURRENCY_CD_L A
SET CURRENCY_CD =:NEW.CURRENCY_CD,
LANGUAGE_CD =:NEW.LANGUAGE_CD,
DESCR =:NEW.DESCR ,
VERSION =:NEW.VERSION
WHERE A.CURRENCY_CD = :OLD.CURRENCY_CD AND
A.LANGUAGE_CD =:OLD.LANGUAGE_CD;
END IF;
END IF;
EXCEPTION
WHEN OTHERS THEN
RAISE_APPLICATION_ERROR(-20001,'ERROR: <DVJ_CI_CURRENCY_CD_L_IU> ' || SQLERRM);
END;
ALTER TRIGGER "SHIVAMG"."DVJ_CI_CURRENCY_CD_L_IU" ENABLE"
Substring Query
SELECT REGEXP_SUBSTR((SELECT REGEXP_SUBSTR
(( select dbms_metadata.get_ddl('TRIGGER', 'DVJ_CI_CURRENCY_CD_L_IU' ) from dual), 'INSERT INTO(.*)+\)')FROM dual),'\((.*)\)') FROM DUAL;

I found the correct Substring query to gather the individual column names from the trigger code. It is as follows:
SELECT REGEXP_SUBSTR((SELECT REGEXP_SUBSTR((SELECT REGEXP_SUBSTR (( SELECT dbms_metadata.get_ddl( 'TRIGGER',trig_name,'CISADM') FROM dual),
'INSERT(\s|\n)+INTO[^\)]+\)',1,1,'n') FROM dual),'[\(](\s|\n|.)+[\)]')
FROM DUAL),'(\w)+',1,counter)INTO temp_col_name FROM dual;

Related

Postgres regular expression to find procedures with wrong Update syntax

My Postgres database was migrated from MySQL using a tool and the code base has lot of syntax issues.
One of the issues is with the UPDATE statements inside procedures where the column name contains alias name as below.
UPDATE table1 t1 SET t1.col1 = 'some_value';
Having alias name after SET keyword as in t1.col1 is a wrong syntax in Postgres.
As the number of procedures is huge, I'm trying to write a regular expression to find which procedures have this pattern.
select proname, prosrc from pg_proc
where regexp_replace(prosrc, E'[\\n\\r]+', ' ', 'g' ) ~* '[:UPDATE:]([[:space:]]+)[:set:]([[:space:]]+)^[a-z]([a-z]|[0-9])+\.^[a-z]([a-z]|[0-9])+([[:space:]]*)[:=:]';
The regexp_replace part on the left side of the condition is to remove line breaks which works fine. The main part on the right side is not returning the desired result.
I am trying to find the procedures that has UPDATE keyword followed by one or more space, followed by SET keyword, followed by one more space, followed by one more alphanumeric characters (which starts with an alphabet), followed by a dot(.) , followed by one more alphanumeric characters (which starts with an alphabet), followed by zero or more spaces, followed by an equal to sign (=).
But the statement I formed seems to be wrong. Any help on this is much appreciated.
I think this may be more complex than you think... A procedure/function may have more than one update statement, and a simple regex will likely come up with many false positives.
I think you want a function to do a better job of eliminating false positives that result from:
Alias that occurs after the update, in a separate statement (after the semicolon) -- fix by splitting statements by semicolons
Aliases within the update that occur after a FROM or WHERE clause, which are valid and not syntax errors
Less frequent, aliases used in a CTE prior to the update - fix by ignoring everything prior to the update keyword
Here is a boilerplate for what I think will get you close and minimize false positives:
create or replace function find_bad_syntax()
returns setof text
language plpgsql as
$BODY$
DECLARE
r pg_proc%rowtype;
dml varchar[];
eval varchar;
alias varchar;
BEGIN
FOR r IN
SELECT * FROM pg_proc WHERE prosrc ilike '%update%'
LOOP
dml := string_to_array (r.prosrc, ';');
foreach eval in array dml
loop
alias := substring (lower (eval), 'update [\w.]+\s+(\w+)');
continue when alias is null or lower (alias) = 'set';
eval := regexp_replace (eval, 'from\s+.*', '', 'i');
eval := regexp_replace (eval, 'where\s.*', '', 'i');
eval := regexp_replace (eval, '^.*update', '', 'i');
if eval ~* (alias || '\.\w+\s*=') then
-- if eval ~* (alias || '\.\w+\s+=') then
return next format ('PROC: %s ALIAS: %s => ERROR: %s', r.proname, alias, eval);
end if;
end loop;
END LOOP;
END;
$BODY$
So to get the results simply:
select * from find_bad_syntax()
I did a test run, and your function did show up in the results.
The below query gives me the expected results. It checks for code where we have SET followed by one or more space, followed by one or more alphanumeric character and _, followed by a dot(.), followed by one or more alphanumeric character and _, followed by one or more spaces and followed by =.
This fetches all the procedures that have the issue that I posted in question.
select proname, prosrc from pg_proc
where regexp_replace(prosrc, E'[\\n\\r]+', ' ', 'g' )
~* '( SET)[[:space:]]+([a-z]|[0-9]|(_))+\.([a-z]|[0-9]|(_))+[[:space:]]+(=)';
Yes, in PostgreSQL this is not working:
UPDATE table1 t1 SET t1.col1 = 'some_value';
But, this is working correctly:
UPDATE table1 t1 SET col1 = 'some_value';
So we only need to clear the update field alias.
Example for do it:
with t1(txt) as (
select 'UPDATE table1 t1 SET t1.col1 = some_value'
)
select regexp_replace(t1.txt, 'SET (.*)\d\.', 'SET ', 'g') from t1
For finding, selecting:
with t1(txt) as (
select 'UPDATE table1 t1 SET t1.col1 = some_value'
)
select * from t1 where t1.txt ~ 'SET (.*)\d\.'
Some small changes:
with t1(txt) as (
select 'UPDATE table1 t1 SET t1.col1 = some_value'
union all
select 'UPDATE table1 tbp3232 SET tbp3232.col1 = some_value'
union all
select 'select pp3.* from table1 pp3'
union all
select 'UPDATE table1 SET col1 = some_value'
union all
select 'UPDATE table1 t SET t.col1 = some_value'
)
select * from t1 where t1.txt ~ 'SET (.*)\w\.'
--Result:
'UPDATE table1 t1 SET t1.col1 = some_value'
'UPDATE table1 tbp3232 SET tbp3232.col1 = some_value'
'UPDATE table1 t SET t.col1 = some_value'

REGEX help needed in Oracle

How to get all the table names from the below Sql? My sql returns only the last table name.
with t as
(select 'select col1,
(select max(col3) from dd3) max_timestamp
from dd1,
dd2
where dd1.col1 = dd2.col1
and dd1.col1 in(select col1 from dd4)' sql_text from dual)
select regexp_substr(regexp_substr(upper(sql_text), '\sFROM\s*(\w|\.|_)*'), '(\w|_|\.)+', 1,2)
from t
Thanks,
DD.
This is a more of a regex question than an Oracle question.
If you can run the sql through REPLACE(REPLACE(sql,CHR(13),' '),CHR(10),NULL) to replace all newlines with a space, so that the query fits on a single line, here is regex that will return all the tables in group 1 (for the ones after FROM) and group 3 for subsequent items in a list:
/FROM ([A-Z0-9$#_]+)(,[\s]*([A-Z0-9$#_]+))*/gi
Having multiple groups is not ideal, so I would look at the full match instead, see https://regex101.com/r/OZUalH/1/ for an example (see full match on the right, where every match has from followed by one or more tables).
But let me warn you this is not going to be robust, as these valid FROM clause expressions are not handled:
"my_table"
MY_TABLE AS A
MY_TABLE AS "a"
etc...
If it were me, I would write a function to run the query through explain plan (execute immediate 'explain plan for ...') and extract the tables from the plan tables (or possibly using SYS.DBMS_XPLAN)

PL/SQL regexp_like filters

I want to delete some tables and wrote this procedure:
set serveroutput on
declare
type namearray is table of varchar2(50);
total integer;
name namearray;
begin
--select statement here ..., please see below
total :=name.count;
dbms_output_line(total);
for i in 1 .. total loop
dbms_output.put_line(name(i));
-- execute immediate 'drop table ' || name(i) || ' purge';
End loop;
end;
/
The idea is to drop all tables with table name having pattern like this:
ERROR_REPORT[2 digit][3 Capital characters][10 digits]
example: ERROR_REPORT16MAY2014122748
However, I am not able to come up with the correct regexp. Below are my select statements and results:
select table_name bulk collect into name from user_tables where regexp_like(table_name, '^ERROR_REPORT[0-9{2}A-Z{3}0-9{10}]');
The results included all the table names I needed plus ERROR_REPORT311AUG20111111111. This should not be showing up in the result.
The follow select statement showed the same result, which meant the A-Z{3} had no effect on the regexp.
select table_name bulk collect into name from user_tables where regexp_like(table_name, '^ERROR_REPORT[0-9{2}0-9{10}]');
My question is what would be the correct regexp, and what's wrong with mine?
Thanks,
Alex
Correct regex is
'^ERROR_REPORT[0-9]{2}[A-Z]{3}[0-9]{10}'
I think this regex should work:
^ERROR_REPORT[0-9]{2}[A-Z]{3}[0-9]{10}
However, please check the regex101 link. I've assumed that you need 2 digits after ERROR_REPORT but your example name shows 3.

Is it possible to combine REPLACE with LIKE to replace multiple values in oracle database column

This is similar to the question here but instead of replacing a single value I want to replace multiple values with matching pattern.
--create table
create table my_table (column1 varchar2(10));
--load table
insert into my_table values ('Test1');
insert into my_table values ('Test2');
insert into my_table values ('Test3');
insert into my_table values ('Test4');
insert into my_table values ('Test5');
insert into my_table values ('Lesson');
--this query replaces 'Test1' with blank
select replace(column1, 'Test1', ' ') from my_table;
--now i want to replace all matching values with blank but i get an error
select replace(column1, like '%Test%', ' ') from my_table; --this throws below error.
--ORA-00936: missing expression
--00936. 00000 - "missing expression"
--*Cause:
--*Action:
--Error at Line: 19 Column: 25
Running Oracle Database 11g Enterprise Edition Release 11.2.0.1.0
I would use regexp_replace.
select regexp_replace(column1,'Test[[:digit:]]', ' ') from my_table;
regexp_replace
In the original post, you were indicating by %Test% that you want to replace the entire string with a space if it had the string "Test" anywhere in it:
with my_table(col1) as
( select 'Test1' from dual
union
select 'Test2' from dual
union
select 'thisisaTestofpatternmatching4' from dual
union
select 'thisisa Test ofpatternmatching5' from dual
union
select 'Test at the start' from dual
union
select 'Testat the start no following space' from dual
union
select 'Ending with Test' from dual
union
select 'Ending nospacebeforewithTest' from dual
union
select 'Testy' from dual
union
select 'Lesson' from dual
)
select regexp_replace(col1, '^.*Test.*$', ' ') from my_table;
I suspect you really only want to replace the word Test though? Can it occur more than once in a line?
select regexp_replace(col1, 'Test', ' ') from my_table;
The word test followed by a digit?
select regexp_replace(col1, 'Test\d', ' ') from my_table;
Tip: Make sure your test cases are set up to include all kinds of combinations of your test data, even where they may be unexpected. When testing your regular expressions, sometimes you may get unexpected results so make sure all possible conditions are tested.

Compare column value against list of regex values stored in another table and update accordingly

I am new to Oracle programming.
I want to check the "msg" value of "Table1" against the "regex" values from "Table2".
If the regular expression matches as such, I want to update the respective "regex_id" in "Table1".
Usual query: SELECT 'match found' FROM DUAL WHERE REGEXP_LIKE('s 27', '^(s27|s 27)')
Table1
MSG REG_EXID
Ss27 ?
s27 ?
s28 ?
s29 ?
Table2
REGEX REG_EXID RELEVANCE
^(s27|s 27) 1 10
^(s29|s 29) 2 2
^(m28|m 28) 3 2
^(s27|s 27) 4 100
Taking the newly added "relevance" into account, with Oracle 11g you could try along
UPDATE Table1 T1
SET T1.reg_exID =
(SELECT DISTINCT
MAX(reg_exID) KEEP (DENSE_RANK FIRST ORDER BY relevance DESC) OVER (PARTITION BY regex)
FROM Table2
WHERE REGEXP_LIKE(T1.msg, regex)
)
;
See SQL Fiddle.
You could work along
UPDATE Table1
SET reg_exID = (SELECT reg_exID FROM Table2 WHERE REGEXP_LIKE(Table1.msg, regex));
Please keep in mind:
None of your current sample records will be updated as REGEX are case sensitive.
The above UPDATE will fail, if more than a single REGEX does match.
You could rewrite the current REGEX expressions along "^m ?28".
See it in action: SQL Fiddle (With some data added to actually show the effect.)
Please comment if and as clarification/adjustment is required.