Teradata REGEXP_SIMILAR matching exact string - regex

I have list of databases and tables obtained like this:
SELECT TRIM(DatabaseName) || '.' || TRIM(TableName) AS DatabaseTable
FROM DBC.TablesV t1
WHERE TableKind IN ('T','O','V')
I now want to match them to dbc.TablesV.RequestText to build a hierarchy of my database views.
At first i did it with simple join like below
JOIN DBC.TablesV t2
ON t2.RequestText LIKE '%' || DatabaseTable || '%'
but unfortunately, we have tables like T1010_User and T1010_User_Hist, and databases like DB_STAGE and Q_DB_STAGE so i decided to add spaces to % to a LIKE clause making it LIKE '% ' || DatabaseTable || ' %' but then it fails to get proper results because sometimes tablename is at the end of a requesttext like this: (...) DB_STAGE.TableName; and sometimes its like this:
(...)
FROM
DB_STAGE.TableName t1
(...)
I decided to use REGEXP_SIMILAR to match them with WHEN REGEXP_SIMILAR() = 1 but my regex-fu is weak, so I cannot build regex that will do something like:
((anything other than a letter/number) or nothing) DatabaseTable ((anything other than a letter/number) or nothing)
This is to build hierarchy of views to help with migrating data to a different database.
This is very simplified case:
CREATE VOLATILE TABLE test1
(
c0 SMALLINT,
c1 varchar(100)
)ON COMMIT PRESERVE ROWS;
INSERT INTO test1 VALUES(1,'aaa
Q_abcdef.abcdef');
INSERT INTO test1 VALUES(2,' Q_abcdef.abcdef ');
INSERT INTO test1 VALUES(3,'aaa
DQ_abcdef.abcdef ');
INSERT INTO test1 VALUES(4,' S_abcdef.abcdef');
INSERT INTO test1 VALUES(5,'Q_abcdef.abcdefg');
INSERT INTO test1 VALUES(6,' sdfs
Q_abcdef.abcdefg');
INSERT INTO test1 VALUES(DQ_abcdef,' 7.abcdefg');
INSERT INTO test1 VALUES(8,' S_abcdef.abcdefg');
INSERT INTO test1 VALUES(9,'Q_abcdef.abcdef;');
INSERT INTO test1 VALUES(10,' Q_abcdef.abcdef;');
INSERT INTO test1 VALUES(11,'DQ_abcdef.abcdef;');
INSERT INTO test1 VALUES(12,' S_abcdef.abcdef;');
I need to match 1, 2, 9 and 10. The ones that have string Q_abcdef.abcdef exactly.

You can use \b for matching word boundaries:
WHERE REGEXP_SIMILAR (c1, '.*\bQ_abcdef.abcdef\b.*', 'i') = 1
This will not return 5 & 6 because it's not matching due to the final g

Related

Oracle PLSQL regexp_substr separate comma separated string with double quotes

I've seen examples of how to separate comma-separated strings into rows like this:
select distinct id, trim(regexp_substr(value,'[^,]+', 1, level) ) value, level
from tbl1
connect by regexp_substr(value, '[^,]+', 1, level) is not null
order by id, level;
but, my question is, how do I do this on double quote and comma delimited strings?
Ex: the above works for strings like "1,2,3,4,5,6,7", but what about "1","2","3","4,5","6,7,8","9" so that the rows end up like:
1
2
3
4,5
6,7,8
9
edit: I'm on Oracle 11.2.0.4, 11gR2.
There is a hack. Replace the pattern "," with # and use it in regular expression.It works like a charm.
Input String : "1","2","3","4,5","6,7,8","9","Ant,B","Gradle","E,F","G"(Can be number/Character doesn't matter)
with temp as (
select replace(replace('"1","2","3","4,5","6,7,8","9","Ant,B","Gradle","E,F","G"','","','#'),'"') Colmn from dual
)
SELECT trim(regexp_substr(str, '[^#]+', 1, level)) str
FROM (SELECT Colmn str FROM temp) t
CONNECT BY instr(str, '#', 1, level - 1) > 0
Output :
STR
1
2
3
4,5
6,7,8
9
Ant,B
Gradle
E,F
G
10 rows
Refer DBFiddle link for demo.
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=d09c326f614d10f5d3c407fdfd3a44c5
here is the another solution.it must be working all string with have number.
i used number values index as base.
with temp as (
select '"1","2","3",x"4,5","6,73,8","9"' Colmn from dual
)
SELECT regexp_substr(Colmn, '\d{1}', REGEXP_INSTR(Colmn, '\d{1}', REGEXP_INSTR(Colmn, '\d{1}') ,level),1 ) from temp
CONNECT BY REGEXP_COUNT (Colmn,'\d{1}')+1> level

PL SQL regular expression substring

I have a long string.
message := 'I loooove my pet animal';
This string in 23 chars long. If message is greater that 15 chars, I need to find the length of message where I can break the string into 2 strings. For example, in this case,
message1 := 'I loove my'
message2 := 'pet animal'
Essentially it should find the position of a whole word at the previous to 15 chars and the break the original string into 2 at that point.
Please give me ideas how I can do this.
Thank you.
Here is a general solution - with possibly more than one input string, and with inputs of any length. The only assumption is that no single word may be more than 15 characters, and that everything between two spaces is considered a word. If a "word" can be more than 15 characters, the solution can be adapted, but the requirement itself would need to state what the desired result is in such a case.
I make up two input strings in a CTE (at the top) - that is not part of the solution, it is just for testing and illustration. I also wrote this in plain SQL - there is no need for PL/SQL code for this type of problem. Set processing (instead of one row at a time) should result in much better execution.
The approach is to identify the location of all spaces (I append and prepend a space to each string, too, so I won't have to deal with exceptions for the first and last substring); then I decide, in a recursive subquery, where each "maximal" substring should begin and where it should end; and then outputting the substrings is trivial. I used a recursive query, that should work in Oracle 11.1 (or 11.2 with the syntax I used, with column names in CTE declarations - it can be changed easily to work in 11.1). In Oracle 12, it would be easier to rewrite the same idea using MATCH_RECOGINZE.
with
inputs ( id, str ) as (
select 101, 'I loooove my pet animal' from dual union all
select 102, '1992 was a great year for - make something up here as needed' from dual
),
positions ( id, pos ) as (
select id, instr(' ' || str || ' ', ' ', 1, level)
from inputs
connect by level <= length(str) - length(replace(str, ' ')) + 2
and prior id = id
and prior sys_guid() is not null
),
r ( id, str, line_number, pos_from, pos_to ) as (
select id, ' ' || str || ' ', 0, null, 1
from inputs
union all
select r.id, r.str, r.line_number + 1, r.pos_to,
( select max(pos)
from positions
where id = r.id and pos - r.pos_to between 1 and 16
)
from r
where pos_to is not null
)
select id, line_number, substr(str, pos_from + 1, pos_to - pos_from - 1) as line_text
from r
where line_number > 0 and pos_to is not null
order by id, line_number
;
Output:
ID LINE_NUMBER LINE_TEXT
---- ----------- ---------------
101 1 I loooove my
101 2 pet animal
102 1 1992 was a
102 2 great year for
102 3 - make
102 4 something up
102 5 here as needed
7 rows selected.
First you reverse string.
SELECT REVERSE(strField) FROM DUAL;
Then you calculate length i = length(strField).
Then find the first space after the middle
j := INSTR( REVERSE(strField), ' ', i / 2, i)`
Finally split by i - j (maybe +/- 1 need to test it)
DEMO
WITH parameter (id, strField) as (
select 101, 'I loooove my pet animal' from dual union all
select 102, '1992 was a great year for - make something up here as needed' from dual union all
select 103, 'You are Supercalifragilisticexpialidocious' from dual
), prepare (id, rev, len, middle) as (
SELECT id, reverse(strField), length(strField), length(strField) / 2
FROM parameter
)
SELECT p.*, l.*,
SUBSTR(strField, 1, len - INSTR(rev, ' ', middle)) as first,
SUBSTR(strField, len - INSTR(rev, ' ', middle) + 2, len) as second
FROM parameter p
JOIN prepare l
ON p.id = l.id
OUTPUT

Perl hash substitution with special characters in keys

My current script will take an expression, ex:
my $expression = '( a || b || c )';
and go through each boolean combination of inputs using sub/replace, like so:
my $keys = join '|', keys %stimhash;
$expression =~ s/($keys)\b/$stimhash{$1}/g;
So for example expression may hold,
( 0 || 1 || 0 )
This works great.
However, I would like to allow the variables (also in %stimhash) to contain a tag, *.
my $expression = '( a* || b* || c* )';
Also, printing the keys of the stimhash returns:
a*|b*|c*
It is not properly substituting/replacing with the extra special character, *.
It gives this warning:
Use of uninitialized value within %stimhash in substitution iterator
I tried using quotemeta() but did not have good results so far.
It will drop the values. An example after the substitution looks like:
( * || * || * )
Any suggestions are appreciated,
John
Problem 1
You use the pattern a* thinking it will match only a*, but a* means "0 or more a". You can use quotemeta to convert text into a regex pattern that matches that text.
Replace
my $keys = join '|', keys %stimhash;
with
my $keys = join '|', map quotemeta, keys %stimhash;
Problem 2
\b
is basically
(?<!\w)(?=\w)|(?<=\w)(?!\w)
But * (like the space) isn't a word character. The solution might be to replace
s/($keys)\b/$stimhash{$1}/g
with
s/($keys)(?![\w*])/$stimhash{$1}/g
though the following make more sense to me
s/(?<![\w*])($keys)(?![\w*])/$stimhash{$1}/g
Personally, I'd use
s{([\w*]+)}{ $stimhash{$1} // $1 }eg

Query to replace a string with matching pattern

I have a string like below.
'comp' as "COMPUTER",'ms' as "MOUSE" ,'keybr' as "KEYBOARD",'MONT' as "MONITOR",
Is it possible to write a query so that i will get the result as
'comp' ,'ms' ,'keybr' ,'MONT' ,
I can replace the string "as" with empty string using REPLACE query.
But how do I remove the string inside double quote?
Can anyone help me doing this?
Thanks in advance.
select replace(
regexp_replace('''comp'' as "COMPUTER"'
, '(".*")'
,null)
,' as '
,null)
from dual
The regex '(".*")' selects text whatever in double quotes.
EDIT:
the regex replaced the entire length of the matching pattern. So, we might need to tokenise the string first using comma as delimiter and apply the regex. Later join it.(LISTAGG)
WITH str_tab(str1, rn) AS
(SELECT regexp_substr(str, '[^,]+', 1, LEVEL), -- delimts
LEVEL
FROM (SELECT '''comp'' as "computer",''comp'' as "computer"' str
FROM dual) tab
CONNECT BY LEVEL <= LENGTH(str) - LENGTH(REPLACE(str, ',')) + 1)
SELECT listagg(replace(
regexp_replace(str1
, '(".*")'
,null)
,' as '
,null), ',') WITHIN GROUP (ORDER BY rn) AS new_text
FROM str_tab;
EDIT2:
A cleaner approach from #EatAPeach
with x(y) as (
select q'<'comp' as "COMPUTER",'ms' as "MOUSE" ,'keybr' as "KEYBOARD",'MONT' as "MONITOR",>'
from dual
)
select y,
regexp_replace(y, 'as ".*?"' ,null)
from x;

PL/SQL Reg Exp found last number and split

How do I use reg expression to found last number in a string and then everything on the right hand side in to column c1 and from the last number everything on the left + 1 character go into column c2 ?
e.g 1
string = 1234john4345 this is a test.
Result
c1 = 1234john4345
c2 = this is a test.
e.g 2
string = 1234john4345a this is a test.
Result
c1 = 1234john4345a
c2 = this is a test.
select test
--Group 1: Match everything up to the last digit, and one other character
--Group 2: Everything after group 1
,regexp_replace(test, '(.*[[:digit:]].)(.*)', '\1') c1
,regexp_replace(test, '(.*[[:digit:]].)(.*)', '\2') c2
from
(
select '1234john4345 this is a test.' test from dual union all
select '1234john4345a this is a test' test from dual
);