Url regex oracle - regex

I want to do something like this :
This is the link I want to replace. So I want only to keep the "textIwantToKeep" part :
http://mylink/aaa-bbb/textIwantToKeep
And I want this :
http://mySecondLink/ccc-ddd/textIwantToKeep
I want to use regular expression with Oracle SQL Developper. I think about to count the number of slash (4) and to split only the part before the 4th slash but it doesn't work..
Thank you for your help.

REGEXP_SUBSTR might be one option; \w+$ returns the last word (i.e. the one "anchored" to the end of the string):
SQL> with test (link) as
2 (select 'http://mylink/aaa-bbb/textIwantToKeep' from dual union all
3 select 'http://mySecondLink/ccc-ddd/textIwantToKeep' from dual
4 )
5 select link,
6 regexp_substr(link, '\w+$') result
7 from test;
LINK RESULT
------------------------------------------- --------------------
http://mylink/aaa-bbb/textIwantToKeep textIwantToKeep
http://mySecondLink/ccc-ddd/textIwantToKeep textIwantToKeep
SQL>

There could be other alternatives, but here is something that came to me quickly -
WITH main_table AS (
SELECT 'http://mylink/aaa-bbb/textIwantToKeep' AS original_string FROM dual
)
,
second_table AS (
SELECT 'http://mySecondLink/ccc-ddd/' AS my_second_link FROM dual
)
SELECT
second_table.my_second_link
|| regexp_substr(main_table.original_string, '[^/]+', 1, 4) AS final_string
FROM
main_table,
second_table;
Let me know if that works.

Related

Oracle REGEXP_SUBSTR - SEMICOLON STRING EXTRACTION

I have below input and need mentioned output. How I can get it. I tried different pattern but could not get through it.
so in brief, any value having all three 1#2#3 parts(if it is present) or first value should be returned
2#9#;2#37#65 -> 2#37#65
2#9#;2#37#65;2#37# -> 2#37#65
2#9#;2#37#65;2#37#;2#37#56 -> 2#37#65 or 2#37#56
2#37#65;2#99 -> 2#37#65
3#9#;3#37#65;3#37#36;2#37#56 -> 3#37#65 or 3#37#36 or 2#37#56
2#37#;2#99# -> 2#37 or 2#99# ( in this case any value)
I tried few patterns and other pattern but no help.
regexp_substr('2#9#;2#37#65;2#37#','#[^;]+',1)
SUBSTR(REGEXP_SUBSTR(SUBSTR(uo_filiere,1,INSTR(uo_filiere,';',1)-1), '#[^#]+$'),2)
You can use a REGEXP_REPLACE here:
REGEXP_REPLACE(uo_filiere, '^(.*;)?([0-9]+(#[0-9]+){2,}).*|^([^;]+).*', '\2\4')
See the regexp demo
Details:
^ - start of string
(.*;)? - an optional Group 1 capturing any text and then a ;
([0-9]+(#[0-9]+){2,}) - Group 2 (\2): one or more digits, and then two or more occurrences of # followed with one or more digits
.* - the rest of the string
| - or
^([^;]+).* - start of string, Group 4 capturing one or more chars other than ; and then any text till end of string.
The replacement is Group 2 + Group 4 values.
Here is a simple-minded way to solve this. It may prove more efficient than other approaches, given the particular nature of the problem.
First, use a regular expression to find the first token that has all three parts. This part of the solution should be the most efficient approach for those strings that do have a three-part token, and it performs work that must be performed on all input strings in any case.
In the second part, wrap within nvl - if no three-part token is found, select the first token regardless of how many parts are present. This part uses only substr and instr in a trivial manner, so it should be very fast too.
Here's the query, run on a few more sample inputs to test those cases too.
with
sample_data (uo_filiere) as (
select '2#9#;2#37#65' from dual union all
select '2#9#;2#37#65;2#37#' from dual union all
select '2#9#;2#37#65;2#37#;2#37#56' from dual union all
select '2#37#65;2#99' from dual union all
select '3#9#;3#37#65;3#37#36;2#37#56' from dual union all
select '2#37#;2#99#' from dual union all
select '1#22#333' from dual union all
select '33#444#' from dual
)
select uo_filiere,
nvl(regexp_substr(uo_filiere, '(;|^)(([^;#]+#){2}[^;]+)', 1, 1, null, 2)
, substr(uo_filiere, 1, instr(uo_filiere || ';', ';') - 1)
) as first_value
from sample_data
;
UO_FILIERE FIRST_VALUE
---------------------------- ----------------------------
2#9#;2#37#65 2#37#65
2#9#;2#37#65;2#37# 2#37#65
2#9#;2#37#65;2#37#;2#37#56 2#37#65
2#37#65;2#99 2#37#65
3#9#;3#37#65;3#37#36;2#37#56 3#37#65
2#37#;2#99# 2#37#
1#22#333 1#22#333
33#444# 33#444#

Regular expression to get last part of a string before a character

I have this string on a single column of a single row on an oracle table:
(test-1#gmail.com-1234567)
(testAAAcccc#gmail.com-7654321)
..
Above it's a single big string.
I need a regular expression to extract all the occurrences (could be 1 or more, 2 in above example) of the 7 numbers above, so the results should be:
1234567
7654321
I'm trying to to that with various regular expression or oracle functions, I'm not able to get both the occurrences.
Could you please help me?
If you need exactly regular expression:
select regexp_substr('(test-1#gmail.com-1234567)', '\d{7}' ) from dual
To find all occurences:
select *
from t,
lateral(select level occurence_number, regexp_substr(str, '(\d{7})',1,level ) digits7
from dual
connect by level<=regexp_count(str, '(\d{7})' )
);
or test query with test data (you can run it as-is to se how it works):
with t(str) as (
select '(test-1#gmail.com-1234567)' from dual union all
select '(testAAAcccc#gmail.com-7654321)' from dual union all
select '7654321 1234567 2345678' from dual
)
select *
from t,
lateral(select level occurence_number, regexp_substr(str, '(\d{7})',1,level ) digits7
from dual
connect by level<=regexp_count(str, '(\d{7})' )
);
If you are looking to extract only numbers, if you can do that by:
Select SUBSTR(column, INSTR(column,'-', -1) + 1)
from dual;
#It is fetching everything from column after dash(-).
If you have paranthesis, if you can replace it with a space and then TRIM:
Select TRIM(replace(SUBSTR(column, INSTR(column,'-', -1) + 1),')', ' '))
from dual;

RegEx in Oracle SQL, positive look ahead

I have a string and I'd like to capture everything before "\Close_Out":
string: \fileshare\R and G\123456\Close_Out\Warranty Letter.pdf
The only solution I have come up with uses positive lookahead, this works when I test it on https://regex101.com/
(.*)(?=\\Close_Out)
But now I need to use it in an Oracle SQL statement :
select REGEXP_SUBSTR('\\fileshare\R and G\123456\Close_Out\Warranty Letter.pdf', '(.*)(?=\\Close_Out)') from dual
and it does not work since (I think) look ahead is not supported. Can someone assist with an alternative expression that will work in sql
If regular expressions isn't a must, then substr + instr does the job:
SQL> with test (col) as
2 (select '\fileshare\R and G\123456\Close_Out\Warranty Letter.pdf' from dual)
3 select substr(col, 1, instr(col,'\Close_Out') - 1) result
4 from test;
RESULT
-------------------------
\fileshare\R and G\123456
SQL>
Just for completness this REGEXP provides the result inclusive the \Close_Out
select REGEXP_SUBSTR('\\fileshare\R and G\123456\Close_Out\Warranty Letter.pdf', '.*\\Close_Out') reg from dual;
REG
------------------------------------
\\fileshare\R and G\123456\Close_Out
To get the string before it use a subexpression - a part enclosed in paretheses and reference it with the subexpression parameter = 1 (last parameter - see details in documentation).
select REGEXP_SUBSTR('\\fileshare\R and G\123456\Close_Out\Warranty Letter.pdf', '(.*)\\Close_Out', 1, 1, null, 1) reg from dual;
REG
--------------------------
\\fileshare\R and G\123456

Last word in a sentence: In SQL (regular expressions possible?)

I need this to be done in Oracle SQL (10gR2). But I guess, I would rather put it plainly, any good, efficient algorithm is fine.
Given a line (or sentence, containing one or many words, English), how will you find the last word of the sentence?
Here is what I have tried in SQL. But, I would like to see an efficient way of doing this.
select reverse(substr(reverse(&p_word_in)
, 0
, instr(reverse(&p_word_in), ' ')
)
)
from dual;
The idea was to reverse the string, find the first occurring space, retrieve the substring and reverse the string. Is it quite efficient? Is a regular expression available? I am on Oracle 10g R2. But I dont mind seeing any attempt in other programming language, I wont mind writing a PL/SQL function if need be.
Update:
Jeffery Kemp has given a wonderful answer. This works perfectly.
Answer
SELECT SUBSTR(&sentence, INSTR(&sentence,' ',-1) + 1)
FROM dual
I reckon it's simpler with INSTR/SUBSTR:
WITH q AS (SELECT 'abc def ghi' AS sentence FROM DUAL)
SELECT SUBSTR(sentence, INSTR(sentence,' ',-1) + 1)
FROM q;
Not sure how it is performance wise, but this should do it:
select regexp_substr(&p_word_in, '\S+$') from dual;
I'm not sure if you can use a regex in oracle, but wouldn't
(\w+)\W*$
work?
This regex matches the last word on a line:
\w+$
And RegexBuddy gives this code for use in Oracle:
DECLARE
match VARCHAR2(255);
BEGIN
match := REGEXP_SUBSTR(subject, '[[:alnum:]]_+$', 1, 1, 'c');
END;
this leaves the punctuation but gets the final word
with datam as (
SELECT 'abc asdb.' A FROM DUAL UNION
select 'ipso factum' a from dual union
select 'ipso factum' a from dual union
SELECT 'ipso factum2' A FROM DUAL UNION
SELECT 'ipso factum!' A FROM DUAL UNION
SELECT 'ipso factum !' A FROM DUAL UNION
SELECT 'ipso factum/**//*/?.?' A FROM DUAL UNION
SELECT 'ipso factum ...??!?!**' A FROM DUAL UNION
select 'ipso factum ..d.../.>' a from dual
)
SELECT a,
--REGEXP_SUBSTR(A, '[[:alnum:]]_+$', 1, 1, 'c') , /** these are the other examples*/
--REGEXP_SUBSTR(A, '\S+$') , /** these are the other examples*/
regexp_substr(a, '[a-zA-Z]+[^a-zA-Z]*$')
from datam
This works too, even for non english words :
SELECT REGEXP_SUBSTR ('San maria Calle Cáceres Numéro 25 principal izquierda, España', '[^ .]+$') FROM DUAL;

PL/SQL Oracle regular expression doesn't work for occurencce of zero

I have a problem in matching regular expression in Oracle PL/SQL.
To be more specific, the problem is that regex doesn't want to match any zero occurrence.
For example, I have something like:
select * from dual where regexp_like('', '[[:alpha:]]*');
and this doesn't work. But if I put space in this statement:
select * from dual where regexp_like(' ', '[[:alpha:]]*');
it works.
I want to have the first example running, so that person doesn't have to put 'space' for it to work.
Any help is appreciated, and thank you for your time.
T
For better or worse, empty strings in Oracle are treated as NULL:
SQL> select * from dual where '' like '%';
DUMMY
-----
Take that into account when querying with Oracle:
SQL> SELECT *
2 FROM dual
3 WHERE regexp_like('', '[[:alpha:]]*')
4 OR '' IS NULL;
DUMMY
-----
X
Does Oracle still treat empty strings as NULLs? And if so, does regexp_like with a NULL input source string return UNKNOWN? Both of these would be semi-reasonable, and a reason why your test does not work as you expected.