oracle removal of complete line - regex

select REGEXP_REPLACE (' if function_text() hi how do you do text_jon','(test\S.*)','',1,0,'m') from dual;
Output for this:
if function_
I don't want this "if function_" too i.e null output should be there
How to achieve the above?

Related

REGEXP_REPLACE for alpha to NULL

While converting Oracle to Postgresql I came to know the following Oracle query need to be converted in Postgres.
Oracle Query: Find pattern and replace with null
select regexp_replace('1', '[^0-9]', null) from dual;
select regexp_replace('a', '[^0-9]', null) from dual;
select regexp_replace('1a1', '[^0-9]', null) from dual;
My try:
As per the postgres document we need to use REGEXP_REPLACE with [[:alpha:]] pattern.
But the statement is replacing with empty string if match found. I'm looking for null instead.
PostgreSQL Query:
select REGEXP_REPLACE('1','[[:alpha:]]','','g') --Correct
select REGEXP_REPLACE('a','[[:alpha:]]','','g') --Wrong: output should be NULL
select REGEXP_REPLACE('1a1','[[:alpha:]]','','g') --Correct
select REGEXP_REPLACE(' ','[[:alpha:]]','','g') --Wrong: output should be NULL
Definitely we can use case statement like following but I want the solution in single line without using case condition.
SELECT case when REGEXP_REPLACE('1a','[[:alpha:]]','','g') = ''
then
null
else
REGEXP_REPLACE('1a','[[:alpha:]]','','g')
end;

BigQuery regexp replace character between quotes

I'm trying to use the BigQuery function regexp_replace for the following scenario:
Given a string field with comma as a delimiter, I need to only remove the commas within double quotes.
I found the following regex to work in the website but it seems that the BigQuery function doesn't support Lookahead groups. Could you please help me find an equivalent expression that is supported by the Big Query function regexp_replace?
https://regex101.com/r/nxkqtb/3
Big Query example code not supported:
WITH tbl AS (
SELECT 'LINE_NR="1",TXT_FIELD="Some text",CID="0"' as text
UNION ALL
SELECT 'LINE_NR="2",TXT_FIELD=",,Some text",CID="0"' as text
UNION ALL
SELECT 'LINE_NR="3",TXT_FIELD="Some text ,",CID="0"' as text
UNION ALL
SELECT 'LINE_NR="4",TXT_FIELD=",Some ,text,",CID="0"' as text
)
SELECT
REGEXP_REPLACE(text, r'(?m),(?=[^"]*"(?:[^"\r\n]*"[^"]*")*[^"\r\n]*$)', "")
FROM tbl;
Thank you
Consider below approach (assuming you know in advance keys within the text field)
select text,
( select string_agg(replace(kv, ',', ''), ',' order by offset)
from unnest(regexp_extract_all(text, r'((?:LINE_NR|TXT_FIELD|CID)=".*?")')) kv with offset
) corrected_text
from tbl;
if applied to sample data in your question - output is

PL/SQL: REGEXP REPLACE with dot and pattern

I am trying to replace the string using regexp_replace in PLSQL and not getting desired output. i am new to this. please advise where i am going wrong.
names := 'table_200_file1_record1.column1 table_200_file2_record2.column2'
SELECT REGEXP_REPLACE(names,'([table_200]*[.]*){1,}','') FROM DUAL;
Desired output: (i want to remove everything before . operator which is starting with table_200)
column1 column2
You need to replace everything that's not a dot after table_200, up to the first dot you find, i.e.:
SELECT REGEXP_REPLACE('table_200_file1_record1.column1 table_200_file2_record2.column2','table_200[^\.]+(\.)','') FROM DUAL

Oracle regex eliminate all duplicate words

I would like to eliminate all duplicate words in a comma separated list.
I've tried with:
SELECT
REGEXP_REPLACE(
'1234,234,1234,1234,928,1234,123,1234,Abcd,1234,1234',
'([^,\w]+)(,[ ]*[\1])+') AS r
FROM dual
It should return
1234,234,928,123,Abcd
But in fact it returns
1234,234,234,234
Also tried with ([^,\w]+)(,[ ]*\1)+ but with '1234,1234,1234' it returns (null)
Also tried with
SELECT
REGEXP_REPLACE(
'1234,234,1234,1234,928,1234,123,1234,Abcd,1234,1234',
'([^,\w]+)(,[ ]*[\1])+', '\1') AS r
FROM dual
and following replacements, even '\1\2' but none of them is giving the desired result.
Please, any ideas?
I know this isn't exactly the method you were asking for, but it still achieves the same result:
WITH DATA AS
( SELECT '1234,234,1234,1234,928,1234,123,1234,Abcd,1234,1234' str FROM dual)
SELECT DISTINCT trim(regexp_substr(str, '[^,]+', 1, LEVEL)) str
FROM DATA
CONNECT BY instr(str, ',', 1, LEVEL - 1) > 0

Using Oracle Regular Expression - Masking based on pattern

Cleaning up ,
With Oracle 11g PL/SQL, for below query, can I get the capture groups' positions (something like what Matcher.start() provides in java).
`select regexp_replace('1234bankzone1234', '^..(.*)bank(zone).(.*)..$', '\2') from dual`
Result should look like : "zone", 9(start of text "zone").
The bigger problem I was trying to solve is to mask data like account number using patterns like '^.....(.*)..$' (this pattern can vary depending on installation).
Will something like below work for you?
select regexp_replace('1234bankzone1234', '^..(.*)bank(zone).(.*)..$', '\2') expr
,instr('1234bankzone1234',regexp_replace('1234bankzone1234', '^..(.*)bank(zone).(.*)..$', '\2')) pos from dual
or more readable subquery like
select a.*, instr(a.value,a.expr) from (
select '1234bankzone1234' value,
regexp_replace('1234bankzone1234', '^..(.*)bank(zone).(.*)..$', '\2') expr from dual
) a
I couldn't find any direct equivalent of Matcher API like functionality and there is no way you can access the position group buffer in SQL.
1: Reverse pattern using this
regexp_replace( regexp_replace( regexp_replace( regexp_replace( regexp_replace( regexp_replace( regexp_replace( regexp_replace( regexp_replace(
pattern, '(\()', '\1#') , '(\))', '#\1') , '\(#', ')#') , '\^\)#', '^') , '#\)\$', '$') , '#\)', '(#') , '#', '') , '\^([^\(]+\))', '^(\1') , '\(([^\)]+)\$', '(\1)$');
So, "^(.)..(.).$"; becomes "^.(..).(.)$";
2: Use this to bulk collect index and count of capture groups within both patterns
SELECT REGEXP_instr(pattern, '\(.*?\)+', 1, LEVEL) bulk collect into posCapture FROM v CONNECT BY LEVEL <= REGEXP_COUNT(pattern, '\(.*?\)');
3: Match both patterns against the text-to-be-masked. Merge them by the order found in step 2.
select regexp_replace(v_src, pattern, '\' || captureIndex) into tempStr from dual;