Regex Oracle not matching as expected

Regex Oracle not matching as expected - regex

I need to match and replace string like VA123 - so two letters and 3 numbers, but this expression is not working as intended. Any idea where I am going off?
SELECT REGEXP_REPLACE ('test VA123', '^\[A-Z]{2}[0-9]{3}$', 'test')
FROM dual;
I want the output in this case to say test test

It you want white space (or start of a string) before the matched string then you can use:
SELECT REGEXP_REPLACE ('test VA123', '(^|\s)[A-Z]{2}[0-9]{3}$', '\1test')
AS replaced_value
FROM dual;
| REPLACED_VALUE |
| :------------- |
| test test |
db<>fiddle here

Related

Redshift Translate command to replace characters

I need to translate commas in a column to pipe with with spaces on each side in Redshift ('a,b,c' becomes 'a | b | c' using Translate. Something in this statement is not giving me my desired results and I can't figure out why?
select 'a,b,c' as comma_string, translate(comma_string, ',', ' | ' ) as pipe_string
is yielding 'a b c' with no pipes. Having trouble getting the space before and after the pipe as
select 'a,b,c' as comma_string, translate(comma_string, ',', '|' ) as pipe_string
gives me 'a|b|c'

The REPLACE command works for this. NOt sure why Translate doesn't.
select 'a,b,c' as comma_string, REPLACE(comma_string, ',' ,' | ') as pipe_string
yields the desired result 'a | b | c'

You would need to use REPLACE since TRANSLATE only maps single characters:
TRANSLATE is similar to the REPLACE function and the REGEXP_REPLACE function, except that REPLACE substitutes one entire string with another string and REGEXP_REPLACE lets you search a string for a regular expression pattern, while TRANSLATE makes multiple single-character substitutions.
https://docs.aws.amazon.com/redshift/latest/dg/r_TRANSLATE.html

Replacing multiple special characters in oracle

I have a requirement in oracle to replace the special characters at first and last position of the column data.
Requirement: only [][.,$'*&!%^{}-?] and alphanumberic characters are allowed to stay in the address data and rest of the characters has to be replaced with space.I have tried in below way in different probabilities but its not working as expected. Please help me in resolving this.
SELECT emp_address,
REGEXP_REPLACE(
emp_address,
'^[^[[][.,$'\*&!%^{}-?\]]]|[^[[][.,$'\*&!%^{}-?\]]]$'
) AS simplified_emp_address
FROM table_name

As per the regular expression operators and metasymbols documentation:
Put ] as the first character of the (negated) character group;
- as the last; and
Do not put . immediately after [ or it can be matched as the start of a coalition element [..] if there is a second . later in the expression.
Also:
Double up the single quote (to escape it, so it does not terminate the string literal); and
Include the non-special characters a-zA-Z0-9 in the capture group too otherwise they will be matched.
Which gives you the regular expression:
SELECT emp_address,
REGEXP_REPLACE(
emp_address,
'^[^][,.$''\*&!%^{}?a-zA-Z0-9-]|[^][,.$''\*&!%^{}?a-zA-Z0-9-]$'
) AS simplified_emp_address
FROM table_name
Which, for the sample data:
CREATE TABLE table_name (emp_address) AS
SELECT '"test1"' FROM DUAL UNION ALL
SELECT '$test2$' FROM DUAL UNION ALL
SELECT '[test3]' FROM DUAL UNION ALL
SELECT 'test4' FROM DUAL UNION ALL
SELECT '|test5|' FROM DUAL;
Outputs:
EMP_ADDRESS
SIMPLIFIED_EMP_ADDRESS
"test1"
test1
$test2$
$test2$
[test3]
[test3]
test4
test4
|test5|
test5
db<>fiddle here

You do not need regular expressions, because they will have cumbersome escape sequences. Use substrings and translate function:
with a as (
select
'some [data ]' as val
from dual
union all
select '{test $' from dual
union all
select 'clean $%&* value' from dual
union all
select 's' from dual
)
select
translate(substr(val, 1, 1), q'{ [][.,$'*&!%^{}-?]}', ' ')
|| substr(val, 2, lengthc(val) - 2)
|| case
when lengthc(val) > 1
then translate(substr(val, -1), q'{ [][.,$'*&!%^{}-?]}', ' ')
end
as value_replaced
from a
| VALUE_REPLACED |
| :--------------- |
| some [data |
| test |
| clean $%&* value |
| s |
db<>fiddle here

How to write fuzzy multiple substring matching when using RLIKE in Hive

For example:
df.select('category').show()
+---------------------------+
| category|
+---------------------------+
| money,insurance|
| life, housework|
| game,FPS,network|
| game,fight,jump|
| hotel|
| trip,hotel|
| null|
I want to use RLIKE to write a regex expression to fuzzy match one of substrings list, ['money', 'life'].
-- This is an exact match
SELECT *
FROM tb_name
WHERE col_name RLIKE '(money|life)'
-- This is a fuzzy match
SELECT *
FROM tb_name
WHERE col_name RLIKE '*.(money|life)'
BUT there is error in ast tree in the fuzzy match code snippet.
06-11 16:59:17-fatal filter ast tree
(TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TAB tb_name))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR "hdfs://XXXX/XX")) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (RLIKE (TOK_TABLE_OR_COL col_name ) '*.(money|life)')) (TOK_LIMIT 2000)))
06-11 16:59:17-fatal Filter feature: .TOK_TAB \S tdw_inter_db.*|.TOK_(CUBE|ROLLUP) .
So I can't see anything wrong with the fuzzy match code snippet.
So could anyone help me?
Thanks in advances.

'(?i)money|life' regexp will match strings containing any of money, life, case insensitive - (?i)

Remove special characters from string on insert?

I have a field of type character varying. On insert I'd like to strip out special characters. In this particular case I'd like to strip out hyphens from a column of hyphenated strings, hyphen_field"123-456-789" from table_two and insert as "123456789" into non_hyphen_field in table_one. I'm starting with a statement of the following form:
INSERT INTO schema.table_one(var_one,var_two,non_hyphen_field)
SELECT var_one, var_two, hyphen_field
FROM schema.table_two;
What is the cleanest way to accomplish this?

On Postgres you can use replace function.
select replace('123-456-789', '-','');
| replace |
| :-------- |
| 123456789 |
dbfiddle here

Oracle SQL Regex not returning expected results

I am using a regex that works perfectly in Java/PHP/regex testers.
\d(?:[()\s#-]*\d){3,}
Examples: https://regex101.com/r/oH6jV0/1
However, trying to use the same regex in Oracle SQL is returning no results. Take for example:
select *
from
(select column_value str from table(sys.dbms_debug_vc2coll('123','1234','12345','12 135', '1', '12 3')))
where regexp_like(str, '\d(?:[()\s#-]*\d){3,}');
This returns no rows. Why does this act so differently? I even used a regex tester that does POSIX ERE, but that still works.

Oracle does not support non-capturing groups (?:). You will need to use a capturing group instead.
It also doesn't like the perl-style whitespace meta-character \s match inside a character class [] (it will match the characters \ and s instead of whitespace). You will need to use the POSIX expression [:space:] instead.
SQL Fiddle
Oracle 11g R2 Schema Setup:
Query 1:
select *
from (
select column_value str
from table(sys.dbms_debug_vc2coll('123','1234','12345','12 135', '1', '12 3'))
)
where regexp_like(str, '\d([()[:space:]#-]*\d){3,}')
Results:
| STR |
|--------|
| 1234 |
| 12345 |
| 12 135 |

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Regex Oracle not matching as expected - regex

I need to match and replace string like VA123 - so two letters and 3 numbers, but this expression is not working as intended. Any idea where I am going off? SELECT REGEXP_REPLACE ('test VA123', '^\[A-Z]{2}[0-9]{3}$', 'test') FROM dual; I want the output in this case to say test test

It you want white space (or start of a string) before the matched string then you can use: SELECT REGEXP_REPLACE ('test VA123', '(^|\s)[A-Z]{2}[0-9]{3}$', '\1test') AS replaced_value FROM dual; | REPLACED_VALUE | | :------------- | | test test | db<>fiddle here

Related

Redshift Translate command to replace characters

Replacing multiple special characters in oracle

How to write fuzzy multiple substring matching when using RLIKE in Hive

Remove special characters from string on insert?

Oracle SQL Regex not returning expected results

Categories

Resources