Converting to number from string with special characters - regex

I have a input column with below values.
1.2%
111.00$
$100.00
aa
ss
Output expected
1.2
111.00
100.00
null
null
I am trying to use REGEXP_REPLACE and tried replacing every character that is not a digit or "." so that 1.2% will become 1.2.
Here is the query I tried but this didn't work.
regexp_replace('%1.aa2', '[^[\d|\.]]', '')
Can anyone suggest how to do that? and what I am doing wrong? I am working on Oracle 11.2 database with pl/sql developer.

Use POSIX class for matching digit chars ie [:digit:]. So this negated class [^[:digit:].] should match any character but not of dot or digit.
regexp_replace('%1.aa2', '[^[:digit:].]', '')

You can use the [^0-9.] regex and replace with empty string:
with testdata(txt) as (
select 'ss' from dual
union
select 'aa' from dual
union
select '$100.00' from dual
union
select '111.00$' from dual
union
select '1.2%' from dual
)
select regexp_replace(txt, '[^0-9.]', '')
from testdata;
Result of an SQL fiddle:

Related

matching patterns for regexp_like in Oracle to include a group of character conditionally

I tried to look up for a good documentation for matching pattern for use of regexp_like in Oracle. I have found some and followed their instructions but looks like I have missed something or the instruction is not comprehensive.
Let's look at this example:
SELECT * FROM
(
SELECT 'ABC' T FROM DUAL
UNION
SELECT 'WZY' T FROM DUAL
UNION
SELECT 'WZY_' T FROM DUAL
UNION
SELECT 'WZYEFG' T FROM DUAL
UNION
SELECT 'WZY_EFG' T FROM DUAL
) C
WHERE regexp_like(T, '(^WZY)+[_]{0,1}+[A-Z]{0,6}')
What I expect to receive are WZY and WZY_EFG. But what I got was:
What I would like to have is the "_" could be present or not but if there are character after the first group, it is mandatory that it be present only once.
Is there a clean way to do this?
Use a subexpression grouping to make sure the _ character appears only with Capitalized Alphabetical Characters
Yes, your pattern does not address the conditional logic you need (only see the _ when capitalized alphabetical characters follow).
Placing the _ character in with a capitalized alphabetical character list into a subexpression grouping forces this logic.
Finally, placing the end of line anchor addresses the zero match scenarios.
SCOTT#DB>SELECT
2 *
3 FROM
4 (
5 SELECT 'ABC' t FROM dual
6 UNION ALL
7 SELECT 'WZY' t FROM dual
8 UNION ALL
9 SELECT 'WZY_' t FROM dual
10 UNION ALL
11 SELECT 'WZYEFG' t FROM dual
12 UNION ALL
13 SELECT 'WZY_EFG' t FROM dual
14 ) c
15 WHERE
16 REGEXP_LIKE ( t, '^(WZY)+([_][A-Z]{1,6}){0,1}$' );
T
__________
WZY
WZY_EFG

Regex extract positive or negative number from string oracle

I am struggling in finding the solution to the following problem:
Assuming the character values 512a, -1230b, -2 and 2.
Which can be obtained into a table from the following query:
with my_input_values as (
select '512a' my_val from dual union select '-1230b' my_val from dual union select '2' my_val from dual union select '-2' my_val from dual
)
select * from my_input_values;
I am trying to build the regular expression that extract the number keeping the positive or negative sign from each value.
The expected result are the following numeric values: 512, -1230, 2 and -2.
Which can be obtained into a table through the following query:
with result as (
select 512 my_val from dual union select -1230 my_val from dual union select 2 my_val from dual union select -2 my_val from dual
)
select * from result;
You could match an optional character (or use an asterix * instead of ? to match zero or more characters) followed by a quote. Then you could replace that match with an empty string.
[a-z]?'
For completeness of the answer, the Oracle code to accomplish this task is the following:
select regexp_replace('-asd51assaddasd2a', '[a-z]?') from dual;
which successfully results in a numberic field maintaining the polarity sign
select -512 from dual;

How to replace a single number from a comma separated string in oracle using regular expressions?

I have the following set of data where I need to replace the number 41 with another number.
column1
41,46
31,41,48,55,58,121,122
31,60,41
41
We can see four conditions here
41,
41
,41,
41,
I have written the following query
REGEXP_replace(column1,'^41$|^41,|,41,|,41$','xx')
where xx is the number to be replaced.
This query will replace the comma as well which is not expected.
Example : 41,46 is replaced as xx46. Here the expected output is xx,46. Please note that there are no spaced between the comma and numbers.
Can somebody help out how to use the regex?
Assuming the string is comma separated, You can use comma concatenation with replace and trim to do the replacement. No regex needed. You should avoid regex as the solution is likely to be slow.
with t (column1) as (
select '41,46' from dual union all
select '31,41,48,55,58,121,122' from dual union all
select '31,60,41' from dual union all
select '41' from dual
)
-- Test data setup. Actual solution is below: --
select
column1,
trim(',' from replace(','||column1||',', ',41,', ',17,')) as replaced
from t;
Output:
COLUMN1 REPLACED
41,46 17,46
31,41,48,55,58,121,122 31,17,48,55,58,121,122
31,60,41 31,60,17
41 17
4 rows selected.
Also, it's worth noting here that the comma separated strings is not the right way of storing data. Normalization is your friend.

Query for regular expression - oracle DB

I am trying to get all the name which has special char and numbers in it. I want to know if there is a better way of writing this query. Thanks
select Name from TABLE1
WHERE NAME LIKE '%-%'
OR NAME LIKE '%$%'
OR NAME LIKE '%4%' etc
Try this:
SELECT Name FROM Test WHERE REGEXP_LIKE(Name,'(-|$|4)');
The | (OR or Alternation) operator allows the expression to return true if the value contains an '-' OR an '$' OR an '4' etc.
Use square brackets to define a character class. Also a full regex to anchor the pattern to the beginning of the string, any number of any characters before a character class consisting of a dash or a 4 or a dollar sign then any number of any characters anchored to the end of the string:
SQL> with tbl(name) as (
2 select 'efs' from dual
3 union
4 select 'abc-' from dual
5 union
6 select 'a4bd' from dual
7 union
8 select 'gh$dll' from dual
9 union
10 select 'xy5zzy' from dual
11 )
12 select name from tbl
13 where regexp_like(name, '^.*[-|$|4].*$');
NAME
------
a4bd
abc-
gh$dll
SQL>

Oracle REGEXP_REPLACE replace space in middle with empty string

I'm trying to use the Oracle REGEXP_REPLACE function to replace a whitespace (which is in the middle of a string) with an empty string.
One of my columns contains strings like the following one.
[alphanumeric][space][digits][space][alpha] (eg. R4SX 315 GFX)
Now, I need to replace ONLY the second whitespace (the whitespace after the digits) with an empty string (i.e. R4SX 315 GFX --> R4SX 315GFX)
To achieve this, I tried the following code:
SELECT REGEXP_REPLACE(
'R4SX 315 GFX',
'([:alphanum:])\s(\d)\s([:alpha:])',
'\1 \2\3') "REPLACED"
FROM dual;
However, the result that I get is the same as my input (i.e. R4SX 315 GFX).
Can someone please tell me what I have done wrong and please point me in the right direction.
Thanks in advance.
[:alphanum:]
alphanum is incorrrect. The alphanumeric character class is [[:alnum:]].
You could use the following pattern in the REGEXP_REPLACE:
([[:alnum:]]{4})([[:space:]]{1})([[:digit:]]{3})([[:space:]]{1})([[:alpha:]]{3})
Using REGEXP
SQL> SELECT REGEXP_REPLACE('R4SX 315 GFX',
2 '([[:alnum:]]{4})([[:space:]]{1})([[:digit:]]{3})([[:space:]]{1})([[:alpha:]]{3})',
3 '\1\2\3\5')
4 FROM DUAL;
REGEXP_REPL
-----------
R4SX 315GFX
SQL>
If you are not sure about the number of characters in each expression of the pattern, then you could do:
SQL> SELECT REGEXP_REPLACE('R4SX 315 GFX',
2 '([[:alnum:]]+[[:blank:]]+[[:digit:]]+)[[:blank:]]+([[:alpha:]]+)',
3 '\1\2')
4 FROM dual;
REGEXP_REPL
-----------
R4SX 315GFX
SQL>
Using SUBSTR and INSTR
The same could be done with substr and instr which wouldbe less resource consuming than regexp.
SQL> WITH DATA AS
2 ( SELECT 'R4SX 315 GFX' str FROM DUAL
3 )
4 SELECT SUBSTR(str, 1, instr(str, ' ', 1, 2) -1)
5 ||SUBSTR(str, instr(str, ' ', 1, 2) +1, LENGTH(str)-instr(str, ' ', 1, 2)) new_str
6 FROM DATA;
NEW_STR
-----------
R4SX 315GFX
SQL>
Your regex contains an invalid class alphanum. Also, these classes must be used inside character classes [...]. Instead of \s, you need to use a supported [:blank:] class. More details on the regex syntax in MySQL can be found here.
I recommend using
SELECT REGEXP_REPLACE(
'R4SX 315 GFX',
'([[:alnum:]]+[[:blank:]]+[[:digit:]]+)[[:blank:]]+([[:alpha:]]+)'
, '\1\2') "REGEXP_REPLACE"
FROM dual;
This way you will use just 2 capturing groups. The less we have the better is for performance. Here you can see more details on REGEXP_REPLACE function.