To remove leading alpha characters from string - regex

I have to Ignore the leading zeros, leading/trailing spaces, leading alpha characters from a string. Please help what regexp can be used.
For example:
the string is abc123abc , then it needs to return 123abc.
Presently i used
REGEXP_SUBSTR('abc123abc','([1-9]+[0-9]*)( *)$')
but it returns null for me.

Something like this?
SQL> with test (col) as
2 (select 'abc123abc' from dual union all
3 select ' 1234ddc' from dual union all
4 select '0abcd' from dual union all
5 select '18858 ' from dual union all
6 select 'ab123ab45' from dual
7 )
8 select col, trim(regexp_replace(col, '^[[:alpha:]]+| |0')) result
9 from test;
COL RESULT
--------- ---------
abc123abc 123abc
1234ddc 1234ddc
0abcd abcd
18858 18858
ab123ab45 123ab45
SQL>

Vaish,
This is how the Regex should be.
What this will do, it will remove any leading spaces, leading zeros.
Example of query:
SELECT REGEXP_SUBSTR('abc123abc','[1-9]+.*') from dual;
You can see some examples I have tried here, plus you can test some more here too.
https://regex101.com/r/zfohRB/1
Regex: '[1-9]+.*'
Explanation:
[1-9] - This will look for the number to start. Excluding 0.
+ - Quantifier + denotes 1 or more.
. - Means anything after that.
* - Means 0 or more. (You can replace this with + if you think that you need at least something after numbers.)
Good Luck

Related

check if string has the format : 2 lettes au max following with numbers with regexp_like oracle

Could you please tell me the expression regexp_like to check if a string has the format which begins with one or two letters max followed by numbers. For example, like 'A25', 'AB567'.
Thanks
How about
SQL> with test (col) as
2 (select 'A25' from dual union all
3 select 'AB567' from dual union all
4 select 'A12A532A' from dual union all
5 select 'ABC123' from dual union all
6 select '12XYZ34' from dual
7 )
8 select col,
9 case when regexp_like(col, '^[[:alpha:]]{1,2}[[:digit:]]+$') then 'OK'
10 else 'Wrong'
11 end result
12 from test;
COL RESULT
---------- ----------
A25 OK
AB567 OK
A12A532A Wrong
ABC123 Wrong
12XYZ34 Wrong
SQL>
The regular expression would be ^[A-Z]{1,2}[0-9]+$.
Working demo on db<>fiddle here
This match a string that contain only numbers and letters, that start with a letter, when there is never more than 2 letters in a row.
^(([A-Za-z]){1,2}([0-9])+)*\2{0,2}$

matching patterns for regexp_like in Oracle to include a group of character conditionally

I tried to look up for a good documentation for matching pattern for use of regexp_like in Oracle. I have found some and followed their instructions but looks like I have missed something or the instruction is not comprehensive.
Let's look at this example:
SELECT * FROM
(
SELECT 'ABC' T FROM DUAL
UNION
SELECT 'WZY' T FROM DUAL
UNION
SELECT 'WZY_' T FROM DUAL
UNION
SELECT 'WZYEFG' T FROM DUAL
UNION
SELECT 'WZY_EFG' T FROM DUAL
) C
WHERE regexp_like(T, '(^WZY)+[_]{0,1}+[A-Z]{0,6}')
What I expect to receive are WZY and WZY_EFG. But what I got was:
What I would like to have is the "_" could be present or not but if there are character after the first group, it is mandatory that it be present only once.
Is there a clean way to do this?
Use a subexpression grouping to make sure the _ character appears only with Capitalized Alphabetical Characters
Yes, your pattern does not address the conditional logic you need (only see the _ when capitalized alphabetical characters follow).
Placing the _ character in with a capitalized alphabetical character list into a subexpression grouping forces this logic.
Finally, placing the end of line anchor addresses the zero match scenarios.
SCOTT#DB>SELECT
2 *
3 FROM
4 (
5 SELECT 'ABC' t FROM dual
6 UNION ALL
7 SELECT 'WZY' t FROM dual
8 UNION ALL
9 SELECT 'WZY_' t FROM dual
10 UNION ALL
11 SELECT 'WZYEFG' t FROM dual
12 UNION ALL
13 SELECT 'WZY_EFG' t FROM dual
14 ) c
15 WHERE
16 REGEXP_LIKE ( t, '^(WZY)+([_][A-Z]{1,6}){0,1}$' );
T
__________
WZY
WZY_EFG

Regular Expression - Not matching certain characters and position of characters

SOLUTION:
Finally solved it using the regex provided by Gary_W below and a simple PowerShell command that uses the discussed replacement function. So there was no need to use the built in regex activity in the software we use. Here´s the PS:
"100,000.00" -replace "([,.]\d{2}$)|[,.]",""
Regular Expressions are freaking me out. I cannot get used to that logic. However, I think my current RE problem is a quite simple one bur I cannot make it work :(
So here´s what I want to achieve:
I want the RE to match only the digits before the last two decimal places.
Thus, the RE must ignore any "." and "," AND always the last two digits.
> Examples:
> 1.000.000,00 --> 1000000
> 123,456.00 --> 123456
> 100.000,00 --> 100000
> 10.000,00 --> 10000
> 10,000.00 --> 10000
> 1.000,00 --> 1000
> 100,00 --> 100
> 99.88 --> 99
> 99,88 --> 99
> 1,23 --> 1
> ...
Any ideas how to get this working?
Here's how I would do it in Oracle, for what it's worth. Maybe the regex used here will give you an idea. Read the regex as "Look for a match of a comma or a decimal followed by 2 digits at the end of the line, OR a comma or a decimal and replace with nothing.
Note the match for the optional decimal places at the end needs to be first in the regex, otherwise the single characters are matched first, making the 2 decimal places non-existent and thus not matched.
SQL> with tbl(str) as (
select '1.000.000,00' from dual union all
select '123,456.00' from dual union all
select '100.000,00' from dual union all
select '10.000,00' from dual union all
select '10,000.00' from dual union all
select '1.000,00' from dual union all
select '100,00' from dual union all
select '99.88' from dual union all
select '99,88' from dual union all
select '1,23' from dual union all
select '3' from dual
)
select str,
regexp_replace(str, '([,.]\d{2}$)|[,.]') fixed
from tbl;
STR FIXED
------------ ------------
1.000.000,00 1000000
123,456.00 123456
100.000,00 100000
10.000,00 10000
10,000.00 10000
1.000,00 1000
100,00 100
99.88 99
99,88 99
1,23 1
3 3
11 rows selected.
SQL>
Just saw the regexr link, plugging in my regex looks like it works with the global flag. The characters you wish to remove are highlighted.
In which language/with which tool? With sed, you can do:
sed 's/\(.*\)[\.,]../\1/;s/[\.,]//g'
In perl it's similar, just without the initial backslashes:
perl -pe 's/(.*)[\.,]../\1/;s/[\.,]//g'
This is done with two regexes, by the way. The first one reads "save all that you can, up to a dot or a comma followed by two chars, and then replace the whole match with that". The second one reads "replace all dots and commas with nothing", that is, "remove all dots and commas".
In regexr.com you can use "Replace" in Tools to replace the match with the first capture group. Just put (.*)[\.,].. in Expression, and $1 in Replace, to see the first regex working. Then you can do something similar with the second one, as regexr doesn't support chaining of expressions, as far as I can see.

Query for regular expression - oracle DB

I am trying to get all the name which has special char and numbers in it. I want to know if there is a better way of writing this query. Thanks
select Name from TABLE1
WHERE NAME LIKE '%-%'
OR NAME LIKE '%$%'
OR NAME LIKE '%4%' etc
Try this:
SELECT Name FROM Test WHERE REGEXP_LIKE(Name,'(-|$|4)');
The | (OR or Alternation) operator allows the expression to return true if the value contains an '-' OR an '$' OR an '4' etc.
Use square brackets to define a character class. Also a full regex to anchor the pattern to the beginning of the string, any number of any characters before a character class consisting of a dash or a 4 or a dollar sign then any number of any characters anchored to the end of the string:
SQL> with tbl(name) as (
2 select 'efs' from dual
3 union
4 select 'abc-' from dual
5 union
6 select 'a4bd' from dual
7 union
8 select 'gh$dll' from dual
9 union
10 select 'xy5zzy' from dual
11 )
12 select name from tbl
13 where regexp_like(name, '^.*[-|$|4].*$');
NAME
------
a4bd
abc-
gh$dll
SQL>

Oracle REGEXP_REPLACE replace space in middle with empty string

I'm trying to use the Oracle REGEXP_REPLACE function to replace a whitespace (which is in the middle of a string) with an empty string.
One of my columns contains strings like the following one.
[alphanumeric][space][digits][space][alpha] (eg. R4SX 315 GFX)
Now, I need to replace ONLY the second whitespace (the whitespace after the digits) with an empty string (i.e. R4SX 315 GFX --> R4SX 315GFX)
To achieve this, I tried the following code:
SELECT REGEXP_REPLACE(
'R4SX 315 GFX',
'([:alphanum:])\s(\d)\s([:alpha:])',
'\1 \2\3') "REPLACED"
FROM dual;
However, the result that I get is the same as my input (i.e. R4SX 315 GFX).
Can someone please tell me what I have done wrong and please point me in the right direction.
Thanks in advance.
[:alphanum:]
alphanum is incorrrect. The alphanumeric character class is [[:alnum:]].
You could use the following pattern in the REGEXP_REPLACE:
([[:alnum:]]{4})([[:space:]]{1})([[:digit:]]{3})([[:space:]]{1})([[:alpha:]]{3})
Using REGEXP
SQL> SELECT REGEXP_REPLACE('R4SX 315 GFX',
2 '([[:alnum:]]{4})([[:space:]]{1})([[:digit:]]{3})([[:space:]]{1})([[:alpha:]]{3})',
3 '\1\2\3\5')
4 FROM DUAL;
REGEXP_REPL
-----------
R4SX 315GFX
SQL>
If you are not sure about the number of characters in each expression of the pattern, then you could do:
SQL> SELECT REGEXP_REPLACE('R4SX 315 GFX',
2 '([[:alnum:]]+[[:blank:]]+[[:digit:]]+)[[:blank:]]+([[:alpha:]]+)',
3 '\1\2')
4 FROM dual;
REGEXP_REPL
-----------
R4SX 315GFX
SQL>
Using SUBSTR and INSTR
The same could be done with substr and instr which wouldbe less resource consuming than regexp.
SQL> WITH DATA AS
2 ( SELECT 'R4SX 315 GFX' str FROM DUAL
3 )
4 SELECT SUBSTR(str, 1, instr(str, ' ', 1, 2) -1)
5 ||SUBSTR(str, instr(str, ' ', 1, 2) +1, LENGTH(str)-instr(str, ' ', 1, 2)) new_str
6 FROM DATA;
NEW_STR
-----------
R4SX 315GFX
SQL>
Your regex contains an invalid class alphanum. Also, these classes must be used inside character classes [...]. Instead of \s, you need to use a supported [:blank:] class. More details on the regex syntax in MySQL can be found here.
I recommend using
SELECT REGEXP_REPLACE(
'R4SX 315 GFX',
'([[:alnum:]]+[[:blank:]]+[[:digit:]]+)[[:blank:]]+([[:alpha:]]+)'
, '\1\2') "REGEXP_REPLACE"
FROM dual;
This way you will use just 2 capturing groups. The less we have the better is for performance. Here you can see more details on REGEXP_REPLACE function.