Regular Expressions _# at end of string - regex

I am using the REGEXP_LIKE function in Oracle 10g to find values in a column with a suffix of _#(like _1, _2 etc). I can find _# in any part of the value with the query below but can I return only values with _# at the end ?
SELECT * FROM Table WHERE REGEXP_LIKE (COLUMN,'_[[:digit:]]')

Sure. Use...
SELECT * FROM Table WHERE REGEXP_LIKE (COLUMN,'_[[:digit:]]$')
The $ character matches "the end of the string."

No need to use reg exps.
select * from table where substr(column,-2) between '_0' and '_9';

Related

Oracle regex and replace

I have varchar field in the database that contains text. I need to replace every occurrence of a any 2 letter + 8 digits string to a link, such as VA12345678 will return /cs/page.asp?id=VA12345678
I have a regex that replaces the string but how can I replace it with a string where part of it is the string itself?
SELECT REGEXP_REPLACE ('test PI20099742', '[A-Z]{2}[0-9]{8}$', 'link to replace with')
FROM dual;
I can have more than one of these strings in one varchar field and ideally I would like to have them replaced in one statement instead of a loop.
As mathguy had said, you can use backreferences for your use case. Try a query like this one.
SELECT REGEXP_REPLACE ('test PI20099742', '([A-Z]{2}[0-9]{8})', '/cs/page.asp?id=\1')
FROM DUAL;
For such cases, you may want to keep the "text to add" somewhere at the top of the query, so that if you ever need to change it, you don't have to hunt for it.
You can do that with a with clause, as shown below. I also put some input data for testing in the with clause, but you should remove that and reference your actual table in your query.
I used the [:alpha:] character class, to match all letters - upper or lower case, accented or not, etc. [A-Z] will work until it doesn't.
with
text_to_add (link) as (
select '/cs/page.asp?id=' from dual
)
, sample_strings (str) as (
select 'test VA12398403 and PI83048203 to PT3904' from dual
)
select regexp_replace(str, '([[:alpha:]]{2}\d{8})', link || '\1')
as str_with_links
from sample_strings cross join text_to_add
;
STR_WITH_LINKS
------------------------------------------------------------------------
test /cs/page.asp?id=VA12398403 and /cs/page.asp?id=PI83048203 to PT3904

Oracle Regex remove duplicates

I have a requirement to remove duplicate values from a comma separated string.
Input String: a,a,a,b,c,a,b
Expected output: a,b,c
What I have tried:
with ct(str) as(
select 'a,a,a,b,c,a,b' from dual
)
select REGEXP_REPLACE(str,'([^,]*)(,\1)+($|,)','\1\3') col from ct
Output: a,b,c,a,b
The above query can remove repetitive characters which are consecutive.
I know that the above requirement can be solved by creating a table out of the comma separated values and do a listagg on the distinct values.
Is it possible to achieve the above requirement using a single regex statement?.
This should give you the required result:
with borken as (SELECT distinct column_value as str,'1' cnt FROM
table(apex_string.split('a,a,a,b,c,a,b' ,',')) )
select listagg(str,',') within group (order by cnt) from borken;

Filter of records in oracle

I have a sets of records in a table like
xyz_t
abc_y
pqr_12-11-2013
psq_1
App_tq2
xyzq_12-10-2014
lpqs_14-09-2012
llyt_23-09-2011
bytx_2
prdtc
I want output
pqr_12-11-2013
xyzq_12-10-2014
lpqs_14-09-2012
llyt_23-09-2011
I mean only those records which has date is suffix.
Thanks in advance.
select s from t
where regexp_like(s, '_[[:digit:]]{1,2}-[[:digit:]]{1,2}-[[:digit:]]{4}$');
[:digit:] - any digit (you can also use \d)
{4} - four times
{1,2} - one or two times
$ end of the string (by default the first carriage return is interpreted as the end)
Use regular expression:
select your_column_name
from your_table
where REGEXP_LIKE(your_column_name, '.*\d{2}-\d{2}-\d{4}$')

Extracting strings using Oracle REGEXP_SUBSTR

I am using REGEXP_SUBSTR in Oracle 11g and I am having difficulty trying to extract the following strings.
My query is:
SELECT regexp_substr('CN=aTIGERAdmin-Admin, CN=D0902498, CN=ea90045052, CN=aTIGERCall-Admin,', '[^CN=]*\,', 1, rownum) line
FROM dual
CONNECT BY LEVEL <= length('CN=aTIGERAdmin-Admin, CN=D0902498, CN=ea90045052, CN=aTIGERCall-Admin,') -
length(REPLACE('CN=aTIGERAdmin-Admin, CN=D0902498, CN=ea90045052, CN=aTIGERCall-Admin,', ',', ''))
From this query, I am having issues trying to match on exact string 'CN=' as from this query, I need the output to appear as follows:
CN=aTIGERAdmin-Admin,
CN=D0902498,
CN=ea90045052,
CN=aTIGERCall-Admin,
And in this format, with the comma at the end.
The way I am doing it at the moment is chopping off the "CN=" but I actually require this part.
I think this will return the resultset you are looking for:
SELECT REGEXP_SUBSTR(d.s,'CN=.*?,', 1, ROWNUM) line
FROM (SELECT 'CN=aTIGERAdmin-Admin, CN=D0902498, CN=ea90045052, CN=aTIGERCall-Admin,'
AS s FROM dual) d
CONNECT BY LEVEL <= LENGTH(d.s) - LENGTH(REPLACE(d.s,',',''))
The regular expression trick used here is to specify the ? modifier (following the .*) to make the match "non-greedy". The default match (without the ? modifier) is "greedy" in that it will match as much of the string as possible. In your case, you want the match to end at the first comma found. The intent here is to match literal string 'CN=' followed by any number of characters (zero, one or more) up to the first comma encountered.
This will work in Oracle 10g as well as 11g.
In 11g, the REGEXP_COUNT function can replace your "count of comman" calculation of occurrences.
CONNECT BY LEVEL <= REGEXP_COUNT(d.s,'CN=.*?,')
(BTW... by using a subquery to return the literal string, the literal string only has to be specified once. That makes it much easier to change the string for testing, rather than having to change it in multiple places.)
Addendum:
I can confirm that the comma is included in the returned value. Sample output:
LINE
-----------------------
CN=aTIGERAdmin-Admin,
CN=D0902498,
CN=ea90045052,
CN=aTIGERCall-Admin,
I'm not an LDAP master, but will the regular expression CN=[^,]+ (C, then N, then equals sign, greedily followed by more than one non-comma) work for you?
Also, do you know about REGEXP_COUNT, new in 11g?
SQL> SELECT REGEXP_SUBSTR('CN=aTIGERAdmin-Admin, CN=D0902498, CN=ea90045052, CN=aTIGERCall-Admin,', 'CN=[^,]+', 1, ROWNUM) line
2 FROM dual
3 CONNECT BY LEVEL <= REGEXP_COUNT('CN=aTIGERAdmin-Admin, CN=D0902498, CN=ea90045052, CN=aTIGERCall-Admin,', 'CN=[^,]+')
4 /
LINE
----------------------------------------------------------------------------------------------------
CN=aTIGERAdmin-Admin
CN=D0902498
CN=ea90045052
CN=aTIGERCall-Admin
SQL>

In Oracle, how do I select rows which contain a character within a certain numeric range?

I have a table in Oracle with a VARCHAR column called DESCRIPTION. Some of the rows contain non-printable characters such as the character with numeric value 150 (which is not in Latin-1 and is "Start of Protected Area" in Unicode).
I want to select all the rows whose DESCRIPTION columns contain a character whose numeric value is between 128 and 160. Is there a way to do this without a long list of LIKE clauses OR'ed together? I suppose it can be done with regular expressions, but I haven't found a way to do it.
I had to do something very like this recently and used some SQL like this:
with codes as (select rownum code from dual connect by level <= 160)
select distinct t.id, t.description
from mytable t, codes c
where t.description like '%' || chr(c.code) || '%'
and c.code >= 128;
Vincent's post helped me a lot with this problem! I wanted to find all rows that had any extended ASCII: 128-255, so I shortened the statement to this:
SELECT description
FROM your_table
WHERE regexp_like (description, '['||chr(128)||'-'||chr(255)||']');
Short way to grab a range.
You could use a regular expression, it may perform better than 30+ single WHERE clause but it won't be much prettier:
SELECT *
FROM your_table
WHERE regexp_like(description, '['||chr(128)||chr(129)||...||chr(160)||']')