Oracle SQL regexp date formatting - regex

im so new in oracle, and trying to select some bad formatted date as cleaned.
for example,
my field is: 12.05.2010 dfsafs()F(Gf, 12:45
can i select it as 12.05.2010 12:45 with regexp or something else ?
thanks

Use the below regex to match date and time formats.
[0-9]{2}\.[0-9]{2}\.[0-9]{4}|[0-9]{2}:[0-9]{2}
DEMO
In oracle, i think you need to escape the curly braces.
[0-9]\{2\}\.[0-9]\{2\}\.[0-9]\{4\}|[0-9]\{2\}:[0-9]\{2\}

Something like this should works:
select regexp_substr(dat,'.*(\d{2}\.\d{2}\.\d{4}).*',1,1,'i',1) ||' '||
regexp_substr(dat,'.*(\d{2}:\d{2}).*',1,1,'i',1) datetime
from
(select '12.05.2010 dfsafs()F(Gf, 12:45' dat from dual);
Check that i extract date and time using regexp_substr and then concat both values.

Related

Is there an alternative to regexp_replace() and regexp_extract() in sas?

I have
select
regexp_replace(city_name,',','') as city_name
, regexp_extract(regexp_replace(postal_cd,',','') ,'^(.*?)(?:-)(.*)$',1) as zip5
This works in Hue but I want to get the same output in SAS, so what can be the replacement for regexp_replace and regexp_extract function to work in sas?
I tried using replace but that is not working in sas
Use SAS functions prxchange to replace, and prxsubstr to extract.
Replacing a matching character with nothing can also be done with COMPRESS
Extract words from a delimited string can also be done with SCAN
The non regular expression ways (COMPRESS, SCAN) are generally faster because they are very specific in their implementation.
Example:
Use COMPRESS and SCAN
data have;
city = 'Spring,field';
zip = '1,2,3,4,5-6,7,8,9';
run;
proc sql;
create table want as
select
compress(city,',') as city
, scan(compress(zip,','),1,'-') as zip5
from
have
;

BigQuery regexp replace character between quotes

I'm trying to use the BigQuery function regexp_replace for the following scenario:
Given a string field with comma as a delimiter, I need to only remove the commas within double quotes.
I found the following regex to work in the website but it seems that the BigQuery function doesn't support Lookahead groups. Could you please help me find an equivalent expression that is supported by the Big Query function regexp_replace?
https://regex101.com/r/nxkqtb/3
Big Query example code not supported:
WITH tbl AS (
SELECT 'LINE_NR="1",TXT_FIELD="Some text",CID="0"' as text
UNION ALL
SELECT 'LINE_NR="2",TXT_FIELD=",,Some text",CID="0"' as text
UNION ALL
SELECT 'LINE_NR="3",TXT_FIELD="Some text ,",CID="0"' as text
UNION ALL
SELECT 'LINE_NR="4",TXT_FIELD=",Some ,text,",CID="0"' as text
)
SELECT
REGEXP_REPLACE(text, r'(?m),(?=[^"]*"(?:[^"\r\n]*"[^"]*")*[^"\r\n]*$)', "")
FROM tbl;
Thank you
Consider below approach (assuming you know in advance keys within the text field)
select text,
( select string_agg(replace(kv, ',', ''), ',' order by offset)
from unnest(regexp_extract_all(text, r'((?:LINE_NR|TXT_FIELD|CID)=".*?")')) kv with offset
) corrected_text
from tbl;
if applied to sample data in your question - output is

Oracle regex and replace

I have varchar field in the database that contains text. I need to replace every occurrence of a any 2 letter + 8 digits string to a link, such as VA12345678 will return /cs/page.asp?id=VA12345678
I have a regex that replaces the string but how can I replace it with a string where part of it is the string itself?
SELECT REGEXP_REPLACE ('test PI20099742', '[A-Z]{2}[0-9]{8}$', 'link to replace with')
FROM dual;
I can have more than one of these strings in one varchar field and ideally I would like to have them replaced in one statement instead of a loop.
As mathguy had said, you can use backreferences for your use case. Try a query like this one.
SELECT REGEXP_REPLACE ('test PI20099742', '([A-Z]{2}[0-9]{8})', '/cs/page.asp?id=\1')
FROM DUAL;
For such cases, you may want to keep the "text to add" somewhere at the top of the query, so that if you ever need to change it, you don't have to hunt for it.
You can do that with a with clause, as shown below. I also put some input data for testing in the with clause, but you should remove that and reference your actual table in your query.
I used the [:alpha:] character class, to match all letters - upper or lower case, accented or not, etc. [A-Z] will work until it doesn't.
with
text_to_add (link) as (
select '/cs/page.asp?id=' from dual
)
, sample_strings (str) as (
select 'test VA12398403 and PI83048203 to PT3904' from dual
)
select regexp_replace(str, '([[:alpha:]]{2}\d{8})', link || '\1')
as str_with_links
from sample_strings cross join text_to_add
;
STR_WITH_LINKS
------------------------------------------------------------------------
test /cs/page.asp?id=VA12398403 and /cs/page.asp?id=PI83048203 to PT3904

Big Query Regex for Date ETL

I have data with Date info imported in Big Query in format 2/13/2016 , 3/4/2012 etc
I want to convert it into Date format like 02-12-2016 and 03-04-2012.
I want to use a Query to create a new column and use regex for the same.
I know the regex to match the first part (2) of 2/4/2012 will be something like
^(\d{1})(/|-)
Reg ex to match the the 2nd part with / would be
(/)(\d{1})(/)
I am wondering how to use these 2 regex along with REGEXP_EXTRACT and REGEXP_REPLACE to create a new column with these dates in correct format.
It might be easiest just to convert to a DATE type column. For example:
#standardSQL
SELECT
PARSE_DATE('%m/%d/%Y', date_string) AS date
FROM (
SELECT '2/13/2016' AS date_string UNION ALL
SELECT '3/4/2012' AS date_string
);
Another option--if you want to keep the dates as strings--is to use REPLACE:
#standardSQL
SELECT
REPLACE(date_string, '/', '-') AS date
FROM (
SELECT '2/13/2016' AS date_string UNION ALL
SELECT '3/4/2012' AS date_string
);

regular expression clob field

I have a question related to an regular expression in oracle 10.
Assuming I have a value like 123456;12345;454545 stored in a clob field, is there a way via an regular expression to only filter on the second pattern (12345) knowing that the value can be more then 5 digits but always occurs after the first semicolon and always has a trailing semicolon at the end?
Thanks a lot for your support in that matter,
Have a nice day,
This query should give you your desired output.
SELECT REGEXP_REPLACE(REGEXP_SUBSTR('123456;12345;454545;45634',';[0-9]+;'),';')
FROM dual;
You can get filter any pattern using this query just change 2 to any value, but it should be less than or equal to the number of elements in the string
with tab(value) as
(select '123456;12345;454545' from dual)
select regexp_substr(value, '[^;]+', 1, 2) from tab;
easily by one call:
select regexp_replace('123456;12345;454545','^[0-9]+;([0-9]+);.*$','\1')
from dual;
perhaps, regexp expression can be modified in a way of more good-looking or your business logic, but the idea, I think, is clear.
select regexp_replace(regexp_substr(Col_name,';\d+;'),';','') from your_table;