Regular expression on Dates in Oracle - regex

I have date formats in all the possible permutations. MM/DD/YYYY, M/D/YYYY, MM/D/YYYY, M/DD/YYYY
Now I need to write a regular expression in Oracle DB to fetch different date formats from 1 column as is

Try this one:
with t(date_col) as (
select '01/01/2014' from dual
union all
select '1/2/2014' from dual
union all
select '01/3/2014' from dual
union all
select '1/04/2014' from dual
union all
select '11/1/14' from dual)
select date_col,
case
when regexp_instr(date_col, '^\d/\d/\d{4}$') = 1 then
'd/m/yyyy'
when regexp_instr(date_col, '^\d{2}/\d/\d{4}$') = 1 then
'dd/m/yyyy'
when regexp_instr(date_col, '^\d/\d{2}/\d{4}$') = 1 then
'd/mm/yyyy'
when regexp_instr(date_col, '^\d{2}/\d{2}/\d{4}$') = 1 then
'dd/mm/yyyy'
else
'Unknown format'
end date_format
from t;
DATE_COL DATE_FORMAT
---------- --------------
01/01/2014 dd/mm/yyyy
1/2/2014 d/m/yyyy
01/3/2014 dd/m/yyyy
1/04/2014 d/mm/yyyy
11/1/14 Unknown format

I am not sure what your goal is, but since months are always first, followed by day, you can use the following expression to get a date regardless of the input format:
select to_date( column, 'mm/dd/yyyy') from ...

You can select all records for which the following is true:
where [column_value] != to_char(to_date([column_value],'MM/DD/YYYY'),'MM/DD/YYYY')

Related

Convert two datetimes to timestamps and keep the larger one

I work with Oracle Database 19c and I would like to convert two datetimes (DD/MM/YY HH24:MM:SS) into timestamps and only keep the larger one.
I try several script for the conversion like this one :
SELECT
CODE_ACT_PROD,
LIB,
CAST (DAT_CRE AS TIMESTAMP) AS DATE_CRE_TIMESTAMP,
CAST (DAT_MOD AS TIMESTAMP) AS DATE_MOD_TIMESTAMP
FROM ACTI
WHERE CODE_ACT_PROD
IN (
SELECT CODE_ACT_PROD
FROM ART_COM
WHERE ETAT = 0
)
but the result is not what I want, the datetimes are not convert and I don't know how to keep the larger one.
Use GREATEST:
SELECT CODE_ACT_PROD,
LIB,
CAST (DAT_CRE AS TIMESTAMP) AS DATE_CRE_TIMESTAMP,
CAST (DAT_MOD AS TIMESTAMP) AS DATE_MOD_TIMESTAMP,
CAST(GREATEST(DAT_CRE, DAT_MOD) AS TIMESTAMP) AS greatest_timestamp
FROM ACTI
WHERE CODE_ACT_PROD IN (
SELECT CODE_ACT_PROD
FROM ART_COM
WHERE ETAT = 0
)
Which, for the sample data:
CREATE TABLE acti (
code_act_prod INT,
lib INT,
dat_cre DATE,
dat_mod DATE
);
CREATE TABLE art_com (
code_act_prod INT,
etat INT
);
INSERT INTO acti (code_act_prod, lib, dat_cre, dat_mod)
SELECT 1, 2, SYSDATE - 1, SYSDATE FROM DUAL UNION ALL
SELECT 3, 4, TRUNC(SYSDATE), SYSDATE - 2 FROM DUAL;
INSERT INTO art_com (code_act_prod, etat)
SELECT 1, 0 FROM DUAL UNION ALL
SELECT 3, 0 FROM DUAL;
Outputs:
CODE_ACT_PROD
LIB
DATE_CRE_TIMESTAMP
DATE_MOD_TIMESTAMP
GREATEST_TIMESTAMP
1
2
2021-09-01 08:38:21.000000
2021-09-02 08:38:21.000000
2021-09-02 08:38:21.000000
3
4
2021-09-02 00:00:00.000000
2021-08-31 08:38:21.000000
2021-09-02 00:00:00.000000
db<>fiddle here
Oracle does not have a datetime data type. It has date which has a day and a time to the second. And it has a timestamp which also has a day and a time to the second with optional fractional seconds and time zone. Converting a date to a timestamp would just add fractional seconds which were always 0. Neither date nor timestamp data types have a format. A varchar2 would have a format. If the columns are date data types, your code is syntactically valid. I'm not sure how results you are getting differ from the results you want since you're not showing us your sample data or expected results and you're not telling us what you mean when you say that something isn't converted.
Assuming the two columns are actually of type date, your code appears to be fine and you just want to use the greatest function to get the latest date. See this fiddle
with cte as (
select sysdate dat_cr, sysdate + 1 dat_mod
from dual
)
select cast(dat_cr as timestamp) ts_cr,
cast(dat_mod as timestamp) ts_mod,
cast( greatest( dat_cr, dat_mod ) as timestamp ) ts_greatest
from cte;
TS_CR TS_MOD TS_GREATEST
02-SEP-21 08.25.38.000000 AM 03-SEP-21 08.25.38.000000 AM 03-SEP-21 08.25.38.000000 AM
Note that the conversion of the three timestamps to strings to be displayed to humans is controlled by your session's nls_timestamp_format.
If you want to handle null dates by returning whichever date is not null, you can use a coalesce and a case statement
with cte as (
select sysdate dat_cr, sysdate + 1 dat_mod
from dual
union all
select null, sysdate from dual
union all
select sysdate, null from dual
)
select cast(dat_cr as timestamp) ts_cr,
cast(dat_mod as timestamp) ts_mod,
cast( case when dat_cr is null or dat_mod is null
then coalesce( dat_mod, dat_cr )
else greatest( dat_cr, dat_mod )
end
as timestamp ) ts_greatest
from cte;
See this fiddle

Stored procedure for data excluding is not working as expected in oracle

I have written a query where I want to exclude the data for which values comes as _900. Below is the query for the same.
SELECT
TO_CHAR(TRIM(RJ_SPAN_ID)) AS SPAN_ID,TO_CHAR(RJ_MAINTENANCE_ZONE_CODE) AS MAINT_ZONE_CODE,RJ_INTRACITY_LINK_ID
FROM NE.MV_SPAN#DB_LINK_NE_VIEWER
--FROM APP_FTTX.SPAN_2#SAT
WHERE
LENGTH(TRIM(RJ_SPAN_ID)) = 21
--AND REGEXP_LIKE(TRIM(RJ_SPAN_ID), 'SP(N|Q|R|S)*.+_(BU|MP)$','i')
--AND (NOT REGEXP_LIKE(TRIM(RJ_SPAN_ID), '(_U)$|(/)$','i')
--AND REGEXP_LIKE(RJ_INTRACITY_LINK_ID, '(%*_9%)','i')--)
AND INVENTORY_STATUS_CODE = 'IPL'
AND RJ_MAINTENANCE_ZONE_CODE = 'INORBNPN01'
and RJ_SPAN_ID = 'ORKPRKORKONASPR001_BU';
I tried all the commented REGEXP but it's not working.
Also, with the above query below is the screenshot for the output which I am getting.
Why not LIKE?
SQL> with test (col) as
2 (select '900' from dual union all --> doesn't contain _900
3 select '123_900AB' from dual union all --> contains _900
4 select '_900' from dual union all --> contains _900
5 select 'ab900cd' from dual --> doesn't contain _900
6 )
7 select *
8 from test
9 where col not like '%\_900%' escape '\';
COL
---------
900
ab900cd
SQL>

How to find missing dates in BigQuery table using sql

How to get a list of missing dates from a BigQuery table. For e.g. a table(test_table) is populated everyday by some job but on few days the jobs fails and data isn't written into the table.
Use Case:
We have a table(test_table) which is populated everyday by some job( a scheduled query or cloud function).Sometimes those job fail and data isn't available for those particular dates in my table.
How to find those dates rather than scrolling through thousands of rows.
The below query will return me a list of dates and ad_ids where data wasn't uploaded (null).
note: I have used MAX(Date) as I knew dates was missing in between my boundary dates. For safe side you can also specify the starting_date and ending_date incase data hasn't been populated in the last few days at all.
WITH Date_Range AS
-- anchor for date range
(
SELECT MIN(DATE) as starting_date,
MAX(DATE) AS ending_date
FROM `project_name.dataset_name.test_table`
),
day_series AS
-- anchor to get all the dates within the range
(
SELECT *
FROM Date_Range
,UNNEST(GENERATE_TIMESTAMP_ARRAY(starting_date, ending_date, INTERVAL 1 DAY)) AS days
-- other options depending on your date type ( mine was timestamp)
-- GENERATE_DATETIME_ARRAY or GENERATE_DATE_ARRAY
)
SELECT
day_series.days,
original_table.ad_id
FROM day_series
-- do a left join on the source table
LEFT JOIN `project_name.dataset_name.test_table` AS original_table ON (original_table.date)= day_series.days
-- I only want the records where data is not available or in other words empty/missing
WHERE original_table.ad_id IS NULL
GROUP BY 1,2
ORDER BY 1
Final output will look like below:
An Alternate solution you can try following query to get desired output:-
with t as (select 1 as id, cast ('2020-12-25' as timestamp) Days union all
select 1 as id, cast ('2020-12-26' as timestamp) Days union all
select 1 as id, cast ('2020-12-27' as timestamp) Days union all
select 1 as id, cast ('2020-12-31' as timestamp) Days union all
select 1 as id, cast ('2021-01-01' as timestamp) Days union all
select 1 as id, cast ('2021-01-04' as timestamp) Days)
SELECT *
FROM (
select TIMESTAMP_ADD(Days, INTERVAL 1 DAY) AS Days, TIMESTAMP_SUB(next_days, INTERVAL 1 DAY) AS next_days from (
select t.Days,
(case when lag(Days) over (partition by id order by Days) = Days
then NULL
when lag(Days) over (partition by id order by Days) is null
then Null
else Lead(Days) over (partition by id order by Days)
end) as next_days
from t) where next_days is not null
and Days <> TIMESTAMP_SUB(next_days, INTERVAL 1 DAY)),
UNNEST(GENERATE_TIMESTAMP_ARRAY(Days, next_days, INTERVAL 1 DAY)) AS days
Output will be as :-
I used the code above but had to restructure it for BigQuery:
-- anchor for date range - this will select dates from the source table (i.e. the table your query runs off of)
WITH day_series AS(
SELECT *
FROM (
SELECT MIN(DATE) as starting_date,
MAX(DATE) AS ending_date
FROM --enter source table here--
---OPTIONAL: filter for a specific date range
WHERE DATE BETWEEN 'YYYY-MM-DD' AND YYYY-MM-DD'
),UNNEST(GENERATE_DATE_ARRAY(starting_date, ending_date, INTERVAL 1 DAY)) as days
-- other options depending on your date type ( mine was timestamp)
-- GENERATE_DATETIME_ARRAY or GENERATE_DATE_ARRAY
)
SELECT
day_series.days,
output_table.date
FROM day_series
-- do a left join on the output table (i.e. the table you are searching the missing dates for)
LEFT JOIN `project_name.dataset_name.test_table` AS output_table
ON (output_table.date)= day_series.days
-- I only want the records where data is not available or in other words empty/missing
WHERE output_table.date IS NULL
GROUP BY 1,2
ORDER BY 1

RegEx in BigQuery

I need to split the following field: LP1234354_CD12346
and get the 2 separate columns with the following values:1234354 and 12346.
I tried regex and right/left but not successful. Thank you in advance!
Dummy data:
SELECT 'LP1234354_CD12346' AS word UNION ALL
SELECT 'LP1234456_CD12345'
Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 AS id, 'LP1234354_CD12346' AS word UNION ALL
SELECT 2, 'LP1234456_CD12345'
)
SELECT id,
REGEXP_EXTRACT_ALL(word, r'(\d+)')[SAFE_OFFSET(0)] AS val1,
REGEXP_EXTRACT_ALL(word, r'(\d+)')[SAFE_OFFSET(1)] AS val2
FROM `project.dataset.table`

How do I extract a pattern from a table in Oracle 11g?

I want to extract text from a column using regular expressions in Oracle 11g. I have 2 queries that do the job but I'm looking for a (cleaner/nicer) way to do it. Maybe combining the queries into one or a new equivalent query. Here they are:
Query 1: identify rows that match a pattern:
select column1 from table1 where regexp_like(column1, pattern);
Query 2: extract all matched text from a matching row.
select regexp_substr(matching_row, pattern, 1, level)
from dual
connect by level < regexp_count(matching_row, pattern);
I use PL/SQL to glue these 2 queries together, but it's messy and clumsy. How can I combine them into 1 query. Thank you.
UPDATE: sample data for pattern 'BC':
row 1: ABCD
row 2: BCFBC
row 3: HIJ
row 4: GBC
Expected result is a table of 4 rows of 'BC'.
You can also do it in one query, functions/procedures/packages not required:
WITH t1 AS (
SELECT 'ABCD' c1 FROM dual
UNION
SELECT 'BCFBC' FROM dual
UNION
SELECT 'HIJ' FROM dual
UNION
SELECT 'GBC' FROM dual
)
SELECT c1, regexp_substr(c1, 'BC', 1, d.l, 'i') thePattern, d.l occurrence
FROM t1 CROSS JOIN (SELECT LEVEL l FROM dual CONNECT BY LEVEL < 200) d
WHERE regexp_like(c1,'BC','i')
AND d.l <= regexp_count(c1,'BC');
C1 THEPATTERN OCCURRENCE
----- -------------------- ----------
ABCD BC 1
BCFBC BC 1
BCFBC BC 2
GBC BC 1
SQL>
I've arbitrarily limited the number of occurrences to search for at 200, YMMV.
Actually there is an elegant way to do this in one query, if you do not mind to run some extra miles. Please note that this is just a sketch, I have not run it, you'll probably have to correct a few typos in it.
create or replace package yo_package is
type word_t is record (word varchar2(4000));
type words_t is table of word_t;
end;
/
create or replace package body yo_package is
function table_function(in_cur in sys_refcursor, pattern in varchar2)
return words_t
pipelined parallel_enable (partition in_cur by any)
is
next varchar2(4000);
match varchar2(4000);
word_rec word_t;
begin
word_rec.word = null;
loop
fetch in_cur into next;
exit when in_cur%notfound;
--this you inner loop where you loop through the matches within next
--you have to implement this
loop
--TODO get the next match from next
word_rec.word := match;
pipe row (word_rec);
end loop;
end loop;
end table_function;
end;
/
select *
from table(
yo_package.table_function(
cursor(
--this is your first select
select column1 from table1 where regexp_like(column1, pattern)
)
)