Convert two datetimes to timestamps and keep the larger one - oracle19c

I work with Oracle Database 19c and I would like to convert two datetimes (DD/MM/YY HH24:MM:SS) into timestamps and only keep the larger one.
I try several script for the conversion like this one :
SELECT
CODE_ACT_PROD,
LIB,
CAST (DAT_CRE AS TIMESTAMP) AS DATE_CRE_TIMESTAMP,
CAST (DAT_MOD AS TIMESTAMP) AS DATE_MOD_TIMESTAMP
FROM ACTI
WHERE CODE_ACT_PROD
IN (
SELECT CODE_ACT_PROD
FROM ART_COM
WHERE ETAT = 0
)
but the result is not what I want, the datetimes are not convert and I don't know how to keep the larger one.

Use GREATEST:
SELECT CODE_ACT_PROD,
LIB,
CAST (DAT_CRE AS TIMESTAMP) AS DATE_CRE_TIMESTAMP,
CAST (DAT_MOD AS TIMESTAMP) AS DATE_MOD_TIMESTAMP,
CAST(GREATEST(DAT_CRE, DAT_MOD) AS TIMESTAMP) AS greatest_timestamp
FROM ACTI
WHERE CODE_ACT_PROD IN (
SELECT CODE_ACT_PROD
FROM ART_COM
WHERE ETAT = 0
)
Which, for the sample data:
CREATE TABLE acti (
code_act_prod INT,
lib INT,
dat_cre DATE,
dat_mod DATE
);
CREATE TABLE art_com (
code_act_prod INT,
etat INT
);
INSERT INTO acti (code_act_prod, lib, dat_cre, dat_mod)
SELECT 1, 2, SYSDATE - 1, SYSDATE FROM DUAL UNION ALL
SELECT 3, 4, TRUNC(SYSDATE), SYSDATE - 2 FROM DUAL;
INSERT INTO art_com (code_act_prod, etat)
SELECT 1, 0 FROM DUAL UNION ALL
SELECT 3, 0 FROM DUAL;
Outputs:
CODE_ACT_PROD
LIB
DATE_CRE_TIMESTAMP
DATE_MOD_TIMESTAMP
GREATEST_TIMESTAMP
1
2
2021-09-01 08:38:21.000000
2021-09-02 08:38:21.000000
2021-09-02 08:38:21.000000
3
4
2021-09-02 00:00:00.000000
2021-08-31 08:38:21.000000
2021-09-02 00:00:00.000000
db<>fiddle here

Oracle does not have a datetime data type. It has date which has a day and a time to the second. And it has a timestamp which also has a day and a time to the second with optional fractional seconds and time zone. Converting a date to a timestamp would just add fractional seconds which were always 0. Neither date nor timestamp data types have a format. A varchar2 would have a format. If the columns are date data types, your code is syntactically valid. I'm not sure how results you are getting differ from the results you want since you're not showing us your sample data or expected results and you're not telling us what you mean when you say that something isn't converted.
Assuming the two columns are actually of type date, your code appears to be fine and you just want to use the greatest function to get the latest date. See this fiddle
with cte as (
select sysdate dat_cr, sysdate + 1 dat_mod
from dual
)
select cast(dat_cr as timestamp) ts_cr,
cast(dat_mod as timestamp) ts_mod,
cast( greatest( dat_cr, dat_mod ) as timestamp ) ts_greatest
from cte;
TS_CR TS_MOD TS_GREATEST
02-SEP-21 08.25.38.000000 AM 03-SEP-21 08.25.38.000000 AM 03-SEP-21 08.25.38.000000 AM
Note that the conversion of the three timestamps to strings to be displayed to humans is controlled by your session's nls_timestamp_format.
If you want to handle null dates by returning whichever date is not null, you can use a coalesce and a case statement
with cte as (
select sysdate dat_cr, sysdate + 1 dat_mod
from dual
union all
select null, sysdate from dual
union all
select sysdate, null from dual
)
select cast(dat_cr as timestamp) ts_cr,
cast(dat_mod as timestamp) ts_mod,
cast( case when dat_cr is null or dat_mod is null
then coalesce( dat_mod, dat_cr )
else greatest( dat_cr, dat_mod )
end
as timestamp ) ts_greatest
from cte;
See this fiddle

Related

How to find missing dates in BigQuery table using sql

How to get a list of missing dates from a BigQuery table. For e.g. a table(test_table) is populated everyday by some job but on few days the jobs fails and data isn't written into the table.
Use Case:
We have a table(test_table) which is populated everyday by some job( a scheduled query or cloud function).Sometimes those job fail and data isn't available for those particular dates in my table.
How to find those dates rather than scrolling through thousands of rows.
The below query will return me a list of dates and ad_ids where data wasn't uploaded (null).
note: I have used MAX(Date) as I knew dates was missing in between my boundary dates. For safe side you can also specify the starting_date and ending_date incase data hasn't been populated in the last few days at all.
WITH Date_Range AS
-- anchor for date range
(
SELECT MIN(DATE) as starting_date,
MAX(DATE) AS ending_date
FROM `project_name.dataset_name.test_table`
),
day_series AS
-- anchor to get all the dates within the range
(
SELECT *
FROM Date_Range
,UNNEST(GENERATE_TIMESTAMP_ARRAY(starting_date, ending_date, INTERVAL 1 DAY)) AS days
-- other options depending on your date type ( mine was timestamp)
-- GENERATE_DATETIME_ARRAY or GENERATE_DATE_ARRAY
)
SELECT
day_series.days,
original_table.ad_id
FROM day_series
-- do a left join on the source table
LEFT JOIN `project_name.dataset_name.test_table` AS original_table ON (original_table.date)= day_series.days
-- I only want the records where data is not available or in other words empty/missing
WHERE original_table.ad_id IS NULL
GROUP BY 1,2
ORDER BY 1
Final output will look like below:
An Alternate solution you can try following query to get desired output:-
with t as (select 1 as id, cast ('2020-12-25' as timestamp) Days union all
select 1 as id, cast ('2020-12-26' as timestamp) Days union all
select 1 as id, cast ('2020-12-27' as timestamp) Days union all
select 1 as id, cast ('2020-12-31' as timestamp) Days union all
select 1 as id, cast ('2021-01-01' as timestamp) Days union all
select 1 as id, cast ('2021-01-04' as timestamp) Days)
SELECT *
FROM (
select TIMESTAMP_ADD(Days, INTERVAL 1 DAY) AS Days, TIMESTAMP_SUB(next_days, INTERVAL 1 DAY) AS next_days from (
select t.Days,
(case when lag(Days) over (partition by id order by Days) = Days
then NULL
when lag(Days) over (partition by id order by Days) is null
then Null
else Lead(Days) over (partition by id order by Days)
end) as next_days
from t) where next_days is not null
and Days <> TIMESTAMP_SUB(next_days, INTERVAL 1 DAY)),
UNNEST(GENERATE_TIMESTAMP_ARRAY(Days, next_days, INTERVAL 1 DAY)) AS days
Output will be as :-
I used the code above but had to restructure it for BigQuery:
-- anchor for date range - this will select dates from the source table (i.e. the table your query runs off of)
WITH day_series AS(
SELECT *
FROM (
SELECT MIN(DATE) as starting_date,
MAX(DATE) AS ending_date
FROM --enter source table here--
---OPTIONAL: filter for a specific date range
WHERE DATE BETWEEN 'YYYY-MM-DD' AND YYYY-MM-DD'
),UNNEST(GENERATE_DATE_ARRAY(starting_date, ending_date, INTERVAL 1 DAY)) as days
-- other options depending on your date type ( mine was timestamp)
-- GENERATE_DATETIME_ARRAY or GENERATE_DATE_ARRAY
)
SELECT
day_series.days,
output_table.date
FROM day_series
-- do a left join on the output table (i.e. the table you are searching the missing dates for)
LEFT JOIN `project_name.dataset_name.test_table` AS output_table
ON (output_table.date)= day_series.days
-- I only want the records where data is not available or in other words empty/missing
WHERE output_table.date IS NULL
GROUP BY 1,2
ORDER BY 1

Pivot with dynamic DATE columns

I have a query that I created from a table.
example:
select
pkey,
trunc (createdformat) business_date,
regexp_substr (statistics, 'business_ \ w *') business_statistics
from business_data
where statistics like '% business_%'
group by regexp_substr(statistics, 'business_\w*'), trunc(createdformat)
This works great thanks to your help.
Now I want to show that in a crosstab / pivot.
That means in the first column are the "business_statistics", the column headings are the "dynamic days from business_date".
I've tried the following, but it doesn't quite work yet
SELECT *
FROM (
select
pkey,
trunc(createdformat) business_date,
regexp_substr(statistics, 'business_\w*') business_statistics
from business_data
where statistics like '%business_%'
)
PIVOT(
count(pkey)
FOR business_date
IN ('17.06.2020','18.06.2020')
)
ORDER BY business_statistics
If I specify the date, like here 17.06.2020 and 18.06.2020 it works. 3 columns (Business_Statistic, 17.06.2020, 18.06.2020). But from column 2 it should be dynamic. That means he should show me the days (date) that are also included in the query / table. So that is the result of X columns (Business_Statistics, Date1, Date2, Date3, Date4, ....). Dynamic based on the table data.
For example, this does not work:
...
IN (SELECT DISTINCT trunc(createdformat) FROM BUSINESS_DATA WHERE statistics like '%business_%' order by trunc(createdformat))
...
The pivot clause doesn't work with dynamic values.
But there are some workarounds discuss here: How to Convert Rows to Columns and Back Again with SQL (Aka PIVOT and UNPIVOT)
You may find one workaround that suits your requirements.
Unfortunately, I am not very familiar with PL / SQL. But could I still process the start date and the end date of the user for the query?
For example, the user enters the APEX environment as StartDate: June 17, 2020 and as EndDate: June 20, 2020.
Then the daily difference is calculated in the PL / SQL query, then a variable is filled with the value of the entered period using Loop.
Example: (Just an idea, I'm not that fit in PL / SQL yet)
DECLARE
startdate := :P9999_StartDate 'Example 17.06.2020
enddate := P9999_EndDate 'Example 20.06.2020
BEGIN
LOOP 'From the startdate to the enddate day
businessdate := businessdate .... 'Example: 17.06.2020,18.06.2020,19.06.2020, ...
END LOOP
SELECT *
FROM (
select
pkey,
trunc(createdformat) business_date,
regexp_substr(statistics, 'business_\w*') business_statistics
from business_data
where statistics like '%business_%'
)
PIVOT(
count(pkey)
FOR business_date
IN (businessdate)
)
ORDER BY business_statistics
END;
That would be my idea, but I fail to implement it. Is that possible? I hope you understand what I mean

How to add current session time to each event in BigQuery?

I've got some data that looks similar to this:
I want to add a column that contains the start time of the session that each event occurred in so that the output looks something like this:
The session_start_time column is based on the session_start event.
I've tried using partitions in analytic functions but to do so I need values that are the same in each row to start with and if I had that I would have solved my problem.
I've also tried FIRST_VALUE with a window function but I haven't managed to pull only the events where the event_name is "session_start" because I can't see a way to filter inside window functions.
How can I achieve this using Standard SQL on BigQuery?
Below is a sample query that includes the sample data:
WITH user_events AS (
SELECT
1 AS user_id,
'session_start' AS event_name,
0 AS event_time
UNION ALL SELECT 1, 'video_play', 2
UNION ALL SELECT 1, 'ecommerce_purchase', 3
UNION ALL SELECT 1, 'session_start', 100
UNION ALL SELECT 1, 'video_play', 105
)
SELECT
user_id,
event_name,
event_time
FROM
user_events
ORDER BY
event_time
#standardSQL
WITH user_events AS (
SELECT 1 AS user_id, 'session_start' AS event_name, 0 AS event_time UNION ALL
SELECT 1, 'video_play', 2 UNION ALL
SELECT 1, 'ecommerce_purchase', 3 UNION ALL
SELECT 1, 'session_start', 100 UNION ALL
SELECT 1, 'video_play', 105
)
SELECT
user_id,
event_name,
event_time,
MIN(event_time) OVER(PARTITION BY user_id, session) AS session_start_time
FROM (
SELECT
user_id,
event_name,
event_time,
COUNTIF(event_name='session_start') OVER(PARTITION BY user_id ORDER BY event_time) AS session
FROM user_events
)
ORDER BY event_time

Why can't BigQuery cast this number as an integer?

In my query, I have a value formatted as a dollar amount, like this:
Coverage_Amount
$10,000
$15,000
null
$2,000
So I remove the extra characters and map the null to 0. I get a column back like this:
Coverage_Amount
10000
15000
0
2000
However, these values are stored as strings, and when I try something like this:
CASE
WHEN Coverage_Amount IS NOT NULL THEN INTEGER(REGEXP_REPLACE(query.Coverage_Amount, r'\$|,', ''))
ELSE 0
END AS Coverage_Amount
I get back
Coverage_Amount
null
null
0
null
The documentation for the INTEGER() function says
Casts expr to a 64-bit integer. Returns NULL if is a string that doesn't correspond to an integer value.
Is there anything I can do to make BigQuery recognize that these are in fact integers?
Both below versions for BigQuery (respectivelly Legacy SQL and StandardSQL) work and return below result
Coverage_Amount val
10000 10000
15000 15000
2000 2000
Legacy SQL
#legacySQL
SELECT
Coverage_Amount,
IFNULL(INTEGER(REGEXP_REPLACE(Coverage_Amount, r'\$|,', '')), 0) AS val
FROM
(SELECT '10000' Coverage_Amount),
(SELECT '15000' Coverage_Amount),
(SELECT '2000' Coverage_Amount)
Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT '10000' Coverage_Amount UNION ALL
SELECT '15000' UNION ALL
SELECT '2000'
)
SELECT
Coverage_Amount,
IFNULL(CAST(REGEXP_REPLACE(Coverage_Amount, r'\$|,', '') AS INT64), 0) AS val
FROM `project.dataset.table`
Obviously, same works for '$15,000' and '$10,000' and '$2,000' etc.
It could be because you have spaces after 0 at the end of string.
I mean f.e. '&10000 '. So you can try to use RTRIM(value, ' ')
SELECT
Coverage_Amount,
IFNULL(INTEGER(REGEXP_REPLACE(RTRIM(Coverage_Amount, ' '), r'\$|,', '')),0) AS val
FROM
(SELECT '$10,000 ' Coverage_Amount)
to delete all spaces from the end of string
Then output will be:
Row Coverage_Amount val
1 $10,000 10000
Are you using Standard? This worked for me (notice I use the CAST operator):
WITH data as(
select "$10,000" d UNION ALL
select "$15,000" UNION ALL
select "$2,000")
SELECT
d,
CAST(REGEXP_REPLACE(d, r'\$|,', '') AS INT64) AS Coverage_Amount
FROM data

Regular expression on Dates in Oracle

I have date formats in all the possible permutations. MM/DD/YYYY, M/D/YYYY, MM/D/YYYY, M/DD/YYYY
Now I need to write a regular expression in Oracle DB to fetch different date formats from 1 column as is
Try this one:
with t(date_col) as (
select '01/01/2014' from dual
union all
select '1/2/2014' from dual
union all
select '01/3/2014' from dual
union all
select '1/04/2014' from dual
union all
select '11/1/14' from dual)
select date_col,
case
when regexp_instr(date_col, '^\d/\d/\d{4}$') = 1 then
'd/m/yyyy'
when regexp_instr(date_col, '^\d{2}/\d/\d{4}$') = 1 then
'dd/m/yyyy'
when regexp_instr(date_col, '^\d/\d{2}/\d{4}$') = 1 then
'd/mm/yyyy'
when regexp_instr(date_col, '^\d{2}/\d{2}/\d{4}$') = 1 then
'dd/mm/yyyy'
else
'Unknown format'
end date_format
from t;
DATE_COL DATE_FORMAT
---------- --------------
01/01/2014 dd/mm/yyyy
1/2/2014 d/m/yyyy
01/3/2014 dd/m/yyyy
1/04/2014 d/mm/yyyy
11/1/14 Unknown format
I am not sure what your goal is, but since months are always first, followed by day, you can use the following expression to get a date regardless of the input format:
select to_date( column, 'mm/dd/yyyy') from ...
You can select all records for which the following is true:
where [column_value] != to_char(to_date([column_value],'MM/DD/YYYY'),'MM/DD/YYYY')