Calculate date and weekending date on Presto - amazon-web-services

Given day, month and year as integer columns in the table, calculate the date and weekending date from these values.
I tried following
select date_parse(cast (2020 as varchar)||cast (03 as varchar)||cast (02 as varchar),'%Y%m%d')
returns an error saying "INVALID_FUNCTION_ARGUMENT: Invalid format: "202032" is too short"

The simplest way is to use format() + cast to date:
presto> SELECT CAST(format('%d-%d-%d', 2020, 3, 31) AS date);
_col0
------------
2020-03-31
Since Athena is still based on Presto .172, it doesn't have this function yet, so you can do the same without format:
presto> SELECT CAST(CAST(2020 AS varchar) || '-' || CAST(3 AS varchar) || '-' || CAST(31 AS varchar) AS date);
_col0
------------
2020-03-31

Related

Power Query - Filter column in Julian Format by Today

I have a table that contains the Date in the JDE Julian Date format:
CYYDDD.
For instance:
01.01.2021 = 121001
Now I would like to filter this column by today. In the past I used this SQL Statement to filter the data:
DB.JDate> ( FLOOR(( EXTRACT(YEAR FROM SYSDATE) - 1900 ) / 100)
|| TO_CHAR(SYSDATE, 'RRDDD') )
How would you do this within Power Query?
I never expect to obtain a date need to perform a very long calculation, nevertheless I have successfully return the output using m query, accept if help :)
Here is the actual outcome:
To convert Julian date to normal date, you need to this following formula, although abit long...
Date.AddDays(#date((1900 +
Number.FromText(Text.Range(Number.ToText([Julian Date]),0,1)) *100 +
Number.FromText(Text.Range(Number.ToText([Julian Date]),1,2))),1,1),
Number.FromText(Text.Range(Number.ToText([Julian Date]),3))-1)
To obtain today date for filter is very easy:
DateTime.LocalNow()

GCP Bigquery WORKDAY function

How do I go about calculating number of workdays in a MONTH based on a date in another column?
Example:
Column 1 - 2020-06-30
Column 2 (Calculated) - 22 (i.e number of workdays in the month of June Mon to Friday)
Does BQ have a WORKDAY function?
You can use below approach
create temp function workdays(input date) as ((
select count(*)
from unnest(generate_date_array(date_trunc(input, month), last_day(input, month ))) day
where not extract(dayofweek from day) in (1, 7)
));
select column1,
workdays(column1) as column2
from your_table
if applied to sample data in your question - output is

How to find missing dates in BigQuery table using sql

How to get a list of missing dates from a BigQuery table. For e.g. a table(test_table) is populated everyday by some job but on few days the jobs fails and data isn't written into the table.
Use Case:
We have a table(test_table) which is populated everyday by some job( a scheduled query or cloud function).Sometimes those job fail and data isn't available for those particular dates in my table.
How to find those dates rather than scrolling through thousands of rows.
The below query will return me a list of dates and ad_ids where data wasn't uploaded (null).
note: I have used MAX(Date) as I knew dates was missing in between my boundary dates. For safe side you can also specify the starting_date and ending_date incase data hasn't been populated in the last few days at all.
WITH Date_Range AS
-- anchor for date range
(
SELECT MIN(DATE) as starting_date,
MAX(DATE) AS ending_date
FROM `project_name.dataset_name.test_table`
),
day_series AS
-- anchor to get all the dates within the range
(
SELECT *
FROM Date_Range
,UNNEST(GENERATE_TIMESTAMP_ARRAY(starting_date, ending_date, INTERVAL 1 DAY)) AS days
-- other options depending on your date type ( mine was timestamp)
-- GENERATE_DATETIME_ARRAY or GENERATE_DATE_ARRAY
)
SELECT
day_series.days,
original_table.ad_id
FROM day_series
-- do a left join on the source table
LEFT JOIN `project_name.dataset_name.test_table` AS original_table ON (original_table.date)= day_series.days
-- I only want the records where data is not available or in other words empty/missing
WHERE original_table.ad_id IS NULL
GROUP BY 1,2
ORDER BY 1
Final output will look like below:
An Alternate solution you can try following query to get desired output:-
with t as (select 1 as id, cast ('2020-12-25' as timestamp) Days union all
select 1 as id, cast ('2020-12-26' as timestamp) Days union all
select 1 as id, cast ('2020-12-27' as timestamp) Days union all
select 1 as id, cast ('2020-12-31' as timestamp) Days union all
select 1 as id, cast ('2021-01-01' as timestamp) Days union all
select 1 as id, cast ('2021-01-04' as timestamp) Days)
SELECT *
FROM (
select TIMESTAMP_ADD(Days, INTERVAL 1 DAY) AS Days, TIMESTAMP_SUB(next_days, INTERVAL 1 DAY) AS next_days from (
select t.Days,
(case when lag(Days) over (partition by id order by Days) = Days
then NULL
when lag(Days) over (partition by id order by Days) is null
then Null
else Lead(Days) over (partition by id order by Days)
end) as next_days
from t) where next_days is not null
and Days <> TIMESTAMP_SUB(next_days, INTERVAL 1 DAY)),
UNNEST(GENERATE_TIMESTAMP_ARRAY(Days, next_days, INTERVAL 1 DAY)) AS days
Output will be as :-
I used the code above but had to restructure it for BigQuery:
-- anchor for date range - this will select dates from the source table (i.e. the table your query runs off of)
WITH day_series AS(
SELECT *
FROM (
SELECT MIN(DATE) as starting_date,
MAX(DATE) AS ending_date
FROM --enter source table here--
---OPTIONAL: filter for a specific date range
WHERE DATE BETWEEN 'YYYY-MM-DD' AND YYYY-MM-DD'
),UNNEST(GENERATE_DATE_ARRAY(starting_date, ending_date, INTERVAL 1 DAY)) as days
-- other options depending on your date type ( mine was timestamp)
-- GENERATE_DATETIME_ARRAY or GENERATE_DATE_ARRAY
)
SELECT
day_series.days,
output_table.date
FROM day_series
-- do a left join on the output table (i.e. the table you are searching the missing dates for)
LEFT JOIN `project_name.dataset_name.test_table` AS output_table
ON (output_table.date)= day_series.days
-- I only want the records where data is not available or in other words empty/missing
WHERE output_table.date IS NULL
GROUP BY 1,2
ORDER BY 1

In sas application Set date parameter and put it in the contacted column to retrieve data in a certain period

The date in the table is not one set,
Days in the days column and months in the month column and years in the year column
I have concatenated the columns and then put these concatenation in where clause and put the parameter I have made but I got no result
I assume you are querying a date dimension table, and you want to extract the record that matches a certain date.
Solution:
I created a dates table to match with,
data dates;
input key day month year ;
datalines;
1 19 2 2018
2 20 2 2018
3 21 2 2018
4 22 2 2018
;;;
run;
Output:
In the where clause I parse the date '20feb2018'd using day, month & year functions: in SAS you have to quote the dates in [''d]
proc sql;
select * from dates
/*if you want to match todays' date: replace '20feb2018'd with today()*/
where day('20feb2018'd)=day and month('20feb2018'd)=month and year('20feb2018'd)=year;
quit;
Output:
if you compare date from day month and year, then use mdy function in where clause as shown below. it is not totally clear what you are looking for.
proc sql;
select * from dates
where mdy(month,day, year) between '19feb2018'd and '21feb2018'd ;

SAS Date Functions: Quarterly Equivelant of Day Function

I often use the day function in order to control date parameters within queries.
With the data step below I can call the beginning of the current month &bom based on whether or not the day of the month is le to the 8th of each month (in which that case we want &bom set to the 1st of the previous month), otherwise set &bom to the 1st of the current month.
data _null_;
call symput('current'," '" || put(intnx('day',today(),0),yymmdd10.) || "'");
call symput('bom'," '" || put(intnx('month',today(),0,'b'),yymmdd10.) || "'");
call symput('end'," '" || put(intnx('day',today(),-8,'e'),yymmdd10.) || "'");
if day(today()) le 8 then do;
call symput('bom'," '" || put(intnx('month',today(),-1,'b'),yymmdd10.) || "'");
end;
run;
%put &bom &end &current;
262 %put &bom &end &current;
'2015-12-01' '2015-12-30' '2016-01-07'
It would seem simple to apply this logic to a "quarterly type" condition. So - If the (sequential) day of the quarter is less than 8 days after the last day of the quarter, your &boq (beginning of quarter) value would be the first day of the LAST quarter'2015-10-01', but the qtr function creates values based on the quarter 1-4, not the "number" representing the sequential day of the quarter.
Is there a function that can operate on the number of days on a quarterly level, much like the day function operating on a monthly level?
My initial attempt was wrapping functions...no success...
qtr_day = day(qtr(today()));
The trick is to use SAS date, since SAS dates are numbers, so you can find the date boundary and add/subtract 8 as desired to increment. Or you can nest intnx functions on a date. To have the date display as a quarter use a quarter format to display the date.
date_qtr_boundary = intnx('quarter', today(), 0, 'e') + 8;
Then you can compare your dates to the boundary value rather than the number 8. I'm having a hard time following exactly what you want to determine, but if you post some sample data and expected output, I (or someone else) can provide more details.