presto/athena get next minute from current timestamp

presto/athena get next minute from current timestamp - amazon-web-services

How to get next minute from current_timestamp in presto/athena
Eg.
2021-07-27 12:29:52.951 UTC -> 2021-07-27 12:30:00.000 UTC
2021-07-27 12:29:25.951 UTC -> 2021-07-27 12:30:00.000 UTC

There's no built-in function for that, but you can do it by adding 1 minute to the timestamp and then using date_trunc to round down to the nearest minute:
WITH data(ts) AS (
VALUES
TIMESTAMP '2021-07-27 12:29:52.951 UTC',
TIMESTAMP '2021-07-27 12:29:25.951 UTC'
)
SELECT ts, date_trunc('minute', ts + INTERVAL '1' MINUTE)
FROM data
=>
ts | _col1
-----------------------------+-----------------------------
2021-07-27 12:29:52.951 UTC | 2021-07-27 12:30:00.000 UTC
2021-07-27 12:29:25.951 UTC | 2021-07-27 12:30:00.000 UTC
(2 rows)

Related

Define and convert datetime in AWS Athena

I have a process where I need to to match UTC datetime and EDT datetime.
As you know, EDT can be changed between 4/5 hours from UTC.
How can I define one datetime to be in UTC and another to be in EDT and match the two?
Something like (datetime_A is my EDT timestamp, and datetime_B is my UTC):
Where CAST((datetime_A as EDT) to UTC)=datetime_B
Thanks!

How to generate_series of every hour of every day of 1 year from the current timestamp

I have a query that generates every day of the year(shown below). What if I want to get a series of every hour of every day of the year from the current timestamp. Example: today is July 23,2019 10:30:00 AM, the result I am hoping to get is below
2019-07-23 20:30:00
2019-07-23 20:00:00
2019-07-23 19:00:00
2019-07-23 18:00:00
.
.
.
2018-07-23 20:00:00
This is a Redshift (PostgreSQL 8.0.2) query for Eclipse Birt. Hoping to create a parameter for both date and time but seems difficult to achieve if 2 separate ranges.
select cast(convert_timezone('UTC','AEST',cast(now() as timestamp without time zone)) as date) - generate_series(0, 365) date,
to_char(cast(convert_timezone('UTC','AEST',cast(now() as timestamp without time zone)) as date) - generate_series(0, 365), 'dd/mm/yyyy') date_disp;
Example: today is July 23,2019 10:30:00 AM, the result I am hoping to get is below:
2019-07-23 20:30:00
2019-07-23 20:00:00
2019-07-23 19:00:00
2019-07-23 18:00:00
.
.
.
2018-07-23 20:00:00

This is to similar to your previous question.
Use:
SELECT date_trunc('hour', now()::timestamp) - generate_series(0, 24 * 365) * interval '1 hour'
This outputs:
2019-07-23 05:00:00
2019-07-23 04:00:00
etc

You can use the DATEADD Redshift function, using "h", "hr" or "hrs" as your first parameter. Documentation for this function can be found here and here.

Redshift - Adding timezone offset (Varchar) to timestamp column

as part of ETL to Redshift, in one of the source tables, there are 2 columns:
original_timestamp - TIMESTAMP: which is the local time when the record was inserted in whichever region
original_timezone_offset - Varchar: which is the offset to UTC
The data looks something like this:
original_timestamp original_timezone_offset
2011-06-22 11:00:00.000000 -0700
2014-11-29 17:00:00.000000 -0800
2014-12-02 22:00:00.000000 +0900
2011-06-03 09:23:00.000000 -0700
2011-07-28 03:00:00.000000 -0700
2011-05-01 01:30:00.000000 -0700
In my target table, I need to convert this to UTC (using the offset). How do I do it?
So far I have tried multiple things but dateadd() seems to be the closest solution. But the problem with dateadd() is, when I say:
SELECT original_timestamp, original_timezone_offset
,dateadd(H, original_timezone_offset, original_timestamp) as original_utc_time
it is adding/subtracting '700'/'800' hours instead of 7/8 hrs to the original timestamp because the offset is a VARCHAR and the values are like: -0700 etc.
Did anyone see this issue before? Appreciate any help/inputs. Thanks.

Just take the 'hours' part of the offset:
WITH t as (
SELECT '2011-06-22 11:00:00.000000'::timestamp as original_timestamp, '-0700' as original_timezone_offset
UNION ALL
SELECT '2014-11-29 17:00:00.000000'::timestamp,'-0800'
UNION ALL
SELECT '2014-12-02 22:00:00.000000'::timestamp,'+0900'
)
SELECT
original_timestamp,
original_timezone_offset,
DATEADD(hour, SUBSTRING(original_timezone_offset, 1, 3)::INT, original_timestamp)
FROM t
2011-06-22 11:00:00 -0700 2011-06-22 04:00:00
2014-11-29 17:00:00 -0800 2014-11-29 09:00:00
2014-12-02 22:00:00 +0900 2014-12-03 07:00:00
You'll need some additional fancy code if you have non-full-hour offsets (eg +0730).

First, recognize that if your timestamps are already in local time of the given offset, then you need to subtract that offset to convert back to UTC. In that first example you gave, 2011-06-22 11:00:00 -0700 is equivalent to 2011-06-22 18:00:00 UTC.
However, rather than try to add or subtract these values yourself, you should let the AT TIME ZONE function do the work for you. It will create a timestamptz that is in your supplied offset, then you can use it again to convert to UTC.
(Note that you could use the CONVERT_TIMEZONE function instead, but that one is only understood by Redshift, where AT TIME ZONE works on regular PostgreSQL also.)
However, you have is that the time zone offsets you have aren't in a format understood by these functions. See time zone usage notes. So, before we try to convert, let's translate your offset strings to an understood format.
We will want -0700 to become +07:00. The colon is required, and the sign must be flipped because it will be interpreted with the POSIX-style time zone format. In that format, positive values lie west of GMT instead of the usual conventions specified in ISO 8601.
concat(translate(substring(original_timezone_offset, 1, 3), '-+', '+-'),':',substring(original_timezone_offset, 4, 2))
Then we will use that with AT TIME ZONE to do the conversion:
(original_timezone AT TIME ZONE <the above mess>) AT TIME ZONE 'UTC' AS utc_timestamp
Putting it all together...
WITH t as (
SELECT '2011-06-22 11:00:00.000000'::timestamp as original_timestamp, '-0700' as original_timezone_offset
UNION ALL
SELECT '2014-11-29 17:00:00.000000'::timestamp,'-0800'
UNION ALL
SELECT '2014-12-02 22:00:00.000000'::timestamp,'+0900'
)
SELECT
original_timestamp,
original_timezone_offset,
concat(translate(substring(original_timezone_offset, 1, 3), '-+', '+-'),':',substring(original_timezone_offset, 4, 2)) as modified_timezone_offset,
(original_timestamp AT TIME ZONE concat(translate(substring(original_timezone_offset, 1, 3), '-+', '+-'),':',substring(original_timezone_offset, 4, 2))) AT TIME ZONE 'UTC' AS utc_timestamptz
FROM t
Output:
2011-06-22 11:00:00 -0700 +07:00 2011-06-22 18:00:00
2014-11-29 17:00:00 -0800 +08:00 2014-11-30 01:00:00
2014-12-02 22:00:00 +0900 -09:00 2014-12-02 13:00:00
SQL Fiddle here.

How to append current and previous sessions side by side filtered by two independent slicers

Objective: I would like obtain the difference between current and previous sessions based on date slicers
I want the output to be 4 columns as such:
Date
Current Sessions (see measure below)
Previous Sessions (see measure below)
Difference (no measure calculated yet).
Situation:
I currently have two measures
Current Sessions: SUM(Sales[Sessions])
Previous Sessions (thanks to #Alexis Olson):
VAR datediffs = DATEDIFF(
CALCULATE (MAX ( 'Date'[Date] ) ),
CALCULATE (MAX ('Previous Date'[Date])),
DAY
)
RETURN
CALCULATE(SUM(Sales[Sessions]),
USERELATIONSHIP('Previous Date'[Date],'Date'[Date]),
DATEADD('Date'[Date],datediffs,DAY)
)
I have three tables.
Sales
Date
Previous Date (carbon copy of Date table)
My previous date table is 1:1 inactive relationship with the Date table. Date table is 1 to many active relationship
with my Sales Table.
I have two slicers at all time comparing the same amount of days from different time periods (e.g. Jan 1th to Jan 7th 2019 vs Dec 25st to Dec 31th 2019)
If i put current sessions, previous sessions and a date column from any of the three tables
+----------+------------------+-------------------+------------+
| date | current sessions | previous sessions | difference |
+----------+------------------+-------------------+------------+
| Jan 8th | 10000 | 70000 | 3000 |
| Jan 9th | 20000 | 10000 | 10000 |
| Jan 10th | 15000 | 16000 | -1000 |
| Jan 11th | 14000 | 12000 | 2000 |
| Jan 12th | 12000 | 14000 | -2000 |
| Jan 13th | 11000 | 16000 | -5000 |
| Jan 14th | 15000 | 18000 | -3000 |
+----------+------------------+-------------------+------------+
When I put the Sessions date on the table along with sessions and previous sessions, I get the sessions amounts right for each day but the previous session amounts doesn't calculate correctly I assume because its being filtered by the date rows.
How can I override that table filter and force it to get the exact previous sessions amounts? Basically have both results appended to each other.The following shows my problem. the previous session is the same on each day and is basically the amount of dec 31st jan 2018 because the max date is different for each row but I want it to be based on the slicer.

The mistake came in the first part of the VAR Datediffs variable within the previous session formula:
CALCULATE(LASTDATE('Date'[Date]),ALLSELECTED('Date'))
This forces to always calculate the last day for each row and overrides the date value in each row.

Pandas time series: groupby and sum from noon to noon

My pandas dataframe is structured like this (with 'date' as index):
starttime duration_seconds
date
2012-12-24 11:52:00 31800
2012-12-23 0:28:00 35940
2012-12-22 2:00:00 26820
2012-12-21 1:57:00 23520
2012-12-20 1:32:00 23100
2012-12-19 0:50:00 25080
2012-12-18 1:17:00 24780
2012-12-17 0:38:00 25440
2012-12-15 10:38:00 32760
2012-12-14 0:35:00 23160
2012-12-12 22:54:00 3960
2012-12-12 0:21:00 24060
2012-12-10 23:45:00 900
2012-12-11 11:00:00 24840
2012-12-10 0:27:00 25980
2012-12-09 19:29:00 4320
2012-12-09 3:00:00 29880
2012-12-08 2:07:00 34380
I use the following to groupby date and sum the total seconds each day:
df_sum = df.groupby(df.index.date).sum()
What I'd like to do is sum duration_seconds from noon on one day to noon on the following day. Is there an elegant (pandas) way of doing this? Thanks in advance!

pd.TimeGrouper is a custom groupby class for time-interval grouping of NDFrames with a DatetimeIndex, TimedeltaIndex or PeriodIndex. (If your dataframe index is using date-strings, you'll need to convert it to a DatetimeIndex first by using df.index = pd.DatetimeIndex(df.index).)
df.groupby(pd.TimeGrouper('24H')).sum() groups df using 24-hour intervals starting at time 00:00:00.
df.groupby(pd.TimeGrouper('24H'), base=12).sum() groups df using 24-hour intervals starting at time 12:00:00:
In [90]: df.groupby(pd.TimeGrouper('24H', base=12)).sum()
Out[90]:
duration_seconds
2012-12-07 12:00:00 34380.0
2012-12-08 12:00:00 34200.0
2012-12-09 12:00:00 26880.0
2012-12-10 12:00:00 24840.0
2012-12-11 12:00:00 28020.0
2012-12-12 12:00:00 NaN
2012-12-13 12:00:00 23160.0
2012-12-14 12:00:00 32760.0
2012-12-15 12:00:00 NaN
2012-12-16 12:00:00 25440.0
2012-12-17 12:00:00 24780.0
2012-12-18 12:00:00 25080.0
2012-12-19 12:00:00 23100.0
2012-12-20 12:00:00 23520.0
2012-12-21 12:00:00 26820.0
2012-12-22 12:00:00 35940.0
2012-12-23 12:00:00 31800.0
Documentation on pd.TimeGrouper is a little sparse. It is a subclas of pd.Grouper and thus many of its parameters have the same meaning as those documented for pd.Grouper. You can find more examples of pd.TimeGrouper usage in the Cookbook. I found the base parameter by inspecting the source code. The base parameter in pd.TimeGrouper has the same meaning as the base parameter in pd.resample and that is not surprising since pd.resample is implemented using pd.TimeGrouper.
In fact, come to think of it, another way to compute the desired result is
df.resample('24H', base=12).sum()

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

presto/athena get next minute from current timestamp - amazon-web-services

How to get next minute from current_timestamp in presto/athena Eg. 2021-07-27 12:29:52.951 UTC -> 2021-07-27 12:30:00.000 UTC 2021-07-27 12:29:25.951 UTC -> 2021-07-27 12:30:00.000 UTC

Related

Define and convert datetime in AWS Athena

How to generate_series of every hour of every day of 1 year from the current timestamp

Redshift - Adding timezone offset (Varchar) to timestamp column

How to append current and previous sessions side by side filtered by two independent slicers

Pandas time series: groupby and sum from noon to noon

Categories

Resources