How to calculate gap between 2 timestamps (edited for AWS Athena ) - amazon-athena

I Have many IOT devices that sends data to my Amazon Athena server, i created a table to store the data and the table contains 2 columns: LocalTime indicate the time that the IOT device capture his status, ServerTime indicate the time the Data arrived to server (sometimes the IOT device doesn't have network connections )
I would like to count the "gaps" in block of hours (let's say 1 hour ) in order to know the deviation of the data arriving, for example:
the result that I would like to get is:
In order to calculate the result i want to calculate how many hours passed between serverTime and LocalTime.
so the first entry (1.1.2019 12:15 - 1.1.2019 10:25 ) = 1-2 hours.
Thanks

If it is MSSQL Server is your database, you can try this below script to get your desired output-
SELECT
CAST(DATEDIFF(HH,localTime,serverTime)-1 AS VARCHAR) +'-'+
CAST(DATEDIFF(HH,localTime,serverTime) AS VARCHAR) [Hours],
COUNT(*) [Count]
FROM your_table
GROUP BY CAST(DATEDIFF(HH,localTime,serverTime)-1 AS VARCHAR) +'-'+
CAST(DATEDIFF(HH,localTime,serverTime) AS VARCHAR)

Oracle
If you using Oracle database as a system, you can use this statement:
select CONCAT(CONCAT (diff_hours,'-') , diff_hours+1) as Hours, count(diff_hours) as Count
from (select 24 * (to_date(LocalTime, 'YYYY-MM-DD hh24:mi') - to_date(ServerTime, 'YYYY-MM-DD hh24:mi')) diff_hours from T_TIMETABLE )
group by diff_hours
order by diff_hours;
Note: This will not display the empty intervals.

Related

How to query the time in unix epoch timestamp in aws athena

I have a simple table contains the node, message, starttime, endtime details where starttime and endtime are in unix timestamp. The query I am running is:
select node, message, (select from_unixtime(starttime)), (select from_unixtime(endtime)) from table1 WHERE try(select from_unixtime(starttime)) > to_iso8601(current_timestamp - interval '24' hour) limit 100
The query is not working and throwing the syntax error.
I am trying to fetch the following information from the table:
query the table using start time and end time for past 'n' hours or 'n' days and get the output of starttime and endtime in human readable format
query the table using a specific date and time in human readable format
You don't need "extra" selects and you don't need to_iso8601 in the where clasue:
WITH dataset AS (
SELECT * FROM (VALUES
(1627409073, 1627409074),
(1627225824, 1627225826)
) AS t (starttime, endtime))
SELECT from_unixtime(starttime), from_unixtime(endtime)
FROM
dataset
WHERE from_unixtime(starttime) > (current_timestamp - interval '24' hour) limit 100
Output:
_col0
_col1
2021-07-27 18:04:33.000
2021-07-27 18:04:34.000
to search last week you can use
WHERE your_date >= to_unixtime(CAST(now() - interval '7' day AS timestamp))

Subtracting current date from column to show in oracle apex classic report

I have a very simple question but since i am not familiar with SQL or PL/SQL, i got no idea to do that.
In my Oracle APEX Application, I am loading data from a table into a CLASSIC REPORT through setting Local Database/SQL Query as source.
I have to make 4 columns from data of 2 columns stored in a table. I can load 3 without any issue using the below simple statement:
Select TaskName, DueDate, DueDate - 3 as ReminderDate
from table_name
Fourth column should be "RemainingDays" which equals to DueDate-current date, I have tried writing DueDate - Sys_date and DueDate - current_date in the above statement to get the fourth column but probably its not the correct way as i get error instead of all 4 columns. (I am doing in it basic excel/dax way). Any Help here?
When you subtract a date from another date, Oracle returns a number which is the number of days between the two dates.
One thing to note when using SYSDATE or CURRENT_DATE is that you may get different results if your user is not in the same timezone as the database. SYSDATE returns the current time of the database. CURRENT_DATE returns the current time of the user whatever timezone they may be in.
If possible, try building the query in a tool such as SQL Developer, get it working there, then build your Classic Report in APEX. If you are still receiving an error, please share the error you are receiving as well as the query you are using.
Example
--Start of sample data
WITH
t (task_name, due_date)
AS
(SELECT 'task1', DATE '2020-9-30' FROM DUAL
UNION ALL
SELECT 'task2', DATE '2020-9-28' FROM DUAL)
--End of sample data
SELECT task_name,
due_date,
due_date - 3 AS reminder_date,
ROUND (due_date - SYSDATE,2) AS days_remaining
FROM t;
Result
TASK_NAME DUE_DATE REMINDER_DATE DAYS_REMAINING
____________ ____________ ________________ _________________
task1 30-SEP-20 27-SEP-20 13.66
task2 28-SEP-20 25-SEP-20 11.66

How to calculate accumulated time for a defined frequency?

I have rows containing descriptions of services that have been ordered by our customers.
Table:
OrderedServices
Columns:
Id (key)
CustomerId
ServiceId
StartDate
EndDate
AmountOfTimeOrdered (hours)
IntervalType (month, week or day)
Interval (integer)
An example:
1;24343;98;2020-01-20;2020-06-05;1.5;day;3
The above is read as ”Customer w/ id 24343 has ordered service #98 to be executed 1.5hrs every 3rd day during the period 2020-01-20 up until 2020-06-05”
The first day of execution is always StartDate, so, in the given example, the services is first executed 2020-01-20, followed by 2020-01-23 (20+3), 2020-01-26, 2020-01-29 aso.
Now I want to calculate the total amount of time executed for a given ServiceType for a given time period.
E.g. 2020-01-01 - 2020-01-31 = 4 x 1.5 = 6hrs in total executed time for the above.
What I can’t figure out is how to create a measure, or a calculated table to achieve this.
Does anyone have an idea?
Kind regards,
Peter
Go to the query editor and use the following stepts:
If your column looks like in your example use as first step Split Column by Delimiter.
After this just add the following custom column:

Query to calculate cost by month using AWS Athena querying

I have a table like below.
item_id bill_start_date bill_end_date usage_amount
635212 2019-02-01 00:00:00.000 3/1/2019 00:00:00.000 13.345 user_project
IBM
I am trying to find usage_amount by each month and each project. Amazon Athena query engine is based on Presto 0.172. Due to the limitations in Athena, it's not recognizing query like select sysdate from dual;.
I tried to convert bill_start_date and bill_end_date from timestamp to date but failed. even current_date() didn't work in my case. I am able to do calculate the total cost by hard coding the values but my end goal is to perform the action on columns.
SELECT (FLOOR(SUM(usage_amount)*100)/100) AS total,
user_project
FROM test_table
WHERE bill_start_date
BETWEEN date '2019-02-01'
AND date '2019-03-01'
GROUP BY user_project;
In Presto, current_timestamp is a SQL standard function which does not use parentheses.
To group by month, I'd use date_trunc('month', bill_start_date).
All of these functions are documented here

SHOW PARTITIONS with order by in Amazon Athena

I have this query:
SHOW PARTITIONS tablename;
Result is:
dt=2018-01-12
dt=2018-01-20
dt=2018-05-21
dt=2018-04-07
dt=2018-01-03
This gives the list of partitions per table. The partition field for this table is dt which is a date column. I want to see the partitions ordered.
The documentation doesn't explain how to do it:
https://docs.aws.amazon.com/athena/latest/ug/show-partitions.html
I tried to add order by:
SHOW PARTITIONS tablename order by dt;
But it gives:
AmazonAthena; Status Code: 400; Error Code: InvalidRequestException;
AWS currently (as of Nov 2020) supports two versions of the Athena engines. How one selects and orders partitions depends upon which version is used.
Version 1:
Use the information_schema table. Assuming you have year, month as partitions (with one partition key, this is of course simpler):
WITH
a as (
SELECT partition_number as pn, partition_key as key, partition_value as val
FROM information_schema.__internal_partitions__
WHERE table_schema = 'my_database'
AND table_name = 'my_table'
)
SELECT
year, month
FROM (
SELECT val as year, pn FROM a WHERE key = 'year'
) y
JOIN (
SELECT val as month, pn FROM a WHERE key = 'month'
) m ON m.pn = y.pn
ORDER BY year, month
which outputs:
year month
0 2018 10
0 2018 11
0 2018 12
0 2019 01
...
Version 2:
Use the built-in $partitions functionality, where the partitions are explicitly available as columns and the syntax is much simpler:
SELECT year, month FROM my_database."my_table$partitions" ORDER BY year, month
year month
0 2018 10
0 2018 11
0 2018 12
0 2019 01
...
For more information, see:
https://docs.aws.amazon.com/athena/latest/ug/querying-glue-catalog.html#querying-glue-catalog-listing-partitions
From your comment it sounds like you're looking to sort the partitions as a way to figure out whether or not a specific partition exists. For this purpose I suggest you use the Glue API instead of querying Athena. Run aws glue get-partition help or check your preferred SDK's documentation for how it works.
There is also a variant to list all partitions of a table, run aws glue get-partitions help to read more about that. I don't think it returns the partitions in alphabetical order, but it has operators for filtering.
The SHOW PARTITIONS command will not allow you to order the result, since this command does not produce a resultset to sort. This command only produces a string output.
You can on the other hand query the partition column and then order the result by value.
select distinct dt from tablename order by dt asc;