copy timestamp from AWS iot rule to Amazon redshift table column - amazon-web-services

My current iot design is iot > rule > kinesis firehose > redshift
I have iot rule as
SELECT *, timestamp() AS timestamp FROM 'topic/#
I get json message something like below
{
"deviceID": "device6",
"timestamp": 1480926222159
}
In my redshift table I have a column eventtime as Timestamp
Now i want to store the json timestamp value to eventtime column, but it gives me error as it needs
TIMEFORMAT AS 'MM.DD.YYYY HH:MI:SS
for timestamp. So how to covert the iot rules timestamp to redshift timestamp?

There is no direct way to converting epoch date value while inserting it to Redshift table Timestamp datatype column.
I have created a column with Bigint datatype and inserting epoch value directly to this column.
After that I am using Quicksight for analytics so I can edit my dataset and create New calculated field for this column and use Qucksight function as below
epochDate(epoch_date)
which converts the epoch value to timestamp field.
One can use similar functions like
SELECT
(TIMESTAMP 'epoch' + myunixtimeclm * INTERVAL '1 Second ')
AS mytimestamp
FROM
example_table

Related

AWS Athena: Partition projection using date-hour with mixed ranges

I am trying to create an Athena table using partition projection. I am delivering records to S3 using Kinesis Firehouse, grouped using a dynamic partitioning key. For example, the records look like the following:
period
item_id
2022/05
monthly_item_1
2022/05/04
daily_item_1
2022/05/04/02
hourly_item_1
2022/06
monthly_item_2
I want to partition the data in S3 by period, which can be monthly, daily or hourly. It is guaranteed that period would be in a supported Java date format. Therefore, I am writing these records to S3 in the below format:
s3://bucket/prefix/2022/05/monthly_items.gz
s3://bucket/prefix/2022/05/04/daily_items.gz
s3://bucket/prefix/2022/05/04/02/hourly_items.gz
s3://bucket/prefix/2022/06/monthly_items.gz
I want to run Athena queries for every partition scope i.e. if my query is for a specific day, I want to fetch its daily_items and hourly_items. If I am running a query for a month, I want to its fetch monthly, daily as well as hourly items.
I've created an Athena table using below query:
create external table `my_table`(
`period` string COMMENT 'from deserializer',
`item_id` string COMMENT 'from deserializer')
PARTITIONED BY (
`year` string,
`month` string,
`day` string,
`hour` string)
ROW FORMAT SERDE
'org.openx.data.jsonserde.JsonSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
LOCATION
's3://bucket/prefix/'
TBLPROPERTIES (
'projection.enabled'='true',
'projection.day.type'='integer',
'projection.day.digits' = '2',
'projection.day.range'='01,31',
'projection.hour.type'='integer',
'projection.hour.digits' = '2',
'projection.hour.range'='00,23',
'projection.month.type'='integer',
'projection.month.digits'='02',
'projection.month.range'='01,12',
'projection.year.format'='yyyy',
'projection.year.range'='2022,NOW',
'projection.year.type'='date',
'storage.location.template'='s3://bucket/prefix/${year}/${month}/${day}/${hour}')
However, with this table running below query outputs zero results:
select * from my_table where year = '2022' and month = '06';
I believe the reason is Athena expects all files to be present under the same prefix as defined by storage.location.template. Therefore, any records present under a month or day prefix are not projected.
I was wondering if it was possible to support such querying functionality in a single table with partition projection enabled, when data in S3 is in a folder type structure similar to the examples above.
Would be great if anyone can help me out!

How to avoid error "Cannot insert rows out of order" in QuestDB?

I'm trying to migrate data to QuestDB and inserting historical records, I create table as
create table records(
type INT,
interval INT,
timestamp TIMESTAMP,
name STRING) timestamp(timestamp)
and insert data from CSV by curl uploading it.
I get back error "Cannot insert rows out of order". I read that out of order was supported in QuestDB but somehow I cannot make it work.
You can insert rows out of order on partitioned tables only, create new partitioned table and copy data into it
create table records2(
type INT,
interval INT,
timestamp TIMESTAMP,
name STRING
)
timestamp(timestamp) partition by DAY
insert into records2
select * from records
drop table records
rename table records2 to records
After this you'll be able to insert out of order into table records

aws athena query result in json format

I create aws athena table that contain some rows
example of data:
first_name | age
=================
a 20
b 30
c 35
When I query the data I the result are saved in CSV format in S3.
SELECT * FROM table1
I would query the data and get the result in JSON format.
The reason is that I should transfer that JSON data to another application for another process.
Is there a way to get query result in JSON format?

Snowflake date column have incorrect date from AVRO file

I have HIVE external table created in AVRO format. Data stored on S3 location. Now i am creating snowflake table on same data file stored on s3 in avro format. But getting issue with date column. Date is not coming correctly although string and int data are coming correctly in snowflake table.
data in hive table( col1 is timestamp data type in hive table):
col1
2021-02-04 10:02:31
data in snowflake table:
col1
53066-07-15 12:56:40.000
sql to create snowflake table:
CREATE OR REPLACE EXTERNAL TABLE test1
(
col1 timestamp as (value:col1::timestamp),
)
WITH LOCATION = #S3_location/folder/
AUTO_REFRESH = TRUE
FILE_FORMAT = 'AVRO';"

How to calculate gap between 2 timestamps (edited for AWS Athena )

I Have many IOT devices that sends data to my Amazon Athena server, i created a table to store the data and the table contains 2 columns: LocalTime indicate the time that the IOT device capture his status, ServerTime indicate the time the Data arrived to server (sometimes the IOT device doesn't have network connections )
I would like to count the "gaps" in block of hours (let's say 1 hour ) in order to know the deviation of the data arriving, for example:
the result that I would like to get is:
In order to calculate the result i want to calculate how many hours passed between serverTime and LocalTime.
so the first entry (1.1.2019 12:15 - 1.1.2019 10:25 ) = 1-2 hours.
Thanks
If it is MSSQL Server is your database, you can try this below script to get your desired output-
SELECT
CAST(DATEDIFF(HH,localTime,serverTime)-1 AS VARCHAR) +'-'+
CAST(DATEDIFF(HH,localTime,serverTime) AS VARCHAR) [Hours],
COUNT(*) [Count]
FROM your_table
GROUP BY CAST(DATEDIFF(HH,localTime,serverTime)-1 AS VARCHAR) +'-'+
CAST(DATEDIFF(HH,localTime,serverTime) AS VARCHAR)
Oracle
If you using Oracle database as a system, you can use this statement:
select CONCAT(CONCAT (diff_hours,'-') , diff_hours+1) as Hours, count(diff_hours) as Count
from (select 24 * (to_date(LocalTime, 'YYYY-MM-DD hh24:mi') - to_date(ServerTime, 'YYYY-MM-DD hh24:mi')) diff_hours from T_TIMETABLE )
group by diff_hours
order by diff_hours;
Note: This will not display the empty intervals.