How to increase your Quicksight SPICE data refresh frequency - amazon-web-services

Quicksight only supports 24 refreshes / 24 Hrs for FULL REFRESH.
I want to refresh the data every 30 Mins.

Answer:
Scenario:
Let us say I want to fetch the data from the source (Jira) and push it to SPICE and render it in Quicksight Dashboards.
Requirement:
Push the data every 30 Mins once.
Quicksight supports the following:
Full refresh
Incremental refresh
Full refresh:
Process - Old data is replaced with new data.
Frequency - Every 1 Hr once
Refresh count - 24 / Day
Incremental refresh:
Process - New data get appended to the dataset.
Frequency - Every 15 Min once
Refresh count - 96 / Day
Issue:
We need to push the data every 30 Min once.
It is going to be a FULL_REFRESH
When it comes to Full Refresh Quicksight only supports Hourly refresh.
Solution:
We can leverage API support from AWS.
Package - Python Boto 3
Class - Quicksight.client
Method - create_ingestion
Process - You can manually refresh datasets by starting new SPICE ingestion.
Refresh cycle: Each 24-hour period is measured starting 24 hours before the current date and time.
Limitations:
Enterprise edition accounts 32 times in a 24-hour period.
Standard edition accounts 8 times in a 24-hour period.
Sample code:
Python - Boto for AWS:
import boto3
client = boto3.client('quicksight')
response = client.create_ingestion(
DataSetId='string',
IngestionId='string',
AwsAccountId='string',
IngestionType='INCREMENTAL_REFRESH'|'FULL_REFRESH'
)
awswrangler:
import awswrangler as wr
wr.quicksight.cancel_ingestion(ingestion_id="jira_data_sample_refresh", dataset_name="jira_db")
CLI:
aws quicksight create-ingestion --data-set-id dataSetId --ingestion-id jira_data_sample_ingestion --aws-account-id AwsAccountId --region us-east-1
API:
PUT /accounts/AwsAccountId/data-sets/DataSetId/ingestions/IngestionId HTTP/1.1
Content-type: application/json
{
"IngestionType": "string"
}
Conclusion:
Using this approach we can achieve 56 Full Refreshes for our dataset also we can go one step further and get the peak hours of our source tool (Jira) and configure the data refresh accordingly. This way we can even achieve a refresh frequency of 10 Min once.
Ref:
Quicksight
Quicksight Gallery
SPICE
Boto - Python
Boto - Create Ingestion
AWS Wrangler
CLI
API

Related

GCP Datastore times out on large download

I'm using Objectify to access my GCP Datastore set of Entites. I have a full list of around 22000 items that I need to load into the frontend:
List<Record> recs = ofy().load().type(Record.class).order("-sync").list();
The number of records has recently increased and I get an error from the backend:
com.google.apphosting.runtime.HardDeadlineExceededError: This request (00000185caff7b0c) started at 2023/01/19 17:06:58.956 UTC and was still executing at 2023/01/19 17:08:02.545 UTC.
I thought that the move to Cloud Firestore in Datastore mode last year would have fixed this problem.
My only solution is to break down the load() into batches using 2 or 3 calls to my Ofy Service.
Is there a better way to grab all these Entities in one go?
Thanks
Tim

Export rds query to s3 give error after some time

I am trying to export data from a table in my postgresql database to S3. When I execute the query everything goes well, the data is exported correctly to S3, until suddenly after about 16 hours, the query gives an error:
ERROR: could not upload to Amazon S3
DETAIL: Amazon S3 client returned 'Unable to parse ExceptionName: ExpiredToken Message: The provided token has expired.'.
CONTEXT: SQL function "query_export_to_s3" statement 1
What could be the problem? I thought that the token was renewed 5 minutes before its expiration.
UPDATE: The role we use to execute the query has a session duration of 12h
More updates: The query I am running is to migrate several GB of data to S3, probably around 500 GB. I made a separate query to verify the number of records and the total is 500 million, this query took 4 hours to complete. Now what I did was run a query to export those 500 million records to S3 and after about 16 hours I get the message you see above.
In S3 the result was saved in parts of 6 GB.
We repeat the query that exports to S3 about 3 times and always the same result, after about 16 hours I get the expired token error.
I'm running the query from ec2 instance.
Please check AWS authentication documentation:
The minimum session duration is 1 hour, and can be set to a maximum of 12 hours.

Information of Informatica jobs for last 30 days

I want to get the date, time and sequence of all the Informatica jobs running on dev server for the last 30 days. How can we get it?
Easiest would be query on rep_sess_log in Infromatica metadatabase (see Repository Guide).
SELECT
SUBJECT_AREA, WORKFLOW_NAME,MAPPING_NAME,
SUCCESSFUL_SOURCE_ROWS, FAILED_SOURCE_ROWS
ACTUAL_START, SESSION_TIMESTAMP END_TIME
FROM REP_SESS_LOG
WHERE
ACTUAL_START BETWEEN TO_DATE('01/JAN/2021','dd/mon/yyyy') AND
TO_DATE('31/JAN/2021','dd/mon/yyyy');

remove a custom metrics from aws cloudwatch

i've created successfully a custom metric by SDK but i'm not able to remove it
I can't find the option from the web console to remove it (from SDK as well, i cant find a method to remove/cancel it)
//the code is not important, i've pasted it just to show it works
IAmazonCloudWatch client = new AmazonCloudWatchClient(RegionEndpoint.EUWest1);
List<MetricDatum> data = new List<MetricDatum>();
data.Add(new MetricDatum()
{
MetricName = "PagingFilePctUsage",
Timestamp = DateTime.Now,
Unit = StandardUnit.Percent,
Value = percentPageFile.NextValue()
});
data.Add(new MetricDatum()
{
MetricName = "PagingFilePctUsagePeak",
Timestamp = DateTime.Now,
Unit = StandardUnit.Percent,
Value = peakPageFile.NextValue()
});
client.PutMetricData(new PutMetricDataRequest()
{
MetricData = data,
Namespace = "mycompany/myresources"
});
it created a metric named "mycompany/myresources" but i can't remove it
Amazon CloudWatch retains metrics for 15 months.
From Amazon CloudWatch FAQs - Amazon Web Services (AWS):
CloudWatch retains metric data as follows:
Data points with a period of less than 60 seconds are available for 3 hours. These data points are high-resolution custom metrics.
Data points with a period of 60 seconds (1 minute) are available for 15 days
Data points with a period of 300 seconds (5 minute) are available for 63 days
Data points with a period of 3600 seconds (1 hour) are available for 455 days (15 months)
So, just pretend that your old metrics don't exist. Most graphs and alarms only look back 24 hours, so old metrics typically won't be noticed, aside from appearing as a name in the list of metrics.

AWS LastModified S3 Bucket different

I'm developing a node.js function that lists the objects in an S3 Bucket via the listObjectsV2 call. In the returned json results, the date is not the same as the date shown in the S3 bucket nor in a aws cli s3 list. In fact, they are different days. I'm not sure how this is happening?
Any thoughts?
aws cli ls
aws s3 ls s3://mybucket
2018-11-08 19:38:55 24294 Thought1.mp3
S3 Page on AWS
JSON results
They are the same times, but in different timezones.
The listObjectsV2 response is giving you Zulu times (UTC or Greenwich Mean Time), which appears to be 6 hours ahead of you.
In the JSON picture you have 2018-11-09T01:38:55.000Z which is ZULU time (the Z at the very end). It means UTC/GMT time.
In the S3 console picture you have Nov 8, 2018 7:38:55 PM GMT-0600 - this time is GMT time minus 6 hours (see at the end GMT-0600) - which may be possibly the US EST time or similar. The difference between these two is exactly 6 hours.
The output from aws CLI is probably on your local computer and shows local time in the 24H format without the timezone, so it is harder to see the reason, but it matches the S3 console time.
In general, AWS returns times in the UTC time zone. This is usually quite helpful once you start deploying in multiple time-zones. On the other side, it may become tricky if you for example run your code on an EC2 instance where is configured a different timezone. So be careful when you convert from your local time to the UTC time - I would suggest you to even use some library like https://momentjs.com/ or you may create yourself more problems.