My ultimate goal is to build an workflow whereby we can collect Quicksight related events and then visualize them in Quicksight itself (basically to see dashboard/user usage). This is all helpfully described on the AWS Blog (https://aws.amazon.com/blogs/big-data/building-an-administrative-console-in-amazon-quicksight-to-analyze-usage-metrics/)
My question is around how I can create a Trail that only streams Quicksight related events to S3. If I create a trail and select data events, I'm getting all kinds of events that I don't care about. Is there a way to create a trail for just the Quicksight service or any alternative route to not clog up S3 with logs I don't need?
You can probably configure the CloudTrail data event source to be Quicksight.
See "Logging data events with basic event selectors in the AWS Management Console" in this guide
https://docs.aws.amazon.com/awscloudtrail/latest/userguide/logging-data-events-with-cloudtrail.html
Either create a trail with custom event selector using cli,try put-event-selectors command to specify the event source as QuickSight.
Or you can try Querying AWS CloudTrail logs via athena https://docs.aws.amazon.com/athena/latest/ug/cloudtrail-logs.html
Related
I have a CloudWatch Log Group, this log group continuously receives logging information from my AWS services.
I want to extract some of the logging information from this log-group and want to store that data into S3 in some format (CSV, PARQUET).
I will then use Athena to query this logging data.
I want some sort of automatic mechanism to send these logs continuously to S3.
Can anyone suggest solution for this?
It looks like Athena is able to communicate directly with cloudwatch as shown here. Not sure how performant this is and how costly this turns out.
The other option is to configure Cloudwatch to send data to Firehose via Subscriptions which then dumps it to S3.
I have a dynamoDB table lets say sampleTable. I want to find out how many times this table has been accessed from cli. How do i check this?
PS. I have checked the metrics but couldnt find any particular metric which gives this information.
There is no CloudWatch metric to monitor API calls to DynamoDB.
However, there is CloudTrail (CT). Thus you can to go CT's event history and look for API calls to DynamoDB from the last 90 days. You can export the history to a CSV file, and investigate off line as well.
For ongoing monitoring of the API calls you can enable CT trail which will store event log details in S3 for as long as you require:
Logging DynamoDB Operations by Using AWS CloudTrail
If you have the trial created, you can use Amazon Athena to query the log data for the statistics of interests, such as number of specific API calls to DynamoDb:
Querying AWS CloudTrail Logs
Also, you could create custom metric based on the trial's log data (once you configure CloudWatch logs for the trial):
Analyzing AWS CloudTrail in Amazon CloudWatch
However, I don't think you can differentiate between API calls done using CLI, or SDK or by other means.
I have a use-case where I want to sync two AWS Glue Data Catalog residing on different accounts.
Does Glue emit notifications which can be published when a new database/table/partition is created/deleted? Or some other way of knowing what is happening in other Glue Data Catalog?
One way is to listen Cloudwatch notifications of that Glue account but according to Doc Cloudwatch notifications are not reliable:
https://docs.aws.amazon.com/glue/latest/dg/automating-awsglue-with-cloudwatch-events.html
AWS provides an open source script(s) for that purpose. See here
Not sure how reliable and fast it is, but worth trying.
We have a team shared AWS account, that sometimes things are hard to debug. Especially, for EMR APIs, throttling happens regularly, that it'll be nice to have CloudTrail logs tell people who is not being nice when using EMR. I think our CloudTrail logging is enabled, that I can see these API events with EMR as event source--
AddJobFlowSteps
RunJobFlow
TerminateJobFlows
I'm pretty sure that I'm calling DescribeCluster for plenty times and caused some throttling, but not sure why they are not showing up in my CloudTrail logs...
Can someone help understand --
Is there additional setting needed for DescribeCluster EMR API, in order to log events to CloudTrail?
And what about other EMR APIs? Can they be configured to log events to CloudTrails, without using SDK explicitly writing to CloudTrails?
I have read these articles, feels like much can be done in CloudTrails...
https://docs.aws.amazon.com/emr/latest/ManagementGuide/logging_emr_api_calls.html
https://docs.aws.amazon.com/awscloudtrail/latest/userguide/logging-management-and-data-events-with-cloudtrail.html#logging-management-events-with-the-cloudtrail-console.
https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-supported-services.html
Appreciate any help!
A quick summary of AWS cloudtrail:
The events recorded by AWS cloudtrail are of two types: Management events and Data events.
Management events include actions like: stopping an instance, deleting a bucket etc.
Data events are only available for two services (S3 and lambda), which include actions like: object 'abc.txt' was read from the S3 bucket.
Under management events, we again have 4 types:
Write-only
Read-only
All (both reads and writes)
None
The DescribeCluster event that you are looking for comes under the management event 'Read-only' type. DescribeCluster - cloudtrail image:
Please ensure that you have selected "All" or "ReadOnly" management event type in your cloudtrail trail.
Selecting "WriteOnly" in management event type in your cloudtrail trail will not record 'DescribeCluster'.
There is no other AWS service specific setting that you can enable in cloudtrail.
Also note that the 'Event history' tab in AWS Cloudtrail console records all types of logs (including ReadOnly) for a period of 90 days. You can see the DescribeCluster event there too.
As per AWS docs, there's no Redshift-Lambda integration yet.
What we would like to do is monitoring redshift activity in order to do something when a redshift table is created, a copy from S3 is made or a bulk insert is performed.
Is there a way to register this kind of activity, and then do something similar to run a lambda function ir order run a small script or so?
Redshift provides an event notification mechanism. You can find a full list of the event categories and messages here. If that covers the kind of information you are interested in you can simply have your Lambda function add the SNS topic used by Redshift for event notification as an event source and your Lambda function will get called every time an event is sent by Redshift.
You can enable audit logs that end up in s3.
All the info you want is also available in various admin tables with prefixes like stl_, stv_ and pg_. For example, COPY commands from S3 are recorded in stl_load_commits, and stl_utilitytext has info on non-select queries like CREATE.
As for triggering events, you could have S3 trigger a lambda when one of the log files lands or run occasional jobs that query the system tables and take action with something like cron jobs or airflow.