Export DynamoDb metrics logs to S3 or CloudWatch - amazon-web-services

I'm trying to use DynamoDB metrics logs in an external observability tool.
To do that, I'll need to get these log data from S3 or CloudWatch log groups (not from Insights or CloudTrail).
For this reason, if there isn't a way to use CloudWatch, I'll need to export these metric logs from DynamoDb to S3, and from there export to CloudWatch or try to get those data directly from S3.
Do you know this is possible?

You could try using Logstash, it has a plugin for Cloudwatch and S3:
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-cloudwatch.html
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-s3.html

AWS puts DynamoDB metrics (table operation, table, and account) over CloudWatch metrics. Also, you can create as many metrics as you need. If you use Python, you can read it with boto3. The CloudWatch client has this method:
get_metric_data
Try this with your metrics:
cloudwatch_client = boto3.client('cloudwatch')
yesterday = date.today() - timedelta(days=1)
today = date.today()
response = cloudwatch_client.get_metric_data(
MetricDataQueries=[
{
'Id': 'some_request',
'MetricStat': {
'Metric': {
'Namespace': 'DynamoDB',
'MetricName': 'metric_name',
'Dimensions': []
},
'Period': 3600,
'Stat': 'Sum',
}
},
],
StartTime=datetime(yesterday.year, yesterday.month, yesterday.day),
EndTime=datetime(today.year, today.month, today.day),
)
print(response)

Related

Creating a CloudWatch Metrics from the Athena Query results

My Requirement
I want to create a CloudWatch-Metric from Athena query results.
Example
I want to create a metric like user_count of each day.
In Athena, I will write an SQL query like this
select date,count(distinct user) as count from users_table group by 1
In the Athena editor I can see the result, but I want to see these results as a metric in Cloudwatch.
CloudWatch-Metric-Name ==> user_count
Dimensions ==> Date,count
If I have this cloudwatch metric and dimensions, I can easily create a Monitoring Dashboard and send send alerts
Can anyone suggest a way to do this?
You can use CloudWatch custom widgets, see "Run Amazon Athena queries" in Samples.
It's somewhat involved, but you can use a Lambda for this. In a nutshell:
Setup your query in Athena and make sure it works using the Athena console.
Create a Lambda that:
Runs your Athena query
Pulls the query results from S3
Parses the query results
Sends the query results to CloudWatch as a metric
Use EventBridge to run your Lambda on a recurring basis
Here's an example Lambda function in Python that does step #2. Note that the Lamda function will need IAM permissions to run queries in Athena, read the results from S3, and then put a metric into Cloudwatch.
import time
import boto3
query = 'select count(*) from mytable'
DATABASE = 'default'
bucket='BUCKET_NAME'
path='yourpath'
def lambda_handler(event, context):
#Run query in Athena
client = boto3.client('athena')
output = "s3://{}/{}".format(bucket,path)
# Execution
response = client.start_query_execution(
QueryString=query,
QueryExecutionContext={
'Database': DATABASE
},
ResultConfiguration={
'OutputLocation': output,
}
)
#S3 file name uses the QueryExecutionId so
#grab it here so we can pull the S3 file.
qeid = response["QueryExecutionId"]
#occasionally the Athena hasn't written the file
#before the lambda tries to pull it out of S3, so pause a few seconds
#Note: You are charged for time the lambda is running.
#A more elegant but more complicated solution would try to get the
#file first then sleep.
time.sleep(3)
###### Get query result from S3.
s3 = boto3.client('s3');
objectkey = path + "/" + qeid + ".csv"
#load object as file
file_content = s3.get_object(
Bucket=bucket,
Key=objectkey)["Body"].read()
#split file on carriage returns
lines = file_content.decode().splitlines()
#get the second line in file
count = lines[1]
#remove double quotes
count = count.replace("\"", "")
#convert string to int since cloudwatch wants numeric for value
count = int(count)
#post query results as a CloudWatch metric
cloudwatch = boto3.client('cloudwatch')
response = cloudwatch.put_metric_data(
MetricData = [
{
'MetricName': 'MyMetric',
'Dimensions': [
{
'Name': 'DIM1',
'Value': 'dim1'
},
],
'Unit': 'None',
'Value': count
},
],
Namespace = 'MyMetricNS'
)
return response
return

AWS Lambda - Copy monthly snapshots to another region

I am trying to run a lambda that will kick off on a schedule to copy all snapshots taken the day prior to another region for DR purposes. I have a bit of code but it seems to not work as intended.
Symptoms:
It's grabbing the same snapshots multiple times and copying them
It always errors out on 2 particular snapshots, I don't know enough coding to write a log to figure out why. These snapshots work if I manually copy them though.
import boto3
from datetime import date, timedelta
SOURCE_REGION = 'us-east-1'
DEST_REGION = 'us-west-2'
ec2_source = boto3.client('ec2', region_name = SOURCE_REGION)
ec2_destination = boto3.client('ec2', region_name = DEST_REGION)
snaps = ec2_source.describe_snapshots(OwnerIds=['self'])['Snapshots']
yesterday = date.today() - timedelta(days = 1)
yesterday_snaps = [ s for s in snaps if s['StartTime'].date() == yesterday ]
for yester_snap in yesterday_snaps:
DestinationSnapshot = ec2_destination.copy_snapshot(
SourceSnapshotId = yester_snap['SnapshotId'],
SourceRegion = SOURCE_REGION,
Encrypted = True,
KmsKeyId='REMOVED FOR SECURITY',
DryRun = False
)
DestinationSnapshotID = DestinationSnapshot['SnapshotId']
ec2_destination.create_tags(Resources=[DestinationSnapshotID],
Tags=yester_snap['Tags']
)
waiter = ec2_destination.get_waiter('snapshot_completed')
waiter.wait(
SnapshotIds=[DestinationSnapshotID],
DryRun=False,
WaiterConfig={'Delay': 10,'MaxAttempts': 123}
)
Debugging
You can debug by simply putting print() statements in your code.
For example:
for yester_snap in yesterday_snaps:
print('Copying:', yester_snap['SnapshotId'])
DestinationSnapshot = ec2_destination.copy_snapshot(...)
The logs will appear in CloudWatch Logs. You can access the logs via the Monitoring tab in the Lambda function. Make sure the Lambda function has AWSLambdaBasicExecutionRole permissions so that it can write to CloudWatch Logs.
Today/Yesterday
Be careful about your definition of yesterday. Amazon EC2 instances run in the UTC timezone, so your concept of a today and yesterday might not match what is happening.
It might be better to add a tag to snapshots after they are copied (eg 'copied') rather than relying on dates to figure out which ones to copy.
CloudWatch Events rule
Rather than running this program once per day, an alternative method would be:
Create an Amazon CloudWatch Events rule that triggers on Snapshot creation:
{
"source": [
"aws.ec2"
],
"detail-type": [
"EBS Snapshot Notification"
],
"detail": {
"event": [
"createSnapshot"
]
}
}
Configure the rule to trigger an AWS Lambda function
In the Lambda function, copy the Snapshot that was just created
This way, the Snapshots are created immediately and there is no need to search for them or figure out which Snapshots to copy

How can I create a custom metric watching EFS metered size in AWS Cloudwatch?

Title pretty much says it all - Since the EFS metered size (usage) ist not a metric that I can use in Cloudwatch, I need to create a custom metric watching the last metered file size in EFS.
Is there any possiblity to do so? Or is there maybe a even better way to monitore the size of my EFS?
I would recommend using a Lambda, running every hour or so and sending the data into CloudWatch.
This code gathers all the EFS File Systems and sends their size (in kb) to Cloudwatch along with the file system name. Modify it to suit your needs:
import json
import boto3
region = "us-east-1"
def push_efs_size_metric(region):
efs_name = []
efs = boto3.client('efs', region_name=region)
cw = boto3.client('cloudwatch', region_name=region)
efs_file_systems = efs.describe_file_systems()['FileSystems']
for fs in efs_file_systems:
efs_name.append(fs['Name'])
cw.put_metric_data(
Namespace="EFS Metrics",
MetricData=[
{
'MetricName': 'EFS Size',
'Dimensions': [
{
'Name': 'EFS_Name',
'Value': fs['Name']
}
],
'Value': fs['SizeInBytes']['Value']/1024,
'Unit': 'Kilobytes'
}
]
)
return efs_name
def cloudtrail_handler(event, context):
response = push_efs_size_metric(region)
print ({
'EFS Names' : response
})
I'd also suggest reading up on the reference below for more details on creating custom metrics.
References
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/publishingMetrics.html

Tagging EMR cluster via an AWS Lambda tiggered by a Cloudwatch event rule

I need to catch the event RunflowJob in my Cloudwatch EventRule in order to tag AWS EMR starting clusters.
I'm looking for this event, because i need the username and account informations
Any idea?
Thanks
Calls to the ListClusters, DescribeCluster, and RunJobFlow actions generate entries in CloudTrail log files.
Every log entry contains information about who generated the request. For example, if a request is made to create and run a new job flow (RunJobFlow), CloudTrail logs the user identity of the person or service that made the request
https://docs.aws.amazon.com/emr/latest/ManagementGuide/logging_emr_api_calls.html#understanding_emr_log_file_entries
Here is a sample snippet to get the username using Python Boto3.
import boto3
cloudtrail = boto3.client("cloudtrail")
response = cloudtrail.lookup_events (
LookupAttributes=[
{
'AttributeKey': 'EventName',
'AttributeValue': 'RunJobFlow'
}
],
)
for event in response.get ("Events"):
print(event.get ("Username"))
username and cluster details can be retrieved from the RunJobFlow event itself. Easier solution would be to use Cloudwatch event rule along with Lambda function as a target to fetch these info and subsequently further action can be taken as required. Example below:
Event Pattern to be used with Cloudwatch event rule
{
"source": ["aws.elasticmapreduce"],
"detail": {
"eventName": ["RunJobFlow"]
}
}
Lambda code snippet
def lambda_handler(event, context):
#print("Received event: " + json.dumps(event, indent=2))
user = event['detail']['userIdentity']['userName']
cluster_id = event['detail']['responseElements']['jobFlowId']
region = event['region']

Boto3 - Create S3 'object created' notification to trigger a lambda function

How do I use boto3 to simulate the Add Event Source action on the AWS GUI Console in the Event Sources tab.
I want to programatically create a trigger such that if an object is created in MyBucket, it will call MyLambda function(qualified with an alias).
The relevant api call that I see in the Boto3 documentation is create_event_source_mapping but it states explicitly that it is only for AWS Pull Model while I think that S3 belongs to the Push Model. Anyways, I tried using it but it didn't work.
Scenarios:
Passing a prefix filter would be nice too.
I was looking at the wrong side. This is configured on S3
s3 = boto3.resource('s3')
bucket_name = 'mybucket'
bucket_notification = s3.BucketNotification(bucket_name)
response = bucket_notification.put(
NotificationConfiguration={'LambdaFunctionConfigurations': [
{
'LambdaFunctionArn': 'arn:aws:lambda:us-east-1:033333333:function:mylambda:staging',
'Events': [
's3:ObjectCreated:*'
],
},
]})