How does Amazon CloudWatch batch logs when streaming to AWS Lambda? - amazon-web-services

The AWS documentation indicates that multiple log event records are provided to Lambda when streaming logs from CloudWatch.
logEvents
The actual log data, represented as an array of log event
records. The "id" property is a unique identifier for every log event.
How does CloudWatch group these logs?
Time? Count? Randomly, from my perspective?

Currently you get one Lambda invocation for every PutLogEvents batch that CloudWatch Logs had received against that log group. However you should probably not rely on that because AWS could always change it (for example batch more, etc).
You can observe this behavior by running the CWL -> Lambda example in the AWS docs.

Some aws services allow you to configure the log intervals such as elastic load balancing. There's a choice between five and sixty minute log intervals. You may not see a specific increment or parameter in the docs because they are configurable based on each service.

Related

Process Daily Logs from Cloudwatch using lambda

I dont have much code to show.
I want to process all cloudwatch logs that are generated in last 1 day using lambda.
I want to execute lambda 6 am in the morning to extract some information from cloudwatch logs that are generated on previous day and put it in a table.
Instead of your 6 AM idea, you could also use a CloudWatch Logs subscription filter and trigger a Lambda function to process and store the log entries as described in a step-by-step example here Example 2: Subscription filters with AWS Lambda.
Or even easier and without duplicating the data to a database: Analyzing log data with CloudWatch Logs Insights
I assume that you have a lot of Lambda functions and each has a CloudWatch Log Group.
You can try the CloudWatch Log Group subscription filter, with this feature you can stream your logs into any support destinations such as Lambda.
Before that, you should prepare a Lambda function that has functionalities to help you to put your extracted data into DynamoDB table.
References:
https://docs.aws.amazon.com/lambda/latest/dg/with-ddb-example.html
https://www.geeksforgeeks.org/aws-dynamodb-insert-data-using-aws-lambda/

Workaround for 2 subscription-filter limit in AWS Cloudwatch Logs

I have several lambda functions deployed on AWS that I want to monitor directly for errors to update a postgresql table with.
I have created a lambda to parse streamed log data and update the db. I want to set up subscription filters between this lambda and my other function logs.
There are 6 log streams I want to monitor and the AWS Console limits the subscription filters to 2 per log group.
Is there a workaround or a better way to implement this kind of monitoring?
Thanks

How can I subscribe to cloudwatch metric data?

I am using Elasitsearch to get logs from cloudwatch log group by subscribing a lambda to the log group. So whenever there is a log event pushed to the log group, my lambda will be triggered and it will save the log to Elasticsearch. Then I can search the log via Kibana dashboard.
I'd like to put the metrics data to Elasticsearch as well but I couldn't find a way to subscribe to metrics data.
You can use AWS Module in MetricBeat from the Elastic Beat's family. Note that pulling metrics from cloudwatch will result in chargeable API calls. So you should carefully consider the scraping frequency.
Thanks

how to check number of times a dynamoDB table has been accessed

I have a dynamoDB table lets say sampleTable. I want to find out how many times this table has been accessed from cli. How do i check this?
PS. I have checked the metrics but couldnt find any particular metric which gives this information.
There is no CloudWatch metric to monitor API calls to DynamoDB.
However, there is CloudTrail (CT). Thus you can to go CT's event history and look for API calls to DynamoDB from the last 90 days. You can export the history to a CSV file, and investigate off line as well.
For ongoing monitoring of the API calls you can enable CT trail which will store event log details in S3 for as long as you require:
Logging DynamoDB Operations by Using AWS CloudTrail
If you have the trial created, you can use Amazon Athena to query the log data for the statistics of interests, such as number of specific API calls to DynamoDb:
Querying AWS CloudTrail Logs
Also, you could create custom metric based on the trial's log data (once you configure CloudWatch logs for the trial):
Analyzing AWS CloudTrail in Amazon CloudWatch
However, I don't think you can differentiate between API calls done using CLI, or SDK or by other means.

Cloudwatch Metric showing wrong value

I have an application publishing a custom cloudwatch metric using boto's put_metric_data. The metric shows the number of tasks waiting in a redis queue.
The 1-minute max shows '3', 1-minute min shows '0' and 1-minute average shows '1.5'.
It seems that the application is correctly setting the value to zero, but some other process is overwriting it with 3 at the same time, but I can't find this to stop it.
Is it possible to see logs for PutMetricData to diagnose where this value might be coming from?
Normally, Amazon CloudTrail would be the ideal way to discover information about API calls being made to your AWS account. Unfortunately, PutMetricData is not captured in Amazon CloudTrail.
From Logging Amazon CloudWatch API Calls in AWS CloudTrail:
The CloudWatch GetMetricStatistics, ListMetrics, and PutMetricData API actions are not supported.