aws download all custom time cloud watch logs within mentioned time period - amazon-web-services

I am trying to process kinesis messages from aws lambda and pushing to power bi.
In lambda have logged application specific messages.
What is the problem now?
I am able to see my messages in cloud watch logs but not all. When I apply filter I am getting specific period messages which is good but not all messages of that period. Within the mentioned range Every time I scroll my mouse to get more messages and download till that point.
For Ex: In UI for specified time range if I get 100 messages I am able to download till this point. There is something I can see as "load more" When I click on it I get around 150 messages. **This is very tedious ** to scroll each time and get more messages and download .
Is there any automated way where I could download in single shot for mentioned period on all messages ?
Any help will reduce lots of effort

You can export the logs to an S3 Bucket using the CloudWatch Logs Console or using the AWS CLI. Specifying the begin time, end time and optionally Stream prefix.
First you need to set appropriate permissions.
Export logs using AWS Console
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/S3ExportTasksConsole.html
Export logs using AWS CLI
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/S3ExportTasks.html

Related

AWS cloudwatch: logs are getting created in different log streams for the single API hit

We are making use of AWS Lambda and have configured cloudwatch for logging. There is a cron job running every 5 minutes which is triggering the lambda function. The logs that are generated for the hit are getting created in different log streams. For reference, please check the image attached here:
So, let's say there is an API hit at 11:45, then for checking the logs I have to go through the log streams having last event time 2022-05-05 11:43:10 (UTC+05:30) , 2022-05-05 11:43:00 (UTC+05:30), 2022-05-05 11:38:11 (UTC+05:30) and 2022-05-05 11:38:02 (UTC+05:30) and so on. The reason is, for a single hit logs are getting created in different log streams. Some of the logs are in first log stream, some are in second, a few are in third one. Previously, all the logs were created in single log stream corresponding to a single hit. Is there anything that can be done to avoid this? as this makes debugging a time taking process.
This is how Lambda works: each Lambda execution environment gets its own log stream. If you need to look at logs across log streams, then the best "built-in" solution is CloudWatch Logs Insights, which works at the log-group level.
Update: this document describes the Lambda execution environment, and the conditions that cause creation/destruction of an environment.

AWS Lambda: Monitoring lambda timeout that was triggered by SNS.

I have an AWS Lambda that was triggered by SNS message. Many time, it has reached the max duration allowed by AWS, and AWS killed it immediately.
I have to either dig into the Lambda logs or the lambda duration chart to find out about the error.
Are there a better way to report this kind of errors?
Yes, there are some 3rd party tools that help you monitor your environment and provide exactly that - filter on specific errors and drill down to what happened there (the input event, the outgoing HTTP requests etc.).
Moreover, you can also configure alerts on specific errors that you will get via slack/mail.
Disclosure: I work for Lumigo, a company that does exactly that.

AWS WAF - Auto Save Web Application Firewall logs in S3

How do you route AWS Web Application Firewall (WAF) logs to an S3 bucket? Is this something I can quickly do through the AWS Console? Or, would I have to use a lambda function (invoked by a CloudWatch timer event) to query the WAF logs every n minutes?
UPDATE:
I'm interested in the ACL logs (Source IP, URI, Matches rule, Request Headers, Action, Time, etc).
UPDATE (05/15/2017)
AWS doesn't provide an easy way to view/parse these logs. You can get a "random sample" via the get-sampled-requests command. Which isn't acceptable...
Gets detailed information about a specified number of requests--a
sample--that AWS WAF randomly selects from among the first 5,000
requests that your AWS resource received during a time range that you
choose. You can specify a sample size of up to 500 requests, and you
can specify any time range in the previous three hours.
http://docs.aws.amazon.com/cli/latest/reference/waf/get-sampled-requests.html
Also, I'm not the only one experiencing this issue either:
https://forums.aws.amazon.com/thread.jspa?threadID=220202
I was looking for this functionality today and stumbled across the referenced thread. It was, coincidentally, updated today:
Hello,
Thanks for your input. I have submitted a feature request on your
behalf to export WAF events to S3 for long term analysis.
Best Regards, albertpataws
The lack of this feature strikes me as being almost as odd as the fact that I can't change timezones for graphs.

How can I email error logs from AWS Spark

I have a process that uses AWS EMR to run a pyspark cluster.
I have a S3 location where all the process logs gets stored.
I want to understand that is there a way I can filter out ERROR logs and get them mailed to my inbox. I do not want to save any log file on my system.
Is there any python library which can help me monitor real time logs. I have seen the boto3 and EMR library, but I could not find a answer to my problem from there.
The EMR logs will likely be buffered up into chunks of a few minutes or some size before being written to S3 ( but full disclosure, that's based on experience with other AWS S3 logging systems, not EMR itself).
If I were attempting to solve this problem, I'd use an AWS Lambda function to execute python that would read the S3 logs line by line and filter for the lines matching ERROR, and then use SNS to send the logs to your email address. You can use S3 events to automatically trigger the Lambda when objects are written to the S3 logging location for EMR, so this is as close to realtime as you're gonna get.
The architecture I am suggesting looks something like this
EMR -> S3 -> Lambda -> SNS -> email inbox
The write of each EMR log to s3 triggers a lambda which uses boto3
to filter the log for error messages, sending alerts to an SNS topic for distribution to users.
It may seem like a lot of moving parts but it won't require much to maintain it and should cost you only a few cents a month more than the S3 storage is already costing you. And the effort for the whole thing is actually pretty small.
Furthermore, you won't need:
a place to execute your code, servers to manage, etc
nontrivial deployment model for your project
any parts not shown above, for that matter
And you'll get for free:
Monitoring in the form of
cloudwatch metrics for lambda,
s3 logs (should you enable them)
cloudwatch logs that store your function's execution windows and stdout.
Easy integration into alerting through cloudwatch Alarms ( these typically integrate well with Pager Duty and the like )
dead-simple exensibility, such as
SNS can send SMS messages to your phone
add more parsing options in the lambda and redeploy
expose cloudwatch metrics and add alarms for thresholds
write the summary to S3 for pre signed email or sms links, or further processing now or later
You could send the email yourself through SES or just manually with python, but I would rather use SNS so that the subscriptions to the topic can vary independently from the python code.
Lambdas are a little intimidating to start with, but they'll include the boto3 sdk by default (which should obviate the need for a zipfile with pip dependencies all together ), which will simplify creation.
For that matter, you can set all this stuff up in the AWS console if you like doing things by dragging mouse pointers around, or intend to do it only a few times, or you can express all if it in cloudformation if you need something repeatable.
http://docs.aws.amazon.com/lambda/latest/dg/with-s3.html
http://docs.aws.amazon.com/lambda/latest/dg/python-programming-model-handler-types.html
http://docs.aws.amazon.com/sns/latest/dg/welcome.html

AWS Lambda function was trigger twice by CloudWatch event

I deployed a service written in Python2.7 using AWS Lambda, and it's about extracting data from some pages and sending results to a web app. The service is triggered by the AWS CloudWatch event (fixed rate of 5 mins).
However, I found out sometimes the service was triggered twice at a time. I got this because there were two log stream printed the same data and result but with different RequestID's. And the database had duplicate data, which showed that both worked successfully. It looked like the service was triggered twice almost at the same time for no reasons.
Does anyone experience the same thing, and how do you fix it? Or, is there a way to limit only one function can be executed at a time.
Yes. Some AWS services have SLA of at least once delivery. I have experienced this with CloudWatch and CloudTrail. I do not know if you can limit it only once. You have to check if the data has been processed already. I overcame this by making boto3 calls in my python code before processing the data. Without knowing your situation, it is difficult to suggest a solution.