How to setup email notifications for AWS operational issues - amazon-web-services

Yesterday our infrastructure started throwing lots of connection errors. We started debugging and the more we looked, the more perplexing the issue appeared to be; until someone noticed the bell icon (Alerts) on the AWS page had an orange dot on it.
Behold! there were lots of AWS operational issues in our availability region that AWS were fixing.
To avoid this situation in the future I wanted to subscribe to these 'Alert' so we get an email notification.
Does anyone know how to set up an email alert for AWS operational issues in the specified region?
Much to my astonishment, there was no obvious way to set this up.

Easiest way is to subscribe RSS feed on AWS Service Health Dashboard.
If you want customized stuffs, you can checkout AWS Personal Health Dashbaord. It shows your AWS services and whether they are experiencing issues.
This AWS documentation provides a really comprehensive guide on how to setup alerts. Checkout this aws-health-tools github repository for fully functional examples.

Related

I can't find and disable AWS resources

My free AWS tier is going to expire in 8 days. I removed every EC2 resource and elastic IP associated with it. Because that is what I recall initializing and experimenting with. I deleted all the roles I created because as I understand it, roles permit AWS to perform actions for AWS services. And yet, when I go to the billing page it shows I have these three services that are in current usage.
[1]: https://i.stack.imgur.com/RvKZc.png
I used the script as recommended by AWS documentation to check for all instances and it shows "no resources found".
Link for script: https://docs.aws.amazon.com/systems-manager-automation-runbooks/latest/userguide/automation-awssupport-listec2resources.html
I tried searching for each service using the dashboard and didn't get anywhere. I found an S3 bucket, I don't remember creating it but I deleted it anyway, and still, I get the same output.
Any help is much appreciated.
ok, I was able to get in touch with AWS support via Live chat, and they informed me that those services in my billing were usages generated before the services were terminated. AWS support was much faster than I expected.

AWS CloudFormation error: "Route did not stabilize in expected time"

I am trying to deploy a CloudFormation template from an AWS workshop - https://emr-developer-experience.workshop.aws/how-to-start/self-paced/cloudformation.html.
The CF template is located at https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/new?stackName=EMR-Dev-Exp-Workshop&templateURL=https://aws-data-analytics-workshops.s3.amazonaws.com/emr-dev-exp-workshop/cfn/emr-dev-exp.template
This CF template creates a new VPC with all the required networking components as well as various services such as EMR, EMR Studio, Service Catalog, etc.
I am from a data background and I am having a hard time debugging this CF template.
Basically, it fails when creating the logical ID "VPCGatewayAttachment" with the error message "Route did not stabilize in expected time". A KB article from AWS (https://aws.amazon.com/premiumsupport/knowledge-center/cloudformation-route-did-not-stabilize/) has some information, but I don't think I really understand the outlined solution.
Appreciate any help.
thanks.
I found this article for you on AWS' Knowledge Center https://aws.amazon.com/premiumsupport/knowledge-center/cloudformation-route-did-not-stabilize/ As the article mentions I would also have a look at the AWS CloudTrail Event History and investigate any potential errors and root causes. When browsing the Event History, I personally like enabling the error codes (you can do that if you press the gear icon) which allow me to quickly detect events that have failed.

Logs are not send to logentries from aws ecs

We are using log-entries as driver on AWS ECS service for sending logs to our logentries account. We have configured AWS ECS service with required parameters like logentries-token but it's observed that after certain amount of time certain containers are not able to send logs to logentries.
Appreciate your help in advance, I am unable to find proper documentation for this on both logenries as well as AWS.
Thanks,
We had the same issue, so I started digging deeper than usual.
Actual driver implementation is quite simple.
The dragon is a dependency that does the socket, tls handling
There is a open issue and a PR to solve a very similar issue.
The PR is stale and I don't see chance for it to land, so I move away from logentries and recommend doing the same. Probably cloudwatch will be better.

Aws limits monitoring with Nagios

I tried searching for this topic on Google and after many failed attempts I decided to post this as questions here.
What I want to achieve: Monitoring my aws limits using Nagios.
As I have understood aws cli can be used to get the limits of only few aws services, for more in depth cost management and service limit management one has to opt for trusted advisor. Unfortunately it's quite expensive.
So I was wondering if there's a much simpler way with Nagios in which I could get notified if any of the aws services for my account is hitting a limit?
What kind of service limit notification strategy is used by organizations(That can't afford to buy a subscription of trusted advisor) that use Aws?
You're right: only few services can show their limit (and current usage) through CLI or API. I don't like it either :) We have three options here:
Create a parser that grabs information from AWS Console (there is an example code here: https://forrestbrazeal.com/2015/07/20/adventures-in-aws-automating-service-limit-checks/).
Buy Trusted Advisor (btw, you can get a Trusted Advisor report with API call).
Try using awslimitchecker. Cause someone already tried to solve this problem.
https://awslimitchecker.readthedocs.io/en/latest/

Using CloudWatch API to get statistics

I have deployed a LAMP stack application on AWS. I need to monitor that using CloudWatch.
Can someone guide me on how to use the CloudWatch API for GetMetrics for CPU utilization? The AWS documentation is very scarce.
I see that the putmetrics call will let me create my own metrics.
My requirement is that I need to display those metric results in a mobile app.
My app monitors a project deployed on AWS. The alerts and metrics that come in must stream into the app.
I don't want just the metrics data in the AWS console,
I want it viewable in my mobile app. The app is developed in MEAN stack.
I must also add that the app is deployed on AWS and the application that is
being monitored is also in there(its a LAMP stack). I have managed to set 2 endpoints(HTTP and DB) and I have written
simple scripts in Javascript to monitor them. But ideally they should happen via Cloudwatch.
Providing a piece of code that replicates the issue that you are seeing normally allows who sees the question to help you better than guessing what you're doing.
Are you using an SDK to do this? What language/version?
here are links to the API docs:
http://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricStatistics.html
http://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_ListMetrics.html
The pattern is to list the metrics and after that use the result and feed it into getmetricsstatistics.
In your specific case, googling the issue a bit before might answer the question before you ask it on SO. For example:
https://forums.aws.amazon.com/thread.jspa?messageID=295740
This can happen when you are hitting the wrong endpoint. Check if you are hitting endpoint of the right AWS service.
For example, trying to hit DynamoDB's endpoint when you want to access CloudWatch APIs.