Monitoring EC2 instances and alerting with PagerDuty - amazon-web-services

I already know how to integrate NewRelic and PagerDuty for a server. Recently I have added an EC2 instance and I am able to see the plots and all the information in the NewRelic. I am not able to integrate that EC2 with PagerDuty for alerting. Can you please let me know if NewRelic supports it for EC2?

New Relic's alerting integrations should work the same for any monitored app, regardless of the platform. If you're having trouble integrating a particular app with PagerDuty, please file a support request with support#newrelic.com

You can find the information on how to do it here:
http://www.pagerduty.com/docs/guides/new-relic-integration-guide/

Related

Find what is making EC2 IMDSv1 calls on Windows Servers

I'm trying to get all our instances (all Windows based) upgraded to IMDSv2 and have been following the advice found here https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html#instance-metadata-transition-to-version-2 and using CloudWatch to find instances making MetadataNoToken calls (i.e. using IMDSv1).
I've found several instances using IMDSv1 this way, but I can't work out how to find out what is making the calls from with the OS.
According to CloudWatch each server is making one call per minute to the IMDSv1 service.
The support article mentions upgrading any AWS SDKs or CLI tools, but the servers in question don't have seem to have any SDKs or CLI tools installed.
Each instance has the following AWS published tools installed on them:
Amazon SSM Agent
Amazon CloudWatch Agent
AWS Tools for Windows
EC2ConfigService
AWS PV Drivers
aws-cfn-bootstrap
I've updated the Amazon SSM Agent and the Amazon CloudWatch Agent to the latest versions. But I can't find any information about how to update the AWS Tools for Windows package.
I've also run TCPView from Sysinternals on the servers and tried to find what process is making calls to the 169.254.169.254 endpoint, but it doesn't seem to pick up any traffic to that address.
I'm reluctant to just disable IMDSv1 and do a scream test as they are production servers.
If anyone has any advice or guidance on how to find what is making the IMDSv1 calls it would be appreciated.
I figured it out in the end, using the £Windows Resource Monitor Network monitor" tool, I found the exectucable that was making the calls.
I've written up the proceess in this blog post:
https://www.greystone.co.uk/2022/03/24/how-greystone-upgraded-its-aws-ec2-instances-to-use-instance-meta-data-service-version-2-imdsv2/

Unable to autoscale GCP instances using custom memory metrics

I am trying to autoscale gcp instances based on memory metrics but I am unable to find the way how this can be done. I have tried to setup this through "stackdriver monitoring metrics" but no luck. Can someone help here how this can be done.
This is similar problem like posted on google forum but no proper answer here as well.
https://groups.google.com/forum/#!topic/gce-discussion/X6LA0-8mFak
It's required to install the Stackdriver Monitoring Agent by following this documentation.
Once installed, you will get more options to configure your autoscaler from your instance group page

How to visualize AWS Elastic Beanstalk application logs

We are using AWS Elastic Beanstalk for deploying application. Currently we have two Elastic Beanstalk applications and two worker processes (that pick message from AWS SQS Queue and process it).
What can be the best tools to view the combine logs from the Elastic Beanstalk application and worker and a few more on-premise applications in future?
Throw the logs in AWS ElasticSearch and the use Kibana, which comes with ElasticSearch, to visualize them.
I used the suggestion and configured Cloud watch logs, Elastic Search, and Kibana; but i am not getting all logs and all insights. I can see httpd access & error logs, ebs access & error logs. It also seems lot of AWS services and configuration. Since I am very new to AWS; therefore, so I am facing trouble in setting things up
Alternatively as suggested by my boss, I tried "New relic" - It was very simple to configure and I can see lot of insights of my EBS application in "New Relic" console. I can also configure my Browser, iOS app, Android app, AWS infrastructure (AWS Services) in one New Relic console. Some details are missing in New Relic console such error stack trace, request params in POST request, and so on; But I also don't want to share such details with New Relic, so, that is ok.
I will use "New Relic" and Cloudwatch logs (for real time investigation into failing HTTP REST services) right now; but I will explore more options inside AWS: Elastic Search and Kibana
Many Thanks

How to integrate on premise logs with GCP stackdriver

I am evaluating stackdriver from GCP for logging across multiple micro services.
Some of these services are deployed on premise and some of them are on AWS/GCP.
Our services are either .NET or nodejs based apps and we are invested in winston for nodejs and nlog in .net.
I was looking # integrating our on-premise nodejs application with stackdriver logging. Looking # https://cloud.google.com/logging/docs/setup/nodejs the documentation it seems that there we need to install the agent for any machine other than the google compute instances. Is this correct?
if we need to install the agent then is there any way where I can test the logging during my development? The development environment is either a windows 10/mac.
There's a new option for ingesting logs (and metrics) with Stackdriver as most of the non-google environment agents look like they are being deprecated. https://cloud.google.com/stackdriver/docs/deprecations/third-party-apps
A Google post on logging on-prem resources with stackdriver and Blue Medora
https://cloud.google.com/solutions/logging-on-premises-resources-with-stackdriver-and-blue-medora
for logs you still need to install an agent on each box to collect the logs, it's a BindPlane agent not a Google agent.
For node.js, you can use the #google-cloud/logging-winston and #google-cloud/logging-bunyan modules from anywhere (on-prem, AWS, GCP, etc.). You will need to provide projectId and auth credentials manually if not running on GCP. Instructions on how to set these up is available in the linked pages.
When running on GCP we figure out the exact environment (App Engine, Compute Engine, etc.) automatically and the logs should up under those resources in the Logging UI. If you are going to use the modules from your development machines, we will report the logs against the 'global' resource by default. You can customize this by passing a specific resource descriptor yourself.
Let us know if you run into any trouble.
I tried setting this up on my local k8s cluster. By following this: https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/
But i couldnt get it to work, the fluentd-gcp-v2.0-qhqzt keeps crashing.
Also, the page mentions that there are multiple issues with stackdriver logging if you DONT use it on google GKE. See the screenshot.
I think google is trying to lock you in into GKE.

Using CloudWatch API to get statistics

I have deployed a LAMP stack application on AWS. I need to monitor that using CloudWatch.
Can someone guide me on how to use the CloudWatch API for GetMetrics for CPU utilization? The AWS documentation is very scarce.
I see that the putmetrics call will let me create my own metrics.
My requirement is that I need to display those metric results in a mobile app.
My app monitors a project deployed on AWS. The alerts and metrics that come in must stream into the app.
I don't want just the metrics data in the AWS console,
I want it viewable in my mobile app. The app is developed in MEAN stack.
I must also add that the app is deployed on AWS and the application that is
being monitored is also in there(its a LAMP stack). I have managed to set 2 endpoints(HTTP and DB) and I have written
simple scripts in Javascript to monitor them. But ideally they should happen via Cloudwatch.
Providing a piece of code that replicates the issue that you are seeing normally allows who sees the question to help you better than guessing what you're doing.
Are you using an SDK to do this? What language/version?
here are links to the API docs:
http://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricStatistics.html
http://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_ListMetrics.html
The pattern is to list the metrics and after that use the result and feed it into getmetricsstatistics.
In your specific case, googling the issue a bit before might answer the question before you ask it on SO. For example:
https://forums.aws.amazon.com/thread.jspa?messageID=295740
This can happen when you are hitting the wrong endpoint. Check if you are hitting endpoint of the right AWS service.
For example, trying to hit DynamoDB's endpoint when you want to access CloudWatch APIs.