How can we monitor the system status check of all EC2 instances simultaneously rather than setting up cloudwatch alarm at each EC2 level individually?
If it's not possible via cloudwatch service, can it be done using boto3?
If you don't want to setup alarms individually, you can automate it, whenever you boot new instance, you can setup cloudwatch rule(when instance state changes from pending to online) to trigger a lambda function, in lambda function you can setup cloudwatch alarm for that instance, for already existing instances also you can setup the alarm with little modification of that script.
Related
We have an EC2 server that runs cronjobs. Currently there is a crontab on that server that holds the cronjob settings. Everything runs perfectly fine on this server.
Would it be overkill to use AWS Cloudwatch Events to trigger the crons instead? ie create a cloudwatch event that calls a lambda to run a shell command on the EC2 instance.
My thinking is that these would be possible benefits:
no need to manage a crontab file on the EC2 server
easier to activate/deactivate specific cronjobs
looks like there are indeed benefits according to the AWS Docs:
https://aws.amazon.com/blogs/compute/scheduling-ssh-jobs-using-aws-lambda/
Decouple job schedule and AMI: If your cron jobs are part of an AMI, each schedule change requires you to create a new AMI version, and update existing instances running with that AMI. This is both cumbersome and time-consuming. Using scheduled Lambda functions, you can keep the job schedule outside of your AMI and change the schedule on the fly.
Flexible targeting of EC2 instances: By abstracting the job schedule from AMI and EC2 instances, you can flexibly target a subset of your EC2 instance fleet based on tags or other conditions. In this example, we are targeting EC2 instances with the “Environment=Dev” tag.
Intelligent scheduling: With scheduled Lambda functions, you can add custom logic to you abstracted job scheduler.
In my experience it's not an over kill at all. I have used same setup with great success running job(s) (around 50 different jobs) with heavy workload.
My setup was slightly different
The cloudwatch scheduled event was calling a lambda which in turn was putting a messages on a sqs and application in running on ec2 instance(s) was grabbing messages from the sqs and processing them.
The sqs was simply added for robustness.
But this may or may not make sense in your use case.
Looking into adding autoscaling of a portion of our application using AWS simple message queuing which would launch EC2 on-demand or spot instances based on queue backlog.
One question I had, is how do you deal with collecting logs from autoscaled instances? New instances are spun up based on an image, but then they are shut down when complete. Currently, if there is an issue with one of our services, which causes it to crash, we have a system to automatically restart the service, but the logs and core dump files are there to review. If we switch to an autoscaling system, where new instance are spun up, how do you get logs and core dump files when there is a failure? Particularly if the instance is spun down.
Good practice is to ship these logs and aggregate them somewhere else, and there are many services such as DataDog and Rapid7 which will do this for you at a cost.
AWS however provides CloudWatch logs, which gives you a central place to store and view logs. It also allows you then to give users access to logs on the AWS console without them having to ssh onto a server.
Shipping your logs to CloudWatch logs requires the installation of the CloudWatch agent on your server and specifying in the config which logs to ship.
You could install the CloudWatch agent once and create an AMI of that server to use in your autoscaling group, or install and configure the CloudWatch agent in userdata for every time a server is spun up.
All the information you need to get started can be found here:
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Install-CloudWatch-Agent.html
I'm trying to automate the turning on and off process of Redis Cluster in aws. I saw the following link for reference (https://forums.aws.amazon.com/thread.jspa?threadID=149772). Is there a way to do it via cloudwatch ?
I am very new to aws platform.
Check the documentation regarding scale in/out
https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/redis-cluster-resharding-online.html It also has commands to reshard a cluster manually.
Check CloudWatch metrics from the Redis cluster. https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/CacheMetrics.HostLevel.html and https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/CacheMetrics.Redis.html Choose the metrics that will trigger autoscaling
You can trigger an AWS Lambda on some event for a metric https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/RunLambdaSchedule.html
From the Lambda you cal call aws cli to reshard the cluster as described in 1. Example: https://alestic.com/2016/11/aws-lambda-awscli/
If you need to turn off the cluster completely, instead of the resharding commands just use https://docs.aws.amazon.com/cli/latest/reference/elasticache/delete-cache-cluster.html
Had couple of questions on AWS:
Is there a way by which I can recreate/write AWS CloudWatch metrics to DynamoDB?
If an Amazon EC2 instance is deleted or if I change a VPC, I need to recreate all CloudWatch metrics manually every time. Is there a way by which I can automate CloudWatch metrics creation for every new VPC instance? Through Terraform, I can only create CloudWatch metric alarms, events and logs but not CloudWatch metrics (eg, EC2, RDS metrics etc).
#1 I could achieve it via AWS CLI and via Python script thereby writing it to dynamodb as well. #2 is still open.
Say I set up aws cloudwatch logging on an ec2 instance to centralize logs from various files. If I have auto-scaling and a new machine gets started up due to high traffic, will the new copied machine start sending logs then too? Does logging work with auto-scale?
As long as the CloudWatch Logs agent is installed and configured on the AMI that is used for auto-scaling, the logs for the new instance(s) will be sent to CloudWatch. You can use the Instance ID when configuring the CloudWatch Logs agent to be able to identify which instance originated the event in the logs.
Also, make sure the instances have the necessary IAM role policy to publish the logs to CloudWatch.