using CloudWatch Events to trigger cron jobs on an EC2 instance overkill? - amazon-web-services

We have an EC2 server that runs cronjobs. Currently there is a crontab on that server that holds the cronjob settings. Everything runs perfectly fine on this server.
Would it be overkill to use AWS Cloudwatch Events to trigger the crons instead? ie create a cloudwatch event that calls a lambda to run a shell command on the EC2 instance.
My thinking is that these would be possible benefits:
no need to manage a crontab file on the EC2 server
easier to activate/deactivate specific cronjobs

looks like there are indeed benefits according to the AWS Docs:
https://aws.amazon.com/blogs/compute/scheduling-ssh-jobs-using-aws-lambda/
Decouple job schedule and AMI: If your cron jobs are part of an AMI, each schedule change requires you to create a new AMI version, and update existing instances running with that AMI. This is both cumbersome and time-consuming. Using scheduled Lambda functions, you can keep the job schedule outside of your AMI and change the schedule on the fly.
Flexible targeting of EC2 instances: By abstracting the job schedule from AMI and EC2 instances, you can flexibly target a subset of your EC2 instance fleet based on tags or other conditions. In this example, we are targeting EC2 instances with the “Environment=Dev” tag.
Intelligent scheduling: With scheduled Lambda functions, you can add custom logic to you abstracted job scheduler.

In my experience it's not an over kill at all. I have used same setup with great success running job(s) (around 50 different jobs) with heavy workload.
My setup was slightly different
The cloudwatch scheduled event was calling a lambda which in turn was putting a messages on a sqs and application in running on ec2 instance(s) was grabbing messages from the sqs and processing them.
The sqs was simply added for robustness.
But this may or may not make sense in your use case.

Related

AWS architecture for create EC2 windows (using AMI), run a my_command.bat, process and terminate EC2 monthly

I am not sure if this is the right platform to ask this type of question but this is my last hope as I am new to AWS.
I have requirement:
Given: AMI ID (has my_software installed, my_command.bat created to run my_software using CLI).
ToDo (Repeats every month):
Create an EC2 instance (Windows) from the given AMI-ID (Windows).
Run my_command.bat file in the instance which runs my_software which generates report.csv and log.csv files.
Send report.csv to my_s3_bucket and log to CloudWatch.
Terminate the instance. (Stop is not enough).
Please suggest the architecture for the same.
I do similar and it works as follows:
Create docker image which runs your command (it should log and write to s3 as required).
Create a batch environment which can pull the image and run the container.
Create Cloudwatch event to run the job at your desired cron schedule
The event target is an SQS queue
Create a lambda function which responds to events on the SQS queue
The lambda function submits the batch job
Note in theory you can call the Lambda function from the Cloudwatch event rule, but I have to use a queue due to a private subnet.
Batch Fargate is a good fit for this, because it will only use resources whilst your job is running. If you need to use EC2, ensure that MinvCpus is zero, so that the instance terminates after it has run. This can take a few minutes, which you will be charged for, so perhaps Fargate is a better fit.

How to run cron job only on single instance in AWS AutoScaling?

I have scheduled 2 cronjobs for my application.
My Application server is in an autoscaling group and I kept a minimum of 2 instances because of High availability. Everything working is fine but cron job is running multiple times because of 2 instances in autoscaling.
I could not limit the instance size to 1 because already my application in the production environment I prefer to have HA.
How should I have to limit execute cron job on a single instance? or should i have to use other services like AWS Lamda or AWS ELasticBeanstalk
Firstly you should consider whether running the crons on these instances is suitable. If you're trying to keep this highly available and it is directly interacted via customers what will the impact of the crons performance be?
Perhaps consider using a separate autoscaling group or instance with a total of 1 instances to run these crons? You could launch the instance or update the autoscaling group just before the cron needs to run and then automate the shutdown after it has completed.
Otherwise you would need to consider using a locking mechanism for your script. By using this your script write a lock to confirm that it is in process, at the beginning of the script run it would check whether there was any script lock in progress. To further prevent the chance of a collision between multiple servers consider adding jitter (random seconds of sleep) to the start of your script.
Suitable technologies for writing a lock are below:
DynamoDB using strongly consistent reads.
EFS for a Linux application, or FSX for a Windows application.
S3 using strong consistency.
Solutions suggested by Chris Williams sound reasonable if using lambda function is not an option.
One way to simulate cron job is by using CloudWatch Events (now known as EventBridge) in conjunction with AWS Lambda.
First you need to write a Lambda function with the code that needs to be executed on a schedule. Lambda supports cron expressions.
You can then use Schedule Expressions with EventBridge/CloudWatch Event in the same way as a cron tab and mention the Lambda function as target.
you can enable termination protection on of the instance. Attach necessary role & permission for system manager. once the instance is available under managed instance under system manager you can create a schedule event in cloudwatch to run ssm documents. if you are running a bash script convert that to ssm document and set this doc as targate. or you can use shellscript document for running commands

AWS "Lambda" Needing More CPU/RAM - Trigger EC2 "Job?"

I have a daily process that needs to digest a tremendous amount of data from two external sources. It normally requires around 28GB or RAM, and a decent amount of processing power. Due to this, an AWS Lambda won't work.
In the meantime, I've been running the process on an EC2 instance. In order to save resources, I've attempted to start the instance using a CloudWatch event. Since no event exists for "StartEC2," I'm kicking off a AWS Lambda instead, which in turn starts the EC2 isntance using Amazon support libraries.
All of this is extremely cumbersome, and I've been looking for a library or pattern that can do what I want. Essentially, I need to start an EC2 instance on a cron/event, deliver a unit of work to it (Shell Script, Java App, whatever), have it run it, then shutdown.
I'd love any suggestions for accomplishing this.
Look into AWS Systems Manager (SSM), you can create an Automation document that will launch the instance, run any custom scripts or tasks, and shut it down again when you're done. You can trigger the SSM Automation with a cron schedule via CloudWatch Events.
You may also want to consider AWS Batch for this type of workload.

Scheduling the stopping/starting of an EC2 instance when not in use by a Beanstalk Deployment or an ECS task?

I have a Docker image containing Python code and third-party binary executables. There are only outbound network requests. The image must run hourly and each execution lasts ~3 minutes.
I can:
Use an EC2 instance and schedule hourly execution via cron
Create a CloudWatch Event/Rule to run an ECS Task Defintion hourly
Setup an Elastic Beanstalk environment and schedule hourly deployment of the image
In all of these scenarios, an EC2 instance is running 24/7 and I am being charged for extended periods of no usage.
How do I accomplish scheduling the starting of an existing EC2 instance hourly and the stopping of said instance after the completion of my docker image?
Here's one approach I can think of. It's very high-level, and omits some details, but conceptually it would work just fine. You'll also need to consider the Identity & Access Management (IAM) Roles used:
CloudWatch Event Rule to trigger the Step Function
AWS Step Function to trigger the Lambda function
AWS Lambda function to start up EC2 instances
EC2 instance polling the Step Functions service for Activity Tasks
Create a CloudWatch Event Rule to schedule a periodic task, using a cron expression
The Target of the CloudWatch Event Rule is an AWS Step Function
The AWS Step Function State Machine starts by triggering an AWS Lambda function, which starts the EC2 instance
The next step in the Step Functions State Machine invokes an Activity Task, representing the Docker container that needs to execute
The EC2 instance has a script running on it, which polls the Activity Task for work
The EC2 instance executes the Docker container, waits for it to finish, and sends a completion message to the Step Functions Activity Task
The script running on the EC2 instance shuts itself down
The AWS Step Function ends
Keep in mind that a potentially better option would be to spin up a new EC2 instance every hour, instead of simply starting and stopping the same instance. Although you might get better startup performance by starting an existing instance vs. launching a new instance, you'll also have to spend time to maintain the EC2 instance like a pet: fix issues if they crop up, or patch the operating system periodically. In today's world, it's a commonly accepted practice that infrastructure should be disposable. After all, you've already packaged up your application into a Docker container, so you most likely don't have overly specific expectations around which host that container is actually being executed on.
Another option would be to use AWS Fargate, which is designed to run Docker containers, without worrying about spinning up and managing container infrastructure.
AWS Step Functions
AWS Fargate
Blog: AWS Fargate: An Overview
Creating a CloudWatch Event Rule that triggers on a schedule

PHP AWS Elastic Beanstalk background workers

I have deployed my application using Elastic Beanstalk, since this gives me a very easy deployment flow, to multiple instances at once using the "git aws.push".
I like to add background processing support to my application. The background worker will use the same codebase, and simply start up a long lived php script that continuously looks for tasks to execute. What AWS should i use to create such a worker instance?
Should i use the EB for this aswell or should i try to setup a standard EC2 instance (since i dont need it to be public available) ? I guess thats the right way of doing it and then create a deployment flow that make it easy to deploy to both my EC2 worker instances and to Elastic beanstalk app? or is there a better way of doing this?
AWS EB now adds support for Worker Instances. They're just a different kind of environment which those two differences:
They don't have a cnamePrefix (whatever.elasticbeanstalk.com)
Instead, they've got a SQS queue bound
On each instance, they run a daemon called sqsd which basically poll their environments' sqs queue and forward it to the local http server.
I believe it is worth a try.
If the worker is just polling a queue for jobs and does not require an ELB, then all you need to do is work with EC2, SQS, and probably S3. You can start EC2 instances as part of an Auto-scaling group that, for example, is configured to scale as a function of depth of the SQS queue. When there is no work to do you can have minimum # of EC2, but if the queue gets deep, auto-scaling will spin up more.