Automate AWS instance start and stop - amazon-web-services

I'm running a instance in amazon AWS and it runs non-stop everyday. I'm using ubuntu ec2 instance which is running Apache, Mirthconnect tool and LAMP server. I want to run this instance only on particular time duration of a day. I prefer not use any additional AWS services such as cloud-watch . Is there a way we could acheive this?.
The major purpose is for using Mirthconnect fetching data from mysql database

There are 3 solutions.
AWS Data Pipeline - You can schedule the instance start/stop just like cron. It will cost you one hour of t1.micro instance for every start/stop
AWS Lambda - Define a lambda function that gets triggered at a pre defined time. Your lambda function can start/stop instances. Your cost will be very minimal or $0
Write a shell script and run it as a cron job or run it on demand. The script will have AWS CLI command to start and stop the instance.
I used Data Pipeline for a long time before moving to Lambda. Data Pipeline is very trivial. Just paste the AWS CLI commands to stop and start instances. Lambda is more involved.

I guess for that you'll need another machine which is on 24x7. On which you can write cron job in python using boto or any other language like bash.
I don't see how you start a instance in stopped state without using any other machine.
Or you can have a simple raspberry pi on at your home which does the ON-OFF work for you using AWS CLI or simple Python. How about that? ;)

Related

aws autoscaling group ec2 instance to run script every minute

I want to have my ec2 instances to start running some automation script within the server once they are launched in the auto scaling group. This script will need to run every minute so userdata is probably not a good option. I wonder if there is a way to do this with ssm document. Any ideas would be much appreciated!
Do you need to store the logs of the script on either S3 or Cloudwatch? Then use an EventBridge rule and a SSM Run Command document.
Do you not care about such logs? Then use your operating system native tools: cron on Linux or task scheduler on Windows.

How to run cron job only on single instance in AWS AutoScaling?

I have scheduled 2 cronjobs for my application.
My Application server is in an autoscaling group and I kept a minimum of 2 instances because of High availability. Everything working is fine but cron job is running multiple times because of 2 instances in autoscaling.
I could not limit the instance size to 1 because already my application in the production environment I prefer to have HA.
How should I have to limit execute cron job on a single instance? or should i have to use other services like AWS Lamda or AWS ELasticBeanstalk
Firstly you should consider whether running the crons on these instances is suitable. If you're trying to keep this highly available and it is directly interacted via customers what will the impact of the crons performance be?
Perhaps consider using a separate autoscaling group or instance with a total of 1 instances to run these crons? You could launch the instance or update the autoscaling group just before the cron needs to run and then automate the shutdown after it has completed.
Otherwise you would need to consider using a locking mechanism for your script. By using this your script write a lock to confirm that it is in process, at the beginning of the script run it would check whether there was any script lock in progress. To further prevent the chance of a collision between multiple servers consider adding jitter (random seconds of sleep) to the start of your script.
Suitable technologies for writing a lock are below:
DynamoDB using strongly consistent reads.
EFS for a Linux application, or FSX for a Windows application.
S3 using strong consistency.
Solutions suggested by Chris Williams sound reasonable if using lambda function is not an option.
One way to simulate cron job is by using CloudWatch Events (now known as EventBridge) in conjunction with AWS Lambda.
First you need to write a Lambda function with the code that needs to be executed on a schedule. Lambda supports cron expressions.
You can then use Schedule Expressions with EventBridge/CloudWatch Event in the same way as a cron tab and mention the Lambda function as target.
you can enable termination protection on of the instance. Attach necessary role & permission for system manager. once the instance is available under managed instance under system manager you can create a schedule event in cloudwatch to run ssm documents. if you are running a bash script convert that to ssm document and set this doc as targate. or you can use shellscript document for running commands

Python pipeline on AWS Cloud

I have few python scripts which need to be executed in sequence on AWS Cloud so what are the best and simplest options? These script files are proof of concept so little bit dirty also but need to run overnight. Most of the script finishes within 10 mins but couple of them can take up to 1 hour running on a single core.
We do not have any servers like Jenkins, airflow etc...we are planning to use existing aws services.
Please let me know, Thanks.
1) EC2 Instance (Manually controlled)
Upload your scripts to an S3 bucket Use default VPC
launch EC2 Instance
Use SSM Remote session to log in
Run AWS CLI (AWS S3 Sync to download from S3)
Run them Manually
stop instance when done.
To be clean, make a SH file (or master .py file) to do the work. If you want it to stop charging you money afterwards, add command to stop instance when complete.
Least amount of work
2) If you want to run scripts daily
- Script out the work above (include modifying the Autoscale group at end to go to one box)
- Create an EC2 Auto Scale Group and launch it on a CRON job schedule.
It will start up, do the work, and then shut down and stop charging you.
3) Lambda
Pretty much like option 2, but AWS will do most of the work for you.
Either put all your scripts into one lambda..or put each script into its own lambda and have a master that does sync invoke of each script in the order you want.
You have a cloudwatch alarm trigger daily and does the work
I would say that if you are in POC mode, option 1 is best decision. It is likely closest to what you already do where you are currently executing. This is what #jarmod recommended already.
You didn't mention anything about which AWS resources your python scripts need to access or at least the purpose of the scripts, so it is difficult to provide a solution.
However a good option is to use AWS Batch.

Run a batch file on EC2 from a (python) lambda

I can see a generic way of starting an EC2 from lambda in Start and Stop Instances at Scheduled Intervals Using Lambda and CloudWatch.
Suppose I use that method to start an EC2, and suppose the AMI is a windows server 2019 customised to have a .bat file on the desktop, and also suppose I'm using a python lambda.
How can I execute this batch file from the lambda? (i.e. just as though someone had RDP'd into the instance and double-clicked on it)
Note: To be very clear, basically I want to start the EC2 using the method given in the AWS docs (above), and right after the instance has started, to run the batch file that will be sitting on the instance's desktop
I think you have a few concepts mixed together.
AWS Lambda functions run on the Lambda service, without having to use Amazon EC2 instances. This is what makes them "serverless".
If you have a batch file on an Amazon EC2 instance, you would presumably want to run that batch file on the EC2 instance itself, without involving Lambda (since you have got a server).
If you wish to run a script on an EC2 instance when it launches for the first time, you can provide a PowerShell or Command-Line script via the User Data field. Software on the AMI will automatically execute this script the first time that the instance starts.
This script could do all the work itself, or it could simply call another script that is stored on the disk. Some people use the script to download another script from a repository (eg Amazon S3 or GitHub) and then execute the downloaded script.
For more information, see: Running Commands on Your Windows Instance at Launch - Amazon Elastic Compute Cloud
If the Amazon EC2 instance is already running and you wish to trigger a script to execute, you can use the AWS Systems Manager Run Command. This works by having an agent on the instance which can be remotely triggered, thereby running scripts without having to login to the instance.

AWS: Automating queries in redshift

I want to automate a redshift insert query to be run every day.
We actually use Aws environment. I was told using lambda is not the right approach. Which is the best ETL process to automate a query in Redshift.
For automating SQL on Redshift you have 3 options (at least)
Simple - cron
Use a EC2 instance and set up a cron job on that to run your SQL code.
psql -U youruser -p 5439 -h hostname_of_redshift -f your_sql_file
Feature rich - Airflow (Recommended)
If you have a complex schedule to run then it is worth investing time learning and using apache airflow. This also needs to run on a server(ec2) but offers a lot of functionality.
https://airflow.apache.org/
AWS serverless - AWS data pipeline (NOT Recommended)
https://aws.amazon.com/datapipeline/
Cloudwatch->Lambda->EC2 method described below by John Rotenstein
This is a good method when you want to be AWS centric, it will be cheaper than having a dedicated EC2 instance.
One option:
Use Amazon CloudWatch Events on a schedule to trigger an AWS Lambda function
The Lambda function launches an EC2 instance with a User Data script. Configure Shutdown Behavior as Terminate.
The EC2 instance executes the User Data script
When the script is complete, it should call sudo shutdown now -h to shutdown and terminate the instance
The EC2 instance will only be billed per-second.
Redshift now supports scheduled queries natively: https://docs.aws.amazon.com/redshift/latest/mgmt/query-editor-schedule-query.html
You can use boto3 and psycopg2 to run the queries by creating a python script
and scheduling it in cron to be executed daily.
You can also try to convert your queries into Spark jobs and schedule those jobs to run in AWS Glue daily. If you find it difficult, you can also look into Spark SQL and give it a shot. If you are going with Spark SQL, keep in mind the memory usage as Spark SQL is pretty memory intensive.