Docker run in AWS ECS - amazon-web-services

I have a command that I am currently running from my OS to run a docker container that takes in a file as an argument and returns some output.
docker run --rm -v ${pwd}:/dir IMAGE [COMMAND] [ARGS]
This allows me to run this container each time i get a new file, obtain output, and spin down the container. I would like to move this to AWS but I'm a bit unsure of how I'd be able to replicate the ad-hoc nature of this command? Does AWS Support docker run?

So, the short answer is no, you do not have the ability to do this on ECS like how you do normally. However, you can build it so you spin up tasks dynamically when a file is uploaded in say S3.
What you want to do is:
S3 Event Notification => EventBridge => StepFunction => ECS RunTask API
With a setup like this, you will be able to run a task to process a file each time it's uploaded
useful links:
https://aws.amazon.com/blogs/aws/new-use-amazon-s3-event-notifications-with-amazon-eventbridge/
https://aws.amazon.com/blogs/compute/introducing-the-amazon-eventbridge-service-integration-for-aws-step-functions/
https://docs.aws.amazon.com/step-functions/latest/dg/connect-ecs.html

There is a gazillion ways you can refactor your application to run on AWS as others have described. If you do not want to go down that path perhaps the closest architecture to what you are already doing on your laptop would be to use AWS ECS + the EFS integration (as described in this blog). You can define a task (with a container) that mounts an EFS share. By populating that EFS share with the file you need will have the container access that file and work on it just like it does locally on your laptop.

Related

Can I improve my setup in AWS for running (machine learning) python scripts in a container when a file is uploaded to S3?

I have a working setup in AWS that looks something like:
The point is that whenever a file is uploaded to S3, it will trigger a lambda that will trigger a Codebuild project. The codebuild project is then based on a docker image (Stored at ECR) and needs to run a few bash command, mainly executing python files in the docker image. That works really well actually.
The files in S3 are updated approximately once a day and each execution in codebuild takes around 4 minutes.
I got the question why I am not using fargate/SageMaker (the scripts are basicly machine learning retraining and predictions). I was just thinking about if there would be any advantages in using Fargate and/or SageMaker for this? Is it e.g. possible to use Fargate and execute bash commands inside the container when triggered?
IIUC, You're wondering the diffenerce between CodeBuild and Fargate/SageMaker
Price
Calcute the price of these three products using the links below.
Pricing Fargate
Pricing SageMaker
Pricing CodeBuild
As your said, you're using the docker image as the main training tools, so maybe the Fargateis more suitable for your scenario.

Python pipeline on AWS Cloud

I have few python scripts which need to be executed in sequence on AWS Cloud so what are the best and simplest options? These script files are proof of concept so little bit dirty also but need to run overnight. Most of the script finishes within 10 mins but couple of them can take up to 1 hour running on a single core.
We do not have any servers like Jenkins, airflow etc...we are planning to use existing aws services.
Please let me know, Thanks.
1) EC2 Instance (Manually controlled)
Upload your scripts to an S3 bucket Use default VPC
launch EC2 Instance
Use SSM Remote session to log in
Run AWS CLI (AWS S3 Sync to download from S3)
Run them Manually
stop instance when done.
To be clean, make a SH file (or master .py file) to do the work. If you want it to stop charging you money afterwards, add command to stop instance when complete.
Least amount of work
2) If you want to run scripts daily
- Script out the work above (include modifying the Autoscale group at end to go to one box)
- Create an EC2 Auto Scale Group and launch it on a CRON job schedule.
It will start up, do the work, and then shut down and stop charging you.
3) Lambda
Pretty much like option 2, but AWS will do most of the work for you.
Either put all your scripts into one lambda..or put each script into its own lambda and have a master that does sync invoke of each script in the order you want.
You have a cloudwatch alarm trigger daily and does the work
I would say that if you are in POC mode, option 1 is best decision. It is likely closest to what you already do where you are currently executing. This is what #jarmod recommended already.
You didn't mention anything about which AWS resources your python scripts need to access or at least the purpose of the scripts, so it is difficult to provide a solution.
However a good option is to use AWS Batch.

Container is not able to call S3 in Fargate

I'm not able to synchronize a log-folder to s3 inside a container.
I'm trying to get the following setup:
Docker Container with installed awscli
there are logfiles and other files generated inside the container
There is a cronjob, which calls the "aws s3 sync" command through a shell-script.
The synchronisation is not working properly and I'm not sure why not.
I tried the following, which worked just fine:
provided access key/secret access key inside the docker container
this worked locally, with plain ECS and with fargate
but it's not recommended to use the access keys
plain ECS without any keys (just the IAM role)
this worked too
I played a little with the configuration and read through the documentation.
The only hints I got are:
Has it something to do with the network mode "awsvpc"? (which fargate has to use)
Has it something to do with the "AWS_CONTAINER_CREDENTIALS_RELATIVE_URI" path variable?
I found a few hits there on the web, but I'm not sure if it's set or not. I'm not able to look inside the container in fargate.
ECS Task Definition has two parameters related to defining IAM Role.
executionRoleArn - Provides access to the task or container to start running by performing needed actions such as pulling images from ECR, writing logs to Cloudwatch.
taskRoleArn - Allows the Task to execute AWS API calls to interact with AWS resources such as S3, etc...
In my case i had a shell script which i used to call using entrypoint in the task definition. I had correctly set the Task Role with access to S3 however it did not work. So using the information provided here https://forums.aws.amazon.com/thread.jspa?threadID=273767#898645
i added the first line in my shell script as
export AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
Still it did not work. Then i upgraded the AWS cli on the docker container to version 2 and it worked. So for me the real problem was that the docker image had an old CLI version.

Continuous Deployment of Docker Compose App to AWS/EC2

I've been trying to find an efficient way to handle continuous deployment with a Docker compose setup and AWS hosting.
So far I've looked into CodeDeploy, S3 buckets, and ECS. My application is relatively small with only 3 docker services, a Django app, NGINX, and PostgreSQL. I was unable to find any reliable information for using CodeDeploy with Docker compose and because of the small scale ECS seems impractical. I've considered an S3 bucket but that seems no better than just deploying my application with something like git or scp.
What is a standard way of handling deploying a docker compose setup on AWS? If possible I would like to use Bitbucket Pipelines or CircleCI to perform the deployment in a manually triggered step after running tests. But I've been unable to find a solution that would easily let me copy over the code (which is in a git repo on a production branch and is how I get the code onto the production server at the moment).
I would like to add some possibilities to #gasc answer
It would be better if you make a cloudformation template for deploying your EC2 resources with all required groups, auto scaling and other stuff.
Then Create the AMI with docker compose installed or any other thing you would be required for your ec2 enviroment.
Then you can use code deploy pipeline, here also aws provides private container registry may be you want to use that
Rest of the steps are same just SCP the compose file into EC2 launch
docker-compose up
command and you are done.
Let me know if you want more help I'm open for discussion
What I will do in your case is:
1 - If needed, update your docker-compose.yml file (or however you called it) to version 3 or higher, to use swarm.
2 - During your pipeline build all images needed, and push them to a registry.
3 - In your pipeline scp your compose file to a manager node.
4 - Deploy your application using swarm (docker stack deploy -c <your-docker-compose-file> your_app_name). This way you can handle rolling updates and scale easily.
Note that if you want to use multiple nodes you need to open a few ports in them
I see you mentioned that ECS might seem impractical for such a small scale - in my opinion not necesarilly. It would require of you to rewrite your docker-compose.yml into task and services definitions, but since there's not a lot of services, that shouldn't take you much time.

build and push docker image to AWS ECR using lambda

Is it possible to automate building a docker image from code committed into github (no tests involved) and then push it to AWS ECR using a lambda function?
you cannot do it just with lambda as lambda is not really a suitable execution environment for the docker daemon (necessary to build the images), however you can use lambda + sns to trigger an endpoint that could point to a service you developed, hosted on ec2 that would trigger the docker build command after a git clone (you can use something similar to python's fabfile.org or a framework that allows you to execute server commands).
You sure can extend this idea on perhaps bringing the ec2 build machine up with some ami that automates this, etc....
The big point here is that you don't really have control over what's provisioned in lambda, so you need ec2.