AWS ECS leader commands (django migrate) - django

We are currently deploying our Django APP on AWS Elastic Beanstalk. There we execute the django db migrations using container commands, where we assure we only run migrations on one instance by using the "leader_only" restriction.
We are considering to move our deployment to AWS EC2 Container Service. However, we cannot figure out a way to enforce the migrate to only be run on one container when new image is being deployed.
Is it possible to configure leader_only commands in AWS EC2 Container Service?

There is a possibility to use ECS built-in functionality to handle deployments that involve migrations. Basically, the idea is the following:
Make containers fail their health checks if they are running against an unmigrated database, e.g. via making a custom view and checking for the existence of the migrations execution plan executor.migration_plan(executor.loader.graph.leaf_nodes())
status = 503 if plan else 200
Make a task definition that does nothing more then just migrates the database and make sure it is scheduled for execution with the rest of the deployment process.
The result is deployment process will try to bring one new container. This new container will fail health checks as long as database is not migrated and thus will block all the further deployment process (so you will still have old instances running to serve requests). Once migration is done - health check will now succeed, so the deployment will unblock and proceed.
This is by far the most elegant solution I was able to find in terms of running Django migrations on Amazon ECS.
Source: https://engineering.instawork.com/elegant-database-migrations-on-ecs-74f3487da99f

Look at using Container Dependency in your task definition to make your application container wait for a migration container to successfully complete. Here's a brief example of the container_definitions component of a task definition:
{
"name": "migration",
"image": "my-django-image",
"essential": false,
"command": ["python3", "mange.py migrate"]
},
{
"name": "django",
"image": "my-django-image",
"essential": true,
"dependsOn": [
{
"containerName": "migration",
"condition": "SUCCESS"
}
]
}
The migration container starts, runs the migrate command, and exits. If successful, then the django container is launched. Of course, if your service is running multiple tasks, each task will run in this fashion, but once migrations have been run once, additional migrate commands will be a no-op, so there's no harm.

For the ones using task definitions JSON all we need to do is flag a container as not essential on containerDefinitions
{
"name": "migrations",
"image": "your-image-name",
"essential": false,
"cpu": 24,
"memory": 200,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "your-logs-group",
"awslogs-region": "your-region",
"awslogs-stream-prefix": "your-log-prefix"
}
},
"command": [
"python3", "manage.py", "migrate"
],
"environment": [
{
"name": "ENVIRON_NAME",
"value": "${ENVIRON_NAME}"
}
]
}
I flagged this container as "essential": false.
"If the essential parameter of a container is marked as true, and that container fails or stops for any reason, all other containers that are part of the task are stopped. If the essential parameter of a container is marked as false, then its failure does not affect the rest of the containers in a task. If this parameter is omitted, a container is assumed to be essential."
source: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task_definition_parameters.html

Honestly, I have not figured this out. I have encountered exactly the same limitation on ECS (as well as others which made me abandon it, but this is out of topic).
Potential workarounds:
1) Run migrations inside your init script. This has the flaw that it runs on every node at the time of the deployment. (I assume you have multi-replicas)
2) Add this as a step of your CI flow.
Hope I helped a bit, in case I come up with another idea, I'll revert back here.

It's not optimal, but you can simply run it as a command in task definition
"command": ["/bin/sh", "-c", "python manage.py migrate && gunicorn -w 3 -b :80 app.wsgi:application"],

Related

How can I set up Continuous Integration of a Dockerized application to Elastic Beanstalk?

I'm new to Docker, and my previous experience is with deploying Java web applications (running in Tomcat containers) to Elastic Beanstalk. The pipeline I'm used to goes something like this: a commit is checked into git, which triggers a Jenkins job, which builds the application JAR (or WAR) file, publishes it to Artifactory, and then deploys that same JAR to an application in Elastic Beanstalk using eb deploy. (Apologies if "pipeline" is a reserved term; I'm using it conceptually.)
Incidentally, I'm also going to be using Gitlab for CI/CD instead of Jenkins (due to organizational reasons out of my control), but the jump from Jenkins to Gitlab seems straight-forward to me -- certainly moreso than the jump from deploying WARs directly to deploying Dockerized containers.
Moving over into the Docker world, I imagine the pipeline will go something like this: a commit is checked into git, which triggers the Gitlab CI, which will then build the JAR or WAR file, publish it to Artifactory, then use the Dockerfile to build the Docker image, publish that Docker image into Amazon ECR (maybe?)... and then I'm honestly not sure how the Elastic Beanstalk integration would proceed from there. I know it has something to do with the Dockerrun.aws.json file, and presumably needs to call the AWS CLI.
I just got done watching a webinar from Amazon called Running Microservices and Docker on AWS Elastic Beanstalk, which stated that in the root of my repo there should be a Dockerrun.aws.json file which essentially defines the integration to EB. However, it seems that JSON file contains a link to the individual Docker image in ECR, which is throwing me off. Wouldn't that link change every time a new image is built? I'm imagining that the CI would need to dynamically update the JSON file in the repo... which almost feels like an anti-pattern to me.
In the webinar I linked above, the host created his Docker image and pushed it ECR manually, with the CLI. Then he manually uploaded the Dockerrun.aws.json file to EB. He didn't need to upload the application however, since it was already contained within the Docker image. This all seems odd to me and I question whether I'm understanding things correctly. Will the Dockerrun.aws.json file need to change on every build? Or am I thinking about this the wrong way?
In the 8 months since I posted this question, I've learned a lot and we've already moved onto different and better technology. But I will post what I learned in answer to my original question.
The Dockerrun.aws.json file is almost exactly the same as an ECS task definition. It's important to use the Multi-Docker container deployment version of Beanstalk (as opposed to the single container), even if you're only deploying a single container. IMO they should just get rid of the single-container platform for Beanstalk as it's pretty useless. But assuming you have Beanstalk set to the Multi-Container Docker platform, then the Dockerrun.aws.json file looks something like this:
{
"AWSEBDockerrunVersion": 2,
"containerDefinitions": [
{
"name": "my-container-name-this-can-be-whatever-you-want",
"image": "my.artifactory.com/docker/my-image:latest",
"environment": [],
"essential": true,
"cpu": 10,
"memory": 2048,
"mountPoints": [],
"volumesFrom": [],
"portMappings": [
{
"hostPort": 80,
"containerPort": 80
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/aws/elasticbeanstalk/my-image/var/log/stdouterr.log",
"awslogs-region": "us-east-1",
"awslogs-datetime-format": "%Y-%m-%d %H:%M:%S.%L"
}
}
}
]
}
If you decide, down the road, to convert the whole thing to an ECS service instead of using Beanstalk, that becomes really easy, as the sample JSON above is converted directly to an ECS task definition by extracting the "containerDefinitions" part. So the equivalent ECS task definition might look something like this:
[
{
"name": "my-container-name-this-can-be-whatever-you-want",
"image": "my.artifactory.com/docker/my-image:latest",
"environment": [
{
"name": "VARIABLE1",
"value": "value1"
}
],
"essential": true,
"cpu": 10,
"memory": 2048,
"mountPoints": [],
"volumesFrom": [],
"portMappings": [
{
"hostPort": 0,
"containerPort": 80
}
],
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/aws/ecs/my-image/var/log/stdouterr.log",
"awslogs-region": "us-east-1",
"awslogs-datetime-format": "%Y-%m-%d %H:%M:%S.%L"
}
}
}
]
Key differences here are that with the Beanstalk version, you need to map port 80 to port 80 because a limitation of running Docker on Beanstalk is that you cannot replicate containers on the same instance, whereas in ECS you can. This means that in ECS you can map your container port to host port "zero," which really just tells ECS to pick a random port in the ephemeral range which allows you to stack multiple replicas of your container on a single instance. Secondly with ECS if you want to pass in environment variables, you need to inject them directly into the Task Definition JSON. In Beanstalk world, you don't need to put the environment variables in the Dockerrun.aws.json file, because Beanstalk has a separate facility for managing environment variables in the console.
In fact, the Dockerrun.aws.json file should really just be thought of as a template. Because Docker on Beanstalk uses ECS under-the-hood, it simply takes your Dockerrun.aws.json as a template and uses it to generate its own Task Definition JSON, which injects the managed environment variables into the "environment" property in the final JSON.
One of the big questions I had at the time when I first asked this question was whether you had to update this Dockerrun.aws.json file every time you deployed. What I discovered is that it comes down to a choice of how you want to deploy things. You can, but you don't have to. If you write your Dockerrun.aws.json file so that the "image" property references the :latest Docker image, then there's no need to ever update that file. All you need to do is bounce the Beanstalk instance (i.e. restart the environment), and it will pull whatever :latest Docker image is available from Artifactory (or ECR, or wherever else you publish your images). Thus, all a build pipeline would need to do is publish the :latest Docker image to your Docker repository, and then trigger a restart of the Beanstalk environment using the awscli, with a command like this:
$ aws elasticbeanstalk restart-app-server --region=us-east-1 --environment-name=myapp
However, there are a lot of drawbacks to that approach. If you have a dev/unstable branch that publishes a :latest image to the same repository, you become at risk of deploying that unstable branch if the environment happens to restart on its own. Thus, I would recommend versioning your Docker tags and only deploying the version tags. So instead of pointing to my-image:latest, you would point to something like my-image:1.2.3. This does mean that your build process would have to update the Dockerrun.aws.json file on each build. And then you also need to do more than just a simple restart-app-server.
In this case, I wrote some bash scripts that made use of the jq utility to programmatically update the "image" property in the JSON, replacing the string "latest" with whatever the current build version was. Then I would have to make a call to the awsebcli tool (note that this is a different package than the normal awscli tool) to update the environment, like this:
$ eb deploy myapp --label 1.2.3 --timeout 1 || true
Here I'm doing something hacky: the eb deploy command unfortunately takes FOREVER. (This was another reason we switched to pure ECS; Beanstalk is unbelievably slow.) That command hangs for the entire deployment time, which in our case could take up to 30 minutes or more. That's completely unreasonable for a build process, so I force the process to timeout after 1 minute (it actually continues the deployment; it just disconnects my CLI client and returns a failure code to me even though it may subsequently succeed). The || true is a hack that effectively tells Gitlab to ignore the failure exit code, and pretend that it succeeded. This is obviously problematic because there's no way to tell if the Elastic Beanstalk deployment really did fail; we're assuming it never does.
One more thing on using eb deploy: by default this tool will automatically try to ZIP up everything in your build directory and upload that entire ZIP to Beanstalk. You don't need that; all you need is to update the Dockerrun.aws.json. In order to do this, my build steps were something like this:
Use jq to update Dockerrun.aws.json file with the latest version tag
Use zip to create a new ZIP file called deploy.zip and put Dockerrun.aws.json inside it
Make sure a file called .elasticbeanstalk/config.yml is in place (described below)
Run the eb deploy ... command
Then you need a file in the build directory at .elasticbeanstalk/config.yml which looks like this:
deploy:
artifact: deploy.zip
global:
application_name: myapp
default_region: us-east-1
workspace_type: Application
The awsebcli knows to automatically look for this file when you call eb deploy. And what this particular file says is to look for a file called deploy.zip instead of trying to ZIP up the whole directory itself.
So the :latest method of deployment is problematic because you risk deploying something unstable; the versioned method of deployment is problematic because the deployment scripts are more complicated, and because unless you want your build pipelines to take 30+ minutes, there's a chance that the deployment won't be successful and there's really no way to tell (outside of monitoring each deployment yourself).
Anyways, it's a bit more work to set up, but I would recommend migrating to ECS whenever you can. (Better still to migrate to EKS, though that's a lot more work.) Beanstalk has a lot of problems.

how to provide environment variables to AWS ECS task definition?

In the task definition on ECS, I have provided environment variable as following:
Key as HOST_NAME and the value as something.cloud.com
On my local I use this docker run command and I'm able to pass in my env variables, but through task definition the variables are not being passed to container.
The docker run command below works on local, but how do I set it up in the task definition in AWS ECS?
docker run -e HOST_NAME=something.cloud.com sid:latest
You should call it name and not key, see example below
{
"name": "nginx",
"image": "",
"portMappings": [
{
"containerPort": 80,
"hostPort": 80
}
],
"environment": [
{
"name": "HOST_NAME",
"value": "something.cloud.com"
}
]
}
If you used the new docker compose integration with ECS, then you will need to update the stack.
It is smart enough to update only the parts that changed. For my case, this was the task definition not picking new environment variables set on a .env file and mounted on the docker container
Run the same command you used to create the stack, only that this time round it'll update it(only parts that changed)
docker compose --context you-ecs-context up -f your.docker-compose.yml
For more: https://docs.docker.com/engine/context/ecs-integration/#rolling-update
You can set hostname var at task definition JSON file
hostname
Type: string
Required: no
The hostname to use for your container. This parameter maps to Hostname in the Create a container section of the Docker Remote API and the --hostname option to docker run.

AWS ElasticBeanstalk Multidocker is not creating an ECS task with a correct Cmd

I am deploying my node app on AWS ElasticBeanstalk using the multidocker option. The app appears to deploy successfully, however, I notice that the app (inside my docker) is not actually running.
When I docker inspect <container_id> the running container, see that "Cmd": null. If I inspect the ECS task definition created by beanstalk, I also see "command": null"/
However, if I run the container manually (via docker run -d myimage:latest), I see "Cmd": ["node", "server.js"] and the application serves correctly. This is the correct CMD that is included inside my Dockerfile.
How come my ECS task definition does not read the CMD from my docker image correctly? Am I suppose to add a command to my Dockerrun.aws.json? I couldn't find any documentation for this.
Dockerrun.aws.json:
{
"AWSEBDockerrunVersion": 2,
"volumes": [
{
"name": "node-app",
"host": {
"sourcePath": "/var/app/current/node-app"
}
}
],
"containerDefinitions": [
{
"name": "node-app",
"image": "1234565.dkr.ecr.us-east-1.amazonaws.com/my-node-app:testing",
"essential": true,
"memory": 128,
"portMappings": [
{
"hostPort": 3000,
"containerPort": 3000
}
]
}
]
}
I have the same issue. For me, it turned out that the entrypoint did take care of running the command. However the issue remains, but it might be interesting to see what your entrypoint looks like when you inspect the image and when you inspect the container.
See also:
What is the difference between CMD and ENTRYPOINT in a Dockerfile?

How to write files from Docker image to EFS?

Composition
Jenkins server on EC2 instance, uses EFS
Docker image for above Jenkins server
Need
Write templates to directory on EFS each time ECS starts the task which builds the Jenkins server
Where is the appropriate place to put a step to do the write?
Tried
If I do it in the Dockerfile, it writes to the Docker image, but never propagates the changes to EFS so that the templates are available as projects on the Jenkins server.
I've tried putting the write command in jenkins.sh but I can't figure out how that is run, anyway it doesn't place the templates where I need them.
The original question included:
Write templates to directory on EFS each time ECS starts the task
In addition to #luke-peterson's answer you can use the shell script as an entry point in your docker file, in order to copy files between the mounted EFS folder and the container.
Instead of ENTRYPOINT, use following directive in your dockerfile:
CMD ["sh", "/app/startup.sh"]
And inside startup.sh you can copy files freely and run the app (.net core app in my example):
cp -R /app/wwwroot/. /var/jenkins-home
dotnet /app/app.dll
Of course, you can also do it programmatically insede the app itself.
You need to start the task with a volume, then mount that volume into the container. This way you have persistent storage across multiple Jenkins start/stop cycles.
Your task definition would look something like the below (I've removed the non relevant parts). The important components are mountPoints and volumes. Not that this is not the same as volumesFrom as you aren't mounting volumes from another container, but rather running them in a single task.
This also assumes you're running Jenkins in the default JENKINS_HOME directory as well as having mounted your EFS drive to /mnt/efs/jenkins-home on the EC2 instance.
{
"requiresAttributes": ...
"taskDefinitionArn": ... your ARN ...,
"containerDefinitions": [
{
"portMappings": ...
.... more config here .....
"mountPoints": [
{
"containerPath": "/var/jenkins_home",
"sourceVolume": "jenkins-home",
}
]
}
],
"volumes": [
{
"host": {
"sourcePath": "/mnt/efs/jenkins-home"
},
"name": "jenkins-home"
}
],
"family": "jenkins"
}
Task definition within ECS:

Deploy docker on AWS beanstalk with docker composer

I'm trying to deploy multiple node.js micro services on AWS beanstalk, and I want them to be deployed on the same instance. It's my first time to deploy multiple services, so there're some failures I need someone to help me out. So, I tried to package them in a docker container first. Meanwhile I'm using docker composer to manage the structure. It's up and running locally in my virtual machine, but when I deployed it on to beanstalk, I met a few problems.
What I know:
I know I have to choose to deploy as multi-container docker.
The best practice to manage multiple node.js services is using docker composer.
I need a dockerrun.aws.json for node.js app.
I need to create task definition for that ecs instance.
Where I have problems:
I can only find dockerrun.aws.json and task_definition.json
template for php, so I can't verify if my configuration for node.js
in those two json files are in correct shape.
It seems like docker-compose.yml, dockerrun.aws.json and task_definition.json are doing similar jobs. I must keep
task_definition, but do I still need dockerrun.aws.json?
I tried to run the task in ecs, but it stopped right away. How can I check the log for the task?
I got:
No ecs task definition (or empty definition file) found in environment
because my task will always stop immediately. If I can check the log, it will be much easier for me to do trouble shooting.
Here is my task_definition.json:
{
"requiresAttributes": [],
"taskDefinitionArn": "arn:aws:ecs:us-east-1:231440562752:task-definition/ComposerExample:1",
"status": "ACTIVE",
"revision": 1,
"containerDefinitions": [
{
"volumesFrom": [],
"memory": 100,
"extraHosts": null,
"dnsServers": null,
"disableNetworking": null,
"dnsSearchDomains": null,
"portMappings": [
{
"hostPort": 80,
"containerPort": 80,
"protocol": "tcp"
}
],
"hostname": null,
"essential": true,
"entryPoint": null,
"mountPoints": [
{
"containerPath": "/usr/share/nginx/html",
"sourceVolume": "webdata",
"readOnly": true
}
],
"name": "nginxexpressredisnodemon_nginx_1",
"ulimits": null,
"dockerSecurityOptions": null,
"environment": [],
"links": null,
"workingDirectory": null,
"readonlyRootFilesystem": null,
"image": "nginxexpressredisnodemon_nginx",
"command": null,
"user": null,
"dockerLabels": null,
"logConfiguration": null,
"cpu": 99,
"privileged": null
}
],
"volumes": [
{
"host": {
"sourcePath": "/ecs/webdata"
},
"name": "webdata"
}
],
"family": "ComposerExample"
}
I had a similar problem and it turned out that I archived the containing folder directly in my Archive.zip file, thus giving me this structure in the Archive.zip file:
RootFolder
- Dockerrun.aws.json
- Other files...
It turned out that by archiving only the RootFolder's content (and not the folder itself), Amazon Beanstalk recognized the ECS Task Definition file.
Hope this helps.
For me, it was simply a case of ensuring the name of the file matched the exact casing as described in the AWS documentation.
dockerfile.aws.json had to be exactly Dockerfile.aws.json
Similar problem. What fixed it for me was using the CLI tools instead of zipping myself, just running eb deploy worked.
For me codecommit was no. Then after adding the Dockerrun.aws.json in git it works.
I got here due to the error. What my issue was is that I was deploying with a label using:
eb deploy --label MY_LABEL
What you need to do is deploy with ':
eb deploy --label 'MY_LABEL'
I've had this issue as well. For me the problem was that Dockerrun.aws.json wasn't added in git. eb deploy detects the presence of git.
I ran eb deploy --verbose to figure this out:
INFO: Getting version label from git with git-describe
INFO: creating zip using git archive HEAD
It further lists all the files that'll go in to the zip, Dockerrun.aws.json isn't there.
git status reports this:
On branch master
Your branch is up to date with 'origin/master'.
Untracked files:
(use "git add <file>..." to include in what will be committed)
Dockerrun.aws.json
nothing added to commit but untracked files present (use "git add" to track)
Adding the file to git and committing helped.
In my specific case I could just remove the .git directory in a scripted deploy.
In my case, I had not committed the Dockerrun.aws.json file after creating it, so using eb deploy failed with the same error.