I have a Dockerfile based on public.ecr.aws/lambda/provided:al2. I am trying to run this script:
#!/bin/sh
cd /tmp
echo 'before nohup'
nohup xvfb-run chromedriver --allowed-ips=127.0.0.1 &
echo 'after nohup'
disown
Locally I tested with docker run --mount type=tmpfs,destination=/tmp,tmpfs-size=536870912 --read-only --rm serverless-chromedriver-dev:appimage and it worked. On AWS Lambda, my logs show ["before nohup","after nohup"] as the output of the script. However, /tmp/nohup.out does not exist and ps aux does not show chromedriver.
I also checked the values of $PATH and /usr/bin/ is on iit where nohup and su both are. I verified /tmp is writable. There is space: /dev/vdd 537560 1016 524708 1% /tmp
Overall I am flabbergasted why there is a difference at all between local docker and AWS Lambda function. Wasn't the point of docker to avoid exactly this?
Related
The Docker awslogs documentation states:
the default AWS shared credentials file (~/.aws/credentials of the root user)
Yet if I copy my AWS credentials file there:
sudo bash -c 'mkdir -p $HOME/.aws; cp .aws/credentials $HOME/.aws/credentials'
... and then try to use the driver:
docker run --log-driver=awslogs --log-opt awslogs-group=neiltest-deleteme --rm hello-world
The result is still the dreaded error:
docker: Error response from daemon: failed to initialize logging driver: failed to create Cloudwatch log stream: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors.
Where does this file really need to go? Is it because the Docker daemon isn't running as root but rather some other user and, if so, how do I determine that user?
NOTE: I can work around this on systems using systemd by setting environment variables. But this doesn't work on Google CloudShell where the Docker daemon has been started by some other method.
Ah ha! I figured it out and tested this on Debian Linux (on my Chromebook w/ Linux VM and Google CloudShell):
The .aws folder must be in the root folder of the root user not in the $HOME folder!
Based on that I was able to successfully run the following:
pushd $HOME; sudo bash -c 'mkdir -p /.aws; cp .aws/* /.aws/'; popd
docker run --log-driver=awslogs --log-opt awslogs-region=us-east-1 --log-opt awslogs-group=neiltest-deleteme --rm hello-world
I initially figured this all out by looking at the Docker daemon's process information:
DOCKERD_PID=$(ps -A | grep dockerd | grep -Eo '[0-9]+' | head -n 1)
sudo cat /proc/$DOCKERD_PID/environ
The confusing bit is that Docker's documentation here is wrong:
the default AWS shared credentials file (~/.aws/credentials of the root user)
The true location is /.aws/credentials. I believe this is because the daemon starts before $HOME is actually defined since it's not running as a user process. So starting a shell as root will tell you a different story for tilde or $HOME:
sudo sh -c 'cd ~/; echo $PWD'
That outputs /root but using /root/.aws/credentials does not work!
I am trying to run docker image inside ec2 instance using gitlab CI/CD.
Trying to expose 5000 port for the application.
But i am aware of the face this job will work for the first time, but for the susequent runs the job will fail, as docker does not allow to run image on the same port, so i am trying to implement a fail safe mechanism where where running it checks for the process, if it exist, it will stop and remove container and then run the image on port 5000.
Here i am facing the problem that if this job runs for the first time docker stop needs at least one argument in the current command.
is there a way to run this command in a if condition basis, like if process exist then only run otherwise dont.
deploy:
stage: deploy
before_script:
- chmod 400 $SSH_KEY
script: ssh -o StrictHostKeyChecking=no -i $SSH_KEY ec2-user#ecxxxxx-xxxx.ap-southeast-1.compute.amazonaws.com "
docker login -u $REGISTRY_USER -p $REGISTRY_PASS &&
docker ps -aq | xargs docker stop | xargs docker rm &&
docker run -d -p 5000:5000 $IMAGE_NAME:$IMAGE_TAG"
error on pipeline
"docker stop" requires at least 1 argument.
See 'docker stop --help'.
Usage: docker stop [OPTIONS] CONTAINER [CONTAINER...]
Stop one or more running containers
"docker rm" requires at least 1 argument.
See 'docker rm --help'.
Usage: docker rm [OPTIONS] CONTAINER [CONTAINER...]
Remove one or more containers
The problem is with xargs docker stop | xargs docker rm command. is there a way to solve this kind of problem
Edit :- This doesn't exactly answer my question because what if a junior engineer is assigned this task to setup a pipeline who doesn't know the name of image, this solution requires us to know the name of the image, in this case this won't work.
Here what I understood is you are not stopping image but you are stopping container and removing it and then creating new container with the expose port 5000.
So give a variable constant name to container which will be same whenever it creates. The "|| true" helps you to stop the container only if it exists if not it won't stop any container
variables:
CONTAINER_NAME: <your-container-name> #please give a name for container to be created for this image
deploy:
stage: deploy
before_script:
- chmod 400 $SSH_KEY
script: ssh -o StrictHostKeyChecking=no -i $SSH_KEY ec2-user#ecxxxxx-xxxx.ap-southeast-1.compute.amazonaws.com "
docker login -u $REGISTRY_USER -p $REGISTRY_PASS" &&
docker stop $CONTAINER_NAME; docker rm $CONTAINER_NAME || true &&
docker run -d -p 5000:5000 --name $CONTAINER_NAME $IMAGE_NAME:$IMAGE_TAG"
This is my 3rd day of tear-your-hair-out since the weekend and I just cannot get ENTRYPOINT to work via gitlab runner 13.3.1, this for something that previously worked with a simple ENTRYPOINT ["/bin/bash"] but that was using local docker desktop and using docker run followed by docker exec commands which worked like a synch. Essentially, at the end of it all I previously got a WAR file built.
Currently I build my container in gitlab runner 13.3.1 and push to s3 bucket and then use the IMAGE:localhost:500/my-recently-builtcontainer and then try and do whatever it is I want with the container but I cannot even get ENTRYPOINT to work, in it's exec form or in shell form - atleast in the shell form I get to see something. In the exec form it just gave "OCI runtime create failed" opaque errors so I shifted to the shell form just to see where I could get to.
I keep getting
sh: 1: sh: echo HOME=/home/nonroot-user params=#$ pwd=/ whoami=nonroot-user script=sh ENTRYPOINT reached which_sh=/bin/sh which_bash=/bin/bash PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin; ls -alrth /bin/bash; ls -alrth /bin/sh; /usr/local/bin/entrypoint.sh ;: not found
In my Dockerfile I distinctly have
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
RUN bash -c "ls -larth /usr/local/bin/entrypoint.sh"
ENTRYPOINT "echo HOME=${HOME} params=#$ pwd=`pwd` whoami=`whoami` script=${0} ENTRYPOINT reached which_sh=`which sh` which_bash=`which bash` PATH=${PATH}; ls -alrth `which bash`; ls -alrth `which sh`; /usr/local/bin/lse-entrypoint.sh ;"
The output after I build the container in gitlab is - and I made sure anyone has rights to see this file and use it - just so that I can proceed with my work
-rwxrwxrwx 1 root root 512 Apr 11 17:40 /usr/local/bin/entrypoint.sh
So, I know it is there and all the chmod flags indicate anybody can look at it - so I am so perplexed why it is saying NOT FOUND
/usr/local/bin/entrypoint.sh ;: not found
entrypoint.sh is ...
#!/bin/sh
export PATH=$PATH:/usr/local/bin/
clear
echo Script is $0
echo numOfArgs is $#
echo paramtrsPassd is $#
echo whoami is `whoami`
bash --version
echo "About to exec ....."
exec "$#"
It does not even reach inside this entrypoint.sh file.
I am trying to use ECS for deployment with travis.
At one point everything was working but now it stopped.
I am following this tutorial https://testdriven.io/part-five-ec2-container-service/
There are 2 tasks that keep stopping and starting.
These are the messages I see in tasks:
STOPPED (CannotStartContainerError: API error (500): oci ru)
STOPPED (Essential container in task exited)
These are the messages I see in the logs:
FATAL: could not write to file "pg_wal/xlogtemp.28": No space left on device
container_linux.go:262: starting container process caused "exec: \"./entrypoint.sh\": permission denied"
Why is ECS stopping and starting so many new tasks? This was not happening before.
This is my docker_deploy.sh from my main microservice which I am calling via travis.
#!/bin/sh
if [ -z "$TRAVIS_PULL_REQUEST" ] || [ "$TRAVIS_PULL_REQUEST" == "false" ];
then
if [ "$TRAVIS_BRANCH" == "staging" ];
then
JQ="jq --raw-output --exit-status"
configure_aws_cli() {
aws --version
aws configure set default.region us-east-1
aws configure set default.output json
echo "AWS Configured!"
}
make_task_def() {
task_template=$(cat ecs_taskdefinition.json)
task_def=$(printf "$task_template" $AWS_ACCOUNT_ID $AWS_ACCOUNT_ID)
echo "$task_def"
}
register_definition() {
if revision=$(aws ecs register-task-definition --cli-input-json "$task_def" --family $family | $JQ '.taskDefinition.taskDefinitionArn');
then
echo "Revision: $revision"
else
echo "Failed to register task definition"
return 1
fi
}
deploy_cluster() {
family="testdriven-staging"
cluster="ezasdf-staging"
service="ezasdf-staging"
make_task_def
register_definition
if [[ $(aws ecs update-service --cluster $cluster --service $service --task-definition $revision | $JQ '.service.taskDefinition') != $revision ]];
then
echo "Error updating service."
return 1
fi
}
configure_aws_cli
deploy_cluster
fi
fi
This is my Dockerfile from my users microservice:
FROM python:3.6.2
# install environment dependencies
RUN apt-get update -yqq \
&& apt-get install -yqq --no-install-recommends \
netcat \
&& apt-get -q clean
# set working directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
# add requirements (to leverage Docker cache)
ADD ./requirements.txt /usr/src/app/requirements.txt
# install requirements
RUN pip install -r requirements.txt
# add entrypoint.sh
ADD ./entrypoint.sh /usr/src/app/entrypoint.sh
RUN chmod +x /usr/src/app/entrypoint.sh
# add app
ADD . /usr/src/app
# run server
CMD ["./entrypoint.sh"]
entrypoint.sh:
#!/bin/sh
echo "Waiting for postgres..."
while ! nc -z users-db 5432;
do
sleep 0.1
done
echo "PostgreSQL started"
python manage.py recreate_db
python manage.py seed_db
gunicorn -b 0.0.0.0:5000 manage:app
I tried deleting my cluster and deregistering my tasks and restarting but ECS still continuously stops and starts new tasks now.
When it was working fine: the difference was that instead of the CMD ["./entrypoint.sh"] in my Dockerfile, I had
RUN python manage.py recreate_db
RUN python manage.py seed_db
CMD gunicorn -b 0.0.0.0:5000 manage:app
travis is passing.
The errors are right there.
You don't have enough space on your host; and the entrypoint.sh file is being denied.
Ensure your host has enough disk space (Shell in and df -h to check and expand the volume or just bring up a new instance with more space) and for the entrypoint.sh ensure that when building your image it is executable chmod +x and also is readable by the user the container is running as.
Test your containers locally first; the second error should have been caught in development instantly.
I realize this answer isn't 100% relevant to the question asked, but some googling brought me here due to the title and I figure my solution might help someone later down the line.
I also had this issue, but the reason why my containers kept restarting wasn't a lack of space or other resources, it was because I had enabled dynamic host port mapping and forgotten to update my security group as needed. What happened then is that the health checks my load balancer sent to my containers inevitably failed and ECS restarted the containers (whoops).
Dynamic Port Mapping in AWS Documentation:
https://aws.amazon.com/premiumsupport/knowledge-center/dynamic-port-mapping-ecs/
https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_PortMapping.html Contents --> hostPort
tl;dr - Make sure your load balancer can health check ports 32768 - 65535.
If it's too many tasks running and they have consumed the space then you will need to shell in to the host and do the following. Don't use -f on the docker rm as that will remove the running ECS agent container
docker rm $(docker ps -aq)
Do docker ps -a
Which results in all the stopped containers which are excited, these also consumes disk space.use below command to remove those zoombie
docker rm $(docker ps -a | grep Exited | awk '{print $1}')
And also remove older images or unused images these takes more DiskStation size than containers
docker rmi -f image_name
I have a django app running inside a single docker container on AWS Elastic Beanstalk. I cannot get it to run migrations properly, it always sees the old docker image and tries to run migrations from that (but it doesn’t have the latest files).
I package an .ebextensions directory with my EBS source bundle (a zip containing a Dockerrun.aws.json file and the .ebextensions dir). And it has a setup.config file that looks like this:
container_commands:
01_migrate:
command: "CONTAINER=`docker ps -a --no-trunc | grep aws_beanstalk | cut -d' ' -f1 | head -1` && docker exec $CONTAINER python3 manage.py migrate"
leader_only: true
Which is partially modeled after the comments on this SO question.
I have verified that it can work if I simply re-deploy the app a second time, since this time the previous running image will have the updated migrations file.
Does anyone know how to access the latest docker image or latest running container in an .ebextensions script?
Based on AWS Documentation on Customizing Software on Linux Servers, container_commands will be executed before your app is deployed.
You can use the container_commands key to execute commands for your container. The commands in container_commands are processed in alphabetical order by name. They run after the application and web server have been set up and the application version file has been extracted, but before the application version is deployed. They also have access to environment variables such as your AWS security credentials. Additionally, you can use leader_only. One instance is chosen to be the leader in an Auto Scaling group. If the leader_only value is set to true, the command runs only on the instance that is marked as the leader.
Take a look also into my answer in here. It run some command in different app deployment state and give the command result.
So, your problem solution might be create an post app deployment hook.
.ebextensions/00_post_migrate.config
files:
"/opt/elasticbeanstalk/hooks/appdeploy/post/10_post_migrate.sh":
mode: "000755"
owner: root
group: root
content: |
#!/usr/bin/env bash
if [ -f /tmp/leader_only ]
then
rm /tmp/leader_only
docker exec `docker ps --no-trunc -q | head -n 1` python3 manage.py migrate
fi
container_commands:
01_migrate:
command: "touch /tmp/leader_only"
leader_only: true
I am using another approach. What I did is run a container based on the newly build image, then pass in the environment variables from Elastic Beanstalk and run the custom command in that container. When that command is done, it will remove itself and proceed with the deployment.
So this is the script I have put inside .ebextensions/scripts/container_command.sh (make sure you replace everything that is within <>):
#!/bin/bash
COMMAND=$1
EB_CONFIG_DOCKER_IMAGE_STAGING=$(/opt/elasticbeanstalk/bin/get-config container -k <environment_name>_image)
EB_SUPPORT_FILES=$(/opt/elasticbeanstalk/bin/get-config container -k support_files_dir)
# build --env arguments for docker from env var settings
EB_CONFIG_DOCKER_ENV_ARGS=()
while read -r ENV_VAR; do
EB_CONFIG_DOCKER_ENV_ARGS+=(--env "${ENV_VAR}")
done < <($EB_SUPPORT_FILES/generate_env)
docker run --name=shopblender_pre_deploy -d \
"${EB_CONFIG_DOCKER_ENV_ARGS[#]}" \
"${EB_CONFIG_DOCKER_IMAGE_STAGING}"
docker exec shopblender_pre_deploy ${COMMAND}
# clean up
docker stop shopblender_pre_deploy
docker rm shopblender_pre_deploy
Now, you can use this script to execute any custom command to the container that will be deployed later.
Something like this .ebextensions/container_commands.config:
container_commands:
01-command:
command: bash .ebextensions/scripts/container_command.sh "php app/console doctrine:schema:update --force --no-interaction" &>> /var/log/database.log
leader_only: true
02-command:
command: bash .ebextensions/scripts/container_command.sh "php app/console fos:elastica:reset --no-interaction" &>> /var/log/database.log
leader_only: true
03-command:
command: bash .ebextensions/scripts/container_command.sh "php app/console doctrine:fixtures:load --no-interaction" &>> /var/log/database.log
leader_only: true
This way you also do not need to worry about what your latest started container is, which is a problem with the solution described above.