Where do I put `.aws/credentials` for Docker awslogs log-driver (and avoid NoCredentialProviders)? - amazon-web-services

The Docker awslogs documentation states:
the default AWS shared credentials file (~/.aws/credentials of the root user)
Yet if I copy my AWS credentials file there:
sudo bash -c 'mkdir -p $HOME/.aws; cp .aws/credentials $HOME/.aws/credentials'
... and then try to use the driver:
docker run --log-driver=awslogs --log-opt awslogs-group=neiltest-deleteme --rm hello-world
The result is still the dreaded error:
docker: Error response from daemon: failed to initialize logging driver: failed to create Cloudwatch log stream: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors.
Where does this file really need to go? Is it because the Docker daemon isn't running as root but rather some other user and, if so, how do I determine that user?
NOTE: I can work around this on systems using systemd by setting environment variables. But this doesn't work on Google CloudShell where the Docker daemon has been started by some other method.

Ah ha! I figured it out and tested this on Debian Linux (on my Chromebook w/ Linux VM and Google CloudShell):
The .aws folder must be in the root folder of the root user not in the $HOME folder!
Based on that I was able to successfully run the following:
pushd $HOME; sudo bash -c 'mkdir -p /.aws; cp .aws/* /.aws/'; popd
docker run --log-driver=awslogs --log-opt awslogs-region=us-east-1 --log-opt awslogs-group=neiltest-deleteme --rm hello-world
I initially figured this all out by looking at the Docker daemon's process information:
DOCKERD_PID=$(ps -A | grep dockerd | grep -Eo '[0-9]+' | head -n 1)
sudo cat /proc/$DOCKERD_PID/environ
The confusing bit is that Docker's documentation here is wrong:
the default AWS shared credentials file (~/.aws/credentials of the root user)
The true location is /.aws/credentials. I believe this is because the daemon starts before $HOME is actually defined since it's not running as a user process. So starting a shell as root will tell you a different story for tilde or $HOME:
sudo sh -c 'cd ~/; echo $PWD'
That outputs /root but using /root/.aws/credentials does not work!

Related

nohup does not run in AWS Lambda

I have a Dockerfile based on public.ecr.aws/lambda/provided:al2. I am trying to run this script:
#!/bin/sh
cd /tmp
echo 'before nohup'
nohup xvfb-run chromedriver --allowed-ips=127.0.0.1 &
echo 'after nohup'
disown
Locally I tested with docker run --mount type=tmpfs,destination=/tmp,tmpfs-size=536870912 --read-only --rm serverless-chromedriver-dev:appimage and it worked. On AWS Lambda, my logs show ["before nohup","after nohup"] as the output of the script. However, /tmp/nohup.out does not exist and ps aux does not show chromedriver.
I also checked the values of $PATH and /usr/bin/ is on iit where nohup and su both are. I verified /tmp is writable. There is space: /dev/vdd 537560 1016 524708 1% /tmp
Overall I am flabbergasted why there is a difference at all between local docker and AWS Lambda function. Wasn't the point of docker to avoid exactly this?

ENTRYPOINT just refuses to exec or even shell run

This is my 3rd day of tear-your-hair-out since the weekend and I just cannot get ENTRYPOINT to work via gitlab runner 13.3.1, this for something that previously worked with a simple ENTRYPOINT ["/bin/bash"] but that was using local docker desktop and using docker run followed by docker exec commands which worked like a synch. Essentially, at the end of it all I previously got a WAR file built.
Currently I build my container in gitlab runner 13.3.1 and push to s3 bucket and then use the IMAGE:localhost:500/my-recently-builtcontainer and then try and do whatever it is I want with the container but I cannot even get ENTRYPOINT to work, in it's exec form or in shell form - atleast in the shell form I get to see something. In the exec form it just gave "OCI runtime create failed" opaque errors so I shifted to the shell form just to see where I could get to.
I keep getting
sh: 1: sh: echo HOME=/home/nonroot-user params=#$ pwd=/ whoami=nonroot-user script=sh ENTRYPOINT reached which_sh=/bin/sh which_bash=/bin/bash PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin; ls -alrth /bin/bash; ls -alrth /bin/sh; /usr/local/bin/entrypoint.sh ;: not found
In my Dockerfile I distinctly have
COPY entrypoint.sh /usr/local/bin/entrypoint.sh
RUN bash -c "ls -larth /usr/local/bin/entrypoint.sh"
ENTRYPOINT "echo HOME=${HOME} params=#$ pwd=`pwd` whoami=`whoami` script=${0} ENTRYPOINT reached which_sh=`which sh` which_bash=`which bash` PATH=${PATH}; ls -alrth `which bash`; ls -alrth `which sh`; /usr/local/bin/lse-entrypoint.sh ;"
The output after I build the container in gitlab is - and I made sure anyone has rights to see this file and use it - just so that I can proceed with my work
-rwxrwxrwx 1 root root 512 Apr 11 17:40 /usr/local/bin/entrypoint.sh
So, I know it is there and all the chmod flags indicate anybody can look at it - so I am so perplexed why it is saying NOT FOUND
/usr/local/bin/entrypoint.sh ;: not found
entrypoint.sh is ...
#!/bin/sh
export PATH=$PATH:/usr/local/bin/
clear
echo Script is $0
echo numOfArgs is $#
echo paramtrsPassd is $#
echo whoami is `whoami`
bash --version
echo "About to exec ....."
exec "$#"
It does not even reach inside this entrypoint.sh file.

EFS symlink fails while deploying

So I use AWS Elastic Beanstalk to serve my PHP application. I want to mount EFS to have permanent storage for the images uploaded via my application.
I have created .ebextensions folder and created one file called mount.config with the below code
packages:
yum:
nfs-utils: []
jq: []
files:
"/tmp/mount-efs.sh" :
mode: "000755"
content: |
#!/usr/bin/env bash
mkdir -p /mnt/efs
EFS_NAME=$(/opt/elasticbeanstalk/bin/get-config environment | jq -r '.EFS_NAME')
mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 $EFS_NAME:/ /mnt/efs || true
mkdir -p /mnt/efs/questions
chown webapp:webapp /mnt/efs/questions
commands:
01_mount:
command: "/tmp/mount-efs.sh"
container_commands:
01-symlink-uploads:
command: ln -s /mnt/efs/questions /var/app/ondeck/images/
Everything is working fine until the last line where it fails to create a symlink.
What I have tried so far:
Running the command directly on the machine while changing ondeck -> current. This works fine.
Removing the EC2 instance and adding a new one. Still failing
In the logs I see
ln: failed to create symbolic link '/var/app/current/images/questions': No such file or directory
Any suggestion what could be the reason?
Ok, I fixed it by replacing ondeck with staging
And adding this line under container_commands:
01-change-permission:
command: chmod -R 777 /var/app/staging/images

AWS ECS tasks keep starting and stopping

I am trying to use ECS for deployment with travis.
At one point everything was working but now it stopped.
I am following this tutorial https://testdriven.io/part-five-ec2-container-service/
There are 2 tasks that keep stopping and starting.
These are the messages I see in tasks:
STOPPED (CannotStartContainerError: API error (500): oci ru)
STOPPED (Essential container in task exited)
These are the messages I see in the logs:
FATAL: could not write to file "pg_wal/xlogtemp.28": No space left on device
container_linux.go:262: starting container process caused "exec: \"./entrypoint.sh\": permission denied"
Why is ECS stopping and starting so many new tasks? This was not happening before.
This is my docker_deploy.sh from my main microservice which I am calling via travis.
#!/bin/sh
if [ -z "$TRAVIS_PULL_REQUEST" ] || [ "$TRAVIS_PULL_REQUEST" == "false" ];
then
if [ "$TRAVIS_BRANCH" == "staging" ];
then
JQ="jq --raw-output --exit-status"
configure_aws_cli() {
aws --version
aws configure set default.region us-east-1
aws configure set default.output json
echo "AWS Configured!"
}
make_task_def() {
task_template=$(cat ecs_taskdefinition.json)
task_def=$(printf "$task_template" $AWS_ACCOUNT_ID $AWS_ACCOUNT_ID)
echo "$task_def"
}
register_definition() {
if revision=$(aws ecs register-task-definition --cli-input-json "$task_def" --family $family | $JQ '.taskDefinition.taskDefinitionArn');
then
echo "Revision: $revision"
else
echo "Failed to register task definition"
return 1
fi
}
deploy_cluster() {
family="testdriven-staging"
cluster="ezasdf-staging"
service="ezasdf-staging"
make_task_def
register_definition
if [[ $(aws ecs update-service --cluster $cluster --service $service --task-definition $revision | $JQ '.service.taskDefinition') != $revision ]];
then
echo "Error updating service."
return 1
fi
}
configure_aws_cli
deploy_cluster
fi
fi
This is my Dockerfile from my users microservice:
FROM python:3.6.2
# install environment dependencies
RUN apt-get update -yqq \
&& apt-get install -yqq --no-install-recommends \
netcat \
&& apt-get -q clean
# set working directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
# add requirements (to leverage Docker cache)
ADD ./requirements.txt /usr/src/app/requirements.txt
# install requirements
RUN pip install -r requirements.txt
# add entrypoint.sh
ADD ./entrypoint.sh /usr/src/app/entrypoint.sh
RUN chmod +x /usr/src/app/entrypoint.sh
# add app
ADD . /usr/src/app
# run server
CMD ["./entrypoint.sh"]
entrypoint.sh:
#!/bin/sh
echo "Waiting for postgres..."
while ! nc -z users-db 5432;
do
sleep 0.1
done
echo "PostgreSQL started"
python manage.py recreate_db
python manage.py seed_db
gunicorn -b 0.0.0.0:5000 manage:app
I tried deleting my cluster and deregistering my tasks and restarting but ECS still continuously stops and starts new tasks now.
When it was working fine: the difference was that instead of the CMD ["./entrypoint.sh"] in my Dockerfile, I had
RUN python manage.py recreate_db
RUN python manage.py seed_db
CMD gunicorn -b 0.0.0.0:5000 manage:app
travis is passing.
The errors are right there.
You don't have enough space on your host; and the entrypoint.sh file is being denied.
Ensure your host has enough disk space (Shell in and df -h to check and expand the volume or just bring up a new instance with more space) and for the entrypoint.sh ensure that when building your image it is executable chmod +x and also is readable by the user the container is running as.
Test your containers locally first; the second error should have been caught in development instantly.
I realize this answer isn't 100% relevant to the question asked, but some googling brought me here due to the title and I figure my solution might help someone later down the line.
I also had this issue, but the reason why my containers kept restarting wasn't a lack of space or other resources, it was because I had enabled dynamic host port mapping and forgotten to update my security group as needed. What happened then is that the health checks my load balancer sent to my containers inevitably failed and ECS restarted the containers (whoops).
Dynamic Port Mapping in AWS Documentation:
https://aws.amazon.com/premiumsupport/knowledge-center/dynamic-port-mapping-ecs/
https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_PortMapping.html Contents --> hostPort
tl;dr - Make sure your load balancer can health check ports 32768 - 65535.
If it's too many tasks running and they have consumed the space then you will need to shell in to the host and do the following. Don't use -f on the docker rm as that will remove the running ECS agent container
docker rm $(docker ps -aq)
Do docker ps -a
Which results in all the stopped containers which are excited, these also consumes disk space.use below command to remove those zoombie
docker rm $(docker ps -a | grep Exited | awk '{print $1}')
And also remove older images or unused images these takes more DiskStation size than containers
docker rmi -f image_name

Django migrations with Docker on AWS Elastic Beanstalk

I have a django app running inside a single docker container on AWS Elastic Beanstalk. I cannot get it to run migrations properly, it always sees the old docker image and tries to run migrations from that (but it doesn’t have the latest files).
I package an .ebextensions directory with my EBS source bundle (a zip containing a Dockerrun.aws.json file and the .ebextensions dir). And it has a setup.config file that looks like this:
container_commands:
01_migrate:
command: "CONTAINER=`docker ps -a --no-trunc | grep aws_beanstalk | cut -d' ' -f1 | head -1` && docker exec $CONTAINER python3 manage.py migrate"
leader_only: true
Which is partially modeled after the comments on this SO question.
I have verified that it can work if I simply re-deploy the app a second time, since this time the previous running image will have the updated migrations file.
Does anyone know how to access the latest docker image or latest running container in an .ebextensions script?
Based on AWS Documentation on Customizing Software on Linux Servers, container_commands will be executed before your app is deployed.
You can use the container_commands key to execute commands for your container. The commands in container_commands are processed in alphabetical order by name. They run after the application and web server have been set up and the application version file has been extracted, but before the application version is deployed. They also have access to environment variables such as your AWS security credentials. Additionally, you can use leader_only. One instance is chosen to be the leader in an Auto Scaling group. If the leader_only value is set to true, the command runs only on the instance that is marked as the leader.
Take a look also into my answer in here. It run some command in different app deployment state and give the command result.
So, your problem solution might be create an post app deployment hook.
.ebextensions/00_post_migrate.config
files:
"/opt/elasticbeanstalk/hooks/appdeploy/post/10_post_migrate.sh":
mode: "000755"
owner: root
group: root
content: |
#!/usr/bin/env bash
if [ -f /tmp/leader_only ]
then
rm /tmp/leader_only
docker exec `docker ps --no-trunc -q | head -n 1` python3 manage.py migrate
fi
container_commands:
01_migrate:
command: "touch /tmp/leader_only"
leader_only: true
I am using another approach. What I did is run a container based on the newly build image, then pass in the environment variables from Elastic Beanstalk and run the custom command in that container. When that command is done, it will remove itself and proceed with the deployment.
So this is the script I have put inside .ebextensions/scripts/container_command.sh (make sure you replace everything that is within <>):
#!/bin/bash
COMMAND=$1
EB_CONFIG_DOCKER_IMAGE_STAGING=$(/opt/elasticbeanstalk/bin/get-config container -k <environment_name>_image)
EB_SUPPORT_FILES=$(/opt/elasticbeanstalk/bin/get-config container -k support_files_dir)
# build --env arguments for docker from env var settings
EB_CONFIG_DOCKER_ENV_ARGS=()
while read -r ENV_VAR; do
EB_CONFIG_DOCKER_ENV_ARGS+=(--env "${ENV_VAR}")
done < <($EB_SUPPORT_FILES/generate_env)
docker run --name=shopblender_pre_deploy -d \
"${EB_CONFIG_DOCKER_ENV_ARGS[#]}" \
"${EB_CONFIG_DOCKER_IMAGE_STAGING}"
docker exec shopblender_pre_deploy ${COMMAND}
# clean up
docker stop shopblender_pre_deploy
docker rm shopblender_pre_deploy
Now, you can use this script to execute any custom command to the container that will be deployed later.
Something like this .ebextensions/container_commands.config:
container_commands:
01-command:
command: bash .ebextensions/scripts/container_command.sh "php app/console doctrine:schema:update --force --no-interaction" &>> /var/log/database.log
leader_only: true
02-command:
command: bash .ebextensions/scripts/container_command.sh "php app/console fos:elastica:reset --no-interaction" &>> /var/log/database.log
leader_only: true
03-command:
command: bash .ebextensions/scripts/container_command.sh "php app/console doctrine:fixtures:load --no-interaction" &>> /var/log/database.log
leader_only: true
This way you also do not need to worry about what your latest started container is, which is a problem with the solution described above.