Connecting to a s3 bucket from docker container - amazon-web-services

I am building a docker container which, in a specific folder transform some data, I would like to allocate those files in a s3 bucket, within as specific folder. I have been going thought the aws cli documentation but not sure how this needs to be faced it.
I have installed it without errors by using:
# AWS cli isntallation
RUN apk add --no-cache \
python3 \
py3-pip \
&& pip3 install --upgrade pip \
&& pip3 install \
awscli \
&& rm -rf /var/cache/apk/*
RUN aws --version
I have read about adding the yaml configuration to point to the bucket but not sure how the process needs to be done. Someone with similar project on mind which can point me better to some documentation or how to face it as i am very layman in docker.

Related

cannot access s3 bucket using vertica

I am running vertica-ce in docker and I created a table , now i wanna export that table as parquet to s3 bucket, when i tried to export using EXPORT TO PARQUET(directory = 's3://s3-bucket-name/data') I got error ERROR 8198: Unable to verify if directory [s3://eucloid-vertica-migration/data/] exists due to 'Access Denied' I know why im getting this error because i need to give the access_key,secret_key but im unable to use awslib inside my docker container so i tried docker exec -it vertica-ce bash -l and accessed my container and tried to install the awscli but apt-get,yum,apk, nothing is working.
if any one have solution for this please let me know!!
You have a couple of options.
Set AWS parameters in the session.
There are a bunch of S3 settings that you can set. For instance, if you need to set is the access and secret key, you can do this:
=> ALTER SESSION SET AWSAuth='access_key:secret_key';
=> EXPORT TO PARQUET(directory = 's3://s3-bucket-name/data');
Depending on your setup, you may need to set additional config options (e.g. region, endpoint url, etc). All of the settings are documented here: https://www.vertica.com/docs/12.0.x/HTML/Content/Authoring/AdministratorsGuide/ConfiguringTheDB/S3Parameters.htm
Create a new version of the image that has aws cli in it.
docker-ce is currently a CentOS based OS, so it uses the yum package manager. You can create a new image using the sample Dockerfile:
FROM vertica/vertica-ce:latest
USER root
RUN set -x \
&& yum -q -y makecache \
&& yum install -y unzip \
&& yum clean all \
&& rm -rf /var/cache/yum \
&& cd /tmp \
&& curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o awscli.zip \
&& unzip awscli.zip \
&& /tmp/aws/install -i /usr/bin/aws-cli -b /usr/bin
USER dbadmin
Mount a volume so that you can access it from your host PC.
With this approach you can use the aws cli that you have installed locally to copy them to an s3 bucket.
Run this docker command to start the ce container persisting everything in /data to a local vertica-data directory.
docker run -p 5433:5433 \
--mount type=bind,source=$(pwd)/vertica-data,target=/data \
--name vertica_ce \
vertica/vertica-ce:latest
Run EXPORT TO PARQUET using the in-container path /data.
Access the parquet files from your PC in the vertica-data directory.

'pip' is not recognized as an internal or external command, operable program or batch file on docker web application

I am undergoing a web project using django and docker. The tutorial references how to set up an email service. I registered with AWS and followed a guide of how to link it to docker. The first step is to run "pip install --upgrade boto3". This is followed by the error in the title. How do I install boto3 through docker?
You can use docker-boto3 docker image instead of installing and maintaining a docker image for your self.
docker run --rm -t \
-v $HOME/.aws:/home/worker/.aws:ro \
-v ${pwd}/example:/work \
shinofara/docker-boto3 python example.py
or you can create your own docker image
FROM alpine:latest
RUN apk add --update python3 \
&& pip3 install --upgrade pip \
&& pip3 install boto3 requests PyYAML pg8000 -U \
&& ln -sv /usr/bin/python3 /usr/bin/python
ENTRYPOINT [ "python3" ]
Boto3 Dockerfile

How to transfer deployment package from S3 to EC2 instance to run python script?

AWS beginner here
I have a repo in GitLab which has a python script and a requirements.txt file, and the python script has to be deployed in the EC2 ubuntu instance (and the script has to be triggered only once a day) via Gitlab CI. I am creating a deployment package of the repo using CI and through this, I am deploying the zipped package in the S3 bucket. My .gitlab-ci.yml file:
image: ubuntu:18.04
variables:
AWS_DEFAULT_REGION: eu-central-1
GIT_SUBMODULE_STRATEGY: recursive
S3_TEST_BUCKET: $BUCKET_UNPACK
stages:
- deploy
TestJob:
stage: deploy
script:
- apt-get -y update
- apt-get -y install python3-pip python3.7 zip
- python3.7 -m pip install --upgrade pip
- python3.7 -V
- pip3.7 install virtualenv
- mv iso_forest_ad.py ~ # This is the python script
- mv requirements.txt ~
# Setup virtual environment
- mkdir ~/forEC2
- cd ~/forEC2
- virtualenv -p python3 venv
- source venv/bin/activate
- pip3.7 install -r ~/requirements.txt -t ~/forEC2/venv/lib/python3.7/site-packages/
# Package environment and dependencies
- cd ~/forEC2/venv/lib/python3.7/site-packages/
- zip -r9 ~/forEC2/archive.zip .
- cd ~
- zip -g ~/forEC2/archive.zip iso_forest_ad.py
- pip install awscli --upgrade
- export PATH=$PATH:~/.local/bin
- aws configure set aws_access_key_id $AWS_TEST_ACCESS_KEY_ID
- aws configure set aws_secret_access_key $AWS_TEST_SECRET_ACCESS_KEY
- aws configure set default.region $AWS_DEFAULT_REGION
- aws s3 cp ~/forEC2/archive.zip $BUCKET_UNPACK/anomaly-detection-deployment.zip
Contents of requirements.txt
-i https://pypi.org/simple
joblib==0.16.0; python_version >= '3.6'
numpy==1.19.0
pandas==1.0.5
psycopg2-binary==2.8.5
python-dateutil==2.8.1; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
pytz==2020.1
scikit-learn==0.23.1
scipy==1.5.1; python_version >= '3.6'
six==1.15.0; python_version >= '2.7' and python_version not in '3.0, 3.1, 3.2, 3.3'
sqlalchemy==1.3.18
threadpoolctl==2.1.0; python_version >= '3.5'
Now, I would like to transfer the script and install the dependencies in the ubuntu EC2 instance and run the script.
I know one way would be to connect to the EC2 instance and do
aws s3 sync s3://s3-bucket-name/folder /home/ubuntu
as suggested in the post: Moving files from s3 to EC2 instance. But doing this, I was not able to install the dependencies from the requirements.txt file.
I would like to know if there is an alternate way (perhaps maybe by using shell script or some other way?) for achieving this. Since I am using ubuntu locally too, using putty is not an option for me.
The link you've posted already shows one way of doing this. Namely, by using UserData.
Therefore, you would have to develop a bash script which would not only download the zip file as shown in the link, but also unpack it, and install the requirements.txt file along side with any other dependencies or configuration setup you require.
So the UserData for your instance would be something like this (pseudo-code, this is only a rough example):
#!/bin/bash
apt update
apt install -y zip awscli python3-pip # awscli is not normally on ubuntu
aws s3 sync s3://optimal-aws-nz-play-config/package.zip .
unzip package.zip
cd package
pip install -r ./requirenements.txt
If this is something you do often, you could create lunch template with the instance settings and the UserData to automatically execute these steps for each instance launched from the template.
There are also other possibilities, involving CodeDeploy, CodePipeline, but plain old UserData would be a good start.
Alternative would be to use run-command. The execution of the command would be triggered from gitlab following upload of the new s3 package.
An example of how to invoke the run-command is in the docs:
aws ssm send-command \
--document-name "AWS-RunPowerShellScript" \
--parameters commands=["echo helloWorld"] \
--targets Key=tag:Env,Values=Dev,Test
Instead of echo helloWorld you would have to write your own bash commands to be executed.

How to deploy to AWS Beanstalk with GitLab CI

How To Deploy a Node App on AWS Elastic Beanstalk, Docker, and Gitlab ci.
I've created a simple node application. Dockerized the node application.
What I'm trying to do is deploy my application using gitlab ci.
This is what I have so far:
image: docker:git
services:
- docker:dind
stages:
- build
- release
- release-prod
variables:
CI_REGISTRY: registry.gitlab.com
CONTAINER_TEST_IMAGE: registry.gitlab.com/testapp/routing:$CI_COMMIT_REF_NAME
CONTAINER_RELEASE_IMAGE: registry.gitlab.com/testapp/routing:latest
before_script:
- echo "$CI_REGISTRY_PASSWORD" | docker login -u "$CI_REGISTRY_USER" --password-stdin "$CI_REGISTRY"
build:
stage: build
script:
- docker build -t $CONTAINER_TEST_IMAGE -f Dockerfile.prod .
- docker push $CONTAINER_TEST_IMAGE
release-image:
stage: release
script:
- docker pull $CONTAINER_TEST_IMAGE
- docker tag $CONTAINER_TEST_IMAGE $CONTAINER_RELEASE_IMAGE
- docker push $CONTAINER_RELEASE_IMAGE
only:
- master
release-prod:
stage: release-prod
script:
when: manual
I'm stuck on release-prod stage. I'm just not sure how I can deploy the app to AWS Beanstalk.
Since I have the docker images have been created and stored in gitlab registry. All I want to do is instruct AWS Beanstalk to download the docker images from gitlab registry and are start the application.
I also have a Dockerrun.aws.json which defines the services.
Your Dockerrun.aws.json file is what Beanstalk uses as the final say in what is deployed.
The option I found to work for us was to make a custom docker image with the eb cli installed so we can run eb deploy... from the gitlab-ci.yml file.
This requires AWS permissions for the runner to be able to access the aws service though so a user or permissions come into play. But they would any way it's setup.
GitLab project - CI/CD settings aws user keys (Ideally it's set up to use an IAM role instead but User/keys will work - I'm not too familiar with getting temporary access which might be the best thing for this but again, I'm not sure how that works)
We use a custom EC2 instance as our runner to run the pipeline so I'm not sure about shared runners - we had a concern of passing aws user creds to a shared runner pipeline...
build stage:
build and push the docker image to our ECR repository or your use case
deploy stage:
have a custom image stored in GitLab that has pre installed the eb cli. Then run eb deploy env-name
This is the dockerfile we use for our PHP project. Some of the installs aren't necessary for your case... This could also be improved by adding a USER and package versions. This will create a docker image that has the eb cli installed though.
FROM node:12
RUN apt-get update && apt-get -y --allow-unauthenticated install apt-transport-https ca-certificates curl gnupg2 software-properties-common ruby-full \
&& add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"
RUN apt-get update && apt-get -y --allow-unauthenticated install docker-ce \
&& apt-get -y install build-essential zlib1g-dev libssl-dev libncurses-dev libffi-dev libsqlite3-dev libreadline-dev libbz2-dev python-pip python3-pip
RUN git clone https://github.com/aws/aws-elastic-beanstalk-cli-setup.git \
&& ./aws-elastic-beanstalk-cli-setup/scripts/bundled_installer
RUN python3 --version && apt-get update && apt-get -y install python3-pip \
&& pip3 install awscli boto3 botocore && pip3 install boto3 botocore --upgrade
Example gitlab-ci.yml setup
release-prod:
image: registry.gitlab.com/your-acct/project/custom-image
stage: release-prod
script:
- service docker start
- echo 'export PATH="/root/.ebcli-virtual-env/executables:$PATH"' >> ~/.bash_profile && source ~/.bash_profile
- echo 'export PATH=/root/.pyenv/versions/3.7.2/bin:$PATH' >> /root/.bash_profile && source /root/.bash_profile
- eb deploy your-environment
when: manual
you could also add the echo commands to the custom gitlab image also so all you need to run is eb deploy...
Hope this helps a little
Although there are couple of different ways to achieve this, I finally found proper solution for my usage cases. I have documented in here https://medium.com/voices-of-plusdental/gitlab-ci-deployment-for-php-applications-to-aws-elastic-beanstalk-automated-qa-test-environments-253ca4932d5b Using eb deploy was the easiest and shortest version. Also allows me to customize the instances in any way I want.

AWS Cloudwatch Agent in Docker

I am trying to package AWS CloudWatch agent into docker container. The docker build runs into the following error -
Failed to connect to bus: No such file or directory
unknown init system
Here is the snippet from Dockerfile -
FROM ubuntu:16.04
RUN \
apt-get -y update && \
apt-get -y install wget && \
apt-get -y install unzip
RUN \
wget https://s3.amazonaws.com/amazoncloudwatch-agent/linux/amd64/latest/AmazonCloudWatchAgent.zip && \
unzip AmazonCloudWatchAgent.zip && \
./install.sh
What is missing or wrong here?
I notice the documentation has different ways of installing, I wonder if they are both correct. I found another in the EC2 guide that a different method for installing on Ubuntu
RUN \
curl https://s3.amazonaws.com/aws-cloudwatch/downloads/latest/awslogs-agent-setup.py -O && \
python ./awslogs-agent-setup.py --region us-east-1
I noticed that AWS has launched official docker image for the Cloudwatch agent on dockerhub and They are updating it frequently. I am late though, But It may help someone.
https://hub.docker.com/r/amazon/cloudwatch-agent