I am trying to get awscliv2 installed in docker image for airflow. However, when I run the dag I get this error and the alias is not being created I have to manually change it in the container. I am still pretty new to docker.
no name!#f3d6d31933d8:/$ awscliv2 configure
18:51:03 - awscliv2 - ERROR - Command failed with code 127
Dockerfile:
# set up some variables
ARG IMAGE=airflow
ARG TAG=2.3.4
ARG STAGEPATH=/etc/airflow/builddeps
# builder stage
FROM bitnami/$IMAGE:$TAG as builder
# refresh the arg
ARG STAGEPATH
# user root is required for installing packages
USER root
# install build essentials
RUN install_packages build-essential unixodbc-dev curl gnupg2
# make paths, including apt archives or the download-only fails trying to cleanup
RUN mkdir -p $STAGEPATH/deb; mkdir -p /var/cache/apt/archives
# download & build pip wheels to directory
RUN mkdir -p $STAGEPATH/pip-wheels
RUN pip install wheel
RUN python -m pip wheel --wheel-dir=$STAGEPATH/pip-wheels \
numpy\
requests \
pythonnet==3.0.0rc5 \
pymssql \
awscliv2 \
apache-airflow-providers-odbc \
apache-airflow-providers-microsoft-mssql \
apache-airflow-providers-ssh \
apache-airflow-providers-sftp \
statsd
# next stage
FROM bitnami/$IMAGE:$TAG as airflow
# refresh the arg within this stage
ARG STAGEPATH
# user root is required for installing packages
USER root
# copy pre-built pip packages from first stage
RUN mkdir -p $STAGEPATH
COPY --from=builder $STAGEPATH $STAGEPATH
# install updated and required pip packages
RUN . /opt/bitnami/airflow/venv/bin/activate && python -m pip install --upgrade --no-index --find-links=$STAGEPATH/pip-wheels \
numpy\
requests \
pythonnet==3.0.0rc5 \
pymssql \
awscliv2 \
apache-airflow-providers-odbc \
apache-airflow-providers-microsoft-mssql \
apache-airflow-providers-ssh \
apache-airflow-providers-sftp \
statsd
# createawscliv2 alias
RUN alias aws='awsv2' /bin/bash
# return to airflow user
USER 1000
I expect the awscliv2 to install with PIP and configure the alias.
I have tried running this from the container command line and the dag still gives the error command not found exit code 128
I've tried a lot of different things found online, but I'm still unable to solve the below timeout error:
2021-11-27T14:51:21.844520452ZTimeout in polling result file: gs://...
when submitting a Dataflow flex-template job. It goes into Queued state and after 14 mins {x} secs goes to Failed state with the above log message. My Dockerfile is as follows:
FROM gcr.io/dataflow-templates-base/python3-template-launcher-base
ARG WORKDIR=/dataflow/template
RUN mkdir -p ${WORKDIR}
WORKDIR ${WORKDIR}
COPY requirements.txt .
COPY test-beam.py .
# Do not include `apache-beam` in requirements.txt
ENV FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE="${WORKDIR}/requirements.txt"
ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${WORKDIR}/test-beam.py"
# Setting Proxy
ENV http_proxy=http://proxy-web.{company_name}.com:80 \
https_proxy=http://proxy-web.{company_name}.com:80 \
no_proxy=127.0.0.1,localhost,.{company_name}.com,{company_name}.com,.googleapis.com
# Company Cert
RUN apt-get update && apt-get install -y curl \
&& curl http://{company_name}.com/pki/{company_name}%20Issuing%20CA.pem -o - | tr -d '\r' > /usr/local/share/ca-certificates/{company_name}.crt \
&& curl http://{company_name}.com/pki/{company_name}%20Root%20CA.pem -o - | tr -d '\r' > /usr/local/share/ca-certificates/{company_name}-root.crt \
&& update-ca-certificates \
&& apt-get remove -y --purge curl \
&& apt-get autoremove -y \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Set pip config to point to Company Cert
RUN pip config set global.cert /etc/ssl/certs/ca-certificates.crt
# Install apache-beam and other dependencies to launch the pipeline
RUN pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir apache-beam[gcp]==2.32.0 \
&& pip install --no-cache-dir -r $FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE \
# Download the requirements to s7peed up launching the Dataflow job.
&& pip download --no-cache-dir --dest /tmp/dataflow-requirements-cache -r $FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE
# Since we already downloaded all the dependencies, there's no need to rebuild everything.
ENV PIP_NO_DEPS=True
ENV http_proxy= \
https_proxy= \
no_proxy=
ENTRYPOINT ["/opt/google/dataflow/python_template_launcher"]
And requirements.py:
numpy
setuptools
scipy
wavefile
I know my Python script used above test-beam.py works as it executes successfully locally using a DirectRunner.
I have gone through many SO posts and GCP's own troubleshooting guide here aimed at this error, however to no success. As you can see from my Dockerfile, I have done the following in it:
Installing apache-beam[gcp] separately and not including it in my requirements.txt file.
Pre-downloading all dependencies using pip download --no-cache-dir --dest /tmp/dataflow-requirements-cache -r $FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE.
Setting ENTRYPOINT ["/opt/google/dataflow/python_template_launcher"] explicitly as it seems this is not set in the base image gcr.io/dataflow-templates-base/python3-template-launcher-base as found by executing docker inspect on it (am I correct about this?).
Unsetting company proxy at the end as it seems to be the cause of timeout issues seen in job logs from previous runs.
What am I missing? How can I fix this issue?
Has anyone ever installed the AWS CloudWatch in the Alpine docker? Seems to me it is not supporting for all the installation packages AWS provided.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/download-cloudwatch-agent-commandline.html
We can install in this way:
RUN apk update && apk add ca-certificates curl rpm
RUN wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
RUN rpm -ihv --nodeps ./amazon-cloudwatch-agent.rpm
But it is not functioning correctly. If I want to check its status
~/test # /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status
I get the error.
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl: line 469: systemctl: not found
For me, I don't think it is not compatible in Alpine (I am using alpine:3.14). Anyone has some idea on this?
Thanks,
#To install aws-cloudwatch-agent
RUN apk update && apk add ca-certificates curl rpm
RUN wget https://s3.amazonaws.com/amazoncloudwatch-agent/amazon_linux/amd64/latest/amazon-cloudwatch-agent.rpm
RUN rpm -ihv --nodeps ./amazon-cloudwatch-agent.rpm
#To setup repo for k6 and install k6
ENV RUN_IN_CONTAINER="True"
RUN wget https://dl.cloudsmith.io/public/cloudposse-dev/packages/alpine/any-version/main/x86_64/k6-0.34.1-r0.apk
RUN apk add --allow-untrusted k6-0.34.1-r0.apk ```
Thanks in advance!
RD
Seems just need to use the docker multiple stages to build what you want:
If we want to do the integration of K6 and AWS CloudWatch (to make them in one docker files with Alpine.
Checkout git#github.com:grafana/k6.git
Update the docker files to
FROM golang:1.17-alpine as builder_k6
WORKDIR $GOPATH/src/go.k6.io/k6
ADD . .
RUN apk --no-cache add git
RUN CGO_ENABLED=0 go install -a -trimpath -ldflags "-s -w -X go.k6.io/k6/lib/consts.VersionDetails=$(date -u +"%FT%T%z")/$(git describe --always --long --dirty)"
FROM debian:latest as builder_cw
RUN apt-get update && \
apt-get install -y ca-certificates curl && \
rm -rf /var/lib/apt/lists/*
RUN curl -O https://s3.amazonaws.com/amazoncloudwatch-agent/debian/amd64/latest/amazon-cloudwatch-agent.deb && \
dpkg -i -E amazon-cloudwatch-agent.deb && \
rm -rf /tmp/* && \
rm -rf /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard && \
rm -rf /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl && \
rm -rf /opt/aws/amazon-cloudwatch-agent/bin/config-downloader
FROM alpine:latest
COPY --from=builder_cw /tmp /tmp
COPY --from=builder_cw /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
COPY --from=builder_cw /opt/aws/amazon-cloudwatch-agent /opt/aws/amazon-cloudwatch-agent
COPY --from=builder_k6 /go/bin/k6 /usr/bin/k6
ADD statsd.json /opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json
ADD statsd.json /opt/aws/amazon-cloudwatch-agent/etc/statsd.json
ADD credentials /root/.aws/credentials
ADD config /root/.aws/config
#startup the agent
ENV RUN_IN_CONTAINER="True"
ENTRYPOINT ["/opt/aws/amazon-cloudwatch-agent/bin/start-amazon-cloudwatch-agent"]
I'm trying to migrate a dockerized PHP-Apache2 server from Debian to Alpine.
The Debian dockerfile:
FROM php:7.3.24-apache-buster
COPY conf/php.ini /usr/local/etc/php
RUN apt-get update && apt-get upgrade -y && apt-get install -y \
curl git \
libfreetype6-dev \
libjpeg62-turbo-dev \
libmcrypt-dev \
libxml2-dev \
libpng-dev \
python-setuptools python-dev build-essential python-pip \
libzip-dev
RUN pecl install mcrypt-1.0.2 \
&& docker-php-ext-enable mcrypt \
&& docker-php-ext-configure gd --with-freetype-dir=/usr/include/ --with-jpeg-dir=/usr/include/ \
&& docker-php-ext-install -j$(nproc) gd \
&& docker-php-ext-install mysqli \
&& docker-php-ext-install zip \
&& docker-php-ext-install soap
RUN pip install --upgrade virtualenv && pip install xhtml2pdf
WORKDIR /var/www/app
COPY ./ /var/www/app
RUN cd /var/www/app && \
php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');" && \
php composer-setup.php && \
php -r "unlink('composer-setup.php');" && \
php composer.phar install --ignore-platform-reqs --prefer-dist
COPY ./conf/default.conf /etc/apache2/sites-enabled/000-default.conf
COPY ./conf/cert /etc/apache2/cert
RUN mkdir /var/log/gts && chmod 777 -R /var/log/gts
RUN mv /etc/apache2/mods-available/rewrite.load /etc/apache2/mods-enabled/rewrite.load
RUN mv /etc/apache2/mods-available/ssl.load /etc/apache2/mods-enabled/ssl.load
COPY conf/apache2.conf /etc/apache2/apache2.conf
RUN mv /etc/apache2/mods-available/remoteip.load /etc/apache2/mods-enabled/remoteip.load
EXPOSE 443
The Alpine dockerfile:
FROM webdevops/php-apache:7.3-alpine
COPY conf/php.ini /usr/local/etc/php
RUN apk update && apk upgrade && apk add \
git \
curl \
autoconf \
freetype-dev \
libjpeg-turbo-dev \
libmcrypt-dev \
libxml2-dev \
libpng-dev \
libzip-dev \
py-setuptools \
python3-dev \
build-base \
py-pip \
libzip-dev \
apache2-dev
RUN PHP_AUTOCONF="/usr/bin/autoconf" \
&& pecl install mcrypt-1.0.2 \
&& docker-php-ext-enable mcrypt \
&& docker-php-ext-configure gd --with-freetype-dir=/usr/include/ --with-jpeg-dir=/usr/include/
RUN pip install --upgrade --ignore-installed virtualenv && pip install xhtml2pdf
WORKDIR /var/www/app
COPY ./ /var/www/app
RUN cd /var/www/app && \
php -r "copy('https://getcomposer.org/installer', 'composer-setup.php');" && \
php composer-setup.php && \
php -r "unlink('composer-setup.php');" && \
php composer.phar install --ignore-platform-reqs --prefer-dist
COPY ./conf/default.conf /etc/apache2/sites-enabled/000-default.conf
COPY ./conf/cert /etc/apache2/cert
RUN mkdir /var/log/gts && chmod 777 -R /var/log/gts
RUN mv /etc/apache2/mods-available/rewrite.load /etc/apache2/mods-enabled/rewrite.load \
&& mv /etc/apache2/mods-available/ssl.load /etc/apache2/mods-enabled/ssl.load
COPY conf/apache2.conf /etc/apache2/apache2.conf
RUN mv /etc/apache2/mods-available/remoteip.load /etc/apache2/mods-enabled/remoteip.load
EXPOSE 443
The Alpine build failed:
mv: can't rename '/etc/apache2/mods-available/rewrite.load': No such file or directory
Turns out that the Alpine container:
Has no /etc/apache2/mods-* directories and no *.load files
Has only *.so files in /usr/lib/apache2 (similar to the list of *.so files in /usr/lib/apache2/modules on Debian).
Questions:
Why don't I have *.load files in the Alpine container?
Why don't I have /etc/apache2/mods-* directories in the Alpine container?
Are *.so files equivalent to *.load files?
If the previous is true, then how do I use the *.so files?
PS: I prefer not to change httpd.conf if possible.
I try to answer your questions one by one:
Why don't I have *.load files in the Alpine container?
The load-files normally just contain one line to load the specific module and can be seen as a convenience method to toggle modules with a shell script (a2enmod)
Why don't I have /etc/apache2/mods-* directories in the Alpine container?
Alpine tries to be minimalistic. This also means you should not install modules you don't need. So having modules installed you don't need (e.g. only in mods-available) is bad practice.
Are *.so files equivalent to *.load files?
The so-files are also present in debian and are the modules itself. The load-files are configuration fragments to load these .so files.
If the previous is true, then how do I use the *.so files?
Just take a look into the load-Files. For example, you can load the rewrite module by using this configuration line:
LoadModule rewrite_module modules/mod_rewrite.so
You shouldn't change httpd.conf, your preference is right here. But you can put custom configuration into the /etc/apache2/conf.d and load all your modules there. Just make sure your configfiles there end with a .conf
How can I build a Docker container with Google's Cloud Command Line Tool/SDK?
The script at the url https://sdk.cloud.google.com appears to require user input so doesn't work in a docker file.
Adding the following to my Docker file appears to work.
# Downloading gcloud package
RUN curl https://dl.google.com/dl/cloudsdk/release/google-cloud-sdk.tar.gz > /tmp/google-cloud-sdk.tar.gz
# Installing the package
RUN mkdir -p /usr/local/gcloud \
&& tar -C /usr/local/gcloud -xvf /tmp/google-cloud-sdk.tar.gz \
&& /usr/local/gcloud/google-cloud-sdk/install.sh
# Adding the package path to local
ENV PATH $PATH:/usr/local/gcloud/google-cloud-sdk/bin
Use this one-liner in your Dockerfile:
RUN curl -sSL https://sdk.cloud.google.com | bash
source:
https://docs.docker.com/v1.8/installation/google/
Doing it with alpine:
FROM alpine:3.6
RUN apk add --update \
python \
curl \
which \
bash
RUN curl -sSL https://sdk.cloud.google.com | bash
ENV PATH $PATH:/root/google-cloud-sdk/bin
RUN curl -sSL https://sdk.cloud.google.com > /tmp/gcl && bash /tmp/gcl --install-dir=~/gcloud --disable-prompts
This will download the google cloud sdk installer into /tmp/gcl, and run it with the parameters as follows:
--install-dir=~/gcloud: Extract the binaries into folder gcloud in home folder. Change this to wherever you want, for example /usr/local/bin
--disable-prompts: Don't show any prompts while installing (headless)
To install gcloud inside a docker container please follow the instructions here.
Basically you need to run
RUN apt-get update && \
apt-get install -y curl gnupg && \
echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - && \
apt-get update -y && \
apt-get install google-cloud-sdk -y
inside your dockerfile. It's important you are user ROOT when you run this command, so it may necessary to add USER root before the previous command.
As an alternative, you could use the docker image provided by google namely google/cloud-sdk. https://hub.docker.com/r/google/cloud-sdk/
Dockerfile:
FROM centos:7
RUN yum update -y && yum install -y \
curl \
which && \
yum clean all
RUN curl -sSL https://sdk.cloud.google.com | bash
ENV PATH $PATH:/root/google-cloud-sdk/bin
Build:
docker build . -t google-cloud-sdk
Then run gcloud:
docker run --rm \
--volume $(pwd)/assets/root/.config:/root/.config \
google-cloud-sdk gcloud
...or run gsutil:
docker run --rm \
--volume $(pwd)/assets/root/.config:/root/.config \
google-cloud-sdk gsutil
The local assets folder will contain the configuration.
apk upgrade --update-cache --available && \
apk add openssl && \
apk add curl python3 py-crcmod bash libc6-compat && \
rm -rf /var/cache/apk/*
curl https://sdk.cloud.google.com | bash > /dev/null
export PATH=$PATH:/root/google-cloud-sdk/bin
gcloud components update kubectl
I was using Python Alpine image python:3.8.6-alpine3.12 as base and this worked for me:
RUN apk add --no-cache bash
RUN wget https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-327.0.0-linux-x86_64.tar.gz \
-O /tmp/google-cloud-sdk.tar.gz | bash
RUN mkdir -p /usr/local/gcloud \
&& tar -C /usr/local/gcloud -xvzf /tmp/google-cloud-sdk.tar.gz \
&& /usr/local/gcloud/google-cloud-sdk/install.sh -q
ENV PATH $PATH:/usr/local/gcloud/google-cloud-sdk/bin
After building and running the image, you can check if google-cloud-sdk is installed by running docker exec -i -t <container_id> /bin/bash and running this:
bash-5.0# gcloud --version
Google Cloud SDK 327.0.0
bq 2.0.64
core 2021.02.05
gsutil 4.58
bash-5.0# gsutil --version
gsutil version: 4.58
If you want a specific version of google-cloud-sdk, you can visit https://storage.cloud.google.com/cloud-sdk-release
curl https://sdk.cloud.google.com | bash -s -- --disable-prompts
and export env
works for me
I got this working with Ubuntu 18.04 using:
RUN apt-get install -y curl && curl -sSL https://sdk.cloud.google.com | bash
ENV PATH="$PATH:/root/google-cloud-sdk/bin"
You can use multi-stage builds to make this simpler and more efficient than solutions using curl.
FROM bitnami/google-cloud-sdk:0.392.0 as gcloud
FROM base-image-for-production:tag
# Do what you need to configure your production image
COPY --from=gcloud /opt/bitnami/google-cloud-sdk/ /google-cloud-sdk
This work for me.
FROM php:7.2-fpm
RUN apt-get update -y
RUN apt-get install -y python && \
curl -sSL https://sdk.cloud.google.com | bash
ENV PATH $PATH:/root/google-cloud-sdk/bin
An example using debian as the base image:
FROM debian:stretch
RUN apt-get update && apt-get install -y apt-transport-https gnupg curl lsb-release
RUN export CLOUD_SDK_REPO="cloud-sdk-$(lsb_release -c -s)" && \
echo "cloud SDK repo: $CLOUD_SDK_REPO" && \
echo "deb http://packages.cloud.google.com/apt $CLOUD_SDK_REPO main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && \
curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add - && \
apt-get update -y && apt-get install google-cloud-sdk -y
I used most of these examples in some form (thanks #KJoe), but I had to do several other things to setup everything so gcloud would work in the environment. Note that it is preferable to limit the number of lines (it limits layers needed to pull)
Here's a more complete example of Dockerfile with gcloud setup and extending a CircleCI image:
FROM circleci/ruby:2.4.1-jessie-node-browsers
# user is circleci in the FROM image, switch to root for system lib installation
USER root
ENV CCI /home/circleci
ENV GTMP /tmp/gcloud-install
ENV GSDK $CCI/google-cloud-sdk
ENV PATH="${GSDK}/bin:${PATH}"
# do all system lib installation in one-line to optimize layers
RUN curl -sSL https://sdk.cloud.google.com > $GTMP && bash $GTMP --install-dir=$CCI --disable-prompts \
&& rm -rf $GTMP \
&& chmod +x $GSDK/bin/* \
\
&& chown -Rf circleci:circleci $CCI
# change back to the user in the FROM image
USER circleci
# setup gcloud specifics to your liking
RUN gcloud config set core/disable_usage_reporting true \
&& gcloud config set component_manager/disable_update_check true \
&& gcloud components install alpha beta kubectl --quiet
My use case was to generate a google bearer token using the service account, so I wanted the docker container to install gcloud this is how my docker file looks like
FROM google/cloud-sdk
# Setting the default directory in container
WORKDIR /usr/src/app
# copies the app source code to the directory in container
COPY . /usr/src/app
CMD ["/bin/bash","/usr/src/app/token.sh"]
If you need to examine a container after it is built but that isn't running use docker run --rm -it <container-build-id> bash -il and type in gcloud --version if installed correctly or not
In Google documentation you can see the best practice
https://cloud.google.com/sdk/docs/install-sdk
search on the page for "Docker Tip"
eg debian use:
RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] http://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - && apt-get update -y && apt-get install google-cloud-cli -y
If you're just interested in getting the gcloud CLI available, add this to your Dockerfile:
# Downloading gcloud package
RUN curl https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-409.0.0-linux-x86_64.tar.gz > /tmp/google-cloud-cli.tar.gz
# Installing the gcloud cli
RUN mkdir -p /usr/local/gcloud \
&& tar -xf /tmp/google-cloud-cli.tar.gz \
&& ./google-cloud-sdk/install.sh --quiet