Getting Permission Denied error while accessing a file in Docker - amazon-web-services

I am trying to deploy a model on AWS Sagemaker and using the following docker file:
FROM ubuntu:16.04
#MAINTAINER Amazon AI <sage-learner#amazon.com>
RUN apt-get -y update && apt-get install -y --no-install-recommends \
wget \
python3.5-dev \
gcc \
nginx \
ca-certificates \
libgcc-5-dev \
&& rm -rf /var/lib/apt/lists/*
# Here we get all python packages.
# There's substantial overlap between scipy and numpy that we eliminate by
# linking them together. Likewise, pip leaves the install caches populated which uses
# a significant amount of space. These optimizations save a fair amount of space in the
# image, which reduces start up time.
RUN wget https://bootstrap.pypa.io/3.3/get-pip.py && python3.5 get-pip.py && \
pip3 install numpy==1.14.3 scipy lightfm scikit-optimize pandas==0.22.0 flask gevent gunicorn && \
rm -rf /root/.cache
# Set some environment variables. PYTHONUNBUFFERED keeps Python from buffering our standard
# output stream, which means that logs can be delivered to the user quickly. PYTHONDONTWRITEBYTECODE
# keeps Python from writing the .pyc files which are unnecessary in this case. We also update
# PATH so that the train and serve programs are found when the container is invoked.
ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"
# Set up the program in the image
COPY lightfm /opt/program
WORKDIR /opt/program
The docker container is built successfully, but when I write the following command:
docker run XYZ train
on my local or even on Sagemaker, I am getting the following error:
standard_init_linux.go:207: exec user process caused "permission denied"
In the docker file I am copying a folder called Lightfm and there is a file called "train" in it.
Can anyone help?
OUTPUT OF MY DOCKER BUILD:
$ docker build -t lightfm .
Sending build context to Docker daemon 41.47kB
Step 1/9 : FROM ubuntu:16.04
---> 5e13f8dd4c1a
Step 2/9 : RUN apt-get -y update && apt-get install -y --no-install-recommends wget python3.5-dev gcc nginx ca-certificates libgcc-5-dev && rm -rf /var/lib/apt/lists/*
---> Using cache
---> 14ae3a1eb780
Step 3/9 : RUN wget https://bootstrap.pypa.io/3.3/get-pip.py && python3.5 get-pip.py && pip3 install numpy==1.14.3 scipy lightfm scikit-optimize pandas==0.22.0 flask gevent gunicorn && rm -rf /root/.cache
---> Using cache
---> 5a2727e27385
Step 4/9 : ENV PYTHONUNBUFFERED=TRUE
---> Using cache
---> 43bf8c5e8414
Step 5/9 : ENV PYTHONDONTWRITEBYTECODE=TRUE
---> Using cache
---> 7d2c45d61cec
Step 6/9 : ENV PATH="/opt/program:${PATH}"
---> Using cache
---> f3cc6313c0d9
Step 7/9 : COPY lightfm /opt/program
---> ad929ba84692
Step 8/9 : WORKDIR /opt/program
---> Running in a040dd0bab03
Removing intermediate container a040dd0bab03
---> 8f53c5a3ba63
Step 9/9 : RUN chmod 755 serve
---> Running in 5666abb27cd0
Removing intermediate container 5666abb27cd0
---> e80aca934840
Successfully built e80aca934840
Successfully tagged lightfm:latest
SECURITY WARNING: You are building a Docker image from Windows against a non-Windows Docker host. All files and directories added to build context will have '-rwxr-xr-x' permissions. It is recommended to double check and reset permissions for sensitive files and directories.

Assuming train is the executable you want to run, give it exec permission. After COPY lightfm /opt/program line, add RUN chmod +x /opt/program/train.

Related

Docker build is giving me this error " /bin/sh: -c requires an argument , The command '/bin/sh -c' returned a non-zero code: 2"

I am creating a docker file for my messaging-app, Here is my dockerfile.
FROM python:3.9-alpine3.15
LABEL maintainer="Noor Ibrahim"
ENV PYTHONUNBUFFERED 1
COPY ./message /message
COPY ./requirements.txt /requirements.txt
WORKDIR /message
EXPOSE 8000
RUN
RUN python -m venv /py && \
/py/bin/pip install --upgrade pip && \
/py/bin/pip install -r /requirements.txt && \
adduser --disabled-password --no-create-home user
ENV PATH="/py/bin:$PATH"
USER user
I am running this command
docker build .
I am getting this error while creating the build. I have tried different things but this error doesn't go away.
Sending build context to Docker daemon 67.07kB
Step 1/11 : FROM python:3.9-alpine3.15
---> e49e2f1d4108
Step 2/11 : LABEL maintainer="Noor Ibrahim"
---> Using cache
---> 517b54d522ef
Step 3/11 : ENV PYTHONUNBUFFERED 1
---> Running in d8eb9d900584
---> 29e281514398
Step 4/11 : COPY ./message /message
---> f84edc508eec
Step 5/11 : COPY ./requirements.txt /requirements.txt
---> 608149cc5c42
Step 6/11 : WORKDIR /message
---> Running in b26ec4b33053
Removing intermediate container b26ec4b33053
---> c608a04f1993
Step 7/11 : EXPOSE 8000
---> Running in 8b4f7f49a3b8
Removing intermediate container 8b4f7f49a3b8
---> a18f155d9320
Step 8/11 : RUN
---> Running in 6fa80a39c7d0
/bin/sh: -c requires an argument
The command '/bin/sh -c' returned a non-zero code: 2
Almost certainly that extra RUN line with no command after it
EXPOSE 8000
RUN <==== This one!!
RUN python -m venv /py && \
/py/bin/pip install --upgrade pip && \
/py/bin/pip install -r /requirements.txt && \
adduser --disabled-password --no-create-home user
Get rid of that and it should be fine.

"Timeout in polling result file" error when executing a Dataflow flex-template job

I've tried a lot of different things found online, but I'm still unable to solve the below timeout error:
2021-11-27T14:51:21.844520452ZTimeout in polling result file: gs://...
when submitting a Dataflow flex-template job. It goes into Queued state and after 14 mins {x} secs goes to Failed state with the above log message. My Dockerfile is as follows:
FROM gcr.io/dataflow-templates-base/python3-template-launcher-base
ARG WORKDIR=/dataflow/template
RUN mkdir -p ${WORKDIR}
WORKDIR ${WORKDIR}
COPY requirements.txt .
COPY test-beam.py .
# Do not include `apache-beam` in requirements.txt
ENV FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE="${WORKDIR}/requirements.txt"
ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${WORKDIR}/test-beam.py"
# Setting Proxy
ENV http_proxy=http://proxy-web.{company_name}.com:80 \
https_proxy=http://proxy-web.{company_name}.com:80 \
no_proxy=127.0.0.1,localhost,.{company_name}.com,{company_name}.com,.googleapis.com
# Company Cert
RUN apt-get update && apt-get install -y curl \
&& curl http://{company_name}.com/pki/{company_name}%20Issuing%20CA.pem -o - | tr -d '\r' > /usr/local/share/ca-certificates/{company_name}.crt \
&& curl http://{company_name}.com/pki/{company_name}%20Root%20CA.pem -o - | tr -d '\r' > /usr/local/share/ca-certificates/{company_name}-root.crt \
&& update-ca-certificates \
&& apt-get remove -y --purge curl \
&& apt-get autoremove -y \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Set pip config to point to Company Cert
RUN pip config set global.cert /etc/ssl/certs/ca-certificates.crt
# Install apache-beam and other dependencies to launch the pipeline
RUN pip install --no-cache-dir --upgrade pip \
&& pip install --no-cache-dir apache-beam[gcp]==2.32.0 \
&& pip install --no-cache-dir -r $FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE \
# Download the requirements to s7peed up launching the Dataflow job.
&& pip download --no-cache-dir --dest /tmp/dataflow-requirements-cache -r $FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE
# Since we already downloaded all the dependencies, there's no need to rebuild everything.
ENV PIP_NO_DEPS=True
ENV http_proxy= \
https_proxy= \
no_proxy=
ENTRYPOINT ["/opt/google/dataflow/python_template_launcher"]
And requirements.py:
numpy
setuptools
scipy
wavefile
I know my Python script used above test-beam.py works as it executes successfully locally using a DirectRunner.
I have gone through many SO posts and GCP's own troubleshooting guide here aimed at this error, however to no success. As you can see from my Dockerfile, I have done the following in it:
Installing apache-beam[gcp] separately and not including it in my requirements.txt file.
Pre-downloading all dependencies using pip download --no-cache-dir --dest /tmp/dataflow-requirements-cache -r $FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE.
Setting ENTRYPOINT ["/opt/google/dataflow/python_template_launcher"] explicitly as it seems this is not set in the base image gcr.io/dataflow-templates-base/python3-template-launcher-base as found by executing docker inspect on it (am I correct about this?).
Unsetting company proxy at the end as it seems to be the cause of timeout issues seen in job logs from previous runs.
What am I missing? How can I fix this issue?

How to pre-install pre commit into hooks into docker

As I understand the documentation, whenever I add these lines to the config:
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.1.0
hooks:
- id: trailing-whitespace
it makes pre-commit to download the hooks code from this repo and execute it. Is it possible to pre-install all the hooks somehow into a Docker image. So when I call pre-commit run no network is used?
I found this section of the documentation describing how pre-commit caches all the repositories. They are stored in ~/.cache/pre-commit and this could be configured by updating PRE_COMMIT_HOME env variable.
However, the caching only works when I do pre-commit run. But I want to pre-install everything w/o running the checks. Is it possible?
you're looking for the pre-commit install-hooks command
at the least you need something like this to cache the pre-commit environments:
COPY .pre-commit-config.yaml .
RUN git init . && pre-commit install-hooks
disclaimer: I created pre-commit
Snippet provided by #anthony-sottile works like charm. It helps utilize docker cache. Here is a working variation for it from django world.
ARG PYTHON_VERSION=3.9-buster
# define an alias for the specfic python version used in this file.
FROM python:${PYTHON_VERSION} as python
# Python build stage
FROM python as python-build-stage
ARG BUILD_ENVIRONMENT=test
# Install apt packages
RUN apt-get update && apt-get install --no-install-recommends -y \
# dependencies for building Python packages
build-essential \
# psycopg2 dependencies
libpq-dev
# Requirements are installed here to ensure they will be cached.
COPY ./requirements .
# Create Python Dependency and Sub-Dependency Wheels.
RUN pip wheel --wheel-dir /usr/src/app/wheels \
-r ${BUILD_ENVIRONMENT}.txt
# Python 'run' stage
FROM python as python-run-stage
ARG BUILD_ENVIRONMENT=test
ARG APP_HOME=/app
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
ENV BUILD_ENV ${BUILD_ENVIRONMENT}
WORKDIR ${APP_HOME}
# Install required system dependencies
RUN apt-get update && apt-get install --no-install-recommends -y \
# psycopg2 dependencies
libpq-dev \
# Translations dependencies
gettext \
# cleaning up unused files
&& apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false \
&& rm -rf /var/lib/apt/lists/*
# All absolute dir copies ignore workdir instruction. All relative dir copies are wrt to the workdir instruction
# copy python dependency wheels from python-build-stage
COPY --from=python-build-stage /usr/src/app/wheels /wheels/
# use wheels to install python dependencies
RUN pip install --no-cache-dir --no-index --find-links=/wheels/ /wheels/* \
&& rm -rf /wheels/
COPY ./compose/test/django/entrypoint /entrypoint
RUN chmod +x /entrypoint
COPY .pre-commit-config.yaml .
RUN git init . && pre-commit install-hooks
# copy application code to WORKDIR
COPY . ${APP_HOME}
ENTRYPOINT ["/entrypoint"]
then you can fire pre-commit checks in similar fashion:
docker-compose -p project_name -f test.yml run --rm django pre-commit run --all-files

Docker error when containerizing app in Google Cloud Run

I am trying to run transformers from huggingface in Google Cloud Run.
My first idea was to run one of the dockerfiles provided by huggingface, but it seems that is not possible.
Any ideas on how to get around this error?
Step 6/9 : WORKDIR /workspace
---> Running in xxx
Removing intermediate container xxx
---> xxx
Step 7/9 : COPY . transformers/
---> xxx
Step 8/9 : RUN cd transformers/ && python3 -m pip install --no-cache-dir .
---> Running in xxx
←[91mERROR: Directory '.' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.
The command '/bin/sh -c cd transformers/ && python3 -m pip install --no-cache-dir .' returned a non-zero code: 1
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/docker" failed: step exited with non-zero status: 1
←[0m
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
ERROR: (gcloud.builds.submit) build xxx completed with status "FAILURE"
Dockerfile from huggingface:
FROM nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04
LABEL maintainer="Hugging Face"
LABEL repository="transformers"
RUN apt update && \
apt install -y bash \
build-essential \
git \
curl \
ca-certificates \
python3 \
python3-pip && \
rm -rf /var/lib/apt/lists
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
mkl \
tensorflow
WORKDIR /workspace
COPY . transformers/
RUN cd transformers/ && \
python3 -m pip install --no-cache-dir .
CMD ["/bin/bash"]
.dockerignore file from Google Cloud Run documentation:
Dockerfile
README.md
*.pyc
*.pyo
*.pyd
__pycache__
.pytest_cache
---- Edit:
Managed to get working based on the answer from Dustin. I basically:
left the Dockerfile in the root folder, together with the transformers folder.
updated the COPY line from the dockerfile to:
COPY . ./
The error is:
Directory '.' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.
This is due to these two lines in your Dockerfile:
COPY . transformers/
RUN cd transformers/ && \
python3 -m pip install --no-cache-dir .
This attempts to copy the local directory containing the Dockerfile into the container, and then install it as a Python project.
It looks like the Dockerfile expects to be run at the repository root of https://github.com/huggingface/transformers. You should cloning the repo and move the Dockerfile you want to build into the root, and then build again.

Dockerfile issue with RPM

I would like install auditserver on nodejs server , So my auditserver with rpm . it is working fine as a manual steps.
I write a Dockerfile like below.
FROM centos:centos6
# Enable EPEL for Node.js
RUN rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
# Install Node.js and npm
RUN yum install -y npm
# ADD rpm into container
ADD auditserver-1-1.x86_64.rpm /opt/
RUN mkdir -p /opt/auditserver
RUN cd /opt
RUN rpm -Uvh auditserver-1-1.x86_64.rpm
# cd to auditserver
RUN cd /opt/auditserver
# Install app dependencies
RUN npm install
# start auditserver
RUN node server
EXPOSE 8080
while building the docker file I see below issue.
root#CloudieBase:/tmp/sky-test# docker build -t sky-test .
Sending build context to Docker daemon 38.4 kB
Step 1 : FROM centos:centos6
---> 9c95139afb21
Step 2 : RUN rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
---> Using cache
---> fd5b1bb647fc
Step 3 : RUN yum install -y npm
---> Using cache
---> b7c2908fc583
Step 4 : ADD auditserver-1-1.x86_64.rpm /opt/
---> 26ace798f98c
Removing intermediate container 5ea6221797f5
Step 5 : RUN mkdir -p /opt/auditserver
---> Running in 8f7292364245
---> 9b340033f6b7
Removing intermediate container 8f7292364245
Step 6 : RUN cd /opt
---> Running in c7d20fd251f3
---> 0cdf90b6cb2e
Removing intermediate container c7d20fd251f3
Step 7 : RUN rpm -Uvh auditserver-1-1.x86_64.rpm
---> Running in 4473241e5077
error: open of auditserver-1-1.x86_64.rpm failed: No such file or directory
The command '/bin/sh -c rpm -Uvh auditserver-1-1.x86_64.rpm' returned a non-zero code: 1
root#CloudieBase:/tmp/sky-test#
Can any help on this to made perfect Dockerfile. thanks.
The problem is that you are not in the /opt directory when executing the rpm command (step 7). See this answer to find out why it happens. Quote:
Each time you RUN, you spawn a new container and therefore the pwd is '/'.
For how to fix it see this question. To summarize: you can use the WORKDIR dockerfile command or change this part:
RUN cd /opt
RUN rpm -Uvh auditserver-1-1.x86_64.rpm
to this:
RUN cd /opt && rpm -Uvh auditserver-1-1.x86_64.rpm