Dockerfile issue with RPM - dockerfile

I would like install auditserver on nodejs server , So my auditserver with rpm . it is working fine as a manual steps.
I write a Dockerfile like below.
FROM centos:centos6
# Enable EPEL for Node.js
RUN rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
# Install Node.js and npm
RUN yum install -y npm
# ADD rpm into container
ADD auditserver-1-1.x86_64.rpm /opt/
RUN mkdir -p /opt/auditserver
RUN cd /opt
RUN rpm -Uvh auditserver-1-1.x86_64.rpm
# cd to auditserver
RUN cd /opt/auditserver
# Install app dependencies
RUN npm install
# start auditserver
RUN node server
EXPOSE 8080
while building the docker file I see below issue.
root#CloudieBase:/tmp/sky-test# docker build -t sky-test .
Sending build context to Docker daemon 38.4 kB
Step 1 : FROM centos:centos6
---> 9c95139afb21
Step 2 : RUN rpm -Uvh http://download.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm
---> Using cache
---> fd5b1bb647fc
Step 3 : RUN yum install -y npm
---> Using cache
---> b7c2908fc583
Step 4 : ADD auditserver-1-1.x86_64.rpm /opt/
---> 26ace798f98c
Removing intermediate container 5ea6221797f5
Step 5 : RUN mkdir -p /opt/auditserver
---> Running in 8f7292364245
---> 9b340033f6b7
Removing intermediate container 8f7292364245
Step 6 : RUN cd /opt
---> Running in c7d20fd251f3
---> 0cdf90b6cb2e
Removing intermediate container c7d20fd251f3
Step 7 : RUN rpm -Uvh auditserver-1-1.x86_64.rpm
---> Running in 4473241e5077
error: open of auditserver-1-1.x86_64.rpm failed: No such file or directory
The command '/bin/sh -c rpm -Uvh auditserver-1-1.x86_64.rpm' returned a non-zero code: 1
root#CloudieBase:/tmp/sky-test#
Can any help on this to made perfect Dockerfile. thanks.

The problem is that you are not in the /opt directory when executing the rpm command (step 7). See this answer to find out why it happens. Quote:
Each time you RUN, you spawn a new container and therefore the pwd is '/'.
For how to fix it see this question. To summarize: you can use the WORKDIR dockerfile command or change this part:
RUN cd /opt
RUN rpm -Uvh auditserver-1-1.x86_64.rpm
to this:
RUN cd /opt && rpm -Uvh auditserver-1-1.x86_64.rpm

Related

Unable to install Tesseract 5.0 version on AWS Lambda

I want to run Tesseract 4.0 or Tesseract 5.0 on my AWS Lambda function. So I have my docker file like so-
FROM public.ecr.aws/lambda/python:3.8
RUN mkdir app
# Copy function code
COPY / ${LAMBDA_TASK_ROOT}/app
# Install the function's dependencies using file requirements.txt
# from your project folder.
COPY requirements.txt .
RUN pip3 install -r requirements.txt --target ${LAMBDA_TASK_ROOT}
RUN rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
RUN yum -y update
RUN yum -y install tesseract
RUN yum install -y poppler-utils
# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "app.com.emlAndMsgParser.mail_parser_test.getEmail_from_msg" ]
but when i do DockerBuild-"docker build -t qa-lambda ." on my terminal, it says Tesseract 3.0 version is getting installed. When i deploy this built Docker image to AWS Lambda,it also says Tesseract 3.0 is installed.
But I want Tesseract 4.0 or preferably Tesserct 5.0.
I tried changing the "RUN yum -y install tesseract" in my Dockerfile to "RUN yum -y install tesseract 5.0.0-alpha-320-g8dc3" and "RUN yum -y install tesseract -y" or "RUN yum -y install tesseract*".
But all of them are installing Tesseract 3.0.
Please can anyone tell me where I am going wrong?
I am a bit new to Tesseract, so any help is appreciated..thanks!
Having the same problem, I finally created a Dockerfile myself:
FROM public.ecr.aws/lambda/java:11 q
# Prepare dev tools
RUN yum -y update
RUN yum -y install wget libstdc++ autoconf automake libtool autoconf-archive pkg-config gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel
RUN yum group install -y "Development Tools"
# Build leptonica
WORKDIR /opt
RUN wget http://www.leptonica.org/source/leptonica-1.82.0.tar.gz
RUN ls -la
RUN tar -zxvf leptonica-1.82.0.tar.gz
WORKDIR ./leptonica-1.82.0
RUN ./configure
RUN make -j
RUN make install
RUN cd .. && rm leptonica-1.82.0.tar.gz
# Build tesseract
RUN wget https://github.com/tesseract-ocr/tesseract/archive/refs/tags/5.2.0.tar.gz
RUN tar -zxvf 5.2.0.tar.gz
WORKDIR ./tesseract-5.2.0
RUN ./autogen.sh
RUN PKG_CONFIG_PATH=/usr/local/lib/pkgconfig LIBLEPT_HEADERSDIR=/usr/local/include ./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/lib
RUN LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j
RUN make install
RUN /sbin/ldconfig
RUN cd .. && rm 5.2.0.tar.gz
# Optional: install language packs
RUN wget https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata
RUN wget https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata
RUN mv *.traineddata /usr/local/share/tessdata
WORKDIR /root
ENTRYPOINT [ "tesseract", "--version" ]
Hope this helps!

Docker error when containerizing app in Google Cloud Run

I am trying to run transformers from huggingface in Google Cloud Run.
My first idea was to run one of the dockerfiles provided by huggingface, but it seems that is not possible.
Any ideas on how to get around this error?
Step 6/9 : WORKDIR /workspace
---> Running in xxx
Removing intermediate container xxx
---> xxx
Step 7/9 : COPY . transformers/
---> xxx
Step 8/9 : RUN cd transformers/ && python3 -m pip install --no-cache-dir .
---> Running in xxx
←[91mERROR: Directory '.' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.
The command '/bin/sh -c cd transformers/ && python3 -m pip install --no-cache-dir .' returned a non-zero code: 1
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/docker" failed: step exited with non-zero status: 1
←[0m
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
ERROR: (gcloud.builds.submit) build xxx completed with status "FAILURE"
Dockerfile from huggingface:
FROM nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04
LABEL maintainer="Hugging Face"
LABEL repository="transformers"
RUN apt update && \
apt install -y bash \
build-essential \
git \
curl \
ca-certificates \
python3 \
python3-pip && \
rm -rf /var/lib/apt/lists
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
mkl \
tensorflow
WORKDIR /workspace
COPY . transformers/
RUN cd transformers/ && \
python3 -m pip install --no-cache-dir .
CMD ["/bin/bash"]
.dockerignore file from Google Cloud Run documentation:
Dockerfile
README.md
*.pyc
*.pyo
*.pyd
__pycache__
.pytest_cache
---- Edit:
Managed to get working based on the answer from Dustin. I basically:
left the Dockerfile in the root folder, together with the transformers folder.
updated the COPY line from the dockerfile to:
COPY . ./
The error is:
Directory '.' is not installable. Neither 'setup.py' nor 'pyproject.toml' found.
This is due to these two lines in your Dockerfile:
COPY . transformers/
RUN cd transformers/ && \
python3 -m pip install --no-cache-dir .
This attempts to copy the local directory containing the Dockerfile into the container, and then install it as a Python project.
It looks like the Dockerfile expects to be run at the repository root of https://github.com/huggingface/transformers. You should cloning the repo and move the Dockerfile you want to build into the root, and then build again.

Getting Permission Denied error while accessing a file in Docker

I am trying to deploy a model on AWS Sagemaker and using the following docker file:
FROM ubuntu:16.04
#MAINTAINER Amazon AI <sage-learner#amazon.com>
RUN apt-get -y update && apt-get install -y --no-install-recommends \
wget \
python3.5-dev \
gcc \
nginx \
ca-certificates \
libgcc-5-dev \
&& rm -rf /var/lib/apt/lists/*
# Here we get all python packages.
# There's substantial overlap between scipy and numpy that we eliminate by
# linking them together. Likewise, pip leaves the install caches populated which uses
# a significant amount of space. These optimizations save a fair amount of space in the
# image, which reduces start up time.
RUN wget https://bootstrap.pypa.io/3.3/get-pip.py && python3.5 get-pip.py && \
pip3 install numpy==1.14.3 scipy lightfm scikit-optimize pandas==0.22.0 flask gevent gunicorn && \
rm -rf /root/.cache
# Set some environment variables. PYTHONUNBUFFERED keeps Python from buffering our standard
# output stream, which means that logs can be delivered to the user quickly. PYTHONDONTWRITEBYTECODE
# keeps Python from writing the .pyc files which are unnecessary in this case. We also update
# PATH so that the train and serve programs are found when the container is invoked.
ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"
# Set up the program in the image
COPY lightfm /opt/program
WORKDIR /opt/program
The docker container is built successfully, but when I write the following command:
docker run XYZ train
on my local or even on Sagemaker, I am getting the following error:
standard_init_linux.go:207: exec user process caused "permission denied"
In the docker file I am copying a folder called Lightfm and there is a file called "train" in it.
Can anyone help?
OUTPUT OF MY DOCKER BUILD:
$ docker build -t lightfm .
Sending build context to Docker daemon 41.47kB
Step 1/9 : FROM ubuntu:16.04
---> 5e13f8dd4c1a
Step 2/9 : RUN apt-get -y update && apt-get install -y --no-install-recommends wget python3.5-dev gcc nginx ca-certificates libgcc-5-dev && rm -rf /var/lib/apt/lists/*
---> Using cache
---> 14ae3a1eb780
Step 3/9 : RUN wget https://bootstrap.pypa.io/3.3/get-pip.py && python3.5 get-pip.py && pip3 install numpy==1.14.3 scipy lightfm scikit-optimize pandas==0.22.0 flask gevent gunicorn && rm -rf /root/.cache
---> Using cache
---> 5a2727e27385
Step 4/9 : ENV PYTHONUNBUFFERED=TRUE
---> Using cache
---> 43bf8c5e8414
Step 5/9 : ENV PYTHONDONTWRITEBYTECODE=TRUE
---> Using cache
---> 7d2c45d61cec
Step 6/9 : ENV PATH="/opt/program:${PATH}"
---> Using cache
---> f3cc6313c0d9
Step 7/9 : COPY lightfm /opt/program
---> ad929ba84692
Step 8/9 : WORKDIR /opt/program
---> Running in a040dd0bab03
Removing intermediate container a040dd0bab03
---> 8f53c5a3ba63
Step 9/9 : RUN chmod 755 serve
---> Running in 5666abb27cd0
Removing intermediate container 5666abb27cd0
---> e80aca934840
Successfully built e80aca934840
Successfully tagged lightfm:latest
SECURITY WARNING: You are building a Docker image from Windows against a non-Windows Docker host. All files and directories added to build context will have '-rwxr-xr-x' permissions. It is recommended to double check and reset permissions for sensitive files and directories.
Assuming train is the executable you want to run, give it exec permission. After COPY lightfm /opt/program line, add RUN chmod +x /opt/program/train.

Fastest way to install Tesseract on Elastic Beanstalk

I am currently using Tika to extract text from files uploaded to my Rails app running on AWS Elastic Beanstalk (64bit Amazon Linux 2016.03 v2.1.2 running Ruby 2.2). I'd like to index scanned images as well, so I need to install Tesseract.
I was able to get it to work by installing it from source like so, but it added 10 minutes to my deploys to a fresh instance. Is there a faster way to do this?
.ebextensions/02-tesseract.config
packages:
yum:
autoconf: []
automake: []
libtool: []
libpng-devel: []
libtiff-devel: []
zlib-devel: []
container_commands:
01-command:
command: mkdir -p install
cwd: /home/ec2-user
02-command:
command: cp .ebextensions/scripts/install_tesseract.sh /home/ec2-user/install/
03-command:
command: bash install/install_tesseract.sh
cwd: /home/ec2-user
.ebextensions/scripts/install_tesseract.sh
#!/usr/bin/env bash
cd_to_install () {
cd /home/ec2-user/install
}
cd_to () {
cd /home/ec2-user/install/$1
}
if ! [ -x "$(command -v tesseract)" ]; then
# Add `usr/local/bin` to PATH
echo 'pathmunge /usr/local/bin' > /etc/profile.d/usr_local.sh
chmod +x /etc/profile.d/usr_local.sh
# Install leptonica
cd_to_install
wget http://www.leptonica.org/source/leptonica-1.73.tar.gz
tar -zxvf leptonica-1.73.tar.gz
cd_to leptonica-1.73
./configure
make
make install
rm -rf /home/ec2-user/install/leptonica-1.73.tar.gz
rm -rf /home/ec2-user/install/leptonica-1.73
# Install tesseract ~ the jewel of Odin's treasure room
cd_to_install
wget https://github.com/tesseract-ocr/tesseract/archive/3.04.01.tar.gz
tar -zxvf 3.04.01.tar.gz
cd_to tesseract-3.04.01
./autogen.sh
./configure
make
make install
ldconfig
rm -rf /home/ec2-user/install/3.04.01.tar.gz
rm -rf /home/ec2-user/install/tesseract-3.04.01
# Install tessdata
cd_to_install
wget https://github.com/tesseract-ocr/tessdata/archive/3.04.00.tar.gz
tar -zxvf 3.04.00.tar.gz
cp /home/ec2-user/install/tessdata-3.04.00/eng.* /usr/local/share/tessdata/
rm -rf /home/ec2-user/install/3.04.00.tar.gz
rm -rf /home/ec2-user/install/tessdata-3.04.00
fi
Short answer
.ebextensions/02-tesseract.config
commands:
01-libwebp:
command: "yum --enablerepo=epel --disablerepo=amzn-main -y install libwebp"
02-tesseract:
command: "yum --enablerepo=epel -y install tesseract"
Long answer
I'm not familiar with non-Ubuntu package managers or ebextensions, so after some digging, I found that there are precompiled binaries that can be installed on Amazon Linux in the stable EPEL repo.
The first obstacle was figuring out how to use the EPEL repo. The easiest way is to use the enablerepo option on the yum command.
That gets us here:
yum --enablerepo=epel install tesseract
Next, I had to resolve this dependency error:
[root#ip-10-0-1-193 ec2-user]# yum install --enablerepo=epel tesseract
Loaded plugins: priorities, update-motd, upgrade-helper
951 packages excluded due to repository priority protections
Resolving Dependencies
--> Running transaction check
---> Package tesseract.x86_64 0:3.04.00-3.el6 will be installed
--> Processing Dependency: liblept.so.4()(64bit) for package: tesseract-3.04.00-3.el6.x86_64
--> Running transaction check
---> Package leptonica.x86_64 0:1.72-2.el6 will be installed
--> Processing Dependency: libwebp.so.5()(64bit) for package: leptonica-1.72-2.el6.x86_64
--> Finished Dependency Resolution
Error: Package: leptonica-1.72-2.el6.x86_64 (epel)
Requires: libwebp.so.5()(64bit)
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest
I found the solution here
Just adding the epel repo doesn't solve it, as the packages in the
amzn-main repository seem to overrule those in the epel repository. If
the libwebp package in the amzn-main repo are excluded it should work
The Tesseract install has some dependencies found in the amzn-main repo. This is why I first install libwebp with --disablerepo=amzn-main.
yum --enablerepo=epel --disablerepo=amzn-main install libwebp
yum --enablerepo=epel install tesseract
Finally, here's how you can install yum packages on Elastic Beanstalk with options:
.ebextensions/02-tesseract.config
commands:
01-libwebp:
command: "yum --enablerepo=epel --disablerepo=amzn-main -y install libwebp"
02-tesseract:
command: "yum --enablerepo=epel -y install tesseract"
Fortunately, this is also the easiest way to install Tesseract on Elastic Beanstalk!

Compiling libsass in Docker container

So I have the following Dockerfile:
############################################################
# Based on Ubuntu
############################################################
FROM ubuntu:trusty
MAINTAINER OTIS WRIGHT
# Add extra software sources
RUN apt-get update -qq
# Libsass requirements.
RUN apt-get install -y curl git build-essential automake libtool
# Fetch sources
RUN git clone https://github.com/sass/libsass.git
RUN git clone https://github.com/sass/sassc.git libsass/sassc
# Create configure script
RUN cd libsass
RUN autoreconf --force --install
RUN cd ..
# Create custom makefiles for **shared library**, for more info read:
# 'Difference between static and shared libraries?' before installing libsass http://stackoverflow.com/q/2649334/802365
RUN cd libsass
RUN autoreconf --force --install
RUN ./configure --disable-tests --enable-shared --prefix=/usr
RUN cd ..
# Build the library
RUN make -C libsass -j5
# Install the library
RUN make -C libsass -j5 install
I am trying to build libsass based on this gist:
https://gist.github.com/edouard-lopez/503d40a5c1a49cf8ae87
However when I try to build with docker I get the following error:
Step 11 : RUN git clone https://github.com/sass/libsass.git
---> Using cache
---> d1a4eef78fa5
Step 12 : RUN git clone https://github.com/sass/sassc.git libsass/sassc
---> Using cache
---> 435410579641
Step 13 : RUN cd libsass
---> Using cache
---> f0a4df503d85
Step 14 : RUN autoreconf --force --install
---> Running in a9b0d51d6ee3
autoreconf: 'configure.ac' or 'configure.in' is required
The command '/bin/sh -c autoreconf --force --install' returned a non-zero code: 1
I don't know a lot about compiling from source so any solutions please walk me through.
Docker won't change dirs for you RUN cd libsass so you need to path your commands or use the WORKDIR directive