Deploying a Geodjango Application on AWS Elastic Beanstalk - django

I'm trying to deploy a geodjango application on AWS Elastic Beanstalk. The configuration is 64bit Amazon Linux 2017.09 v2.6.6 running Python 3.6. I am getting this error when trying to deploy:
Requires: libpoppler.so.5()(64bit) Error: Package: gdal-java-1.9.2-8.rhel6.x86_64 (pgdg93) Requires: libpoppler.so.5()(64bit)
How do I install the required package? I read through Setting up Django with GeoDjango Support in AWS Beanstalk or EC2 Instance but I am still getting problems. My ebextensions currently looks like:
commands:
01_yum_update:
command: sudo yum -y update
02_epel_repo:
command: sudo yum-config-manager -y --enable epel
03_install_gdal_packages:
command: sudo yum -y install gdal gdal-devel
packages:
yum:
git: []
postgresql95-devel: []
gettext: []
libjpeg-turbo-devel: []
libffi-devel: []

I'm going to answer my own question for the sake my future projects and anyone else trying to get started with geodjango. Updating this answer as of July 2020
Create an ebextensions file to install GDAL on the EC2 instance at deployment:
01_gdal.config
commands:
01_install_gdal:
test: "[ ! -d /usr/local/gdal ]"
command: "/tmp/gdal_install.sh"
files:
"/tmp/gdal_install.sh":
mode: "000755"
owner: root
group: root
content: |
#!/usr/bin/env bash
sudo yum-config-manager --enable epel
sudo yum -y install make automake gcc gcc-c++ libcurl-devel proj-devel geos-devel
# Geos
cd /
sudo mkdir -p /usr/local/geos
cd usr/local/geos/geos-3.7.2
sudo wget geos-3.7.2.tar.bz2 http://download.osgeo.org/geos/geos-3.7.2.tar.bz2
sudo tar -xvf geos-3.7.2.tar.bz2
cd geos-3.7.2
sudo ./configure
sudo make
sudo make install
sudo ldconfig
# Proj4
cd /
sudo mkdir -p /usr/local/proj
cd usr/local/proj
sudo wget -O proj-5.2.0.tar.gz http://download.osgeo.org/proj/proj-5.2.0.tar.gz
sudo wget -O proj-datumgrid-1.8.tar.gz http://download.osgeo.org/proj/proj-datumgrid-1.8.tar.gz
sudo tar xvf proj-5.2.0.tar.gz
sudo tar xvf proj-datumgrid-1.8.tar.gz
cd proj-5.2.0
sudo ./configure
sudo make
sudo make install
sudo ldconfig
# GDAL
cd /
sudo mkdir -p /usr/local/gdal
cd usr/local/gdal
sudo wget -O gdal-2.4.4.tar.gz http://download.osgeo.org/gdal/2.4.4/gdal-2.4.4.tar.gz
sudo tar xvf gdal-2.4.4.tar.gz
cd gdal-2.4.4
sudo ./configure
sudo make
sudo make install
sudo ldconfig
As shown, the script checks whether gdal already exists using the test function. It then downloads the Geos, Proj, and GDAL libraries and installs them in the usr/local directory. At the time of writing this, geodjango (Django 3.0) supports up to Geos 3.7, Proj 5.2 (which also requires projdatum. Current releases do not require it), and GDAL 2.4 Warning: this installation process can take a long time. Also I am not a Linux professional so some of those commands may be redundant, but it works.
Lastly I add the following two environment variables to my Elastic Beanstalk configuration:
LD_LIBRARY_PATH: /usr/local/lib:$LD_LIBRARY_PATH
PROJ_LIB: usr/local/proj
If you still have troubles I recommend checking the logs and ssh-ing in the EC2 instance to check that installation took place. Original credit to this post

Related

Install gdal geodjango on elastic beanstalk

what are the correct steps to install geodjango on elastic beanstalk?
got eb instance, installed env and made it two instances now I want to use geodjango on it, I'm already using it on a separate ec2 instance for testing
that's my django.config file and it fails
option_settings:
aws:elasticbeanstalk:container:python:
WSGIPath: hike.project.wsgi:application
commands:
01_gdal:
command: "wget http://download.osgeo.org/gdal/2.1.3/gdal-2.1.3.tar.gz && tar -xzf gdal-2.1.3.tar.gz && cd gdal-2.1.3 && ./configure && make && make install"
then tried this instead and also failed from 100% cpu and time limit
option_settings:
aws:elasticbeanstalk:container:python:
WSGIPath: hike.project.wsgi:application
commands:
01_install_gdal:
test: "[ ! -d /usr/local/gdal ]"
command: "/tmp/gdal_install.sh"
files:
"/tmp/gdal_install.sh":
mode: "000755"
owner: root
group: root
content: |
#!/usr/bin/env bash
sudo yum-config-manager --enable epel
sudo yum -y install make automake gcc gcc-c++ libcurl-devel proj-devel geos-devel
# Geos
cd /
sudo mkdir -p /usr/local/geos
cd usr/local/geos/geos-3.7.2
sudo wget geos-3.7.2.tar.bz2 http://download.osgeo.org/geos/geos-3.7.2.tar.bz2
sudo tar -xvf geos-3.7.2.tar.bz2
cd geos-3.7.2
sudo ./configure
sudo make
sudo make install
sudo ldconfig
# Proj4
cd /
sudo mkdir -p /usr/local/proj
cd usr/local/proj
sudo wget -O proj-5.2.0.tar.gz http://download.osgeo.org/proj/proj-5.2.0.tar.gz
sudo wget -O proj-datumgrid-1.8.tar.gz http://download.osgeo.org/proj/proj-datumgrid-1.8.tar.gz
sudo tar xvf proj-5.2.0.tar.gz
sudo tar xvf proj-datumgrid-1.8.tar.gz
cd proj-5.2.0
sudo ./configure
sudo make
sudo make install
sudo ldconfig
# GDAL
cd /
sudo mkdir -p /usr/local/gdal
cd usr/local/gdal
sudo wget -O gdal-2.4.4.tar.gz http://download.osgeo.org/gdal/2.4.4/gdal-2.4.4.tar.gz
sudo tar xvf gdal-2.4.4.tar.gz
cd gdal-2.4.4
sudo ./configure
sudo make
sudo make install
sudo ldconfig
and I've no idea what to do,

Unable to install Tesseract 5.0 version on AWS Lambda

I want to run Tesseract 4.0 or Tesseract 5.0 on my AWS Lambda function. So I have my docker file like so-
FROM public.ecr.aws/lambda/python:3.8
RUN mkdir app
# Copy function code
COPY / ${LAMBDA_TASK_ROOT}/app
# Install the function's dependencies using file requirements.txt
# from your project folder.
COPY requirements.txt .
RUN pip3 install -r requirements.txt --target ${LAMBDA_TASK_ROOT}
RUN rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
RUN yum -y update
RUN yum -y install tesseract
RUN yum install -y poppler-utils
# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "app.com.emlAndMsgParser.mail_parser_test.getEmail_from_msg" ]
but when i do DockerBuild-"docker build -t qa-lambda ." on my terminal, it says Tesseract 3.0 version is getting installed. When i deploy this built Docker image to AWS Lambda,it also says Tesseract 3.0 is installed.
But I want Tesseract 4.0 or preferably Tesserct 5.0.
I tried changing the "RUN yum -y install tesseract" in my Dockerfile to "RUN yum -y install tesseract 5.0.0-alpha-320-g8dc3" and "RUN yum -y install tesseract -y" or "RUN yum -y install tesseract*".
But all of them are installing Tesseract 3.0.
Please can anyone tell me where I am going wrong?
I am a bit new to Tesseract, so any help is appreciated..thanks!
Having the same problem, I finally created a Dockerfile myself:
FROM public.ecr.aws/lambda/java:11 q
# Prepare dev tools
RUN yum -y update
RUN yum -y install wget libstdc++ autoconf automake libtool autoconf-archive pkg-config gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel
RUN yum group install -y "Development Tools"
# Build leptonica
WORKDIR /opt
RUN wget http://www.leptonica.org/source/leptonica-1.82.0.tar.gz
RUN ls -la
RUN tar -zxvf leptonica-1.82.0.tar.gz
WORKDIR ./leptonica-1.82.0
RUN ./configure
RUN make -j
RUN make install
RUN cd .. && rm leptonica-1.82.0.tar.gz
# Build tesseract
RUN wget https://github.com/tesseract-ocr/tesseract/archive/refs/tags/5.2.0.tar.gz
RUN tar -zxvf 5.2.0.tar.gz
WORKDIR ./tesseract-5.2.0
RUN ./autogen.sh
RUN PKG_CONFIG_PATH=/usr/local/lib/pkgconfig LIBLEPT_HEADERSDIR=/usr/local/include ./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/lib
RUN LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j
RUN make install
RUN /sbin/ldconfig
RUN cd .. && rm 5.2.0.tar.gz
# Optional: install language packs
RUN wget https://github.com/tesseract-ocr/tessdata/raw/main/deu.traineddata
RUN wget https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata
RUN mv *.traineddata /usr/local/share/tessdata
WORKDIR /root
ENTRYPOINT [ "tesseract", "--version" ]
Hope this helps!

sdkman does not install java in a dockerfile

I have this docker file:
# We are going to star from the jhipster image
FROM jhipster/jhipster
# install as root
USER root
### Setup docker cli (don't need docker daemon) ###
# Install some packages
RUN apt-get install apt-transport-https ca-certificates curl gnupg-agent software-properties-common -y
# Add Dockers official GPG key:
RUN ["/bin/bash", "-c", "set -o pipefail && curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -"]
# Add a stable repository
RUN add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
# Setup aws credentials as environment variables
ENV AWS_ACCESS_KEY_ID "change it!"
ENV AWS_SECRET_ACCESS_KEY "change it!"
# noninteractive install for tzdata
ARG DEBIAN_FRONTEND=noninteractive
# set timezone for tzdata
ENV TZ=America/Sao_Paulo
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
# Install the latest version of Docker Engine - Community and also aws cli
RUN apt-get update && apt-get install docker-ce docker-ce-cli containerd.io awscli -y
# change back to default user
USER jhipster
# install skd and java version 1.8
RUN curl -s "https://get.sdkman.io" | bash
RUN bash $HOME/.sdkman/bin/sdkman-init.sh
RUN bash -c "sdk install java 8.0.222.j9-adpt"
When I run a command to build an image from this dockerfile it fails on the last step with a message:
/bin/sh: 1: sdk: not found
When I install it on my local machine it runs sdkman (sdk) on bash. But on this script it calls it from sh not bash. How can I make it calls skdman (sdk) from sh? What I actually want to do is install a specific java version through sdkman (sdk). Is there another way to do it?
For sdk command to be available you need to run source sdkman-init.sh.
Here is a working sample with java 11 on centos.
FROM centos:latest
ARG CANDIDATE=java
ARG CANDIDATE_VERSION=11.0.6-open
ENV SDKMAN_DIR=/root/.sdkman
# update the image
RUN yum -y upgrade
# install requirements, install and configure sdkman
# see https://sdkman.io/usage for configuration options
RUN yum -y install curl ca-certificates zip unzip openssl which findutils && \
update-ca-trust && \
curl -s "https://get.sdkman.io" | bash && \
echo "sdkman_auto_answer=true" > $SDKMAN_DIR/etc/config && \
echo "sdkman_auto_selfupdate=false" >> $SDKMAN_DIR/etc/config
# Source sdkman to make the sdk command available and install candidate
RUN bash -c "source $SDKMAN_DIR/bin/sdkman-init.sh && sdk install $CANDIDATE $CANDIDATE_VERSION"
# Add candidate path to $PATH environment variable
ENV JAVA_HOME="$SDKMAN_DIR/candidates/java/current"
ENV PATH="$JAVA_HOME/bin:$PATH"
ENTRYPOINT ["/bin/bash", "-c", "source $SDKMAN_DIR/bin/sdkman-init.sh && \"$#\"", "-s"]
CMD ["sdk", "help"]
The problem is every RUN command in Dockerfile is executed within a new bash environment, so you need to put both of your last two commands under the same line to look like this:
RUN bash $HOME/.sdkman/bin/sdkman-init.sh && bash -c "sdk install java 8.0.222.j9-adpt"

AWSEBCLI not reading env vars

I am attempting to run AWSEBCLI inside a docker container. I am passing the access key and security token as env vars as described in the docs under "Configuration Settings and Precedence"
ERROR: CredentialsError - Operation Denied. You appear to have no credentials
Here is my docker file
FROM circleci/golang
ADD . /go/src
WORKDIR /go/src
RUN sudo apt-get -y -qq update --assume-yes
RUN sudo apt-get install python-pip python-dev build-essential --assume-yes
RUN sudo pip install awscli=="1.16.9"
RUN sudo pip install awsebcli=="3.14.4"
RUN echo ${AWS_ACCESS_KEY_ID}
RUN echo ${AWS_SECRET_ACCESS_KEY}
CMD sudo eb deploy Circledocker
The environment defined in your user session and the sudo session are not the same.
RUN echo ${AWS_ACCESS_KEY_ID} -> Works
RUN sudo echo ${AWS_ACCESS_KEY_ID} -> Will not provide you the value.
Take a look at man sudo, the -E flag :
-E, --preserve-env
Indicates to the security policy that the user wishes to preserve their
existing environment variables. The security policy may return an error
if the user does not have permission to preserve the environment.
So this normally works :
sudo -E bash -c 'echo $AWS_ACCESS_KEY_ID'
Try your eb deploy command like this :
sudo -E bash -c 'eb deploy Circledocker'
Hope it helps !

Fastest way to install Tesseract on Elastic Beanstalk

I am currently using Tika to extract text from files uploaded to my Rails app running on AWS Elastic Beanstalk (64bit Amazon Linux 2016.03 v2.1.2 running Ruby 2.2). I'd like to index scanned images as well, so I need to install Tesseract.
I was able to get it to work by installing it from source like so, but it added 10 minutes to my deploys to a fresh instance. Is there a faster way to do this?
.ebextensions/02-tesseract.config
packages:
yum:
autoconf: []
automake: []
libtool: []
libpng-devel: []
libtiff-devel: []
zlib-devel: []
container_commands:
01-command:
command: mkdir -p install
cwd: /home/ec2-user
02-command:
command: cp .ebextensions/scripts/install_tesseract.sh /home/ec2-user/install/
03-command:
command: bash install/install_tesseract.sh
cwd: /home/ec2-user
.ebextensions/scripts/install_tesseract.sh
#!/usr/bin/env bash
cd_to_install () {
cd /home/ec2-user/install
}
cd_to () {
cd /home/ec2-user/install/$1
}
if ! [ -x "$(command -v tesseract)" ]; then
# Add `usr/local/bin` to PATH
echo 'pathmunge /usr/local/bin' > /etc/profile.d/usr_local.sh
chmod +x /etc/profile.d/usr_local.sh
# Install leptonica
cd_to_install
wget http://www.leptonica.org/source/leptonica-1.73.tar.gz
tar -zxvf leptonica-1.73.tar.gz
cd_to leptonica-1.73
./configure
make
make install
rm -rf /home/ec2-user/install/leptonica-1.73.tar.gz
rm -rf /home/ec2-user/install/leptonica-1.73
# Install tesseract ~ the jewel of Odin's treasure room
cd_to_install
wget https://github.com/tesseract-ocr/tesseract/archive/3.04.01.tar.gz
tar -zxvf 3.04.01.tar.gz
cd_to tesseract-3.04.01
./autogen.sh
./configure
make
make install
ldconfig
rm -rf /home/ec2-user/install/3.04.01.tar.gz
rm -rf /home/ec2-user/install/tesseract-3.04.01
# Install tessdata
cd_to_install
wget https://github.com/tesseract-ocr/tessdata/archive/3.04.00.tar.gz
tar -zxvf 3.04.00.tar.gz
cp /home/ec2-user/install/tessdata-3.04.00/eng.* /usr/local/share/tessdata/
rm -rf /home/ec2-user/install/3.04.00.tar.gz
rm -rf /home/ec2-user/install/tessdata-3.04.00
fi
Short answer
.ebextensions/02-tesseract.config
commands:
01-libwebp:
command: "yum --enablerepo=epel --disablerepo=amzn-main -y install libwebp"
02-tesseract:
command: "yum --enablerepo=epel -y install tesseract"
Long answer
I'm not familiar with non-Ubuntu package managers or ebextensions, so after some digging, I found that there are precompiled binaries that can be installed on Amazon Linux in the stable EPEL repo.
The first obstacle was figuring out how to use the EPEL repo. The easiest way is to use the enablerepo option on the yum command.
That gets us here:
yum --enablerepo=epel install tesseract
Next, I had to resolve this dependency error:
[root#ip-10-0-1-193 ec2-user]# yum install --enablerepo=epel tesseract
Loaded plugins: priorities, update-motd, upgrade-helper
951 packages excluded due to repository priority protections
Resolving Dependencies
--> Running transaction check
---> Package tesseract.x86_64 0:3.04.00-3.el6 will be installed
--> Processing Dependency: liblept.so.4()(64bit) for package: tesseract-3.04.00-3.el6.x86_64
--> Running transaction check
---> Package leptonica.x86_64 0:1.72-2.el6 will be installed
--> Processing Dependency: libwebp.so.5()(64bit) for package: leptonica-1.72-2.el6.x86_64
--> Finished Dependency Resolution
Error: Package: leptonica-1.72-2.el6.x86_64 (epel)
Requires: libwebp.so.5()(64bit)
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest
I found the solution here
Just adding the epel repo doesn't solve it, as the packages in the
amzn-main repository seem to overrule those in the epel repository. If
the libwebp package in the amzn-main repo are excluded it should work
The Tesseract install has some dependencies found in the amzn-main repo. This is why I first install libwebp with --disablerepo=amzn-main.
yum --enablerepo=epel --disablerepo=amzn-main install libwebp
yum --enablerepo=epel install tesseract
Finally, here's how you can install yum packages on Elastic Beanstalk with options:
.ebextensions/02-tesseract.config
commands:
01-libwebp:
command: "yum --enablerepo=epel --disablerepo=amzn-main -y install libwebp"
02-tesseract:
command: "yum --enablerepo=epel -y install tesseract"
Fortunately, this is also the easiest way to install Tesseract on Elastic Beanstalk!