Run conda inside singularity - dockerfile

I would like to run a conda command with singularity.
The command is:
singularity exec ~/dockerimage.sif conda
It yields an error:
/.singularity.d/actions/exec: 9: exec: conda: Permission denied
Here is my dockerfile:
FROM ubuntu:20.04
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y apt-utils wget=1.20.3-1ubuntu1 python3.8=3.8.2-1ubuntu1.2 python3-pip=20.0.2-5ubuntu1 python3-yaml=5.3.1-1 git=1:2.25.1-1ubuntu3
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.8.3-Linux-x86_64.sh && chmod +x Miniconda3-py38_4.8.3-Linux-x86_64.sh && ./Miniconda3-py38_4.8.3-Linux-x86_64.sh -b && cp /root/miniconda3/bin/conda /usr/bin/conda
RUN wget https://data.qiime2.org/distro/core/qiime2-2020.8-py36-linux-conda.yml && conda env create -n qiime2-2020.8 --file qiime2-2020.8-py36-linux-conda.yml && conda install -y -n qiime2-2020.8 -c conda-forge -c bioconda -c qiime2 -c defaults q2cli q2template q2-types q2-feature-table q2-metadata vsearch snakemake
What should I add to the Dockerfile? How would it work?

You're installing with conda default settings, which puts it in the home of the current user. That user is root. Singularity runs as your current user, so unless you're running as root the conda files will not be available.
modify your conda install command to set the install prefix: -p /opt/conda (or some other arbitrary location)
make sure that any user will be able to access the files installed with conda: chmod -R o+rX /opt/conda
update PATH to include conda: export PATH="$PATH:/opt/conda/bin"
when running your image make sure your environment variables are not overriding those in the container: singularity exec --cleanenv ~/dockerimage.sif conda

Related

Pyarrow fs.HadoopFileSytem reports unable to load libhdfs.so

I'm trying to use the pyarrow Filesystem interface with HDFS. I receive a libhdfs.so not found error when calling the fs.HadoopFileSystem constructor even though libhdfs.so is apparently at the indicated location.
from pyarrow import fs
hfs = fs.HadoopFileSystem(host="10.10.0.167", port=9870)
OSError: Unable to load libhdfs: /hadoop-3.3.1/lib/native/libhdfs.so: cannot open shared object file: No such file or directory
I've tried different python and pyarrow versions and setting ARROW_LIBHDFS_DIR. For testing, I'm using the following dockerfile on linuxmint.
FROM openjdk:11
RUN apt-get update &&\
apt-get install wget -y
RUN wget -nv https://dlcdn.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1-aarch64.tar.gz &&\
tar -xf hadoop-3.3.1-aarch64.tar.gz
ENV PATH=/miniconda/bin:${PATH}
RUN wget -nv https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh &&\
bash miniconda.sh -b -p /miniconda &&\
conda init
RUN conda install -c conda-forge python=3.9.6
RUN conda install -c conda-forge pyarrow=4.0.1
ENV JAVA_HOME=/usr/local/openjdk-11
ENV HADOOP_HOME=/hadoop-3.3.1
RUN printf 'from pyarrow import fs\nhfs = fs.HadoopFileSystem(host="10.10.0.167", port=9870)\n' > test_arrow.py
# 'python test_arrow.py' fails with ...
# OSError: Unable to load libhdfs: /hadoop-3.3.1/lib/native/libhdfs.so: cannot open shared object file: No such file or directory
RUN python test_arrow.py || true
CMD ["/bin/bash"]
I have created a docker file for the pyarrow fs hadoopfilesystem client.
HDFS needs to be installed to use the libhdfs.so file.
RUN mkdir -p /data/hadoop
RUN apt-get -q update
RUN apt-get install software-properties-common -y
RUN add-apt-repository "deb http://deb.debian.org/debian/ sid main"
RUN apt-get -q update
RUN apt-get install openjdk-8-jdk -y
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*
RUN wget "https://dlcdn.apache.org/hadoop/common/hadoop-3.3.2/hadoop-3.3.2.tar.gz" -O hadoop-3.3.2.tar.gz
RUN tar xzf hadoop-3.3.2.tar.gz
ENV HADOOP_HOME=/app/hadoop-3.3.2
ENV HADOOP_INSTALL=$HADOOP_HOME
ENV HADOOP_MAPRED_HOME=$HADOOP_HOME
ENV HADOOP_COMMON_HOME=$HADOOP_HOME
ENV HADOOP_HDFS_HOME=$HADOOP_HOME
ENV YARN_HOME=$HADOOP_HOME
ENV HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
ENV PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
ENV HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/nativ"
ENV JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
ENV CLASSPATH="$HADOOP_HOME/bin/hadoop classpath --glob"
ENV ARROW_LIBHDFS_DIR=$HADOOP_HOME/lib/native
ADD pyarrow-app.py /app/
CMD [ "python3" "-u" "/app/pyarrow-app.py.py"]

sdkman does not install java in a dockerfile

I have this docker file:
# We are going to star from the jhipster image
FROM jhipster/jhipster
# install as root
USER root
### Setup docker cli (don't need docker daemon) ###
# Install some packages
RUN apt-get install apt-transport-https ca-certificates curl gnupg-agent software-properties-common -y
# Add Dockers official GPG key:
RUN ["/bin/bash", "-c", "set -o pipefail && curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -"]
# Add a stable repository
RUN add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
# Setup aws credentials as environment variables
ENV AWS_ACCESS_KEY_ID "change it!"
ENV AWS_SECRET_ACCESS_KEY "change it!"
# noninteractive install for tzdata
ARG DEBIAN_FRONTEND=noninteractive
# set timezone for tzdata
ENV TZ=America/Sao_Paulo
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
# Install the latest version of Docker Engine - Community and also aws cli
RUN apt-get update && apt-get install docker-ce docker-ce-cli containerd.io awscli -y
# change back to default user
USER jhipster
# install skd and java version 1.8
RUN curl -s "https://get.sdkman.io" | bash
RUN bash $HOME/.sdkman/bin/sdkman-init.sh
RUN bash -c "sdk install java 8.0.222.j9-adpt"
When I run a command to build an image from this dockerfile it fails on the last step with a message:
/bin/sh: 1: sdk: not found
When I install it on my local machine it runs sdkman (sdk) on bash. But on this script it calls it from sh not bash. How can I make it calls skdman (sdk) from sh? What I actually want to do is install a specific java version through sdkman (sdk). Is there another way to do it?
For sdk command to be available you need to run source sdkman-init.sh.
Here is a working sample with java 11 on centos.
FROM centos:latest
ARG CANDIDATE=java
ARG CANDIDATE_VERSION=11.0.6-open
ENV SDKMAN_DIR=/root/.sdkman
# update the image
RUN yum -y upgrade
# install requirements, install and configure sdkman
# see https://sdkman.io/usage for configuration options
RUN yum -y install curl ca-certificates zip unzip openssl which findutils && \
update-ca-trust && \
curl -s "https://get.sdkman.io" | bash && \
echo "sdkman_auto_answer=true" > $SDKMAN_DIR/etc/config && \
echo "sdkman_auto_selfupdate=false" >> $SDKMAN_DIR/etc/config
# Source sdkman to make the sdk command available and install candidate
RUN bash -c "source $SDKMAN_DIR/bin/sdkman-init.sh && sdk install $CANDIDATE $CANDIDATE_VERSION"
# Add candidate path to $PATH environment variable
ENV JAVA_HOME="$SDKMAN_DIR/candidates/java/current"
ENV PATH="$JAVA_HOME/bin:$PATH"
ENTRYPOINT ["/bin/bash", "-c", "source $SDKMAN_DIR/bin/sdkman-init.sh && \"$#\"", "-s"]
CMD ["sdk", "help"]
The problem is every RUN command in Dockerfile is executed within a new bash environment, so you need to put both of your last two commands under the same line to look like this:
RUN bash $HOME/.sdkman/bin/sdkman-init.sh && bash -c "sdk install java 8.0.222.j9-adpt"

AWSEBCLI not reading env vars

I am attempting to run AWSEBCLI inside a docker container. I am passing the access key and security token as env vars as described in the docs under "Configuration Settings and Precedence"
ERROR: CredentialsError - Operation Denied. You appear to have no credentials
Here is my docker file
FROM circleci/golang
ADD . /go/src
WORKDIR /go/src
RUN sudo apt-get -y -qq update --assume-yes
RUN sudo apt-get install python-pip python-dev build-essential --assume-yes
RUN sudo pip install awscli=="1.16.9"
RUN sudo pip install awsebcli=="3.14.4"
RUN echo ${AWS_ACCESS_KEY_ID}
RUN echo ${AWS_SECRET_ACCESS_KEY}
CMD sudo eb deploy Circledocker
The environment defined in your user session and the sudo session are not the same.
RUN echo ${AWS_ACCESS_KEY_ID} -> Works
RUN sudo echo ${AWS_ACCESS_KEY_ID} -> Will not provide you the value.
Take a look at man sudo, the -E flag :
-E, --preserve-env
Indicates to the security policy that the user wishes to preserve their
existing environment variables. The security policy may return an error
if the user does not have permission to preserve the environment.
So this normally works :
sudo -E bash -c 'echo $AWS_ACCESS_KEY_ID'
Try your eb deploy command like this :
sudo -E bash -c 'eb deploy Circledocker'
Hope it helps !

macOS, Dockerfile mounting a folder cannot change locale

I'm trying to mount a folder with my docker file instead of copying it on build. We use git for development and I don't want to rebuild the image everytime I make a change for testing.
my docker file is now as such
#set base image
FROM centos:centos7.2.1511
MAINTAINER Alex <alex#app.com>
#install yum dependencies
RUN yum -y update \\
&& yum -y install yum-plugin-ovl \
&& yum -y install epel-release \
&& yum -y install net-tools \
&& yum -y install gcc \
&& yum -y install python-devel \
&& yum -y install git \
&& yum -y install python-pip \
&& yum -y install openldap-devel \
&& yum -y install gcc gcc-c++ kernel-devel \
&& yum -y install libxslt-devel libffi-devel openssl-devel \
&& yum -y install libevent-devel \
&& yum -y install openldap-devel \
&& yum -y install net-snmp-devel \
&& yum -y install mysql-devel \
&& yum -y install python-dateutil \
&& yum -y install python-pip \
&& pip install --upgrade pip
# Create the DIR
#RUN mkdir -p /var/www/itapp
# Set the working directory
#WORKDIR /var/www/itapp
# Copy the app directory contents into the container
#ADD . /var/www/itapp
# Install any needed packages specified in requirements.txt
#RUN pip install -r requirements.txt
# Make port available to the world outside this container
EXPOSE 8000
# Define environment variable
ENV NAME itapp
# Run server when the container launches
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]
ive commented out the creation and copy of the itapp Django files as I want to mount them instead, (do I need to rebuild this first?)
then my command for mounting is
docker run -it -v /Users/alex/itapp:/var/www/itapp itapp bash
I now get an error:
bash: warning: setlocale: LC_CTYPE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_COLLATE: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_MESSAGES: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_NUMERIC: cannot change locale (en_US.UTF-8): No such file or directory
bash: warning: setlocale: LC_TIME: cannot change locale (en_US.UTF-8): No such file or directory
and the dev instance does not run.
how would I also set the working directory to the the volume that I'm mounting at runtime?
Try this command. -w WORKDIR option in docker run sets working directory inside the container.
docker run -d -v /Users/alex/itapp:/var/www/itapp -w /var/www/itapp itapp
Also, you'll to map your container port to your host port to be able to access, for example, from a browser to your app.
To do this, use the following command.
docker run -d -p 8000:8000 -v /Users/alex/itapp:/var/www/itapp -w /var/www/itapp itapp
After this, your app should be running in localhost:8000

Why does CMD never work in my Dockerfiles?

I have a few Dockerfiles where CMD doesn't seem to run. Here is an example (all the way at the bottom).
##########################################################
# Set the base image to Ansible
FROM ubuntu:16.10
# Install Ansible, Python and Related Deps #
RUN apt-get -y update && \
apt-get install -y python-yaml python-jinja2 python-httplib2 python-keyczar python-paramiko python-setuptools python-pkg-resources git python-pip
RUN mkdir /etc/ansible/
RUN echo '[local]\nlocalhost\n' > /etc/ansible/hosts
RUN mkdir /opt/ansible/
RUN git clone http://github.com/ansible/ansible.git /opt/ansible/ansible
WORKDIR /opt/ansible/ansible
RUN git submodule update --init
ENV PATH /opt/ansible/ansible/bin:/bin:/usr/bin:/usr/local/bin:/sbin:/usr/sbin
ENV PYTHONPATH /opt/ansible/ansible/lib
ENV ANSIBLE_LIBRARY /opt/ansible/ansible/library
RUN apt-get update -y
RUN apt-get install python -y
RUN apt-get install python-dev -y
RUN apt-get install python-setuptools -y
RUN apt-get install python-pip
RUN mkdir /ansible/
WORKDIR /ansible
COPY ./ansible ./
WORKDIR /
RUN ansible-playbook -c local ansible/playbooks/installdjango.yml
ENV PROJECTNAME testwebsite
################## SETUP DIRECTORY STRUCTURE ######################
WORKDIR /home
CMD ["django-admin" "startproject" "$PROJECTNAME"]
EXPOSE 8000
If I build and run the container, I can manually run
Django-admin startproject $PROJECTNAME and it will create a new project as expected, but the CMD in my Dockerfile does not seem to be doing anything and this is happening with all my other Dockerfiles so there's something I must not be getting.
ENTRYPOINT and CMD defines the default command that docker runs when it starts your container, not when the image is built. When ENTRYPOINT isn't defined, you simply run the value of CMD. Otherwise, CMD becomes args to the ENTRYPOINT. When you run your image, you can override the value of the CMD by passing args after the container name.
So, in your example above, CMD may be defined as anything, but when you run your container with docker run -it <imagename> /bin/bash, you override any value of CMD and replace it with /bin/bash. To run the defined value of CMD, you would need to run the container with docker run <imagename>.