tesseract command not working and giving file error - c++

I have installed tesseract version 4.0 in ubuntu.
I am able to perform all the actions of tesseract using Tesseract CLI like simple OCR text generation.
I want to train the LSTM.
I read this article and tried to run the following command directly on terminal after isntalling Tesseract from Build.
mkdir -p ~/tesstutorial/engoutput
training/lstmtraining --debug_interval 100 \
--traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \
--net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]' \
--model_output ~/tesstutorial/engoutput/base --learning_rate 20e-4 \
--train_listfile ~/tesstutorial/engtrain/eng.training_files.txt \
--eval_listfile ~/tesstutorial/engeval/eng.training_files.txt \
--max_iterations 5000 &>~/tesstutorial/engoutput/basetrain.log
Althoguh it created the engouput directory.
Current path was pointed to SRC directory of tesseract.
Get the following error :
bash: training/lstmtraining: No such file or directory
Running as

Fixed by following code
Create Training Data First
cd ~/tesseract-ocr/src
training/tesstrain.sh \
--fonts_dir /usr/share/fonts/ \
--lang eng \
--linedata_only \
--noextract_font_properties \
--exposures "0" \
--langdata_dir /home/shan/langdata_lstm \
--output_dir /home/shan/tesstutorial/engtrain \
--tessdata_dir /home/shan/tesseract-ocr/tessdata \
--fontlist "Arial"
sudo chmod -R 777 /home/shan/tesstutorial/engtrain
Then LSTM Model
sudo chmod -R 777 /home/shan/tesstutorial/
cd ~/tesseract-ocr/src/
training/lstmtraining --stop_training \
--continue_from ~/tesstutorial/engoutput/base_checkpoint \
--traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \
--model_output ~/tesstutorial/engoutput/eng.traineddata
sudo chmod -R 777 ~/tesstutorial
cd ~/tesseract-ocr/src/
training/lstmtraining --debug_interval 100 \
--traineddata ~/tesstutorial/engtrain/eng/eng.traineddata \
--net_spec '[1,36,0,1 Ct3,3,16 Mp3,3 Lfys48 Lfx96 Lrx96 Lfx256 O1c111]' \
--model_output ~/tesstutorial/engoutput/base --learning_rate 20e-4 \
--train_listfile ~/tesstutorial/engtrain/eng.training_files.txt \
--max_iterations 5000 &>~/tesstutorial/engoutput/basetrain.log

Related

ERROR: No matching distribution found for jax[gpu]>=0.3.4 (from -r vit_jax/requirements.txt (line 8))

trying to build my first dockerfile for vision transformer. ran into
ERROR: Could not find a version that satisfies the requirement
jax[gpu]>=0.3.4 (from -r vit_jax/requirements.txt (line 8)) (from
versions: 0.0, 0.1, 0.1.1, 0.1.2, 0.1.3, 0.1.4, 0.1.5, 0.1.6, 0.1.7,
0.1.8, 0.1.9, 0.1.10, 0.1.11, 0.1.12, 0.1.13, 0.1.14, 0.1.15, 0.1.16, 0.1.18, 0.1.19, 0.1.20, 0.1.21, 0.1.22, 0.1.23, 0.1.24, 0.1.25, 0.1.26, 0.1.27, 0.1.28, 0.1.29, 0.1.30, 0.1.31, 0.1.32, 0.1.33, 0.1.34, 0.1.35, 0.1.36, 0.1.37, 0.1.38, 0.1.39, 0.1.40, 0.1.41, 0.1.42, 0.1.43, 0.1.44, 0.1.45, 0.1.46, 0.1.47, 0.1.48, 0.1.49, 0.1.50, 0.1.51, 0.1.52, 0.1.53, 0.1.54, 0.1.55, 0.1.56, 0.1.57, 0.1.58, 0.1.59, 0.1.60, 0.1.61, 0.1.62, 0.1.63, 0.1.64, 0.1.65, 0.1.66, 0.1.67, 0.1.68, 0.1.69, 0.1.70, 0.1.71, 0.1.72, 0.1.73, 0.1.74, 0.1.75, 0.1.76, 0.1.77, 0.2.0, 0.2.1, 0.2.2, 0.2.3, 0.2.4, 0.2.5, 0.2.6, 0.2.7, 0.2.8, 0.2.9, 0.2.10, 0.2.11, 0.2.12, 0.2.13, 0.2.14, 0.2.15, 0.2.16, 0.2.17) ERROR: No matching distribution found for jax[gpu]>=0.3.4 (from -r vit_jax/requirements.txt (line 8))
didn't find anyone running vit ran into this problem, so i assume it's my dockerfile's flaw not requirements.txt's? below is my dockerfile
FROM pytorch/pytorch:1.2-cuda10.0-cudnn7-runtime
ENV DEBIAN_FRONTEND=noninteractive
ARG USERNAME=user
WORKDIR /dockertest
ARG WORKDIR=/dockertest
RUN apt-get update && apt-get install -y \
automake autoconf libpng-dev nano python3-pip \
sudo curl zip unzip libtool swig zlib1g-dev pkg-config \
python3-mock libpython3-dev libpython3-all-dev \
g++ gcc cmake make pciutils cpio gosu wget \
libgtk-3-dev libxtst-dev sudo apt-transport-https \
build-essential gnupg git xz-utils vim libgtk2.0-0 libcanberra-gtk-module\
libva-dev libdrm-dev xorg xorg-dev protobuf-compiler \
openbox libx11-dev libgl1-mesa-glx libgl1-mesa-dev \
libtbb2 libtbb-dev libopenblas-dev libopenmpi-dev \
&& sed -i 's/# set linenumbers/set linenumbers/g' /etc/nanorc \
&& apt clean \
&& rm -rf /var/lib/apt/lists/*
RUN git clone https://github.com/google-research/vision_transformer.git \
&&cd vision_transformer \
&& pip3 install pip --upgrade \
&& pip install -r vit_jax/requirements.txt \
&&python -m vit_jax.main --workdir=/tmp/vit-$(date +%s) \
--config=$(pwd)/vit_jax/configs/vit.py:b16,cifar10 \
--config.pretrained_dir='gs://vit_models/imagenet21k' \
&& pip cache purge
RUN echo "root:root" | chpasswd \
&& adduser --disabled-password --gecos "" "${USERNAME}" \
&& echo "${USERNAME}:${USERNAME}" | chpasswd \
&& echo "%${USERNAME} ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers.d/${USERNAME} \
&& chmod 0440 /etc/sudoers.d/${USERNAME}
USER ${USERNAME}
RUN sudo chown -R ${USERNAME}:${USERNAME} ${WORKDIR}
WORKDIR ${WORKDIR}
The issue is that you're using Python 3.6 (as specified in the docker file), which is not supported by JAX version 0.2.18 and newer (see JAX Changelog).
To fix the issue, you should upgrade Python to version 3.7 or newer. Python 3.6 has reached its end of life and is no longer receiving security updates.
Alternatively, if for some reason you must continue using Python 3.6, you should install jax version 0.2.17 and jaxlib version 0.1.69, which were the last releases to be compatible with Python 3.6.

Purge parachain issue

I want to purge my parachain collator node, but I got this error
Input("Error parsing spec file: missing field `relay_chain` at line 143 column 1")(cannot purge parachain)
This is the command I used to purge my parachain
./target/release/parachain-collator purge-chain --base-path /tmp/parachain/alice --chain rococo-custom.json
This is the command I used to run this parachain-collator
./target/release/parachain-collator \
--alice \
--collator \
--force-authoring \
--parachain-id 2000 \
--base-path /tmp/parachain/alice \
--port 40333 \
--ws-port 8844 \
-- \
--execution wasm \
--chain rococo-custom.json \
--port 30343 \
--ws-port 9977
Thank you so much for your help!
./XXX/parachain-collator purge-chain --base-path <your collator DB path set above>
no need args for chainspec.

Does anyone know where can I find DockerFIle for aws/codebuild/nodejs:10.1.0?

I am looking for dockerFile for aws/codebuild/nodejs:10.1.0 which used to be available on GitHub earlier but not anymore?
Here you can find the aws/codebuild/nodejs:10.1.0.
Release updated standard 2.0 image #minethai minethai released this on
Jun 26 · 15 commits to master since this release
You can download the zip folder form here.
Changes:
Updated minor versions for node8, node10, powershell, and gradle 5
# Copyright 2017-2017 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Amazon Software License (the "License"). You may not use this file except in compliance with the License.
# A copy of the License is located at
#
# http://aws.amazon.com/asl/
#
# or in the "license" file accompanying this file.
# This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied.
# See the License for the specific language governing permissions and limitations under the License.
#
FROM ubuntu:14.04.5
ENV DOCKER_BUCKET="download.docker.com" \
DOCKER_VERSION="17.09.0-ce" \
DOCKER_CHANNEL="stable" \
DOCKER_SHA256="a9e90a73c3cdfbf238f148e1ec0eaff5eb181f92f35bdd938fd7dab18e1c4647" \
DIND_COMMIT="3b5fac462d21ca164b3778647420016315289034" \
DOCKER_COMPOSE_VERSION="1.21.2" \
GITVERSION_VERSION="3.6.5"
# Install git, SSH, and other utilities
RUN set -ex \
&& echo 'Acquire::CompressionTypes::Order:: "gz";' > /etc/apt/apt.conf.d/99use-gzip-compression \
&& apt-get update \
&& apt install -y apt-transport-https \
&& apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 3FA7E0328081BFF6A14DA29AA6A19B38D3D831EF \
&& echo "deb https://download.mono-project.com/repo/ubuntu stable-trusty main" | tee /etc/apt/sources.list.d/mono-official-stable.list \
&& apt-get update \
&& apt-get install software-properties-common -y --no-install-recommends \
&& apt-add-repository ppa:git-core/ppa \
&& apt-get update \
&& apt-get install git=1:2.* -y --no-install-recommends \
&& git version \
&& apt-get install -y --no-install-recommends openssh-client=1:6.6* \
&& mkdir ~/.ssh \
&& touch ~/.ssh/known_hosts \
&& ssh-keyscan -t rsa,dsa -H github.com >> ~/.ssh/known_hosts \
&& ssh-keyscan -t rsa,dsa -H bitbucket.org >> ~/.ssh/known_hosts \
&& chmod 600 ~/.ssh/known_hosts \
&& apt-get install -y --no-install-recommends \
wget=1.15-* python=2.7.* python2.7-dev=2.7.* fakeroot=1.20-* ca-certificates \
tar=1.27.* gzip=1.6-* zip=3.0-* autoconf=2.69-* automake=1:1.14.* \
bzip2=1.0.* file=1:5.14-* g++=4:4.8.* gcc=4:4.8.* imagemagick=8:6.7.* \
libbz2-dev=1.0.* libc6-dev=2.19-* libcurl4-openssl-dev=7.35.* libdb-dev=1:5.3.* \
libevent-dev=2.0.* libffi-dev=3.1~* libgeoip-dev=1.6.* libglib2.0-dev=2.40.* \
libjpeg-dev=8c-* libkrb5-dev=1.12+* liblzma-dev=5.1.* \
libmagickcore-dev=8:6.7.* libmagickwand-dev=8:6.7.* libmysqlclient-dev=5.5.* \
libncurses5-dev=5.9+* libpng12-dev=1.2.* libpq-dev=9.3.* libreadline-dev=6.3-* \
libsqlite3-dev=3.8.* libssl-dev=1.0.* libtool=2.4.* libwebp-dev=0.4.* \
libxml2-dev=2.9.* libxslt1-dev=1.1.* libyaml-dev=0.1.* make=3.81-* \
patch=2.7.* xz-utils=5.1.* zlib1g-dev=1:1.2.* unzip=6.0-* curl=7.35.* \
e2fsprogs=1.42.* iptables=1.4.* xfsprogs=3.1.* xz-utils=5.1.* \
mono-devel less=458-* groff=1.22.* liberror-perl=0.17-* \
asciidoc=8.6.* build-essential=11.* bzr=2.6.* cvs=2:1.12.* cvsps=2.1-* docbook-xml=4.5-* docbook-xsl=1.78.* dpkg-dev=1.17.* \
libdbd-sqlite3-perl=1.40-* libdbi-perl=1.630-* libdpkg-perl=1.17.* libhttp-date-perl=6.02-* \
libio-pty-perl=1:1.08-* libserf-1-1=1.3.* libsvn-perl=1.8.* libsvn1=1.8.* libtcl8.6=8.6.* libtimedate-perl=2.3000-* \
libunistring0=0.9.* libxml2-utils=2.9.* libyaml-perl=0.84-* python-bzrlib=2.6.* python-configobj=4.7.* \
sgml-base=1.26+* sgml-data=2.0.* subversion=1.8.* tcl=8.6.* tcl8.6=8.6.* xml-core=0.13+* xmlto=0.0.* xsltproc=1.1.* \
&& rm -rf /var/lib/apt/lists/* \
&& apt-get clean
# Download and set up GitVersion
RUN set -ex \
&& wget "https://github.com/GitTools/GitVersion/releases/download/v${GITVERSION_VERSION}/GitVersion_${GITVERSION_VERSION}.zip" -O /tmp/GitVersion_${GITVERSION_VERSION}.zip \
&& mkdir -p /usr/local/GitVersion_${GITVERSION_VERSION} \
&& unzip /tmp/GitVersion_${GITVERSION_VERSION}.zip -d /usr/local/GitVersion_${GITVERSION_VERSION} \
&& rm /tmp/GitVersion_${GITVERSION_VERSION}.zip \
&& echo "mono /usr/local/GitVersion_${GITVERSION_VERSION}/GitVersion.exe \$#" >> /usr/local/bin/gitversion \
&& chmod +x /usr/local/bin/gitversion
# Install Docker
RUN set -ex \
&& curl -fSL "https://${DOCKER_BUCKET}/linux/static/${DOCKER_CHANNEL}/x86_64/docker-${DOCKER_VERSION}.tgz" -o docker.tgz \
&& echo "${DOCKER_SHA256} *docker.tgz" | sha256sum -c - \
&& tar --extract --file docker.tgz --strip-components 1 --directory /usr/local/bin/ \
&& rm docker.tgz \
&& docker -v \
# set up subuid/subgid so that "--userns-remap=default" works out-of-the-box
&& addgroup dockremap \
&& useradd -g dockremap dockremap \
&& echo 'dockremap:165536:65536' >> /etc/subuid \
&& echo 'dockremap:165536:65536' >> /etc/subgid \
&& wget "https://raw.githubusercontent.com/docker/docker/${DIND_COMMIT}/hack/dind" -O /usr/local/bin/dind \
&& curl -L https://github.com/docker/compose/releases/download/${DOCKER_COMPOSE_VERSION}/docker-compose-Linux-x86_64 > /usr/local/bin/docker-compose \
&& chmod +x /usr/local/bin/dind /usr/local/bin/docker-compose \
# Ensure docker-compose works
&& docker-compose version
# Install dependencies by all python images equivalent to buildpack-deps:jessie
# on the public repos.
RUN set -ex \
&& wget "https://bootstrap.pypa.io/2.6/get-pip.py" -O /tmp/get-pip.py \
&& python /tmp/get-pip.py \
&& pip install awscli==1.* \
&& rm -fr /var/lib/apt/lists/* /tmp/* /var/tmp/*
VOLUME /var/lib/docker
COPY dockerd-entrypoint.sh /usr/local/bin/
ENV NODE_VERSION="10.1.0"
# gpg keys listed at https://github.com/nodejs/node#release-team
RUN set -ex \
&& for key in \
94AE36675C464D64BAFA68DD7434390BDBE9B9C5 \
B9AE9905FFD7803F25714661B63B535A4C206CA9 \
77984A986EBC2AA786BC0F66B01FBB92821C587A \
56730D5401028683275BD23C23EFEFE93C4CFFFE \
71DCFD284A79C3B38668286BC97EC7A07EDE3FC1 \
FD3A5288F042B6850C66B31F09FE44734EB7990E \
8FCCA13FEF1D0C2E91008E09770F7A9A5AE15600 \
C4F0DFFF4E8C1A8236409D08E73BC641CC11F4C8 \
DD8F2338BAE7501E3DD5AC78C273792F7D83545D \
9554F04D7259F04124DE6B476D5A82AC7E37093B \
93C7E9E91B49E432C2F75674B0A78B0A6C481CF6 \
114F43EE0176B71C7BC219DD50A3051F888C628D \
7937DFD2AB06298B2293C3187D33FF9D0246406D \
; do \
gpg --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys "$key" || \
gpg --keyserver hkp://ipv4.pool.sks-keyservers.net --recv-keys "$key" || \
gpg --keyserver hkp://pgp.mit.edu:80 --recv-keys "$key" ; \
done
RUN set -ex \
&& wget "https://nodejs.org/download/release/v$NODE_VERSION/node-v$NODE_VERSION-linux-x64.tar.gz" -O node-v$NODE_VERSION-linux-x64.tar.gz \
&& wget "https://nodejs.org/download/release/v$NODE_VERSION/SHASUMS256.txt.asc" -O SHASUMS256.txt.asc \
&& gpg --batch --decrypt --output SHASUMS256.txt SHASUMS256.txt.asc \
&& grep " node-v$NODE_VERSION-linux-x64.tar.gz\$" SHASUMS256.txt | sha256sum -c - \
&& tar -xzf "node-v$NODE_VERSION-linux-x64.tar.gz" -C /usr/local --strip-components=1 \
&& rm "node-v$NODE_VERSION-linux-x64.tar.gz" SHASUMS256.txt.asc SHASUMS256.txt \
&& ln -s /usr/local/bin/node /usr/local/bin/nodejs \
&& rm -fr /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN npm set unsafe-perm true
CMD [ "node" ]

CloudML job + verbosity == Error

Runnning the dataeng-machine-learning codelab on step 9. 4. Feature Engineering.
The notebook step for running a tarin job is:
%%bash
OUTDIR=gs://${BUCKET}/taxifare/ch4/taxi_trained
JOBNAME=lab4a_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
--region=$REGION \
--module-name=trainer.task \
--package-path=${REPO}/courses/machine_learning/feateng/taxifare/trainer \
--job-dir=$OUTDIR \
--staging-bucket=gs://$BUCKET \
--scale-tier=BASIC \
--runtime-version=1.0 \
-- \
--train_data_paths="gs://$BUCKET/taxifare/ch4/taxi_preproc/train*" \
--eval_data_paths="gs://${BUCKET}/taxifare/ch4/taxi_preproc/valid*" \
--output_dir=$OUTDIR \
--num_epochs=100
That works great no matter how many time I run it.
However if I run:
%%bash
OUTDIR=gs://${BUCKET}/taxifare/ch4/taxi_trained
JOBNAME=lab4a_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
--region=$REGION \
--module-name=trainer.task \
--package-path=${REPO}/courses/machine_learning/feateng/taxifare/trainer \
--job-dir=$OUTDIR \
--staging-bucket=gs://$BUCKET \
--scale-tier=BASIC \
--runtime-version=1.0 \
-- \
--train_data_paths="gs://$BUCKET/taxifare/ch4/taxi_preproc/train*" \
--eval_data_paths="gs://${BUCKET}/taxifare/ch4/taxi_preproc/valid*" \
--output_dir=$OUTDIR \
--num_epochs=100 \
--verbosity DEBUG
Job fails after about 40 sec. with this in the logs:
The replica master 0 exited with a non-zero status of 2. Termination reason: Error.
I've found this usage in here:
https://cloud.google.com/ml-engine/docs/how-tos/getting-started-training-prediction#cloud-train-single
So I guesss it's ok to use.
What am I doing wrong?
Note that every argument after the "-- \" line is a pass through to the tensorflow code and is therefore dependent on the individual sample code.
In this case, the "--verbosity" flag isn't supported by the sample you are running. Looking at the samples repo, it looks like the only sample that has that flag is the census estimator sample.
The taxifare example is currently hardcoded to INFO, and the code doesn't parse the --verbose flag.

adding user to docker container and give them access to host folder

I am able only to give the first user write, executable and read access to the export folder in the below Dockerfile
...
VOLUME ["/export/"]
RUN groupadd galaxy \
&& chgrp -R galaxy /export \
&& chmod -R 770 /export
RUN useradd dudleyk \
&& mkdir /home/dudleyk \
&& chown dudleyk:dudleyk /home/dudleyk \
&& addgroup dudleyk galaxy \
&& ln -s /export/ /home/dudleyk/ \
&& echo "dudleyk:dudleyk" | chpasswd
RUN useradd lorencm \
&& mkdir /home/lorencm \
&& chown lorencm:lorencm /home/lorencm \
&& addgroup lorencm galaxy \
&& ln -s /export/ /home/lorencm/ \
&& echo "lorencm:lorencm" | chpasswd
EXPOSE 8787
CMD ["/init"]
I logged to the docker container with docker run -it -v /home/galaxy:/export rstudio bash and it showed me the following
ls -ahl
drwxr-xr-x 43 dudleyk galaxy 4.0K Apr 8 00:09 export
How do I give the second user write, executable and read access to the export?
Thank you in advance