Does Tessaract OCR uses neural networks as their default training mechanism - c++

Sorry this must be probably a dumb question. but i am fairly new to machine learning and Tessaract OCR. I have heard that Tessaract OCR can be trained.
What i need to know is does Tessaract OCR uses neural networks as their default training mechanism or do we have to program it explicitly to use neural networks ?.
Sorry if i'm thinking in a wrong way about this "training" concept. but what i need to know exactly is is Tessaract already using NN or if not how i can approach using NN with tessaract OCR to improve recognition accuracy ?.
If one can please suggest me some good resources/way to refer/try and to get started it would be a great help too.
what i currently know about basic machine learning supervised training concept and to perform basic image OCR operation in Tessaract OCR.

It appears that Tessaract uses an Adaptive Classifier by default. Check this out for a good read:
https://github.com/tesseract-ocr/docs/blob/master/tesseracticdar2007.pdf
There appears to be an option called "Cube mode" where it will switch to using NNs for the learning system instead of the adaptive classifier (https://code.google.com/p/tesseract-ocr-extradocs/wiki/Cube). More info about adaptive classifiers:
http://www.cs.indiana.edu/~rawlins/website/adaptivity/information-helper.html
Also, related very closely is a Learning Classifier System:
http://en.wikipedia.org/wiki/Learning_classifier_system
Also, your terminology of "training" is very close. Training is how you teach the pattern recognition system or learning system what responses it should give to certain input sets. Then, it uses similarities when it encounters unknown data to classify the new data. Machine learning is one of the coolest fields in existence in my opinion (probably biased opinion but whatever!) keep up the learning! You are the meta learner: learning how to teach a machine to learn! Cool stuff!

Yes, starting from tesseract 4.0, it provides a new lstm-based ocr engine: https://tesseract-ocr.github.io/tessdoc/NeuralNetsInTesseract4.00

Related

Specific topics on Tensorflow for CNN

I have a mini project for my new course in Tensorflow for this semester with random topics. Since I have some background on Convolution Neuron Network, I intend to use it for my project. My computer can only run CPU version of TensorFlow.
However, as a new bee, I realize that there are a lot of topics such that MNIST, CIFAR-10, etc, thus I don't know which suitable topic I should pick out from them. I only have two weeks left. It would be great if the topic is not too complicated but too not easy for study because it matchs my intermediate level.
In your experience, could you give me some advice about the specific topic I should do for my project?
Moreover, it would be better if in this topic I can provide my own data to test my training, because my professor said that it is a plus point to get A grade in my project.
Thanks in advance,
I think that to answer this question you need to properly evaluate the marking criteria for your project. However, I can give you a brief overview of what you've just mentioned.
MNIST: MNIST is a Optical Character Recognition task for individual numbers 0-9 in images size 28px square. This is considered the "Hello World" of CNNs. It's pretty basic and might be too simplistic for your requirements. Hard to gauge without more information. Nonetheless, this will run pretty quickly with CPU Tensorflow and the online tutorial is pretty good.
CIFAR-10: CIFAR is a much bigger dataset of objects and vehicles. The image sizes are 32px square so individual image processing isn't too bad. But the dataset is very large and your CPU might struggle with it. It takes a long time to train. You could try training on a reduced dataset but I don't know how that would go. Again, depends on your course requirements.
Flowers-Poets: There is the Tensorflow for Poets re-training example which might not be suitable for your course, you could use the flowers dataset to build your own model.
Build-your-own-model: You could use tf.Layers to build your own network and experiment with it. tf.Layers is pretty easy to use. Alternatively you could look at the new Estimators API that will automate a lot of the training processes for you. There are a number of tutorials (of varying quality) on the Tensorflow website.
I hope that helps give you a run-down of what's out there. Other datasets to look at are PASCAL VOC and imageNet (however they are huge!). Models to look at experimenting with may include VGG-16 and AlexNet.

machine learning for any cancer diagnosis on image dataset with python

Blockquote
i am working on this project asssigned by university as final project. But the issue is i am not getting any help from the internet so i thought may be asking here can solve issue. i had read many articles but they had no code or guidance and i am confused what to do. Basically it is an image processing work with machine learning. Data set can be found easily but issue is python python learning algorithm and code
Blockquote
I presume if it's your final project you have to create the program yourself rather than ripping it straight from the internet. If you want a good starting point which you can customise Tensor Flow from Google is very good. You'll want to understand how it works (i.e. how machine learning works) but as a first step there's a good example of image processing on the website in the form of number recognition (which is also the "Hello World" of machine learning).
https://www.tensorflow.org/get_started/mnist/beginners
This also provides a good intro to machine learning with neural nets: https://www.youtube.com/watch?v=uXt8qF2Zzfo
One note on Tensor Flow, you'll probably have to use Python 3.5+ as in my experience it can be difficult getting it on 2.7.
First of all I need to know what type of data are you using because depending on your data, if it is a MRI or PET scan or CT, there could be different suggestion for using machine learning in python for detection.
However, I suppose your main dataset consist of MR images, I am attaching an article which I found it a great overview of different methods>
This project compares four different machine learning algorithms: Decision Tree, Majority, Nearest Neighbors, and Best Z-Score (an algorithm of my own design that is a slight variant of the Na¨ıve Bayes algorithm)
https://users.soe.ucsc.edu/~karplus/abe/Science_Fair_2012_report.pdf
Here, breast cancer and colorectal cancer have been considered and the algorithms that performed best (Best Z-Score and Nearest Neighbors) used all features in classifying a sample. Decision Tree used only 13 features for classifying a sample and gave mediocre results. Majority did not look at any features and did worst. All algorithms except Decision Tree were fast to train and test. Decision Tree was slow, because it had to look at each feature in turn, calculating the information gain of every possible choice of cutpoint.
My Solution:-
Lung Image Database Consortium provides open access dataset for Lung Cancer Images.
Download it then apply any machine learning algorithm to classify images having tumor cells or not.
I attached a link for reference paper. They applied neural network to classify the images.
For coding part, use python "OpenCV" for image pre-processing and segmentation.
When it comes for classification part, use any machine learning libraries (tensorflow, keras, torch, scikit-learn... much more) as you are compatible to work with and perform classification using any better outperforming algorithms as you wish.
That's it..
Link for Reference Journal

OpenCV training output

So I am creating my own classifiers using the OpenCV Machine Learning module for age estimation. I can train my classifiers but the training takes a long time so I would like to see some output (status classifier, iterations done etc.). Is this possible? I'm using ml::Boost, ml::LogisticalRegression and ml::RTrees all inheriting cv::StatModel. Just to be clear i'm not using the given application for recognizing objects in images (opencv_createsamples and opencv_traincascade). The documentation is very limited so it's very hard to find something in it.
Thanks
Looks like there's an open feature request for a "progress bar" to provide some rudimentary feedback... See https://github.com/Itseez/opencv/issues/4881. Personally, I gave up on using the OpenCV ML a while back. There are several high-quality tools available to build machine learning models. I've personally used Google's Tensorflow, but I've heard good things about Theano and Caffe as well.

Open Source implementation of oriented Basic Image Features for Computer Vision?

Does anyone know if there is an open source implementation of oriented Basic Image Features (oBIF)? There's talk about it at http://www.cs.ucl.ac.uk/staff/m.lillholm/or.html and http://blog.kaggle.com/2011/05/04/andrew-newell-and-lewis-griffin-on-winning-the-icdar-2011-competition/. I looked through OpenCV some, but I didn't see anything that looked like this. But maybe I just don't understand all the terminology?
If not, I'd like to try to understand how hard it would be to implement. But I'm new to machine vision, and I'm having trouble understanding the image-specific jargon at the links above ('saddle-like', '2nd order gaussian filter bank', etc - I do know what a Gaussian is -just not a 2nd order filter bank of Gaussian's). What's a good reference for learning these machine vision concepts?
Full disclosure: I would like to know how to use them to compete in this years Kaggle competition (http://www.kaggle.com/c/awic2012) for Arabic Handwriting Recognition. I have more of a Machine Learning background, but I'm interested in using the features that won last years competition.

Getting the amplitude(or rms voltage) of audio signal captured in C++ by wavin lib.?

I am working on a very basic robotics project, and wish to implement voice recognition in it.
i know its a complex thing but i wish to do it for only 3 or 4 commands(or words).
i know that using wavin i can record audio. but i wish to do real-time amplitude analysis on the audio signal, how can that be done, the wave will be inputed as 8-bit, mono.
i have thought of divinding the signal into a set of some specific time, further diving it into smaller subsets, getting the average rms value over the subset and then summing them up and then see how much different they are from the actual stored signal.If the error is below accepted value for all(or most) of the sets, then print the word.
How can this be implemented?
if you can provide me any other suggestion also, it would be great.
Thanks, in advance.
There is no simple way to recognize words, because they are basically a sequence of phonemes which can vary in time and frequency.
Classical isolated word recognition systems use signal MFCC (cepstral coefficients) as input data, and try to recognize patterns using HMM (hidden markov models) or DTW (dynamic time warping) algorithms.
You will also need a silence detection module if you don't want a record button.
For instance Edimburgh University toolkit provides some of these tools (with good documentation).
If you don't want to build it "from scratch" or have a source of inspiration, here is an (old but free) implementation of such a system (which uses its own toolkit) with a full explanation and practical examples on how it works.
This system is a LVCSR (Large-Vocabulary Continuous Speech Recognition) and you only need a subset of it. If someone know an open source reduced vocabulary system (like a simple IVR) it would be welcome.
If you want to make a basic system from your own, I recommend you to use MFCC and DTW:
For each target word to modelize:
record some instances of the word
compute some (eg each 10ms) delta-MFCC through the word to have a model
When you want to recognize a signal:
compute some delta-MFCC of this signal
use DTW to compare these delta-MFCC to each modelized word's delta-MFCC
output the word that fits the best (use a threshold to drop garbage)
If you just want to recognize a few commands, there are many commercial and free products you can use. See Need text to speech and speech recognition tools for Linux or What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition? or Speech Recognition on iPhone. The answers to these questions link to many available products and tools. Speech recognition and understanding of a list of commands is a very common problem solved commercially. Many of the voice automated phone systems you call uses this type of technology. The same technology is available for developers.
From watching these questions for few months, I've seen most developer choices break down like this:
Windows folks - use the System.Speech features of .Net or Microsoft.Speech and install the free recognizers Microsoft provides. Windows 7 includes a full speech engine. Others are downloadable for free. There is a C++ API to the same engines known as SAPI. See at http://msdn.microsoft.com/en-us/magazine/cc163663.aspx. or http://msdn.microsoft.com/en-us/library/ms723627(v=vs.85).aspx
Linux folks - Sphinx seems to have a good following. See http://cmusphinx.sourceforge.net/ and http://cmusphinx.sourceforge.net/wiki/
Commercial products - Nuance, Loquendo, AT&T, others
Online service - Nuance, Yapme, others
Of course this may also be helpful - http://en.wikipedia.org/wiki/List_of_speech_recognition_software