Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have already built a deep neural network image classifier program in Matlab (gives 1 output for each example, such as is it a car or not), using gradient descent and back propagation algorithms. It is a simple feed forward network, with 1 or 2 hidden layers. I'm using the obtained weights in a nvcc C++ for real time object detection.
NN training results have quite a good accuracy (more than %99.9, but not enough), and can process more than 100,000 image files of size 32x32. But only problem with the Matlab code is: it ends up in local minimums in each training, so requires many different trainings but its training speed is quite slow.
Other than my slowly working Matlab NN training code, I have tried:
1) OpenCV 3.0.0, it "probably" has a bug in virtual float cv::ml::StatModel::predict function at the moment. So I weren't able to use it properly.
2) Tried OpenNN with gui, but it even gets stuck during loading and training. I'm still working on to fix that.
3) Checked FANN, but could only find "one" tutorial code written in C++. May take quite a time for me to master it with out examples.
4) I had tried Theano in Python a few months ago, it was quite customizable, and has quite many tutorials. But had never tried training image files with it.
5) I can also transfer my Matlab code to nvcc C++, and try conjugate gradients method to speed up further. But didn't try this yet, it is the last choice for me.
Mastering in each path may take quite a big time, and I have many more different works to do too. Which path should I take, or do you have another suggestion? Thank you
If you have experience with Matlab, the easiest path is to go through the "VGG Convolutional Neural Networks Practical" and use their open source MatConvNet toolbox for Matlab: http://www.vlfeat.org/matconvnet/.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have some questions about Tesseract OCR confidence value which can be obtained by calling AllWordConfidences() function in C++ API.
What is the confidence value (returned by tesseract API) and how tesseract calculates that value? (like based on what factors)?
Is there any possibility that I can change the accuracy levels of tesseract?
Can anyone help me with these questions? Thank you.
I've used similar metrics in other OCR software (specifically in ANPR software). If I recall correctly there are two confidence factors overall; one is a 0->100% confidence factor and the other is a 0->X value that is used as an aggregation of the various cascading confidence factors.
This value is arbitrary and so I'd recommend using the 0->100% value. Also note that each character should have a confidence factor.
These metrics calculated by evaluating how clear contour lines/edges are, how close the shapes detected in characters are to expected shapes and how close the decisions are to choose one character vs another is. IE the OCR has an easier time choosing between 'p' and 'b' than 'Q' and 'O'.
The only way to 'improve' these metrics is to train the detector! So prepare to have lots of valid data. You will also need patience using the Tesseract training tools - I found them to be 75% nightmarish.
Good luck!
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I want to implement a "simple" video editor and since I'm new to the topic, I'm not sure how to start.
The editor should have the following features / components
A timeline for multiple recordings
A video player that plays the edited video in real-time (it should render all added effects and assets)
Assets that can be placed on the timeline such as text elements, arrows and so on
I'd like to start with the video player and then build the other components around it.
Which frameworks would you recommend?
For the player, I'm not sure if DirectShow is the right choice or MediaFoundation would be better. Are there other libraries to consider? FFmpeg?
My recommendation given your interests is to start with Blender
http://www.blender.org
It's written in a combination of C, C++, and Python, has a substantial user community, and has the advantage of open source code so you can see how a real large project looks.
You might end up just contributing to it, or you might lift bits of it to bootstrap your own project, etc. But if you don't know about, it's worthwhile to look at if only to help you refine what you want to work on.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 1 year ago.
Improve this question
For my current project in C++ / Qt I need a library (LGPL is preferred) which can calculate a spectrogram from a signal ( basically an array of doubles ). I already use Qwt for the GUI part.
Any suggestions?
Thanks.
It would be fairly easy to put together your own spectrogram. The steps are:
window function (fairly trivial,
e.g. Hanning)
FFT (FFTW would be
a good choice but if licensing is an
issue then go for Kiss FFT or
similar)
calculate log magnitude
of frequency domain components
(trivial: log(sqrt(re * re + im *
im))
"How do I create a frequency vs time plot?" lists several libraries, each of which can calculate a spectrogram from a signal.
Copied and pasted from my own answer:
Some source code to generate spectrograms / waterfall plots from audio data:
SoX - Sound eXchange includes spectrogram source code
Audacity includes spectrogram source code
glfer includes waterfall spectrum display spectrum source code
source code that uses fftw to compute the spectrogram of an audio stream
more source code that uses OpenAL and fftw to compute the spectrogram for an audio stream
"Sound Activated Recorder with Spectrogram in C#" by Jeff Morton
Topographica seems to include spectrogram source code
SpectroGraph for iTunes
Image to Spectrogram goes in the reverse direction from the above utilities.
you could use fftw (fftw.org) to calculate the spectrogram, you would still need to plot the data, but that should not be a problem
You can use FFT code from here. It uses C++ template metaprogramming for efficiency. The full source is provided by the author here.
It was suggested to include this code into Eigen for its use of templated (type friendly) code.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am trying to do some Face Recognition (not detection) stuff using OpenCV. I found this article with some code:
http://www.cognotics.com/opencv/servo_2007_series/index.html
However, this code is written using the older C-style OpenCV API. Does someone have a C++ API version of this using a more recent version like OpenCV 2.3.1?
Update: OpenCV 2.4.2 now comes with the very new cv::FaceRecognizer. Please see the very detailed documentation at:
http://docs.opencv.org/trunk/modules/contrib/doc/facerec/index.html
I have released libfacerec, a modern face recognition library for the OpenCV C++ API (BSD license). libfacerec has no additional dependencies and implements the Eigenfaces method, Fisherfaces method and Local Binary Patterns Histograms. Parts of the library are going to be included in OpenCV 2.4.
The latest revision of the libfacerec is available at:
https://github.com/bytefish/libfacerec
The library was written for OpenCV 2.3.1 with the upcoming OpenCV 2.4 in mind, so I don't support OpenCV versions earlier than 2.3.1. This project comes as a CMake project with a well-documented API, there's also a tutorial on gender classification. You can see a HTML version of the documentation at:
http://www.bytefish.de/dev/libfacerec/
If you want to understand how those algorithms work, you might want to read my Guide To Face Recognition (includes Python and GNU Octave/MATLAB examples):
http://www.bytefish.de/blog/face_recognition_with_opencv2
The relevant publications are:
Turk, M., and Pentland, A. Eigenfaces for recognition.. Journal of Cognitive Neuroscience 3 (1991), 71–86.
Belhumeur, P. N., Hespanha, J., and Kriegman, D. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection.. IEEE Transactions on Pattern Analysis and Machine Intelligence 19, 7 (1997), 711–720.
Ahonen, T., Hadid, A., and Pietikainen, M. Face Recognition with Local Binary Patterns.. Computer Vision - ECCV 2004 (2004), 469–481.
I'm doing a face recognition project for my engineer's degree, using c++ api. I think that everything regarding face recognition in c++ is fairly straightforward, even simpler than in C (less pointers). To use PCA you have a class named PCA described here. Just use the proper methods and read documentation with understanding. To build the matrix with input data I've created a matrix of proper size, then pasted pictures as rows (use method reshape) into it (there is a method in cv::Mat that lets you to get easily a row of a matrix). You just need to keep sure that base data and tested data have the same parameters (channels, size,etc.)
EDIT:
using namespace cv; //somewhere near top
inserting data to data matrix:
62 Mat reshaped=img.reshape(1,1);
63 Mat dataRow=_data.row(y++);
64 resize(reshaped,dataRow,dataRow.size(),0,0,CV_INTER_LINEAR);
computing pca:
251 _pca(_data,Mat(),CV_PCA_DATA_AS_ROW); //compute pca
252 _pca.project(_data,_vectors); // project original data to new coordinates
As opencv's documentation isn't the best out there, it doesn't hurt to spend some time reading it. Most of the c api functions have their equivalents in c++ api, You only need to do some "write into search window and hit enter" searching. And, there are also tutorials in c++ to get a grip of the c++ api.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I'm an Electronics and Telecommunications student, next to my graduation. I'm gonna work on a project that involves my knowledge about DSP, music and audio in general. I allready know all the basic mathematic instruments and all the stuff I need to manage it, such as FFT, circular convolution ecc ecc.
I want to learn C++ programming basically for one reason: it's very important in the professional world!!! And I think it's one of the most used to write applications working with audio, especially when it's about real time processing.
Ok, after this small introduction I would like to know first, which are the most used libraries to work with audio processing in c++?? I was longer looking on the web but i couldn't find a lo of working stuff. (I work under linux with eclipse CDT enviroment).
Then I would like to know if there are good sources to learn how to write some working code, such as for example how to write a simple low pass filter. Basically now i will not write real time applications, I would like to start from the processing of a WAV file, or even better an MP3 file, so basically on vectors of samples.
Let's say that basically for now I would like to extract the waveform from an audio file, and save it to a thumbnail or to a PNG image.
Ok, for now I think it's all I would need.
Any ideas, advices, libraries, books, interesting sources about that?
Thanks a lot in advance for any kind of answer.
Giovanni.
I would suggest for you to write your own WAVE file reader and writer in C++, without relying on external libraries. The WAVE format is fairly straight forward, at least if you only intend on supporting the most common wave files.
Then you'll have access to the audio data, which you can easily manipulate in C++. I would recommend starting by modifying the volume, the number of channels to calculating statistics on the audio. Creating a PNG of the audio waveform requires some more advanced C++ skills...
Checkout this link which will give you some information on the available (commercial and open source) audio editing softwares.
Some interesting open source audio editing tools which are written in c++,
Audacity
LMMS
Qtractor
Ardour
Rosegarden
C++ library for audio processing.
SndObj
The Synthesis ToolKit in C++
C++ Code and links related Filters and audio processing..
C++ code for Filter,Audio Processing
Code Guru,Low pass filter
I've used BASS with good results (there's a C/C++ API you can use).