OpenCV Neural Network train one iteration at a time - c++

The only way I know to train a multilayer neural network in OpenCV is:
CvANN_MLP network;
....
network.train(input, output, Mat(), Mat(), params, flags);
But this will not print out any meaningful debug (e.g. Iteration count, current error,...), the program will just sit there until it finishes training, very troublesome if the dataset is in gigabytes, there's no way I can see the progress.
How do I train the network one iteration at a time, or print out some debug while training?

Problem not solved, but question solved. Answer: It's impossible as far as the current OpenCV versions are concerned.

Are you setting the UPDATE_WEIGHTS flags?
You can test the error yourself by having the ANN predict the result vector for each sample in the training set.

According to http://opencv.willowgarage.com/documentation/cpp/ml_neural_networks.html#cvann-mlp-train
the params parameter is of Type cvANN_MLP_TrainParams. This class contains a property TermCriteria which controls the when the training function terminates. This Termination criteria class http://opencv.willowgarage.com/documentation/cpp/basic_structures.html can be set to terminate after a given number of iterations or when a given epsilon conditions is fulfilled or some combination of both. I have not used the training function myself so I can't know the code that you'd use to make this work, but something like this should limit the number of training cycles
CvANN_MLP_TrainParams params = CvANN_MLP_TrainParams()
params.term_crit.type = 1;//This should tell the train function you want to terminate on number of iterations
params.term_crit.maxCount = 1;//Termination after one iteration might be max_iter instead of maxCount
network.train(input, output, Mat(),Mat(), params, flags)
Like I said I haven't worked with openCV but having read the documentation something like this should work.

Your answer lays in the source code. IF you want to get some output after every x epochs, put something in the source code, in this loop:
https://github.com/opencv/opencv/blob/9787ab598b6609a6ca6652a12441d741cb15f695/modules/ml/src/ann_mlp.cpp#L941
When they made OpenCV they had to find a balance between user customizability and how easy it is to use/read. Ultimately you have the power to do whatever you want when editing the source code.

Related

userWarning pymc3 : What does reparameterize mean?

I built a pymc3 model using the DensityDist distribution. I have four parameters out of which 3 use Metropolis and one uses NUTS (this is automatically chosen by the pymc3). However, I get two different UserWarnings
1.Chain 0 contains number of diverging samples after tuning. If increasing target_accept does not help try to reparameterize.
MAy I know what does reparameterize here mean?
2. The acceptance probability in chain 0 does not match the target. It is , but should be close to 0.8. Try to increase the number of tuning steps.
Digging through a few examples I used 'random_seed', 'discard_tuned_samples', 'step = pm.NUTS(target_accept=0.95)' and so on and got rid of these user warnings. But I couldn't find details of how these parameter values are being decided. I am sure this might have been discussed in various context but I am unable to find solid documentation for this. I was doing a trial and error method as below.
with patten_study:
#SEED = 61290425 #51290425
step = pm.NUTS(target_accept=0.95)
trace = sample(step = step)#4000,tune = 10000,step =step,discard_tuned_samples=False)#,random_seed=SEED)
I need to run these on different datasets. Hence I am struggling to fix these parameter values for each dataset I am using. Is there any way where I give these values or find the outcome (if there are any user warnings and then try other values) and run it in a loop?
Pardon me if I am asking something stupid!
In this context, re-parametrization basically is finding a different but equivalent model that it is easier to compute. There are many things you can do depending on the details of your model:
Instead of using a Uniform distribution you can use a Normal distribution with a large variance.
Changing from a centered-hierarchical model to a
non-centered
one.
Replacing a Gaussian with a Student-T
Model a discrete variable as a continuous
Marginalize variables like in this example
whether these changes make sense or not is something that you should decide, based on your knowledge of the model and problem.

When training a single batch, is iteration of examples necessary (optimal) in python code?

Say I have one batch that I want to train my model on. Do I simply run tf.Session()'s sess.run(batch) once, or do I have to iterate through all of the batch's examples with a loop in the session? I'm looking for the optimal way to iterate/update the training ops, such as loss. I thought tensorflow would handle it itself, especially in the cases where tf.nn.dynamic_rnn() takes in a batch dimension for listing the examples. I thought, perhaps naively, that a for loop in the python code would be the inefficient method of updating the loss. I am using tf.losses.mean_squared_error(batch) for a regression problem.
My regression problem is given two lists of word vectors (300d each), and determines the similarity between the two lists on a continuous scale from [0, 5]. My supervised model is Deepmind's Differential Neural Computer (DNC). The problem is I do not believe it is learning anything. this is due to the fact that the all of the output from the model is centered around 0 and even negative. I do not know how it could possibly be negative given no negative labels provided. I only call sess.run(loss) for the single batch, I do not create a python loop to iterate through it.
So, what is the most efficient way to iterate the training of a model and how do people go about it? Do they really use python loops to do multiple calls to sess.run(loss) (this was done in the training file example for DNC, and I have seen it in other examples as well). I am certain I get the final loss from the below process, but I am uncertain if the model has actually been trained entirely just because the loss was processed in one go. I also do not understand the point of update_ops returned by some functions, and am uncertain if they are necessary to ensure the model has been trained.
Example of what I mean by processing a batch's loss once:
# assume the model has been defined prior through batch_output_logits
train_loss = tf.losses.mean_squared_error(labels=target,
predictions=batch_output_logits)
with tf.Session() as sess:
sess.run(init_op) # pseudo code, unnecessary for question
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
# is this the entire batch's loss && model has been trained for that batch?
loss_np = sess.run(train_step, train_loss)
coord.request_stop()
coord.join(threads)
Any input on why I am receiving negative values when the labels are in the range [0, 5] is welcomed as well(general abstract answers for this are fine, because its not the main focus). I am thinking of attempting to create a piece-wise function, if possible, for my loss, so that for any values out of bounds face a rapidly growing exponential loss function. Uncertain how to implement, or if it would even work.
Code is currently private. Once allowed, I will make the repo public.
To run DNC model, go to the project/ directory and run python -m src.main. If there are errors you encounter feel free to let me know.
This model depends upon Tensorflow r1.2, most recent Sonnet, and NLTK's punkt for Tokenizing sentences in sts_handler.py and tests/*.
In a regression model, the network calculates the model output based on the randomly initialized values for your model parameters. That's why you're seeing negative values here; you haven't trained your model enough for it to learn that your values are only between 0 and 5.
Unless I'm missing something, you are only calculating the loss, but you aren't actually training the model. You should probably be calling sess.run(optimizer) on an optimizer, not on your loss function.
You probably need to train your model for multiple epochs (training your model for one epoch = training your model once on the entire dataset).
Batches are used because it is more computationally efficient to train your model on a batch than it is to train it on a single example. However, your data seems to be small enough that you won't have that problem. As such, I would recommend reducing your batch size to as low as possible. As a general rule, you get better training from a smaller batch size, at the cost of added computation.
If you post all of your code, I can take a look.

SegNet results of train set (test via test_segmentation.py)

I run SegNet on my own dataset (by Segnet tutorial). I see great results via test_segmentation.py.
my problem is that I want to see the real net results and not test_segmentation own colorisation (via classes).
for example, if I have trained net with 2 classes, so after the train I will see not only 2 colors (as we see with the classes), but we will see the real net color segmentation ([0.22,0.19,0.3....) lighter and darker as the net see it]
I hope that I explained myself well. thanks for helping.
You could use a python script to achieve what you want. Take a look at this script.
The command out = out['argmax'], extracts the raw output, so you can get a segmentation map with 'lighter and darker' values as you wanted.
When you say the 'real' net color segmentation I will assume that you mean the probability maps. Effectively the last layer will have one map for every class; and if you check the function predict in inference.py, they take the argmax; that is the channel (which represents the class) with the highest probability. If you want to get these maps, you just have to get the data without computing the argmax; something like:
predicted = net.blobs['prob'].data
I solve it. the solution is to range cmin and cmax from 0 to 1 in the scipy saving method. for example: scipy.misc.toimage(output, cmin=0.0, amax=1).save(/path/.../image.png)

How to process multiple inputs in c++ for artificial neural network

How does C++ process multiple inputs for an artificial neural network in real-time?
I'm assuming this is without using a spiking neural network, but a more traditional one (i.e. just a basic neural network as described here)
http://www.ai-junkie.com/ann/evolved/nnt1.html
Is this possible in a real-time world? I was thinking one would have to process either each input individually (which will always result in the same output, hence the dilemna), or accrue a certain # of inputs per time threshold and then process them at once...
then again, what does someone do with multiple instances of the same input? Process it twice?
I ask this because I'm looking at neuralbot, which I believe uses a normal neural network, but I'm trying to understand ANN's first before I delve into it, and am not sure how an ANN processes multiple inputs before processing target output(s).
Your question is not really clear but I'll try to answer. :)
You can see an ANN (Artificial Neural Network) like a particular case of Adaptive Filters.
There are three main elements:
A sequence of inputs x(n).
A Parametric Variable Filter. In this case the filter is the ANN and the parameters are the neurons weights.
An Update Algorithm that updates the filter parameters according to the error between desired an actual output. In ANN the most used update algorithm is the Backpropagation Algorithm.
In ANN there are two steps:
The Training Step. This is the hard part. You start with random neurons weights. You have a sequence of inputs and their desired outputs and you run the ANN with the update algorithm on. When the error is under a certain threshold you can say that your ANN is trained. This step is usually done off-line (not in real time).
The Execution Step. You have the trained ANN. Now just put in sequence the input in it and use the output. This is usually a fast operation and can be done in real-time (if it is what you mean).
Now.. what do you mean for "multiple inputs at once"? First of all standard computers can do only a very very small number of operations at once, standard PC has 4/8 cores so can do almost 4-8 operations at once. This number is too low for every real world ANN applications.
You said:
what does someone do with multiple instances of the same input? Process it twice?
The answer is Yes. The "Execution Step" is so fast that there are no reason to don't do it. In the "Training Step" duplicated inputs can be removed before training starts (cause the training inputs are known a priori). So there are no problem in this. :)

Autocorrelation returns random results with mic input (using a high pass filter)

Sorry to ask a similar question to the one i asked before (FFT Problem (Returns random results)), but i've looked up pitch detection and autocorrelation and have found some code for pitch detection using autocorrelation.
Im trying to do pitch detection of a users singing. Problem is, it keeps returning random results. I've got some code from http://code.google.com/p/yaalp/ which i've converted to C++ and modified (below). My sample rate is 2048, and data size is 1024. I'm detecting pitch of both a sine wave and mic input. The frequency of the sine wave is 726.0, and its detecting it to be 722.950820 (which im ok with), but its detecting the pitch of the mic as a random number from around 100 to around 1050.
I'm now using a High pass filter to remove the DC offset, but it's not working. Am i doing it right, and if so, what else can i do to fix it? Any help would be greatly appreciated!
(Fixed)
Thanks,
Niall.
Edit: Changed the code to implement a high pass filter with a cutoff of 30hz (from What Are High-Pass and Low-Pass Filters?, can anyone tell me how to convert the low-pass filter using convolution to a high-pass one?) but it's still returning random results. Plugging it into a VST host and using VST plugins to compare spectrums isn't an option to me unfortunately.
Edit: Fixed, thanks for everyones help, but I never got it to work, now using new code.
I am no sound expert, but if you are sampling with 44100 (I guess samples per second) and use 1024 datapoints. You are working with about 1/40th of a second worth of data. I doesn't surprise me that the current pitch varies a lot, depending on which piece you pick. If you want to find the average or main pitch of a voice, I'd expect to need about 1second worth of data.
At 44.1 kHz sampling frequency, 1024 samples is only a little bit over 23 ms worth of data. Isn't it possible that this is simply insufficient data in order to compute the pitch of a human singer?
I mean, the sound I can make that lasts for 23 ms is probably not something I have a lot of pitch-control over; I would expect this kind of measurement to be done over slighly longer periods of time.
The problem is in your findBestCandidates() function:
Inside this function you access the 'inputs' array from 0 up to 'length - 1'.
When you call this function inside detectPitchCalculation() function 'inputs' is 'results' and 'length' is 'nHiPeriodInSamples'.
But 'results' is only allocated and filled up to 'nHiPeriodInSamples - nLowPeriodInSamples - 1'.
So if 'nLowPeriodInSamples' is greater 0 you access unallocated and random memory inside the findBestCandidates() function!
EDIT:
Another bug is that you fill each 'nResolution' entry of the 'results' array in detectPitchCalculation() function but access each entry in the findBestCandidates() function (via the 'inputs' argument). But since you call detectPitchCalculation() with a 'nResolution=1' this does not explain your specific problem...so I will look a little bit more. But it would definitely a problem if you call it with higher resolutions.
I don't see the problem in you code, but I'm no good in C. But I'd try the following to find the problem:
run with data where the result in known, e.g. with sin(x) as input
run it with small data size (e.g. 2)
Compare the results with known correct ones. You should be able to find those on the internet, or do them by hand.
If random means: same input, different output, you most probably have some bug in the initialisation of variables. Use a debugger and known input to check, that all variables, especially all elements of arrays are properly initialized.