Why does dlib's neural net xml export contain different parameters for layers than specified by the trainer? - c++

In DLib it is possible to simply output a neural net type via the dlib::net_to_xml(some_net, some_filename) function. It works fine, but it also displays information like for example the net type, learning_rate_multi. In my case for example it exports the following line for one of the layers (the rest of the exported xml is omitted for clarity):
<fc num_outputs='42' learning_rate_mult='1' weight_decay_mult='1' bias_learning_rate_mult='500' bias_weight_decay_mult='0'>
Those values are correct, except for learning_rate_mult and weight_decay_mult, which always show 1. I tried setting them to different values with the trainer class, like 2 or 0.0001, but they keep showing 1. I verified that the values 2 and 0.0001 indeed were used by the net.
Might this be a bug in dlib's dlib::net_to:xml function?

Those values apply for every layer and are independent from the trainer-values. The layer parameters are relevant for optimzers like the Adam Optimization Algorithm:
https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/
you can change them by specifyng them in every layer.
So no it is not a bug.

Related

Mxnet with symbol API: batch normalization update

I am currently training a convolutional neural network using Mxnet, with the C++ Symbol API. This network contains some Batchnormalization layers, which contains the four parameter NDArray. Two of them, the moving_mean and moving_variance parameter are supposed to be updated at every batch during the training.
I was guessing that, since the boolean for the forward pass of the executor is set to true, it would update automatically the new parameters. However, for some reasons, these two NDArray remains still, without any update of the parameter. How so? Besides, since there are no gradients computed for these two NDArray, because it is not "learnable" parameters, I have no way to update the values through the regular optimizer update function. How to tell Mxnet, using the symbol API, to update the moving_mean and moving_variance NDArrays?
moving_mean and moving_variance are updated during the backward pass of training, rather than during the optimization step like other parameters. One other reason these parameters could remain fixed during training is if you've set use_global_stats=True on the BatchNorm layer.

userWarning pymc3 : What does reparameterize mean?

I built a pymc3 model using the DensityDist distribution. I have four parameters out of which 3 use Metropolis and one uses NUTS (this is automatically chosen by the pymc3). However, I get two different UserWarnings
1.Chain 0 contains number of diverging samples after tuning. If increasing target_accept does not help try to reparameterize.
MAy I know what does reparameterize here mean?
2. The acceptance probability in chain 0 does not match the target. It is , but should be close to 0.8. Try to increase the number of tuning steps.
Digging through a few examples I used 'random_seed', 'discard_tuned_samples', 'step = pm.NUTS(target_accept=0.95)' and so on and got rid of these user warnings. But I couldn't find details of how these parameter values are being decided. I am sure this might have been discussed in various context but I am unable to find solid documentation for this. I was doing a trial and error method as below.
with patten_study:
#SEED = 61290425 #51290425
step = pm.NUTS(target_accept=0.95)
trace = sample(step = step)#4000,tune = 10000,step =step,discard_tuned_samples=False)#,random_seed=SEED)
I need to run these on different datasets. Hence I am struggling to fix these parameter values for each dataset I am using. Is there any way where I give these values or find the outcome (if there are any user warnings and then try other values) and run it in a loop?
Pardon me if I am asking something stupid!
In this context, re-parametrization basically is finding a different but equivalent model that it is easier to compute. There are many things you can do depending on the details of your model:
Instead of using a Uniform distribution you can use a Normal distribution with a large variance.
Changing from a centered-hierarchical model to a
non-centered
one.
Replacing a Gaussian with a Student-T
Model a discrete variable as a continuous
Marginalize variables like in this example
whether these changes make sense or not is something that you should decide, based on your knowledge of the model and problem.

SegNet results of train set (test via test_segmentation.py)

I run SegNet on my own dataset (by Segnet tutorial). I see great results via test_segmentation.py.
my problem is that I want to see the real net results and not test_segmentation own colorisation (via classes).
for example, if I have trained net with 2 classes, so after the train I will see not only 2 colors (as we see with the classes), but we will see the real net color segmentation ([0.22,0.19,0.3....) lighter and darker as the net see it]
I hope that I explained myself well. thanks for helping.
You could use a python script to achieve what you want. Take a look at this script.
The command out = out['argmax'], extracts the raw output, so you can get a segmentation map with 'lighter and darker' values as you wanted.
When you say the 'real' net color segmentation I will assume that you mean the probability maps. Effectively the last layer will have one map for every class; and if you check the function predict in inference.py, they take the argmax; that is the channel (which represents the class) with the highest probability. If you want to get these maps, you just have to get the data without computing the argmax; something like:
predicted = net.blobs['prob'].data
I solve it. the solution is to range cmin and cmax from 0 to 1 in the scipy saving method. for example: scipy.misc.toimage(output, cmin=0.0, amax=1).save(/path/.../image.png)

Caffe GoogleNet classification.cpp gives random outputs

I used Caffe GoogleNet model to train my own data (10k images, 2 classes). I stop it at 400000th iteration with an accuracy of ~80%.
If I run the below command:
./build/examples/cpp_classification/classification.bin
models/bvlc_googlenet/deploy.prototxt
models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel
data/ilsvrc12/imagenet_mean.binaryproto
data/ilsvrc12/synset_words.txt
1.png
it gives me a different -- apparently random -- result each time (i.e. if I run it n times, then I get n different results). Why? Does my training fail? Does it still use the old data from the reference model?
I don't think it is a problem with the training. Even if the training data wasn't, it should give the same (possibly wrong) output every time. If you are getting random results, it indicates that the weights are not being loaded properly.
When you load a .caffemodel against a .prototxt, caffe will load the weights of all the layers in the prototxt whose names match with the ones in the caffemodel. For the other layers, it will do a random initialisation (gaussian xavier, etc according to the specification in the prototxt).
So the best thing for you to do now is to check if the model was trained using the same prototxt you are using now.
I see that you are using GoogleNet prototxt and reference_caffenet caffemodel. Is this intentional?
When you want to deploy the fine-tuned model, you should check two main things:
Inputs:
Input image uses a BGR channel instead of RGB (e.g. opencv)
Mean file: Is same as mean file when training?
Prototxt:
When fine-tuning the model, you will change some layers' name in the original prototxt, and you should check whether the same layer name used?
And there are some
Fine-tune tricks and CS231n_transfer_learning which are very useful for fine-tuning.

Face Recognition Using Backpropagation Neural Network?

I'm very new in image processing and my first assignment is to make a working program which can recognize faces and their names.
Until now, I successfully make a project to detect, crop the detected image, make it to sobel and translate it to array of float.
But, I'm very confused how to implement the Backpropagation MLP to learn the image so it can recognize the correct name for the detected face.
It's a great honor for all experts in stackoverflow to give me some examples how to implement the Image array to be learned with the backpropagation.
It is standard machine learning algorithm. You have a number of arrays of floats (instances in ML or observations in statistics terms) and corresponding names (labels, class tags), one per array. This is enough for use in most ML algorithms. Specifically in ANN, elements of your array (i.e. features) are inputs of the network and labels (names) are its outputs.
If you are looking for theoretical description of backpropagation, take a look at Stanford's ml-class lectures (ANN section). If you need ready implementation, read this question.
You haven't specified what are elements of your arrays. If you use just pixels of original image, this should work, but not very well. If you need production level system (though still with the use of ANN), try to extract more high level features (e.g. Haar-like features, that OpenCV uses itself).
Have you tried writing your feature vectors to an arff file and to feed them to weka, just to see if your approach might work at all?
Weka has a lot of classifiers integrated, including MLP.
As I understood so far, I suspect the features and the classifier you have chosen not to work.
To your original question: Have you made any attempts to implement a neural network on your own? If so, where you got stuck? Note, that this is not the place to request a complete working implementation from the audience.
To provide a general answer on a general question:
Usually you have nodes in an MLP. Specifically input nodes, output nodes, and hidden nodes. These nodes are strictly organized in layers. The input layer at the bottom, the output layer on the top, hidden layers in between. The nodes are connected in a simple feed-forward fashion (output connections are allowed to the next higher layer only).
Then you go and connect each of your float to a single input node and feed the feature vectors to your network. For your backpropagation you need to supply an error signal that you specify for the output nodes. So if you have n names to distinguish, you may use n output nodes (i.e. one for each name). Make them for example return 1 in case of a match and 0 else. You could very well use one output node and let it return n different values for the names. Probably it would even be best to use n completely different perceptrons, i.e. one for each name, to avoid some side-effects (catastrophic interference).
Note, that the output of each node is a number, not a name. Therefore you need to use some sort of thresholds, to get a number-name relation.
Also note, that you need a lot of training data to train a large network (i.e. to obey the curse of dimensionality). It would be interesting to know the size of your float array.
Indeed, for a complex decision you may need a larger number of hidden nodes or even hidden layers.
Further note, that you may need to do a lot of evaluation (i.e. cross validation) to find the optimal configuration (number of layers, number of nodes per layer), or to find even any working configuration.
Good luck, any way!