MLP: Relation between number of iteration and accuracy - c++

I would like to ask somebody for advice. I created programme in C++ where I am using OpenCV library (v2.4.11), especially MLP classifier.
I have had accuracy on 2 000 testing screens about 92% but only when I set number of iterations on 1. With bigger numbers like 100, 1000 it is getting worse (on 100 it is 78%, on 1000 77%).
It is possible that problem is in data model and programming part is correct? Or it has to be my fault?
Thank you very much.

It is possible that problem is in data model and programming part is
correct?
Yes, the number of iterations, like the number of neurons and the number of layers, is one of the parameters that has great impact on the overall performance of the MLP classifier. The more iterations you apply to the MLP training the more the MLP NN adapts/fits to its training data. This leads to high performance on training data but can eventually lead to poor performance on test data. In this case you have over-train/over-fit your MLP NN.
There are however methods (e.g., grid-search) for the optimization of the parameters of a classifier.

Related

Yolo high mAP but fairly low confidence

I am currently working on a project which makes use of YOLOv5. I have trained my model for 100 epochs on a 5000+ dataset of images and got a good mAP of 0.95. My question came up when I tried to detect objects using the detect.py with the trained weights I got fairly low confidence in detection about 0.30 to 0.70 on certain objects. Should I train my model for more epochs to better my confidence? and does high mAP not result in high confidence?

Hyper parameter tuning for Random cut forest

I have used to below hyper parameters to train the model.
rcf.set_hyperparameters(
num_samples_per_tree=200,
num_trees=250,
feature_dim=1,
eval_metrics =["accuracy", "precision_recall_fscore"])
is there any best way to choose the num_samples_per_tree and num_trees parameters.
what are the best numbers for both num_samples_per_tree and num_trees.
There are natural interpretations for these two hyper-parameters that can help you determine good starting approximations for HPO:
num_samples_per_tree -- the reciprocal of this value approximates the density of anomalies in your data set/stream. For example, if you set this to 200 then the assumption is that approximately 0.5% of the data is anomalous. Try exploring your dataset to make an educated estimate.
num_trees -- the more trees in your RCF model the less noise in scores. That is, if more trees are reporting that the input inference point is an anomaly then the point is much more likely to be an anomaly than if few trees suggest so.
The total number of points sampled from the input dataset is equal to num_samples_per_tree * num_trees. You should make sure that the input training set is at least this size.
(Disclosure - I helped create SageMaker Random Cut Forest)

Comparison of HoG with CNN

I am working on the comparison of Histogram of oriented gradient (HoG) and Convolutional Neural Network (CNN) for the weed detection. I have two datasets of two different weeds.
CNN architecture is 3 layer network.
1) 1st dataset contains two classes and have 18 images.
The dataset is increased using data augmentation (rotation, adding noise, illumination changes)
Using the CNN I am getting a testing accuracy of 77% and for HoG with SVM 78%.
2) Second dataset contact leaves of two different plants.
each class contain 2500 images without data augmentation.
For this dataset, using CNN I am getting a test accuracy of 94% and for HoG with SVM 80%.
My question is Why I am getting higher accuracy for HoG using first dataset? CNN should be much better than HoG.
The only reason comes to my mind is the first data has only 18 images and less diverse as compare to the 2nd dataset. is it correct?
Yes, your intuition is right, having this small data set (just 18 images before data augmentation) can cause the worse performance. In general, for CNNs you usually need at least thousands of images. SVMs do not perform that bad because of the regularization (that you most probably use) and because of the probably much lower number of parameters the model has. There are ways how to regularize deep nets, e.g., with your first data set you might want to give dropout a try, but I would rather try to acquire a substantially larger data set.

Using class weight to balance data set lowers accuracy in RBF SVM

I have been using sklearn to learn on some data. This is a binary classifcation task and I am using a RBF kernel. My data set is quite unbalanced (80:20) and I'm using only 120 samples, with 10ish features (I've been experimenting with a few less). Since I set class_weight="auto" the accuracy I've calculated from a cross validated (10 folds) gridsearch has dropped dramatically. Why??
I will include a couple of validation accuracy heatmaps to demonstrate the difference.
NOTE: top heatmap is before classweight was changed to auto.
Accuracy is not the best metrics to use when dealing with unbalanced dataset. Let's say you have 99 positive examples and 1 negative example, and if you predict all outputs to be positive, still you will get 99% accuracy, whereas you have mis-classified the only negative example. You might have gotten high accuracy in the first case because your predictions will be on the side which has high number of samples.
When you do class weight = auto, it takes the imbalance into consideration and hence, your predictions might have moved towards center, you can cross-check it using plotting the histograms of predictions.
My suggestion is, don't use accuracy as performance metric, use something like F1 Score or AUC.

Parameters to improve a music frequency analyzer

I'm using a FFT on audio data to output an analyzer, like you'd see in Winamp or Windows Media Player. However the output doesn't look that great. I'm plotting using a logarithmic scale and I average the linear results from the FFT into the corresponding logarithmic bins. As an example, I'm using bins like:
16k,8k,4k,2k,1k,500,250,125,62,31,15 [hz]
Then I plot the magnitude (dB) against frequency [hz]. The graph definitely 'reacts' to the music, and I can see the response of a drum sample or a high pitched voice. But the graph is very 'saturated' close to the lower frequencies, and overall doesn't look much like what you see in applications, which tend to be more evenly distributed. I feel that apps that display visual output tend to do different things to the data to make it look better.
What things could I do to the data to make it look more like the typical music player app?
Some useful information:
I downsample to single channel, 32kHz, and specify a time window of 35ms. That means the FFT gets ~1100 points. I vary these values to experiment (ie tried 16kHz, and increasing/decreasing interval length) but I get similar results.
With an FFT of 1100 points, you probably aren't able to capture the low frequencies with a lot of frequency resolution.
Think about it, 30 Hz corresponds to a period of 33ms, which at 32kHz is roughly 1000 samples. So you'll only be able to capture about 1 period in this time.
Thus, you'll need a longer FFT window to capture those low frequencies with sharp frequency resolution.
You'll likely need a time window of 4000 samples or more to start getting noticeably more frequency resolution at the low frequencies. This will be fine too, since you'll still get about 8-10 spectrum updates per second.
One option too, if you want very fast updates for the high frequency bins but good frequency resolution at the low frequencies, is to update the high frequency bins more quickly (such as with the windows you're currently using) but compute the low frequency bins less often (and with larger windows necessary for the good freq. resolution.)
I think a lot of these applications have variable FFT bins.
What you could do is start with very wide evenly spaced FFT bins like you have and then keep track of the number of elements that are placed in each FFT bin. If some of the bins are not used significantly at all (usually the higher frequencies) then widen those bins so that they are larger (and thus have more frequency entries) and shring the low frequency bins.
I have worked on projects were we just spend a lot of time tuning bins for specific input sources but it is much nicer to have the software adjust in real time.
A typical visualizer would use constant-Q bandpass filters, not a single FFT.
You could emulate a set of constant-Q bandpass filters by multiplying the FFT results by a set of constant-Q filter responses in the frequency domain, then sum. For low frequencies, you should use an FFT longer than the significant impulse response of the lowest frequency filter. For high frequencies, you can use shorter FFTs for better responsiveness. You can slide any length FFTs along at any desired update rate by overlapping (re-using) data, or you might consider interpolation. You might also want to pre-window each FFT to reduce "spectral leakage" between frequency bands.