I am training a neural network (in C++, without any additional library), to learn a random wiggly function:
f(x)=0.2+0.4x2+0.3sin(15x)+0.05cos(50x)
Plotted in Python as:
lim = 500
for i in range(lim):
x.append(i)
p = 2*3.14*i/lim
y.append(0.2+0.4*(p*p)+0.3*p*math.sin(15*p)+0.05*math.cos(50*p))
plt.plot(x,y)
that corresponds to a curve as :
The same neural network has successfully approximated the sine function quite well with a single hidden layer(5 neurons), tanh activation. But, I am unable to understand what's going wrong with the wiggly function. Although the Mean Square Error seems to dip.(**The error has been scaled up by 100 for visibility):
And this is the expected (GREEN) vs predicted (RED) graph.
I doubt the normalization. This is how I did it:
Generated training data as:
int numTrainingSets = 100;
double MAXX = -9999999999999999;
for (int i = 0; i < numTrainingSets; i++)
{
double p = (2*PI*(double)i/numTrainingSets);
training_inputs[i][0] = p; //INSERTING DATA INTO i'th EXAMPLE, 0th INPUT (Single input)
training_outputs[i][0] = 0.2+0.4*pow(p, 2)+0.3*p*sin(15*p)+0.05*cos(50*p); //Single output
///FINDING NORMALIZING FACTOR (IN INPUT AND OUTPUT DATA)
for(int m=0; m<numInputs; ++m)
if(MAXX < training_inputs[i][m])
MAXX = training_inputs[i][m]; //FINDING MAXIMUM VALUE IN INPUT DATA
for(int m=0; m<numOutputs; ++m)
if(MAXX < training_outputs[i][m])
MAXX = training_outputs[i][m]; //FINDING MAXIMUM VALUE IN OUTPUT DATA
///NORMALIZE BOTH INPUT & OUTPUT DATA USING THIS MAXIMUM VALUE
////DO THIS FOR INPUT TRAINING DATA
for(int m=0; m<numInputs; ++m)
training_inputs[i][m] /= MAXX;
////DO THIS FOR OUTPUT TRAINING DATA
for(int m=0; m<numOutputs; ++m)
training_outputs[i][m] /= MAXX;
}
This is what the model trains on. The validation/test data is generated as follows:
int numTestSets = 500;
for (int i = 0; i < numTestSets; i++)
{
//NORMALIZING TEST DATA USING THE SAME "MAXX" VALUE
double p = (2*PI*i/numTestSets)/MAXX;
x.push_back(p); //FORMS THE X-AXIS FOR PLOTTING
///Actual Result
double res = 0.2+0.4*pow(p, 2)+0.3*p*sin(15*p)+0.05*cos(50*p);
y1.push_back(res); //FORMS THE GREEN CURVE FOR PLOTTING
///Predicted Value
double temp[1];
temp[0] = p;
y2.push_back(MAXX*predict(temp)); //FORMS THE RED CURVE FOR PLOTTING, scaled up to de-normalize
}
Is this normalizing right? If yes, what could probably go wrong? If no, what should be done?
There's nothing wrong with using that normalization, unless you use a fancy weight initialization for the neural network. It rather seems that something goes wrong during training but without further details on that side, it's hard to pinpoint the problem.
I ran a quick crosscheck using tensorflow (MSE loss; Adam optimizer) and it does converge in that case:
Here's the code for reference:
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
x = np.linspace(0, 2*np.pi, 500)
y = 0.2 + 0.4*x**2 + 0.3*x*np.sin(15*x) + 0.05*np.cos(50*x)
class Model(tf.keras.Model):
def __init__(self):
super().__init__()
self.h1 = tf.keras.layers.Dense(5, activation='tanh')
self.out = tf.keras.layers.Dense(1, activation=None)
def call(self, x):
return self.out(self.h1(x))
model = Model()
loss_object = tf.keras.losses.MeanSquaredError()
train_loss = tf.keras.metrics.Mean(name='train_loss')
optimizer = tf.keras.optimizers.Adam()
#tf.function
def train_step(x, y):
with tf.GradientTape() as tape:
loss = loss_object(y, model(x))
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
train_loss(loss)
# Normalize data.
x /= y.max()
y /= y.max()
data_set = tf.data.Dataset.from_tensor_slices((x[:, None], y[:, None]))
train_ds = data_set.shuffle(len(x)).batch(64)
loss_history = []
for epoch in range(5000):
for train_x, train_y in train_ds:
train_step(train_x, train_y)
loss_history.append(train_loss.result())
print(f'Epoch {epoch}, loss: {loss_history[-1]}')
train_loss.reset_states()
plt.figure()
plt.xlabel('Epoch')
plt.ylabel('MSE loss')
plt.plot(loss_history)
plt.figure()
plt.plot(x, y, label='original')
plt.plot(x, model(list(data_set.batch(len(x)))[0][0]), label='predicted')
plt.legend()
plt.show()
I found the case to be not so regular and this was the mistake:
1) I was finding the normalizing factor correctly, but had to change this:
for (int i = 0; i < numTrainingSets; i++)
{
//Find and update Normalization factor(as shown in the question)
//Normalize the training example
}
to
for (int i = 0; i < numTrainingSets; i++)
{
//Find Normalization factor (as shown in the question)
}
for (int i = 0; i < numTrainingSets; i++)
{
//Normalize the training example
}
Also, the validation set was earlier generated as :
int numTestSets = 500;
for (int i = 0; i < numTestSets; i++)
{
//Generate data
double p = (2*PI*i/numTestSets)/MAXX;
//And other steps...
}
whereas the Training data was generated on numTrainingSets = 100. Hence, p generated for training set and the one generated for validation set lies in different range. So, I had to make ** numTestSets = numTrainSets**.
Lastly,
Is this normalizing right?
I had been wrongly normalizing the actual result too!
As shown in the question:
double p = (2*PI*i/numTestSets)/MAXX;
x.push_back(p); //FORMS THE X-AXIS FOR PLOTTING
///Actual Result
double res = 0.2+0.4*pow(p, 2)+0.3*p*sin(15*p)+0.05*cos(50*p);
Notice: the p used to generate this actual result has been normalized (unnecessarily).
This is the final result after resolving these issues...
Related
I want to create a 1D plot from an image. Then I want to determine the maxima and their distances to each other in c++.
I am looking for some tips on how I could approach this.
I load the image as cv::Mat. In opencv I have searched, but only found the histogram function, which is wrong. I want to get a cross section of the image - from left to right.
does anyone have an idea ?
Well I have the following picture:
From this I want to create a 1D plot like in the following picture (I created the plot in ImageJ).
Here you can see the maxima (I could refine it with "smooth").
I want to determine the positions of these maxima and then the distances between them.
I have to get to the 1D plot somehow. I suppose I can get to the maxima with a derivation?
++++++++++ UPDATE ++++++++++
Now i wrote this to get an 1D Plot:
cv::Mat img= cv::imread(imgFile.toStdString(), cv::IMREAD_ANYDEPTH | cv::IMREAD_COLOR);
cv::cvtColor(img, img, cv::COLOR_BGR2GRAY);
uint8_t* data = img.data;
int width = img.cols;
int height = img.rows;
int stride = img.step;
std::vector<double> vPlot(width, 0);
for (int i = 0; i < height; i++) {
for (int j = 0; j < width; j++) {
uint8_t val = data[ i * stride + j];
vPlot[j]=vPlot[j] + val;
}
}
std::ofstream file;
file.open("path\\plot.csv");
for(int i = 0; i < vPlot.size(); i++){
file << vPlot[i];
file << ";";
}
file.close();
When i plot this in excel i got this:
Thats looks not so smooth as in ImageJ. Did i something wrong?
I need it like in the Plot of ImageJ - more smooth.
ok I got it:
for (int i = 0; i < vPlot.size(); i++) {
vPlot[i] = vPlot[i] / height;
}
Ok but i don't know how to get the maxima an distances.
When i have the local maxima (i don't know how), i can calculate the distance between them with the index of the vetcor elements.
Has anybody an idea to get the local Maxima out of the vector, that I plot above ?
Now o wrote this to find the maxima:
// find maxima
std::vector<int> idxMax;
int flag = 0;
for(int i = 1; i < avg.size(); i++){
double diff = avg[i] - avg[i-1];
if(diff < 0){
if(flag>0){
idxMax.push_back(i);
flag = -1;
}
}
if(diff >= 0){
if(flag<=0){
flag = 1;
}
}
}
But more maxima are found than wanted. The length of the vector varies and also the number of peaks. These can be close together or far away. They are also not always the same height, as can be seen in the picture
So, I try to create my own neural network. Something really simple.
My input is the MNIST database of handwritten digits.
Input: 28*28 neurons (Images).
Output: 10 neurons (0/1/2/3/4/5/6/7/8/9).
So my network is as follow: 28*28 -> 15 -> 10.
The problem remains in my estimated output. Indeed, it seems I have a gradient explosion.
The output given by my network is here: https://pastebin.com/EFpBGAZd
As you can see, the first estimated output is wrong. So my network adjust the weights thanks to the backpropagation. But It doesn't seems to updates the weights correctly. Indeed the estimated output is too high compared to the second highest value.
So the first estimated output keeps being the best estimated output for the following training (13 in my example).
My backpropagation code:
VOID BP(NETWORK &Network, double Target[OUTPUT_NEURONS]) {
double DeltaETotalOut = 0;
double DeltaOutNet = 0;
double DeltaErrorNet = 0;
double DeltaETotalWeight = 0;
double Error = 0;
double ErrorTotal = 0;
double OutputUpdatedWeights[OUTPUT_NEURONS*HIDDEN_NEURONS] = { 0 };
unsigned int _indexOutput = 0;
double fNetworkError = 0;
//Calculate Error
for (int i = 0; i < OUTPUT_NEURONS; i++) {
fNetworkError += 0.5*pow(Target[i] - Network.OLayer.Cell[i].Output, 2);
}
Network.Error = fNetworkError;
//Output Neurons
for (int i = 0; i < OUTPUT_NEURONS; i++) {
DeltaETotalOut = -(Target[i] - Network.OLayer.Cell[i].Output);
DeltaOutNet = ActivateSigmoidPrime(Network.OLayer.Cell[i].Output);
for (int j = 0; j < HIDDEN_NEURONS; j++) {
OutputUpdatedWeights[_indexOutput] = Network.OLayer.Cell[i].Weight[j] - 0.5 * DeltaOutNet*DeltaETotalOut* Network.HLayer.Cell[j].Output;
_indexOutput++;
}
}
//Hidden Neurons
for (int i = 0; i < HIDDEN_NEURONS; i++) {
ErrorTotal = 0;
for (int k = 0; k < OUTPUT_NEURONS; k++) {
DeltaETotalOut = -(Target[k] - Network.OLayer.Cell[k].Output);
DeltaOutNet = ActivateSigmoidPrime(Network.OLayer.Cell[k].Output);
DeltaErrorNet = DeltaETotalOut * DeltaOutNet;
Error = DeltaErrorNet * Network.OLayer.Cell[k].Weight[i];
ErrorTotal += Error;
}
DeltaOutNet = ActivateSigmoidPrime(Network.HLayer.Cell[i].Output);
for (int j = 0; j < INPUT_NEURONS; j++) {
DeltaETotalWeight = ErrorTotal * DeltaOutNet*Network.ILayer.Image[j];
Network.HLayer.Cell[i].Weight[j] -= 0.5 * DeltaETotalWeight;
}
}
//Update Weights
_indexOutput = 0;
for (int i = 0; i < OUTPUT_NEURONS; i++) {
for (int j = 0; j < HIDDEN_NEURONS; j++) {
Network.OLayer.Cell[i].Weight[j] = OutputUpdatedWeights[_indexOutput];
_indexOutput++;
}
}}
How can I solve this issue?
I didn't worked on the hidden layer nor biases, is it due to it?
Thanks
Well, since Backpropagation is notoriously hard to implement and especially to debug (I guess everyone who did it can relate) it’s much harder to debug some Code written by others.
After a quick view over your code, I’m quite surprised that you calculate a negative delta term? Are you using ReLU or any sigmoid function? I’m quite sure there is more. But I’d suggest you to stay away from MNIST until you got your network to solve XOR.
I’ve wrote a summary in pseudo code on how to implement Backpropagation in pseudo code. I’m sure you’ll be able to translate it into C++ quite easily.
Strange convergence in simple Neural Network
In my experience neural networks should really be implemented with matrix operations. This will make your code faster and easier to debug.
The way to debug backpropagation is to use finite difference. For a loss function J(theta) we can approximate the gradient in each dimension with (J(theta + epsilon*d) - J(theta))/epsilon with d a one-hot vector representing one dimension (note the similarity to a derivative).
https://en.wikipedia.org/wiki/Finite_difference_method
I wrote a program that loads, saves, and performs the fft and ifft on black and white png images. After much debugging headache, I finally got some coherent output only to find that it distorted the original image.
input:
fft:
ifft:
As far as I have tested, the pixel data in each array is stored and converted correctly. Pixels are stored in two arrays, 'data' which contains the b/w value of each pixel and 'complex_data' which is twice as long as 'data' and stores real b/w value and imaginary parts of each pixel in alternating indices. My fft algorithm operates on an array structured like 'complex_data'. After code to read commands from the user, here's the code in question:
if (cmd == "fft")
{
if (height > width) size = height;
else size = width;
N = (int)pow(2.0, ceil(log((double)size)/log(2.0)));
temp_data = (double*) malloc(sizeof(double) * width * 2); //array to hold each row of the image for processing in FFT()
for (i = 0; i < (int) height; i++)
{
for (j = 0; j < (int) width; j++)
{
temp_data[j*2] = complex_data[(i*width*2)+(j*2)];
temp_data[j*2+1] = complex_data[(i*width*2)+(j*2)+1];
}
FFT(temp_data, N, 1);
for (j = 0; j < (int) width; j++)
{
complex_data[(i*width*2)+(j*2)] = temp_data[j*2];
complex_data[(i*width*2)+(j*2)+1] = temp_data[j*2+1];
}
}
transpose(complex_data, width, height); //tested
free(temp_data);
temp_data = (double*) malloc(sizeof(double) * height * 2);
for (i = 0; i < (int) width; i++)
{
for (j = 0; j < (int) height; j++)
{
temp_data[j*2] = complex_data[(i*height*2)+(j*2)];
temp_data[j*2+1] = complex_data[(i*height*2)+(j*2)+1];
}
FFT(temp_data, N, 1);
for (j = 0; j < (int) height; j++)
{
complex_data[(i*height*2)+(j*2)] = temp_data[j*2];
complex_data[(i*height*2)+(j*2)+1] = temp_data[j*2+1];
}
}
transpose(complex_data, height, width);
free(temp_data);
free(data);
data = complex_to_real(complex_data, image.size()/4); //tested
image = bw_data_to_vector(data, image.size()/4); //tested
cout << "*** fft success ***" << endl << endl;
void FFT(double* data, unsigned long nn, int f_or_b){ // f_or_b is 1 for fft, -1 for ifft
unsigned long n, mmax, m, j, istep, i;
double wtemp, w_real, wp_real, wp_imaginary, w_imaginary, theta;
double temp_real, temp_imaginary;
// reverse-binary reindexing to separate even and odd indices
// and to allow us to compute the FFT in place
n = nn<<1;
j = 1;
for (i = 1; i < n; i += 2) {
if (j > i) {
swap(data[j-1], data[i-1]);
swap(data[j], data[i]);
}
m = nn;
while (m >= 2 && j > m) {
j -= m;
m >>= 1;
}
j += m;
};
// here begins the Danielson-Lanczos section
mmax = 2;
while (n > mmax) {
istep = mmax<<1;
theta = f_or_b * (2 * M_PI/mmax);
wtemp = sin(0.5 * theta);
wp_real = -2.0 * wtemp * wtemp;
wp_imaginary = sin(theta);
w_real = 1.0;
w_imaginary = 0.0;
for (m = 1; m < mmax; m += 2) {
for (i = m; i <= n; i += istep) {
j = i + mmax;
temp_real = w_real * data[j-1] - w_imaginary * data[j];
temp_imaginary = w_real * data[j] + w_imaginary * data[j-1];
data[j-1] = data[i-1] - temp_real;
data[j] = data[i] - temp_imaginary;
data[i-1] += temp_real;
data[i] += temp_imaginary;
}
wtemp = w_real;
w_real += w_real * wp_real - w_imaginary * wp_imaginary;
w_imaginary += w_imaginary * wp_real + wtemp * wp_imaginary;
}
mmax=istep;
}}
My ifft is the same only with the f_or_b set to -1 instead of 1. My program calls FFT() on each row, transposes the image, calls FFT() on each row again, then transposes back. Is there maybe an error with my indexing?
Not an actual answer as this question is Debug only so some hints instead:
your results are really bad
it should look like this:
first line is the actual DFFT result
Re,Im,Power is amplified by a constant otherwise you would see a black image
the last image is IDFFT of the original not amplified Re,IM result
the second line is the same but the DFFT result is wrapped by half size of image in booth x,y to match the common results in most DIP/CV texts
As you can see if you IDFFT back the wrapped results the result is not correct (checker board mask)
You have just single image as DFFT result
is it power spectrum?
or you forget to include imaginary part? to view only or perhaps also to computation somewhere as well?
is your 1D **DFFT working?**
for real data the result should be symmetric
check the links from my comment and compare the results for some sample 1D array
debug/repair your 1D FFT first and only then move to the next level
do not forget to test Real and complex data ...
your IDFFT looks BW (no gray) saturated
so did you amplify the DFFT results to see the image and used that for IDFFT instead of the original DFFT result?
also check if you do not round to integers somewhere along the computation
beware of (I)DFFT overflows/underflows
If your image pixel intensities are big and the resolution of image too then your computation could loss precision. Newer saw this in images but if your image is HDR then it is possible. This is a common problem with convolution computed by DFFT for big polynomials.
Thank you everyone for your opinions. All that stuff about memory corruption, while it makes a point, is not the root of the problem. The sizes of data I'm mallocing are not overly large, and I am freeing them in the right places. I had a lot of practice with this while learning c. The problem was not the fft algorithm either, nor even my 2D implementation of it.
All I missed was the scaling by 1/(M*N) at the very end of my ifft code. Because the image is 512x512, I needed to scale my ifft output by 1/(512*512). Also, my fft looks like white noise because the pixel data was not rescaled to fit between 0 and 255.
Suggest you look at the article http://www.yolinux.com/TUTORIALS/C++MemoryCorruptionAndMemoryLeaks.html
Christophe has a good point but he is wrong about it not being related to the problem because it seems that in modern times using malloc instead of new()/free() does not initialise memory or select best data type which would result in all problems listed below:-
Possibly causes are:
Sign of a number changing somewhere, I have seen similar issues when a platform invoke has been used on a dll and a value is passed by value instead of reference. It is caused by memory not necessarily being empty so when your image data enters it will have boolean maths performed on its values. I would suggest that you make sure memory is empty before you put your image data there.
Memory rotating right (ROR in assembly langauge) or left (ROL) . This will occur if data types are being used which do not necessarily match, eg. a signed value entering an unsigned data type or if the number of bits is different in one variable to another.
Data being lost due to an unsigned value entering a signed variable. Outcomes are 1 bit being lost because it will be used to determine negative or positive, or at extremes if twos complement takes place the number will become inverted in meaning, look for twos complement on wikipedia.
Also see how memory should be cleared/assigned before use. http://www.cprogramming.com/tutorial/memory_debugging_parallel_inspector.html
I want to segment car plate to get separate characters.
I found some article, where such segmentation performed using brightness histograms (as i understand - sum of all non-zero pixels).
How can i calculate such histogram? I would really appreciate for any help!
std::vector<int> computeColumnHistogram(const cv::Mat& in) {
std::vector<int> histogram(in.cols,0); //Create a zeroed histogram of the necessary size
for (int y = 0; y < in.rows; y++) {
p_row = in.ptr(y); ///Get a pointer to the y-th row of the image
for (int x = 0; x < in.cols; x++)
histogram[x] += p_row[x]; ///Update histogram value for this image column
}
//Normalize if you want (you'll get the average value per column):
// for (int x = 0; x < in.cols; x++)
// histogram[x] /= in.rows;
return histogram;
}
Or use reduce as suggested by Berak, either calling
cv::reduce(in, out, 0, CV_REDUCE_AVG);
or
cv::reduce(in, out, 0, CV_REDUCE_SUM, CV_32S);
out is a cv::Mat, and it will have a single row.
I have a Kernel filter that I generated and I want to apply it to my image but I could not get a right result by doing this:
Actually I can use a different method as well since I am not to familiar with opencv I need help thanks.
channel[c] is the read image;
int size = 5; // Gaussian filter box side size
double gauss[5][5];
int sidestp = (size - 1) / 2;
// I have a function to generate the gaussiankernel filter
float sum = 0;
for (int x = 1; x < channels[c].cols - 1; x++){
for (int y = 1; y < channels[c].rows - 1; y++){
for (int i = -size; i <= size; i++){
for (int j = -sidestp; j <= sidestp; j++){
sum = sum + gauss[i + sidestp][j + sidestp] * channels[c].at<uchar>(x - i, y - j);
}
}
result.at<uchar>(y, x) = sum;
}
}
OpenCV has an inbuilt function filter2D that does this convolution for you.
You need to provide your source and destination images, along with the custom kernel (as a Mat), and a few more arguments. See this if it still bothers you.
Just to add to the previous answer, since you are performing Gaussian blur, you can use the OpenCV GaussianBlur (Check here). Unlike filter2D, you can use the standard deviations as input parameter.