Artificial Neural Network with large inputs & outputs - c++

I've been following Dave Miller's ANN C++ Tutorial, and I've been having some problems getting it to function as expected.
You can view the code I'm working with here. It's an XCode project, but includes the main.cpp and data set file.
Previously, this program would only gives outputs between -1 and 1, I'm presuming due to the use of the tanh function. I've manipulated the data inputs so I can input my data that is much larger and have valid outputs. I've simply done this by multiplying the input values by 0.0001, and multiplying the output values by 10000.
The training data I'm using is the included CSV file. The last column is the expected output, the rest are inputs. Am I using the wrong mathematical function for these data?
Would you say that this is actually learning? This whole thing has stressed me out so much, I understand the theory behind ANN's but just can't implement from scratch for myself.
The net recent average error definitely gets smaller and smaller, which to me would say it is learning.
I'm sorry if I haven't explained myself very well, I'm very new to ANN's and this whole thing is very confusing to me. My university lecturers are useless when it comes to the practical side, they only teach us the theory of it.
I've been playing around with the eta and alpha values, along with the number of hidden layers.

You explained yourself quite well, if the net recent average is getting lower and lower it probably means that the network is actually learning, but here is my suggestion about how to be completely sure.
Take you CSV file and split it into 2 files one should be about 10% of the all data and the other all the remaining.
You start with an untrained network and you run your 10% file trough the net and for each line you save the difference between actual output and expected result.
Then you train the network only with the 90% of the CSV file you have and finally you re run trough the NET the first 10% file again and you compare the differences you had on the first run with the the latest ones.
You should find out that the new results are much closer to the expected values than the first time, and this would be the final proof that your network is learning.
Does this make any sense ? if not please send share some code or send me a link to the exercise you are running and I will try to explain it in code.

Related

Extracting number of bits in a macroblock from VVC VTM reference software

Final:Result after calculating and displaying the differenceI am new to VVC and I am going through the reference software's code trying to understand it. I have encoded and decoded videos using the reference software. I want to extract the bitstream from it, I want to know the number of bits there are in each macroblock. I am not sure which class I should be working with, for now I am looking at, mv.cpp, QuantRDOQ.cpp, and TrQuant.cpp.
I am afraid to mess the code up completely, I don't know where to add what lines of code. Start: Result after calculating and displaying the difference
P.S. The linked pictures are after my problem has been solved, I attached these pictures because of my query in the comments.
As the error says, getNumBins() is not supported by the CABAC estimator. So you should make sure you call it "only" during the encoding, and not during the RDO.
This should do the job:
if (isEncoding())
before = m_BinEncoder.getNumBins()
coding_unit( cu, partitioner, cuCtx );
if (isEncoding())
{
after = m_BinEncoder.getNumBins();
diff = after - before;
}
The simpleset solution that I'm aware of is at the encoder side.
The trick is to compute the difference in the number of written bits "before" and "after" encoding a Coding Unit (CU) (aka macroblock). This stuff happens in the CABACWriter.cpp file.
You should go to to coding_tree() function, where coding_unit() function is called, which is responsible for context-coding all syntax elementes in the current CU.
There, you may call the function getNumBins() twice: once before and once after coding_unit(). The difference of the two value should do the job for you.

Neural Network gives same output for different inputs, doesn't learn

I have a neural network written in standard C++11 which I believe follows the back-propagation algorithm correctly (based on this). If I output the error in each step of the algorithm, however, it seems to oscillate without dampening over time. I've tried removing momentum entirely and choosing a very small learning rate (0.02), but it still oscillates at roughly the same amplitude per network (with each network having a different amplitude within a certain range).
Further, all inputs result in the same output (a problem I found posted here before, although for a different language. The author also mentions that he never got it working.)
The code can be found here.
To summarize how I have implemented the network:
Neurons hold the current weights to the neurons ahead of them, previous changes to those weights, and the sum of all inputs.
Neurons can have their value (sum of all inputs) accessed, or can output the result of passing said value through a given activation function.
NeuronLayers act as Neuron containers and set up the actual connections to the next layer.
NeuronLayers can send the actual outputs to the next layer (instead of pulling from the previous).
FFNeuralNetworks act as containers for NeuronLayers and manage forward-propagation, error calculation, and back-propagation. They can also simply process inputs.
The input layer of an FFNeuralNetwork sends its weighted values (value * weight) to the next layer. Each neuron in each layer afterwards outputs the weighted result of the activation function unless it is a bias, or the layer is the output layer (biases output the weighted value, the output layer simply passes the sum through the activation function).
Have I made a fundamental mistake in the implementation (a misunderstanding of the theory), or is there some simple bug I haven't found yet? If it would be a bug, where might it be?
Why might the error oscillate by the amount it does (around +-(0.2 +- learning rate)) even with a very low learning rate? Why might all the outputs be the same, no matter the input?
I've gone over most of it so much that I might be skipping over something, but I think I may have a plain misunderstanding of the theory.
It turns out I was just staring at the FFNeuralNetwork parts too much and accidentally used the wrong input set to confirm the correctness of the network. It actually does work correctly with the right learning rate, momentum, and number of iterations.
Specifically, in main, I was using inputs instead of a smaller array in to test the outputs of the network.

Some confusion over Numpy + Scipy + matplotlib Spectrum Analyzer code

I've been attempting to understand the code at the bottom of http://www.frank-zalkow.de/en/code-snippets/create-audio-spectrograms-with-python.html, though sadly I haven't been getting anywhere with it. I don't think I'm expected to understand most of the code, as I have limited experience with FFTs, but unfortunately I'm also having trouble understanding how the graph is generated. I'm also getting very limited progress from a trial-and-error approach, due to the fact that my computer lags heavily and because of the relatively long time it takes for a graph to be generated.
With that being said, I need a way to scale the graph so that it only displays values up to 5000 Hz, though still on a logarithmic scale. I'd also like to understand how the wav file is sampled, and what values I can edit in order to take more samples per second. Can somebody explain how both of these points work, and how I can edit the code in order to fulfill these requirements?
Hm, this code is by me so gladly help you understanding it. It's maybe not best practice and there may be several ways to improve it – suggestions are welcome. But at least it worked for me.
The function stft does a standard short-time-fourier-transform of an audio signal by the help of the numpy strides. The function logscale_spec takes an stft and scales it logarithmically. This is maybe a bit dirty and there must be a better way to do it. But it worked for me. plotstft is the function that finally reads a wave file via scipy.io.wavfile, combines the prior two functions and makes a plot with matplotlibs imshow. If you have a mono wavefile you should be able to just call plotstft("/path/to/mono.wav").
That was an overview – if I should explain some things in more detail, just say so.
To your questions. To leave out some frequencie values: You can get the frequencies values of the fft wih np.fft.fftfreq(binsize, 1./sr). You just have to find the index of of your cutoff value and leaving this values of the stft.
I don't understand your second question... You can have a look of all samples of your wavefile by:
>>> import scipy.io.wavfile as wav
>>> x = wav.read("/path/to/file.wav")
>>> x
(44100, array([4554752, 4848551, 3981874, ..., 2384923, 2040309, 294912], dtype=int32))
>>> x[1]
array([4554752, 4848551, 3981874, ..., 2384923, 2040309, 294912], dtype=int32)

Neural Network seems to work fine until used for processing data (all of the results are practically the same)

I have recently implemented a typical 3 layer neural network (input -> hidden -> output) and I'm using the sigmoid function for activation. So far, the host program has 3 modes:
Creation, which seems to work fine. It creates a network with a specified number of input, hidden and output neurons, initializes the weights to either random values or zero.
Training, which loads a dataset, computes the output of the network then backpropagates the error and updates the weights. As far as I can tell, this works ok. The weights change, but not extremely, after training on the dataset.
Processing, which seems to work ok. However, the data output for the dataset which was used for training, or any other dataset for that matter is very bad. It's usually either just a continuuous stream of 1's, with an occasional 0.999999 or every output value for every input is 0.9999 with the last digits being different between inputs. As far as I could tell there was no correlation between those last 2 digits and what was supposed to be outputed.
How should I go about figuring out what's not working right?
You need to find a set of parameters (number of neurons, learning rate, number of iterations for training) that works well for classifying previously unseen data. People often achieve this by separating their data into three groups: training, validation and testing.
Whatever you decide to do, just remember that it really doesn't make sense to be testing on the same data with which you trained, because any classifcation method close to reasonable should be getting everything 100% right under such a setup.

How to process multiple inputs in c++ for artificial neural network

How does C++ process multiple inputs for an artificial neural network in real-time?
I'm assuming this is without using a spiking neural network, but a more traditional one (i.e. just a basic neural network as described here)
http://www.ai-junkie.com/ann/evolved/nnt1.html
Is this possible in a real-time world? I was thinking one would have to process either each input individually (which will always result in the same output, hence the dilemna), or accrue a certain # of inputs per time threshold and then process them at once...
then again, what does someone do with multiple instances of the same input? Process it twice?
I ask this because I'm looking at neuralbot, which I believe uses a normal neural network, but I'm trying to understand ANN's first before I delve into it, and am not sure how an ANN processes multiple inputs before processing target output(s).
Your question is not really clear but I'll try to answer. :)
You can see an ANN (Artificial Neural Network) like a particular case of Adaptive Filters.
There are three main elements:
A sequence of inputs x(n).
A Parametric Variable Filter. In this case the filter is the ANN and the parameters are the neurons weights.
An Update Algorithm that updates the filter parameters according to the error between desired an actual output. In ANN the most used update algorithm is the Backpropagation Algorithm.
In ANN there are two steps:
The Training Step. This is the hard part. You start with random neurons weights. You have a sequence of inputs and their desired outputs and you run the ANN with the update algorithm on. When the error is under a certain threshold you can say that your ANN is trained. This step is usually done off-line (not in real time).
The Execution Step. You have the trained ANN. Now just put in sequence the input in it and use the output. This is usually a fast operation and can be done in real-time (if it is what you mean).
Now.. what do you mean for "multiple inputs at once"? First of all standard computers can do only a very very small number of operations at once, standard PC has 4/8 cores so can do almost 4-8 operations at once. This number is too low for every real world ANN applications.
You said:
what does someone do with multiple instances of the same input? Process it twice?
The answer is Yes. The "Execution Step" is so fast that there are no reason to don't do it. In the "Training Step" duplicated inputs can be removed before training starts (cause the training inputs are known a priori). So there are no problem in this. :)