I am using the EM model in openCV and have run in to difficulty with the log likelihood value output from EM::train. I am training the EM models on separate (but correlated) data and the log likelihood values are linked to the absolute values of the data, i.e. when the training values are greater than 100, EM::train returns strongly positive log likelihood values, when the training values are less than one large negative values are returned, my understanding is this doesn't make any sense given the way EM works.
The section of code that trains the models is below, in case I am making any silly mistakes (I am quite new to opencv) std::cout confirms that the values are larger than one when the training values are larger than 100 and smaller than -1 when the training values are less than 1.
My problem is similar to this post:
OpenCV: Output of the predict function of Expectation Maximization
except there is no way around using the log-likelihoods without implementing something like a a DPGMM.
Many Thanks,
Sam
{
double logs =0;
cv::EM model (g,1,parameters);
model.train(sample_input.col(i),log_likelihoods);
for(int j = 0; j < number_selected; j++ )
{
logs += log_likelihoods.at<double>(j);
std::cout << log_likelihoods.at<double>(j) <<' ';
}
mean_of_logs[i] += 0.1*logs;
reduced_models.push_back(model);
}
Related
This is a follow up question to my previous question about being able to emulate aggregate functions (like in PGSQL) in BigQuery.
The solution propsed in the previous question does indeed work for cases where the function applied on each window is independant of the previous window - like calculating simple average etc., But when calculating recursive functions like exponential moving average, where the formula is:
EMA[i] = price[i]*k + EMA[i-1]×(1−k)
Using the same example from the previous question,
CREATE OR REPLACE FUNCTION temp_db.ema_func(arr ARRAY<int64>, window_size int8)
RETURNS int64 LANGUAGE js AS """
if(arr.length<=window_size){
// calculate a simple moving average till end of first window
var SMA = 0;
for(var i = 0;i < arr.length; i++){
SMA = SMA + arr[i]
}
return SMA/arr.length
}else{
// start calculation of EMA where EMA[i-1] is the SMA we calculated for the first window
// note: hard-coded constant (k) for the sake of simplicity
// the problem: where do I get EMA[i-1] or prev_EMA from?
// in this example, we only need the most recent value, but in general case, we would
// potentially have to do other calculations with the new value
return curr[curr.length-1]*(0.05) + prev_ema*(1−0.05)
}
""";
select s_id, temp_db.ema_func(ARRAY_AGG(s_price) over (partition by s_id order by s_date rows 40 preceding), 40) as temp_col
from temp_db.s_table;
Storing state variable as a custom type is very easy in PGSQL and is a part of the aggregate function parameters. Would it be possible to do emulate the same functionality with BigQuery?
i don't think it can be done generically for BigQuery and rather wanted to see the specific case and see if some reasonable workaround is possible. Meantime, again recursiveness and aggregate UDF is something that is not supported [hopefully yet] in BQ, so you might want to submit respective feature request(s).
Meantime checkout BQ scripting but i don't think your case will fit there
I am building a penalized regression model with 2 parameters $\lambda$ and $\alpha$. I am trying to find optimal values for these parameters so I consider a grid of different values. Let's say I consider n_lambda different $\lambda$ values and n_alpha different $\alpha$ values. In order to test the performance of the model, I consider $n$ data observations, for which I have the true value of my response vartiable, and I compute the predictions for these observations for each parameters pair.
I store my predictions into a 3D-array matrix with dimension (n_lambda, n_alpha, n_observations). This means that element [0, 0, :] of this matrix contains the predictions for n observations for the first value of $\lambda$ and the first value of $\alpha$.
Now I want to compute, for each of my predictions, the means squared error. I know I can do this using nested for loops like:
from sklearn.metrics import mean_squared_error
error_matrix = np.zeros(n_lambda, n_alpha)
for i in range(n_lambda):
for j in range(n_alpha):
error_matrix[i, j] = mean_squared_error(true_value, prediction[i, j, :])
This would work, but nesting for loops is not really optimal. I guess there must be a better way to get what I want. Actually I have tried using map function but it is not working so I would appreciate any advice.
I'm using WEKA Explorer to run a 10fold cross validation. I output the predictions to a CSV file. Because the 10fold approach mixes the order of the data, I do not know which specific data is correctly or incorrectly classified.
I mean, by looking at the CSV I do not know which specific 1 or 0 is classified as 1 or 0. Is there any way to see what is the classification result for every specific instance in test set for every fold? For example, it would be great if the CSV would record the ID of the instance being classified.
One alternative could be for me to implement the 10folds approach manually; i.e., I could create the 10 ARFF files and then run on each of them a percentage split with 90/10 (and preserve order). This solution looks pretty elaborated, effort expensive and error prone.
Thanks for your help!
To do that you need to do the following for every fold:
int result = new int[testSet.numInstances()];
for (int j = 0; j < testSet.numInstances(); j++) {
double res[j] = classifier.classifyInstance(testSet.get(j));
}
Now res array has the classification result for every Instance in test set. You can use this information as you want.
You can for example print the attributes of each instance(e.g if attributes are strings you can print them using (Before addingFilter) testSet.get(j).stringValue(PositionOfAttributeYouWantToPrint)) followed by the classification result.
Note that if the classification result is nominal value you can print it using this:
testSet.classAttribute().value((int)res[j]))
I'm working on a Web Service writen in Java to use Weka algorith j48 to classify some atributes. First it builds the classifier and then it classifies an instance using the classifier tree.
This is part of the code i have for the classifydata method
fc.buildClassifier(train);
for (int i = 0; i < test.numInstances(); i++)
{
double pred = fc.classifyInstance(test.instance(i));
predicated = (test.classAttribute().value((int) pred));
}
being fc the FilteredClassifier that was previously set, being train the data used to build the classifier and test the instance to classify
I'm also not sure if with this code i'm doing a good classification, if you could confirm that it would be nice.
What i really want is to get the "accuracy percentage". I don't really know if it is called like this but i don't know how else to reffer to it. Basicly i want something that will return the accuracy percentage of the classify result. Imagine i have simple a tree that has only 2 classifications, "1" or "2". Imagine i classify an instance and the result is "2". Now i want something that will return how accurate it is for the instance to be a "2", and who says accuracy says probability of being really a "2"
I hope i made myself clear because this is kinda new to me aswell
For this you have to use the distributionForInstance() method:
double[] probabilityDistribution = fc.distributionForInstance(test.instance[i])
Then if you have the two class values "1" and "2" (and you added the attribute/class values in that order to your class attribute), you can get the probabilities with which the given test instance is of one of the two class values by:
// Probability of the test instance beeing a "1"
double classAtt1Prob = probabilityDistribution[0];
// Probability of the test instance beeing a "2"
double classAtt2Prob = probabilityDistribution[1];
I am trying to make a simple artificial neural network work with the backpropagation algorithm. I have created an ANN and I believe I have implemented the BP algorithm correctly, but I may of course be wrong.
Right now, I am trying to train the network by giving it two random numbers (a, b) between 0 and 0.5, and having it add them. Then, of course, each time the output the network gives is compared to the theoretical answer of a + b (which will always be achievable by the sigmoid function).
Strangely, the output always converges to a number between 0 and 1 (as it must, because of the sigmoid function), but the random numbers I'm putting in seem to have no effect on it.
Edit: Sorry, it appears it doesn't converge. Here is an image of the output:
The weights are randomly distributed between -1 and 1, but I have also tried between 0 and 1.
I also tried giving it two constant numbers (0.35,0.9) and trying to train it to spit out 0.5. This works and converges very fast to 0.5. I have also trained it to spit out 0.5 if I give it any two random numbers between 0 and 1, and this also works.
If instead, my target is:
vector<double> target;
target.push_back(.5);
Then it converges very quickly, even with random inputs:
I have tried a couple different networks, since I made it very easy to add layers to my network. The standard one I am using is one with two inputs, one layer of 2 neurons, and a second layer of only one neuron (the output neuron). However, I have also tried adding a few layers, and adding neurons to them. It doesn't seem to change anything. My learning rate is equal to 1.0, though I tried it equal to 0.5 and it wasn't much different.
Does anyone have any idea of anything I could try?
Is this even something an ANN is capable of? I can't imagine it wouldn't be, since they can be trained to do such complicated things.
Any advice? Thanks!
Here is where I train it:
//Initialize it. This will be one with 2 layers, the first having 2 Neurons and the second (output layer) having 1.
vector<int> networkSize;
networkSize.push_back(2);
networkSize.push_back(1);
NeuralNetwork myNet(networkSize,2);
for(int i = 0; i<5000; i++){
double a = randSmallNum();
double b = randSmallNum();
cout << "\n\n\nInputs: " << a << ", " << b << " with expected target: " << a + b;
vector<double> myInput;
myInput.push_back(a);
myInput.push_back(b);
vector<double> target;
target.push_back(a + b);
cout << endl << "Iteration " << i;
vector<double> output = myNet.backPropagate(myInput,target);
cout << "Output gotten: " << output[0];
resultPlot << i << "\t" << abs(output[0] - target[0]) << endl;
}
Edit: I set up my network and have been following from this guide: A pdf. I implemented "Worked example 3.1" and got the same exact results they did, so I think my implementation is correct, at least as far as theirs is.
As #macs states, the maximum output of standard sigmoid is 1, so, if you try to add n numbers from [0, 1], then your target should be normalized, i.e. sum(A1, A2, ..., An) / n.
In a model such as this, the sigmoid function (both in the output and in the intermediate layers) is used mainly for producing something that resembles a 0/1 toggle while still being a continuous function, so using it to represent a range of numbers is not what this kind of network is designed to do. This is because it is designed mostly with classification problems in mind.
There are, of course, other NN models that can do that sort of thing (for example, dropping the sigmoid on the output and just keeping it as a sum of its children).
If you can redefine your model in terms of classifying the input, you'll probably get better results.
Some examples of similar tasks for which the network will be more suitable:
Test whether the output is bigger or smaller than a certain constant - this should be very easy.
Output: A series of outputs, each representing a different potential value (for example, one output each for the the values between 0 and 10, one for 'more than 10', and one for 'less than 0'). You will want your network to round the result to the nearest integer
A tricky one will be to try and create a boolean representation of the output by having multiple output nodes.
None of these will give you the precision that you may want, though, since by nature NNs are more 'fuzzy'