weka training result return 0 in java - weka

I am new to weka. I am using weka in java to train my result on android phone.I loaded arff file when the program started.
In the training set, I only set the nominal result to #attribute Result {1,2,3,4}. Therefore, I thought, I should only receive 1,2,3,4 as a result. But, when I train the data real-timely on the phone, a lot of 0 or -1 appeared in the result.
My question is: is that possible for Weka to return a classification result outside the nominal value set? Like in my case, I set the result in the trainig set as 1,2,3,4, but it returned a lot of 0.
Thanks a lot
Below is my code.
int result = 0;
try {
Instance inst = new DenseInstance(1.0,vals);
data.setClassIndex( data.numAttributes() - 1 );
data.add(inst);
inst.setDataset(data);
result = (int) m_classifier.classifyInstance(inst);
} catch (Exception e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}

If you want to debug the code, you may attach weka source code weka-src.jar to weka.jar and then break into the function to see what happened.
It's quite hard that someone encountered exactly the same problem with you. So you may need to try to debug by yourself. I think it's the most effective way for you and for now.

Related

How to correctly format input and resize output data whille using TensorRT engine?

I'm trying implementing deep learning model into TensorRT runtime. The model conversion step is done quite OK and i'm pretty sure about it.
Now there's 2 parts i'm currently struggle with is memCpy data from host To Device (like openCV to Trt) and get the right output shape in order to get the right data. So my questions is:
How actually a shape of input dims relate with memory buffer. What is the difference when the model input dims is NCHW and NHWC, so when i read a openCV image, it's NHWC and also the model input is NHWC, do i have to re-arange the buffer data, if Yes then what's the actual consecutive memory format i have to do ?. Or simply what does the format or sequence of data that the engine are expecting ?
About the output (assume the input are correctly buffered), how do i get the right result shape for each task (Detection, Classification, etc..)..
Eg. an array or something look similar like when working with python .
I read Nvidia docs and it's not beginner-friendly at all.
//Let's say i have a model thats have a dynamic shape input dim in the NHWC format.
auto input_dims = nvinfer1::Dims4{1, 386, 342, 3}; //Using fixed H, W for testing
context->setBindingDimensions(input_idx, input_dims);
auto input_size = getMemorySize(input_dims, sizeof(float));
// How do i format openCV Mat to this kind of dims and if i encounter new input dim format, how do i adapt to that ???
And the expected output dims is something like (1,32,53,8) for example, the output buffer result in a pointer and i don't know what's the sequence of the data to reconstruct to expected array shape.
// Run TensorRT inference
void* bindings[] = {input_mem, output_mem};
bool status = context->enqueueV2(bindings, stream, nullptr);
if (!status)
{
std::cout << "[ERROR] TensorRT inference failed" << std::endl;
return false;
}
auto output_buffer = std::unique_ptr<int>{new int[output_size]};
if (cudaMemcpyAsync(output_buffer.get(), output_mem, output_size, cudaMemcpyDeviceToHost, stream) != cudaSuccess)
{
std::cout << "ERROR: CUDA memory copy of output failed, size = " << output_size << " bytes" << std::endl;
return false;
}
cudaStreamSynchronize(stream);
//How do i use this output_buffer to form right shape of output, (1,32,53,8) in this case ?
Could you please edit your question and tell us which model you're using if it's a commonly known NN, prehaps one we can download to test locally?
Then, the answer since it doesn't depend on the model (even though it would help to answer)
How actually a shape of input dims relate with memory buffer
If the input is NxCxHxW, you need to allocate N*C*H*W*sizeof(float) memory for that on your CPU and GPU. To be more precise, you need to allocate space on GPU for all the bindings and on CPU for only input and output bindings.
when i read a openCV image, it's NHWC and also the model input is NHWC, do i have to re-arange the buffer data
No, you do not have to re-arrange the buffer data. If you would have to change between NHWC and NCHW you can check this or google 'opencv NHWC to NHCW'.
Full working code example here, especially this function.
Or simply what does the format or sequence of data that the engine are expecting ?
This depends on how the neural network was trained. You should in general know exactly which kind of preprocessing and image data formats have been used to train the NN. You should even use the same libraries to load images and process them if possible. It's an open problem in ML: if you try to replicate results of some papers and use their models but they haven't open sourced the preprocessing you might get worse results. In the "worst" case you can implement both NHCW and NCHW and test which of them works.
About the output (assume the input are correctly buffered), how do i get the right result shape for each task (Detection, Classification, etc..).. Eg. an array or something look similar like when working with python .
This question clearly requires me to understand which NNs you are referring to. But I myself do the following:
Load the TensorRT .engine file in my code like this and deserialize like this
Print the bindings like this
Then I know the size of the input binding or bindings if there are many inputs, and the size of the output binding or bindings if there are many outputs.
This way you know the right result shape for each task. I hope this answered your question. If not, please add detailed comments and edit your post to be more precise. Thank you.
I read Nvidia docs and it's not beginner-friendly at all.
Yes I agree. You're better of searching TensorRT c++ (or Python) repositories from Github and studying their code. Have you seen TensorRT samples? It doesn't really take many lines of code to implement TensorRT inference.

How to use tf.contrib.rnn.convLSTMCell class in tensorflow

I would like to use a convolution LSTM in my research but I'm having a difficult time figuring out the exact way to implement this class in tensorflow. Here is what I have so far. I get no errors, but I am seriously doubting my implementation. Can anyone confirm if I am doing this correctly?
n_input = 4
x = tf.placeholder(tf.float32,shape=[None,n_input,HEIGHT,WIDTH,2])
y = tf.placeholder(tf.float32,shape=[None,HEIGHT,WIDTH,2])
convLSTM_cell = tf.contrib.rnn.ConvLSTMCell(
conv_ndims=2,
input_shape = [HEIGHT,WIDTH,DEPTH],
output_channels=2,
kernel_shape=[3,3]
)
outputs, states = tf.nn.dynamic_rnn(convLSTM_cell, x, dtype=tf.float32)
weights = tf.Variable(tf.random_normal([3,3,2,2]))
biases = tf.Variable(tf.random_normal([2]))
conv_out = tf.nn.conv2d(outputs[-1],weights,strides=[1,1,1,1],padding='SAME')
out = tf.nn.sigmoid(conv_out + biases)
UPDATE:
printing the size of outputs gives the shape=(?,4,436,1024,2) but I think I want (?,5,436,1024,2) or (?,1,436,1024,2).
UPDATE2:
So according to a fellow lab mate, the 4 outputs corresponds to the lstm outputs for each frame and so it is working correctly. Apparently all I have to do is take output #4 and that is the predicted future time frame.
A stackoverflow confirmation would put my mind at ease on this whole thing.
Yes, you are correct!
The output dimension will match the input dimension. If you actually want the (?,5,436,1024,2) output, you will have to look at the history, state.h. the last four [-4] of it will still correspond to the output.

Why is GLSL log returning the wrong result? (Intel Driver)

I am doing some maths on the GPU and reading the result.
And I am getting the wrong value From log. I have verified this for values 0 - 10, 20, 30, 40.
If I hard code the value (as you can see bellow under verify) I get the right result spat out. However if I use log with the hard coded value that should return the same result, I get the wrong result spat out.
This is the kind of thing I have been doing in my function.
vec4 IScale(vec4 value)
{
switch(uScaleType_i)
{
case Log:
//value = log(value);
value = vec4(1,1,1,1);
value.r = log(5);
//verifiy
//value.r = 0.698970004
break;
case Sqrt:
value = sqrt(value);
break;
case None:
break;
}
return value;
}
I am wondering is there any sense here. I have added the results of what I am getting back into excel and done a graph. At first Its almost like its double the correct value but its not quite that clean, it gets further and further away.
Is there any other explanation for this other than a driver issue? I cant think of anything else to check!
And if so how can i possibly work around it, other than refactoring my code to do it on the CPU? And why can't I find evidence online to back this up? I am completely utterly baffled!
I am running on a laptop with:
(Intel(R) HD Graphics 4000 with 132 ext.)
p.s. Sqrt is fine and I get the values I would expect.
p.p.s I checked, I have not accidentally created a function called "log"
I believe you are tripping over the base used for the log. In Excel the base is 10 however in glsl it is e.
To get the right result you should divide the result with the log of the base you want.
value = log(value)/log(10);
Or in excel you can use LN(RC[-1])
This is as per the specification. log() will return the natural logarithm, i.e. the logarithm to the base e. Not the base 10 logarithm.

weka: how to generate libsvm training parameter

I am running libsvm through weka. Its output accuracy looks good to me, so I am planning to write a svm model by myself. However, weka didn't generate any training parameter, such as number of support vector. Therefore i cannot do anything. Searching the web, i found somebody said it would generate some parameters like the following:
optimization finished, #iter = 27
nu = 0.058475864943863545
obj = -1.871013102744184, rho = -0.19357337828800944
nSV = 9, nBSV = 0 `enter code here`
Total nSV = 9
but how come i didn't see any of them? any step that i missed? please help me. Thanks a lot.
Weka writes the output you mentioned to stderr.
So if you have started weka.sh or weka.bat from a terminal (or "command window" if you are on Windows), you should see that output appear in your terminal window after clicking "classify"
If you want to have access to this information via scripts, you can
redirect the output to a file and read in that file.
Here is how to edit the startup file weka.sh / weka.bat.
Edit this line (it is probably the last line) in order to write log info to a file instead of the terminal window:
java -cp $CP -Xmx8092m weka.gui.GUIChooser 2>>/opt/weka-stable/weka.log &
You can also add a properties file to your home directory to add more fine-grained behaviour.
https://weka.wikispaces.com/Properties+file
(You probably can also access information via the Weka Java API somehow, but you did not ask for that)

Callgrind: Profile a specific part of my code

I'm trying to profile (with Callgrind) a specific part of my code by removing noise and computation that I don't care about.
Here is an example of what I want to do:
for (int i=0; i<maxSample; ++i) {
//Prepare data to be processed...
//Method to be profiled with these data
//Post operation on the data
}
My use-case is a regression test, I want to make sure that the method in question is still fast enough (something like less than 10% extra instructions since the last implementation).
This is why I'd like to have the cleaner output form Callgrind.
(I need a for loop in order to have a significant amount of data processed in order to have a good estimation of the behavior of the method I want to profile)
My first try was to change the code to:
for (int i=0; i<maxSample; ++i) {
//Prepare data to be processed...
CALLGRIND_START_INSTRUMENTATION;
//Method to be profiled with these data
CALLGRIND_STOP_INSTRUMENTATION;
//Post operation on the data
}
CALLGRIND_DUMP_STATS;
Adding the Callgrind macros to control the instrumentation. I also added the --instr-atstart=no options to be sure that I profile only the part of the code I want...
Unfortunately with this configuration when I start to launch my executable with callgrind, it never ends... It is not a question of slowness, because a full instrumentation run last less than one minute.
I also tried
for (int i=0; i<maxSample; ++i) {
//Prepare data to be processed...
CALLGRIND_TOGGLE_COLLECT;
//Method to be profiled with these data
CALLGRIND_TOGGLE_COLLECT;
//Post operation on the data
}
CALLGRIND_DUMP_STATS;
(or the --toggle-collect="myMethod" option)
But Callgrind returned me a log without any call (KCachegrind is white as snow :( and says zero instructions...)
Did I use the macros/options correctly? Any idea of what I need to change in order to get the expected result?
I finally managed to solve this issue... This was a config issue:
I kept the code
for (int i=0; i<maxSample; ++i) {
//Prepare data to be processed...
CALLGRIND_TOGGLE_COLLECT;
//Method to be profiled with these data
CALLGRIND_TOGGLE_COLLECT;
//Post operation on the data
}
CALLGRIND_DUMP_STATS;
But ran the callgrind with --collect-atstart=no (and without the --instr-atstart=no!!!) and it worked perfectly, in a reasonable time (~1min).
The issue with START/STOP instrumentation was that callgrind dumps a file (callgrind.out.#number) at each iteration (each STOP) thus it was really really slow... (after 5min I had only 5000 runs for a 300 000 iterations benchmark... unsuitable for a regression test).
The toggle-collect option is very picky in how you specify the method to use as trigger. You actually need to specify its argument list as well, and even the whitespace needs to match! Use the method name exactly as it appears in the callgrind output. For instance, I am using this invokation:
$ valgrind
--tool=callgrind
--collect-atstart=no
"--toggle-collect=ctrl_simulate(float, int)"
./swaag
Please observe:
The double quotes around the option.
The argument list including parentheses.
The whitespace after the comma character.