Caffe: Multi-Label Images with Varying Number of Labels - computer-vision

I have a dataset where the images have VARYING number of labels. The number of labels is between 1 and 5. There are 100 classes.
After googling, it seems like HDF5 db with slice layer can deal with multiple labels, as in the following URL.
The only problem is that it supposes a fixed number of labels. Following this, I would have to create a 1x100 matrix, where entry value is 1 for the labeled classes, and 0 for non-label classes, as in the following definition:
layers {
name: "slice0"
type: SLICE
bottom: "label"
top: "label_matrix"
slice_param {
slice_dim: 1
slice_point: 100
}
}
where each image contains a a label looking like (1,0,0,...1,...0,....,0,1) where the vector size is 100 dimension.
Now, I apologize that my question becomes somehow vague, but is this a feasible idea? I.e., is there a better approach to this problem?

I get that you have 5 types of labels that are not always present for each data point. 1 of the 5 labels is for 100-way classification. Correct so far?
I would suggest always writing all 5 labels into your HDF5 and use a special value for when the label is missing. You can then use the missing_value option to skip computing the loss for that layer for that iteration. Using it requires add loss_param{ ignore_label = Y } to the loss layer in your network prototxt definition where Y is a scalar.
The backpropagated error will only be a function of labels that are present. If input X does not have a valid value for a label, the network will still produce an estimate for that label. But it will not be penalized for it. The output is produced without any effect on how the weights are updated in that iteration. Only outputs for non-missing labels contribute to the error signal and the weight gradients.
It seems that only the Accuracy and SoftmaxWithLossLayer layers support missing_values.
Each label is a 1x5 matrix. The first entry can be for the 100-way classification (e.g. [0-99]) and entries 2:5 have scalars that reflect the values that the other labels can take. The order of the columns is the same for all entries in your dataset. A missing label is marked by a special value of your choosing. This special value has to lie outside the set of valid label values. This will depend on what those labels represent. If a label value of -1 never occurs you can use this to flag a missing label.

Related

Scikit: How do i display actual labels instead of 0 and 1's as indexes for confusion matrix

I have a confusion matrix but it has 0,1,2 as indexes/labels instead of actual labels. Is there any way to display the actual labels as index for confusion matrix in scikit?
If you are using inbuilt confusion-matrix function, then there is a parameter labels in it, in which you can pass the actual labels.
From the documentation:
labels : array, shape = [n_classes], optional List of labels to index
the matrix. This may be used to reorder or select a subset of labels.
If none is given, those that appear at least once in y_true or y_pred
are used in sorted order.
And if you are talking about or want to plot the confusion matrix, then you can follow this official scikit example:
http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html#sphx-glr-auto-examples-model-selection-plot-confusion-matrix-py

RRD Graph - Change line colour by value

I have a RRD database with data:
"DS:pkts_transmitted:GAUGE:120:0:U",
"DS:pkts_received:GAUGE:120:0:U",
"DS:pkts_lost:GAUGE:120:0:U",
"DS:rtt_min:GAUGE:120:0:U",
"DS:rtt_avg:GAUGE:120:0:U",
"DS:rtt_max:GAUGE:120:0:U",
And I want that the Avg line change colour if I lose any package.
For example, if I lose 5 packets make the line blue, if I lose 10 make it red.
I see people doing it but I read the documentation and I can't find how to do this.
The way to do this is to actually have multiple lines defined (one of each colour) and hide the ones you don't want to see at any time, using calculations.
For example, say we have an RRD with two DSs:
DS:x:GAUGE:60:0:U
DS:y:GAUGE:60:0:1
Now, we want to show the line for x in red if y is 0, and blue if it is 1. To do this, we create two calculated values, x1 and x2.
CDEF:x1=y,0,EQ,x,UNKN,IF
CDEF:x2=y,1,EQ,x,UNKN,IF
Thus, x1 is active if y=0 and x2 if y=1. Yes, this could be simplified, but I'm showing it like this for the example.
Now, we can make lines using these:
LINE:x1#ff0000:MyLine
LINE:x2#0000ff
Note that the second line doesn't need a legend. Now, the line will appear to change colour depending on the value of the y metric, since at any time the other line will be UNKN and therefore not displayed.
You can extend this, of course, to have multiple colours and more complex thresholds.

What is a safety wall and how do I use it?

I've Googled and found zero answers for "safety wall", so I'm pretty sure that's not the correct term. I'll explain myself:
As I've read, I'm talking about taking a two dimensional array and placing it in a same array with an addition of one cell to each side to make sure staying safe and not getting out the limits I've created.
What is the right term for this technique and how would I use it?
Like others told, you need to search it "sentinel" or something like "sentinel control"..
You can use sentinel control when you dont know size or limits of your program. For example, you are writting a program, which is calculating avarage grade of class. However you dont know how many student are in class. Or you inserting array which you dont know limits. Then you can use sentinel control for this job.
Lets look this example,
int grade;
int totalgrade = 0;
int studentCount = 0;
std::cin >> grade;
while (grade != -1)
{
totalgrade = totalgrade + grade;
studentCount ++;
std::cin >> grade;
} // loop until user enter -1
So if you dont know how many values will be entered from user, you can use sentinel control for this job. You can also read more about sentinel value.
These are usually referred to as "ghost cells", and are often used in numerical simulations or image processing where you are applying a kernel (such as a smoothing or difference operator) to an array. They allow you apply the kernel without special casing the edges.
For example; suppose you want to smooth out an image - you could use a kernel like:
0.0 0.1 0.0
0.1 0.6 0.1
0.0 0.1 0.0
You apply this by taking the source image, and for every pixel, you compute the value of the destination pixel by centering the kernel on the source pixel and adding up the weighted contributions of the 9 covered pixel (0.6 * the value of the source pixel, plus 0.1 times the value of each of the pixels above, below, and to the sides). Do this for every pixel and you'll end up with a smoothed version of your original image.
This works well, but the question is "what do you do at the border cells?" Rather than having complicated if/then logic for the border cases (which can be tricky and can degrade performance), you can just add 1 layer of ghost cells to each side.
Of course, you have to pick values for the cells before you run your algorithm. How you pick their value depends on your algorithm. You might choose to set them all to zero, but in the case of the smoothing kernel, this will darken your image at it's borders, so that's probably not what you want. A better plan would be to fill the ghost cells with the value of the nearest non-ghost cell.
You also need to figure out how many ghost cells you need, which depends on the size of your kernel. For a 3x3 kernel like above, you need 1 layer of ghost cells (to take care of the part of the kernel that might "hang off" the edge). More complicated kernels might require more (a 5x5 kernel would require 2 layers, etc).
You can google "ghost cell computation" to find out more (add 'computation' or you'll get a lot of biology results!)

Image classification using cascaded boosting in scikit-learn - why would classification terminate early?

I've pasted all my code here in case you'd need that to understand my question: Plotting a graph on axes but getting no results while trying to classify image based on HoG features
My question is: given approximately 500 images (the Caltech "Cars 2001" dataset) with 48 HoG features each, what possible reasons can there be for the boosting to terminate early? What could cause a perfect fit, or a problem with the boosted sample weights, and how can such problems be solved? The specific algorithm I'm using is SAMME, a multiclass Adaboost classifier. I'm using Python 2.7 on Anaconda.
When I checked certain variables during the classification of my dataset, setting the n_estimators parameter to be 600, I found that:
discrete_test_errors: consisted of 1 item instead of being an array of 600 values
discrete_estimator_errors: was again one single value instead of of being an array of 600 values
real_test_errors is just one item again instead of 600
discrete_estimator_weights: array ([1.]) "
n_trees_discrete and n_trees_real: 1 instead of 600
Problem solved. My training array, labels, had no negatives, it consisted only of labels for one class. Of course the training would terminate immediately!

Multiple scans by key

I have one 4-channel HSVL image - Hue, Saturation, Value (floats), Label (unsigned int).
The task is to compute an array of sums of Hues, Saturations, and Values, for each unique label. For example, I will be able to access the output Sum[of pixels with label 455] = { Hue: 500, Sat: 100, Val: 200 }. The size of the image is about 5 MP, and there are about 3000 different labels.
My idea is to have ~32 scans over parts of the image, that will produce 32 x nLabels sums. Then I can scan over the 32 partitions of the image, to arrive at nLabel sum structures.
Does a "scan by key?" algorithm exist that is a solution to this exact type of problem?
If you want to do this by CUDA, the following could help.
Since you only need the sum values, I think what you need is "reduce by key". Thrust provides an implementation thrust::reduce_by_key() which could meet your needs.
But before using it, you have to sort all the pixels by the labels. This can be done with thrust::sort_by_key()
You may also be interested in thrust::zip_iterator, which can zip the 3 channels HSV into a single value iterator for sorting and reduction.