How to determine the time take by PCA in weka? - weka

As I'm a mathematician, I'd like to know the time taken by the PCA to perform the reduction. I'd like to determine the time taken in seconds as we do it using tic toc in Matlab.

Related

Calculating the median at each pixel for multiple frames OpenCV (C++)

The task is to extract stable background out of the video, the idea is to choose n random frames from the video and take median for each pixel. I was doing this task in python using numpy
medianFrame = np.median(frames, axis=0).astype(dtype=np.uint8)
But now i need to perform the same task in C++. I have tried the naive way of splitting channels and going through rows*cols number of pixels for n frames to calculate the median frame but its not efficient at all and takes way more time than what np.median was taking. I also try to use xtensor for performing the task but wasn`t able to worlk with it. Any suggestions or direction about how to approach this task would greatly help me, Thanks!

Using class weight to balance data set lowers accuracy in RBF SVM

I have been using sklearn to learn on some data. This is a binary classifcation task and I am using a RBF kernel. My data set is quite unbalanced (80:20) and I'm using only 120 samples, with 10ish features (I've been experimenting with a few less). Since I set class_weight="auto" the accuracy I've calculated from a cross validated (10 folds) gridsearch has dropped dramatically. Why??
I will include a couple of validation accuracy heatmaps to demonstrate the difference.
NOTE: top heatmap is before classweight was changed to auto.
Accuracy is not the best metrics to use when dealing with unbalanced dataset. Let's say you have 99 positive examples and 1 negative example, and if you predict all outputs to be positive, still you will get 99% accuracy, whereas you have mis-classified the only negative example. You might have gotten high accuracy in the first case because your predictions will be on the side which has high number of samples.
When you do class weight = auto, it takes the imbalance into consideration and hence, your predictions might have moved towards center, you can cross-check it using plotting the histograms of predictions.
My suggestion is, don't use accuracy as performance metric, use something like F1 Score or AUC.

estimate linear combination of regression coefficients in sas

I'm using a LMM in SAS and, I would like to get an estimation (and a p-value) of a linear combination of some of the regression coefficients.
Say that the model is:
b0+b1Time+b2X1+b3X2+b4(Time*X1)
and say that, I want to get an estimate and a p-value for the b1+b4.
What should I do?

Compute frequency of sinusoidal signal, c++

i have a sinusoidal-like shaped signal,and i would like to compute the frequency.
I tried to implement something but looks very difficult, any idea?
So far i have a vector with timestep and value, how can i get the frequency from this?
thank you
If the input signal is a perfect sinusoid, you can calculate the frequency using the time between positive 0 crossings. Find 2 consecutive instances where the signal goes from negative to positive and measure the time between, then invert this number to convert from period to frequency. Note this is only as accurate as your sample interval and it does not account for any potential aliasing.
You could try auto correlating the signal. An auto correlation can be rapidly calculated by following these steps:
Perform FFT of the audio.
Multiply each complex value with its complex conjugate.
Perform the inverse FFT of the audio.
The left most peak will always be the highest (as the signal always correlates best with itself). The second highest peak, however, can be used to calculate the sinusoid's frequency.
For example if the second peak occurs at an offset (lag) of 50 points and the sample rate is 16kHz and the window is 1 second then the end frequency is 16000 / 50 or 320Hz. You can even use interpolation to get a more accurate estimation of the peak position and thus a more accurate sinusoid frequency. This method is quite intense but is very good for estimating the frequency after significant amounts of noise have been added!

Expectation Maximization opencv-Log Likelihood value

I'm estimating the parameters of a GMM using EM,
When I use my Matlab script And run the EM code i get a single value of "log-likelihood"..
However in opencv the output of EM.train gives a matrix which contained the loglikelihood value of every sample.
How do I get a single log likelihood value? Do I need to take the minimum of all the loglikelihood values of all samples or the sum of all loglikelihood values?
You need sum of log probabilities of datapoints which you use to estimate probability density function. You'll get loglikelihood of your estimation.
You can find good explanation in "Pattern Recognition and Machine Learning" book