fftw + opencv inconsistent output - c++

I recently tried to implement an FFT function for Opencv's Mat.
I inspired my implementation mainly from FFTW's code samples and from :
FFTW-OpenCV
I payed close attention to adapt the size of the input image in order to fasten the processing.
It seems that I did something wrong because the output is always a black image.
Here is my implementation:
void fft2_32f(const cv::Mat1f& _src, cv::Mat2f& dst)
{
cv::Mat2f src;
const int rows = cv::getOptimalDFTSize(_src.rows);
const int cols = cv::getOptimalDFTSize(_src.cols);
// const int total = cv::alignSize(rows*cols,steps);
if(_src.isContinuous() && _src.rows == rows && _src.cols == cols)
{
src = cv::Mat2f::zeros(src.size());
dst = cv::Mat2f::zeros(src.size());
// 1) copy the source into a complex matrix (the imaginary component is set to 0).
cblas_scopy(src.total(), _src.ptr<float>(), 1, src.ptr<float>(), 2);
// 2) prepare and apply the transform.
fftwf_complex* ptr_in = reinterpret_cast<fftwf_complex*>(src.ptr<float>());
fftwf_complex* ptr_out = reinterpret_cast<fftwf_complex*>(dst.ptr<float>());
// fftwf_plan fft = fftwf_plan_dft_1d(src.total(), ptr_in, ptr_out, FFTW_FORWARD, FFTW_ESTIMATE);
fftwf_plan fft = fftwf_plan_dft_2d(src.rows, src.cols, ptr_in, ptr_out, FFTW_FORWARD, FFTW_ESTIMATE);
fftwf_execute(fft);
fftwf_destroy_plan(fft);
// 3) normalize
cblas_saxpy(dst.rows * dst.step1(), 1.f/dst.total(), dst.ptr<float>(), 1, dst.ptr<float>(), 1);
}
else
{
src = cv::Mat2f::zeros(rows, cols);
dst = cv::Mat2f::zeros(rows, cols);
// 1) copy the source into a complex matrix (the imaginary component is set to 0).
support::parallel_for(cv::Range(0, _src.rows), [&src, &_src](const cv::Range& range)->void
{
for(int r=range.start; r<range.end; r++)
{
int c=0;
const float* it_src = _src[r];
float* it_dst = src.ptr<float>(r);
#if CV_ENABLE_UNROLLED
for(;c<=_src.cols-4; c+=4, it_src+=4, it_dst+=8)
{
*it_dst = *it_src;
*(it_dst+2) = *(it_src+1);
*(it_dst+4) = *(it_src+2);
*(it_dst+6) = *(it_src+3);
}
#endif
for(; c<_src.cols; c++, it_src++, it_dst+=2)
*it_dst = *it_src;
}
}, 0x80);
// 2) prepare and apply the transform.
fftwf_complex* ptr_in = reinterpret_cast<fftwf_complex*>(src.ptr<float>());
fftwf_complex* ptr_out = reinterpret_cast<fftwf_complex*>(dst.ptr<float>());
fftwf_plan fft = fftwf_plan_dft_2d(src.rows, src.cols, ptr_in, ptr_out, FFTW_FORWARD, FFTW_ESTIMATE);
fftwf_execute(fft);
fftwf_destroy_plan(fft);
double min(0.);
double max(0.);
// 3) normalize
cblas_saxpy(dst.rows * dst.step1(), 1.f/dst.total(), dst.ptr<float>(), 1, dst.ptr<float>(), 1);
}
}
Note:
The parallel_for implementation is inspired by: How to use lambda as a parameter to parallel_for_
Thanks in advance for any help.

I figure out my issue.
This function written as is does work perfectly (at least for the purpose I made it for).
My issue was that :
cv::Mat dst = cv::Mat::zeros(src.size(), CV_32FC2);
cv::Mat1f srcw = src;
cv::Mat1f dstw = dst;
fft2_32f(srcw, dstw); // realocate dstw to the optimal size for receive the output depending on the size of srcw. ... so the dstw is reallocate but not dst.
dst.copyTo(_outputVariable);
In that case the correct information is store in dstw but not in dst because of the reallocation inside the function.
So when I try to visualize my data I had a black image because of that.
The proper call use to be:
cv::Mat dst;
cv::Mat1f srcw = src;
cv::Mat1f dstw;
fft2_32f(srcw, dstw); // realocate dstw to the optimal size for receive the output depending on the size of srcw. ... so the dstw is reallocate but not dst.
dst = dstw;
dst.copyTo(_outputVariable); // or dstw.copyTo(_outputVariable);
With that code I got the proper output.
Note depending on the application a roi (take a look to the operator()(const cv::Rect&) of OpenCV's Mat container) corresponding to the size of the input may be usefull in order to preserve the dimensions.
Thank you for your help :).
Can someone help me to mark this topic as close ? please.

Related

How to set input with image for tensorflow-lite in c++?

I am trying to move our Tensoflow model from Python+Keras version to Tensorflow Lite with C++ on an embedded platform.
It looks like I don't know how set properly input for interpreter.
Input shape should be (1, 224, 224, 3).
As an input I am taking image with openCV, converting this to CV_BGR2RGB.
std::unique_ptr<tflite::FlatBufferModel> model_stage1 =
tflite::FlatBufferModel::BuildFromFile("model1.tflite");
TFLITE_MINIMAL_CHECK(model_stage1 != nullptr);
// Build the interpreter
tflite::ops::builtin::BuiltinOpResolver resolver_stage1;
std::unique_ptr<Interpreter> interpreter_stage1;
tflite::InterpreterBuilder(*model_stage1, resolver_stage1)(&interpreter_stage1);
TFLITE_MINIMAL_CHECK(interpreter_stage1 != nullptr);
cv::Mat cvimg = cv::imread(imagefile);
if(cvimg.data == NULL) {
printf("=== IMAGE READ ERROR ===\n");
return 0;
}
cv::cvtColor(cvimg, cvimg, CV_BGR2RGB);
uchar* input_1 = interpreter_stage1->typed_input_tensor<uchar>(0);
memcpy( ... );
I have issue with proper setup of memcpy for this uchar type.
When I am doing like this, I have seg fault during working:
memcpy(input_1, cvimg.data, cvimg.total() * cvimg.elemSize());
How should I properly fill input in this case?
To convert my comments into an answer:
Memcpy might not be the right approach here. OpenCV saves images as 1-dimensional arrays of RGB-ordered (or BGR or yet another color combination) color values per pixel. It is possible to iterate over these RGB-chunks via:
for (const auto& rgb : cvimg) {
// now rgb[0] is the red value, rgb[1] green and rgb[2] blue.
}
And writing values to a Tensorflow-Lite typed_input_tensor should be done like this; where i is the index (iterator) and x the assigned value:
interpreter->typed_input_tensor<uchar>(0)[i] = x;
So the loop could look like this:
for (size_t i = 0; size_t < cvimg.size(); ++i) {
const auto& rgb = cvimg[i];
interpreter->typed_input_tensor<uchar>(0)[3*i + 0] = rgb[0];
interpreter->typed_input_tensor<uchar>(0)[3*i + 1] = rgb[1];
interpreter->typed_input_tensor<uchar>(0)[3*i + 2] = rgb[2];
}
This is how you can do it at least for the single channel case. This assumes that the opencv bufffer is contiguous. So, in this case, tensor dims are (1, x, y, 1).
float* out = interpreter->typed_tensor<float>(input);
input_type = interpreter->tensor(input)->type;
img.convertTo(img, CV_32F, 255.f/input_std);
cv::subtract(img, cv::Scalar(input_mean/input_std), img);
float* in = img.ptr<float>(0);
memcpy(out, in, img.rows * img.cols * sizeof(float));
OpenCV version - 4.3.0
TF Lite version - 2.0.0
nada's approach is also correct. Pick whichever suits your programming style, however memcpy version is going to be comparatively faster.

3D reconstruction from multiple images with one camera

So, I've been trying to get a 3D cloud point from a sequence of images of an object. I have successfully obtained a decent point cloud with two images. I got that from matching features on both images, finding the fundamental matrix and from that, extracting P' (the camera matrix for the second view). For the first view, I set P = K(I | 0), where K is the matrix for the camera intrinsics. But I haven't been able to extend this approach to several images. My idea was to do this sliding the two image window through the sequence of images(e.g. match image1 with image2, find 3d points, match image2 with image3 and then find the more 3d points, and so on). For the following image pairs, P would be made of a cumulative rotation matrix and a cumulative translation vector (this would allow me to keep bringing the points to the first camera coordinate system). But this is not working at all. I'm using OpenCV. What I wanna know is if this approach makes sense at all.
In the code, P_prev is P and Pl is P'. This is just the part that I think it's relevant.
Mat combinedPointCloud;
Mat P_prev;
P_prev = (Mat_<double>(3,4) << cameraMatrix.at<double>(0,0), cameraMatrix.at<double>(0,1), cameraMatrix.at<double>(0,2), 0,
cameraMatrix.at<double>(1,0), cameraMatrix.at<double>(1,1), cameraMatrix.at<double>(1,2), 0,
cameraMatrix.at<double>(2,0), cameraMatrix.at<double>(2,1), cameraMatrix.at<double>(2,2), 0);
for(int i = 1; i < images.size(); i++) {
Mat points3D;
image1 = images[i-1];
image2 = images[i];
matchTwoImages(image1, image2, imgpts1, imgpts2);
P = findSecondProjectionMatrix(cameraMatrix, imgpts1, imgpts2);
P.col(0).copyTo(R.col(0));
P.col(1).copyTo(R.col(1));
P.col(2).copyTo(R.col(2));
P.col(3).copyTo(t.col(0));
if(i == 1) {
Pl = P;
triangulatePoints(P_prev, Pl, imgpts1, imgpts2, points3D); //points3D is 4xN
//Transforming to euclidean by hand, because couldn't make
// opencv's convertFromHomogeneous work
aux.create(3, points3D.cols, CV_64F);// aux is 3xN
for(int i = 0; i < points3D.cols; i++) {
aux.at<float>(0, i) = points3D.at<float>(0, i)/points3D.at<float>(3, i);
aux.at<float>(1, i) = points3D.at<float>(1, i)/points3D.at<float>(3, i);
aux.at<float>(2, i) = points3D.at<float>(2, i)/points3D.at<float>(3, i);
}
points3D.create(3, points3D.cols, CV_64F);
aux.copyTo(points3D);
}
else {
R_aux = R_prev * R;
t_aux = t_prev + t;
R_aux.col(0).copyTo(Pl.col(0));
R_aux.col(1).copyTo(Pl.col(1));
R_aux.col(2).copyTo(Pl.col(2));
t_aux.col(0).copyTo(Pl.col(3));
triangulatePoints(P_prev, Pl, imgpts1, imgpts2, points3D);
//Transforming to euclidean by hand, because couldn't make
// opencv's convertFromHomogeneous work
aux.create(3, points3D.cols, CV_64F);// aux is 3xN
for(int i = 0; i < points3D.cols; i++) {
aux.at<float>(0, i) = points3D.at<float>(0, i)/points3D.at<float>(3, i);
aux.at<float>(1, i) = points3D.at<float>(1, i)/points3D.at<float>(3, i);
aux.at<float>(2, i) = points3D.at<float>(2, i)/points3D.at<float>(3, i);
}
points3D.create(3, points3D.cols, CV_64F);
aux.copyTo(points3D);
}
Pl.col(0).copyTo(R_prev.col(0));
Pl.col(1).copyTo(R_prev.col(1));
Pl.col(2).copyTo(R_prev.col(2));
Pl.col(3).copyTo(t_prev.col(0));
P_prev = Pl;
if(i==1) {
points3D.copyTo(combinedPointCloud);
} else {
hconcat(combinedPointCloud, points3D, combinedPointCloud);
}
}
show3DCloud(comninedPointCloud);

How to determine PHOW features for an image in C++ with vlfeat and opencv?

I have implemented a PHOW features detector in matlab, as follows:
[frames, descrs] = vl_phow(im);
which is a wraper to the code:
...
for i = 1:4
ims = vl_imsmooth(im, scales(i) / 3) ;
[frames{s}, descrs{s}] = vl_dsift(ims, 'Fast', 'Step', step, 'Size', scales(i)) ;
end
...
I'm doing an implementation in c++ with opencv and vlfeat. This is part of my implementation code to calculate PHOW features for an image (Mat image):
...
//convert into float array
float* img_vec = im2single(image);
//create filter
VlDsiftFilter* vlf = vl_dsift_new(image.cols, image.rows);
double bin_sizes[] = { 3, 4, 5, 6 };
double magnif = 3;
double* scales = (double*)malloc(4*sizeof(double));
for (size_t i = 0; i < 4; i++)
{
scales[i] = bin_sizes[i] / magnif;
}
for (size_t i = 0; i < 4; i++)
{
double sigma = sqrt(pow(scales[i], 2) - 0.25);
//smooth float array image
float* img_vec_smooth = (float*)malloc(image.rows*image.cols*sizeof(float));
vl_imsmooth_f(img_vec_smooth, image.cols, img_vec, image.cols, image.rows, image.cols, sigma, sigma);
//run DSIFT
vl_dsift_process(vlf, img_vec_smooth);
//number of keypoints found
int keypoints_num = vl_dsift_get_keypoint_num(vlf);
//extract keypoints
const VlDsiftKeypoint* vlkeypoints = vl_dsift_get_keypoints(vlf);
//descriptors dimention
int dim = vl_dsift_get_descriptor_size(vlf);
//extract descriptors
const float* descriptors = vl_dsift_get_descriptors(vlf);
...
//return all descriptors of diferent scales
I'm not sure if the return should be the set of all descriptors for all scales, which requires a lot of storage space when we are processing several images; or the result of an operation between descriptors of different scales.
Can you help me with this doubt?
Thanks
You can do either. The simplest would be to simply concatenate the different levels. I believe this is what VLFeat does (atleast they don't say they do anything more in the documentation). Removing those below your contrast threshold should help, but you'll still have several thousand (depending on the size of your image). But you could compare the descriptors occurring near the same location to prune some out. Its a bit of a time-space trade-off. Generally, I've seen the bin sizes spaced (by intervals of 2, but could be more) which should reduce the need to check for overlapping descriptors.

Convert Matlab based Ridge Segment Function into C++

I am going to perform ridge segmentation on an input image using OpenCV. From the internet, I found a Matlab code as follows, which fits quite well with my goal:
function [normim, mask, maskind] = ridgesegment(im, blksze, thresh)
im = normalise(im,0,1); % normalise to have zero mean, unit std dev
fun = inline('std(x(:))*ones(size(x))');
stddevim = blkproc(im, [blksze blksze], fun);
mask = stddevim > thresh;
maskind = find(mask);
% Renormalise image so that the *ridge regions* have zero mean, unit
% standard deviation.
im = im - mean(im(maskind));
normim = im/std(im(maskind));
end
So I tried to convert it to C++. Up to now, I can only finish these parts:
cv::Mat ridgeSegment(cv::Mat inputImg, int blockSize, double thresh)
{
cv::normalize(inputImg, inputImg, 0, 1.0, cv::NORM_MINMAX, CV_8UC1);
blkproc(inputImg, cv::Size(blockSize, blockSize), thresh);
...// how to do the next steps ????
}
cv::Mat blkproc(cv::Mat img, cv::Size size, double thresh)
{
cv::Mat croppedImg;
for (int i = 0; i < im.cols; i += size.width)
{
for (int j = 0; j < im.rows; j += size.height)
{
croppedImg = im(cv::Rect(i, j, size.width, size.height)).clone();
//perform standard deviation calculation here???
}
}
return croppedImg;
}
I don't know how to proceed further here. Especially that stddevim and its later parts. Could someone explain and show me the rest? Thank you in advance.

How to smooth a histogram?

I want to smooth a histogram.
Therefore I tried to smooth the internal matrix of cvHistogram.
typedef struct CvHistogram
{
int type;
CvArr* bins;
float thresh[CV_MAX_DIM][2]; /* for uniform histograms */
float** thresh2; /* for non-uniform histograms */
CvMatND mat; /* embedded matrix header for array histograms */
}
I tried to smooth the matrix like this:
cvCalcHist( planes, hist, 0, 0 ); // Compute histogram
(...)
// smooth histogram with Gaussian Filter
cvSmooth( hist->mat, hist_img, CV_GAUSSIAN, 3, 3, 0, 0 );
Unfortunately, this is not working because cvSmooth needs a CvMat as input instead of a CvMatND. I couldn't transform CvMatND into CvMat (CvMatND is 2-dim in my case).
Is there anybody who can help me? Thanks.
You can use the same basic algorithm used for Mean filter, just calculating the average.
for(int i = 1; i < NBins - 1; ++i)
{
hist[i] = (hist[i - 1] + hist[i] + hist[i + 1]) / 3;
}
Optionally you can use a slightly more flexible algorithm allowing you to easily change the window size.
int winSize = 5;
int winMidSize = winSize / 2;
for(int i = winMidSize; i < NBins - winMidSize; ++i)
{
float mean = 0;
for(int j = i - winMidSize; j <= (i + winMidSize); ++j)
{
mean += hist[j];
}
hist[i] = mean / winSize;
}
But bear in mind that this is just one simple technique.
If you really want to do it using OpenCv tools, I recommend you access the openCv forum: http://tech.groups.yahoo.com/group/OpenCV/join
You can dramatically change the "smoothness" of a histogram by changing the number of bins you use. A good rule of thumb is to have sqrt(n) bins if you have n data points. You might try applying this heuristic to your histogram and see if you get a better result.