CImg - Saliency with spectral approach - c++

I am trying to implement the spectral approach to get the saliency of an image with CImg, but I'm having trouble getting there.
This might seems like a repost from this question (spectral residual saliency detection in C++ with CImg) but I think I got right the two mistakes from this question (atan2 and FFT arguments).
Here's my code:
int main(int argc, char * argv[]) {
const char * input_file = "img/pic.png";
CImg<float> input = CImg<float>(input_file);
const CImg<float> mask(3,3,1,1,1.0f/9.0f);
resize_fft(input); // Resize for fft
CImg<float> gray = any2gray(input); // to single channel grayscale
CImgList<float> fft = gray.get_FFT();
CImg<float> amp = (fft[0].get_pow(2) + fft[1].get_pow(2)).get_sqrt();
CImg<float> amp_log = (amp + 1.0f).get_log().get_normalize(0, 255);
CImg<float> phase = fft[1].get_atan2(fft[0]);
CImg<float> residual = amp_log - amp_log.get_convolve(mask);
CImg<float> real = residual.get_exp();
CImg<float>::FFT(real, phase, true);
real.save("img/001.png");
real.normalize(0, 255).save("img/002.png");
return 1;
}
Both save pictures 001 and 002 end up being noise-like picture, like still in the frequency space.
I don't what's something wrong with what I'm doing, if yuo guys can help me ?
Thanks.

Firstly, It is obvious that you forget to smooth real with Gaussian filter.
Secondly, the line CImg<float>::FFT(real, phase, true); is suspectable. I don't know about CImg library but I can understand what you are expressing. When you do inverse fft, I think that both the real part and the imaginary part are wrong. The formulae in the paper are kind of misleading, reading the matlab code is clearer.
If you are familiar with complex number, you will find that getting variable phase is not necessary here.
The pseudo code replacing the line are here:
fft[0] = fft[0] ./ amp .* residual;
fft[1] = fft[1] ./ amp .* residual;
//Inverse Fourier Transform
CImg<float>::FFT(fft[0], fft[1], true);
real = fft[0].get_pow(2) + fft[1].get_pow(2);
real.get_convolve(Gaussian filter with sigma = 8)
All the operators with left dot mean element-wise operation.

Related

unrolled label of a Cap using chessboard pattern (OpenCV c++)

I´m trying to use a chessboard pattern, to get the information of the cylinder map and rectifie the "distortion" so that image shows the cap surface unrolled. I made a first test with a one shot calibration and cv::fisheye::undistortImage to get a un-distortion (attached two images).
*//runCalibrationFishEye
void runCalibrationFishEye(cv::Mat& image, cv::Matx33d&, cv::Vec4d&);
cv::Mat removeFisheyeLensDist(cv::Mat&, cv::Matx33d&, cv::Vec4d&);*
It is to remark that i am not interested in calibrate the image, to get metric values. I just want to use the chessboard information to unroll the image on the cylinder surface.
The final aim is to use the rectified images of 4 cameras and to stitch the rectified images to one unrolled image.
Do i need to make a full calibration of the camera? Or is there another way to get a remap of the cylinder surface?
I will try to implement this interesting unwarp method: https://dsp.stackexchange.com/questions/2406/how-to-flatten-the-image-of-a-label-on-a-food-jar/2409#2409
cap with chessboard
Rectification
I have found a similar approach, of another problem but with a similar Mathematics. And it was solved without a calibration pattern. Link here. Its a approximation, but the result is quite good enough.
the user Hammer gave an answer that helped me to get a solution. I have changed the way he do the mapping, using OpenCV remap. The formula to recalculate the coordinates is just as he gave it, using different values, and making a preprocessing to adjust the image (Rotation, zoom, and other adjustments).Unrolled image. I am now improving the distortion of the edges, so that it is not so pronounced at the edges. But the main question is solved.
cv::Point2f convert_pt(cv::Point2f point, int w, int h)
{
cv::Point2f pc(point.x - w / 2, point.y - h / 2);
float f = w;
float r = w;
float omega = w / 2;
float z0 = f - sqrt(r*r - omega*omega);
//Formula para remapear el cylindro
float zc = (2 * z0 + sqrt(4 * z0*z0 - 4 * (pc.x*pc.x / (f*f) + 1)*(z0*z0 - r*r))) / (2 * (pc.x*pc.x / (f*f) + 1));
cv::Point2f final_point(pc.x*zc / f, pc.y*zc / f);
final_point.x += w / 2;
final_point.y += h / 2;
return final_point;
}

OpenCV 3.1 Stitch images in order they were taken

I am building an Android app to create panoramas. The user captures a set of images and those images
are sent to my native stitch function that was based on https://github.com/opencv/opencv/blob/master/samples/cpp/stitching_detailed.cpp.
Since the images are in order, I would like to match each image only to the next image in the vector.
I found an Intel article that was doing just that with following code:
vector<MatchesInfo> pairwise_matches;
BestOf2NearestMatcher matcher(try_gpu, match_conf);
Mat matchMask(features.size(),features.size(),CV_8U,Scalar(0));
for (int i = 0; i < num_images -1; ++i)
{
matchMask.at<char>(i,i+1) =1;
}
matcher(features, pairwise_matches,matchMask);
matcher.collectGarbage();
Problem is, this wont compile. Im guessing its because im using OpenCV 3.1.
Then I found somewhere that this code would do the same:
int range_width = 2;
BestOf2NearestRangeMatcher matcher(range_width, try_cuda, match_conf);
matcher(features, pairwise_matches);
matcher.collectGarbage();
And for most of my samples this works fine. However sometimes, especially when im stitching
a large set of images (around 15), some objects appear on top of eachother and in places they shouldnt.
I've also noticed that the "beginning" (left side) of the end result is not the first image in the vector either
which is strange.
I am using "orb" as features_type and "ray" as ba_cost_func. Seems like I cant use SURF on OpenCV 3.1.
The rest of my initial parameters look like this:
bool try_cuda = false;
double compose_megapix = -1; //keeps resolution for final panorama
float match_conf = 0.3f; //0.3 default for orb
string ba_refine_mask = "xxxxx";
bool do_wave_correct = true;
WaveCorrectKind wave_correct = detail::WAVE_CORRECT_HORIZ;
int blend_type = Blender::MULTI_BAND;
float blend_strength = 5;
double work_megapix = 0.6;
double seam_megapix = 0.08;
float conf_thresh = 0.5f;
int expos_comp_type = ExposureCompensator::GAIN_BLOCKS;
string seam_find_type = "dp_colorgrad";
string warp_type = "spherical";
So could anyone enlighten me as to why this is not working and how I should match my features? Any help or direction would be much appreciated!
TL;DR : I want to stitch images in the order they were taken, but above codes are not working for me, how can I do that?
So I found out that the issue here is not with the order the images are stitched but rather the rotation that is estimated for the camera parameters in the Homography Based Estimator and the Bundle Ray Adjuster.
Those rotation angles are estimated considering a self rotating camera and my use case envolves an user rotating the camera (which means that will be some translation too.
Because of that (i guess) horizontal angles (around Y axis) are highly overestimated which means that the algorithm considers the set of images cover >= 360 degrees which results in some overlapped areas that shouldnt be overlapped.
Still havent found a solution for that problem though.
matcher() takes UMat as mask instead of Mat object, so try the following code:
vector<MatchesInfo> pairwise_matches;
BestOf2NearestMatcher matcher(try_gpu, match_conf);
Mat matchMask(features.size(),features.size(),CV_8U,Scalar(0));
for (int i = 0; i < num_images -1; ++i)
{
matchMask.at<char>(i,i+1) =1;
}
UMat umask = matchMask.getUMat(ACCESS_READ);
matcher(features, pairwise_matches, umask);
matcher.collectGarbage();

How to compute 2D log-chromaticity?

My goal is to remove shadows from image. I use C++ and OpenCV. Sure I lack enough math background and not being native English speaker makes everything harder to understand.
After reading different approaches to remove shadows I found method which should work for me but it relies on something they call "2D chromaticity" and "2D log-chromaticity space" but even this term seems to be inconsistent in different sources. Many papers on topic, few are listed here:
http://www.cs.cmu.edu/~efros/courses/LBMV09/Papers/finlayson-eccv-04.pdf
http://www2.cmp.uea.ac.uk/Research/compvis/Papers/DrewFinHor_ICCV03.pdf
http://www.cvc.uab.es/adas/publications/alvarez_2008.pdf
http://ivrgwww.epfl.ch/alumni/fredemba/papers/FFICPR06.pdf
I teared Google into strips by searching right words and explanations. Best I found is Illumination invariant image which did not help me much.
I tried to repeat formula log(G/R), log(B/R) described in first paper, page 3 to get figures similar to 2b.
As input I used http://en.wikipedia.org/wiki/File:Gretag-Macbeth_ColorChecker.jpg
Output I get is
My source code:
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
using namespace std;
using namespace cv;
int main( int argc, char** argv ) {
Mat src;
src = imread( argv[1], 1 );
if( !src.data )
{ return -1; }
Mat image( 600, 600, CV_8UC3, Scalar(127,127,127) );
int cn = src.channels();
uint8_t* pixelPtr = (uint8_t*)src.data;
for(int i=0 ; i< src.rows;i++) {
for(int j=0 ; j< src.cols;j++) {
Scalar_<uint8_t> bgrPixel;
bgrPixel.val[0] = pixelPtr[i*src.cols*cn + j*cn + 0]; // B
bgrPixel.val[1] = pixelPtr[i*src.cols*cn + j*cn + 1]; // G
bgrPixel.val[2] = pixelPtr[i*src.cols*cn + j*cn + 2]; // R
if(bgrPixel.val[2] !=0 ) { // avoid division by zero
float a= image.cols/2+50*(log((float)bgrPixel.val[0] / (float)bgrPixel.val[2])) ;
float b= image.rows/2+50*(log((float)bgrPixel.val[1] / (float)bgrPixel.val[2])) ;
if(!isinf(a) && !isinf(b))
image.at<Vec3b>(a,b)=Vec3b(255,2,3);
}
}
}
imshow("log-chroma", image );
imwrite("log-chroma.png", image );
waitKey(0);
}
What I am missing or misunderstand?
By reading the paper Recovery of Chromaticity Image Free from Shadows via Illumination Invariance that you've posted, and your code, I guess the problem is that your coordinate system (X/Y axis) are linear while in the paper the coordinate system are log(R/G) by log(B/G).
This is the closest I can figure. Reading through this:
http://www2.cmp.uea.ac.uk/Research/compvis/Papers/DrewFinHor_ICCV03.pdf
I came across the sentence:
"Fig. 2(a) shows log-chromaticities for the 24 surfaces of a Macbeth ColorChecker Chart, (the six neutral patches all belong to the same
cluster). If we now vary the lighting and plot median values
for each patch, we see the curves in Fig. 2(b)."
If you look closely at the log-chromaticity plot, you see 19 blobs, corresponding to each of the 18 colors in the Macbeth chart, plus the sum of all the 6 grayscale targets in the bottom row:
Explanation of Log Chromaticities
Explanation of Log Chromaticities
With 1 picture, we can only get 1 point of each blob: We take the median value inside each target and plot it. To get plot from the paper, we would have to create multiple images with different lighting. We might be able to do this by varying the temperature of the image in an image editor.
For now, I just looked at the color patches in the original image and plotted the points:
Input:
Color Patches Used
Output:
Log Chromaticity
The graph dots are not all in the same place as the paper, but I figure it's fairly close. Would someone please check my work to see if this makes sense?
In that OpenCV code I got a "undefined Identifier error" for the function ifinf() and I solved it by replacing it with _finite(). That might be the issue with the Visual studio version.
if(!isinf(a) && !isinf(b)) ----> if(_finite(a) && _finite(b))
Include this header:
#include<float.h>

Finding extrinsics between cameras

I'm in the situation where I need to find the relative camera poses between two/or more cameras based on image correspondences (so the cameras are not in the same point). To solve this I tried the same approach as described here (code below).
cv::Mat calibration_1 = ...;
cv::Mat calibration_2 = ...;
cv::Mat calibration_target = calibration_1;
calibration_target.at<float>(0, 2) = 0.5f * frame_width; // principal point
calibration_target.at<float>(1, 2) = 0.5f * frame_height; // principal point
auto fundamental_matrix = cv::findFundamentalMat(left_matches, right_matches, CV_RANSAC);
fundamental_matrix.convertTo(fundamental_matrix, CV_32F);
cv::Mat essential_matrix = calibration_2.t() * fundamental_matrix * calibration_1;
cv::SVD svd(essential_matrix);
cv::Matx33f w(0,-1,0,
1,0,0,
0,0,1);
cv::Matx33f w_inv(0,1,0,
-1,0,0,
0,0,1);
cv::Mat rotation_between_cameras = svd.u * cv::Mat(w) * svd.vt; //HZ 9.19
But in most of my cases I get extremly weird results. So my next thought was using a full fledged bundle adjuster (which should do what i am looking for?!). Currently my only big dependency is OpenCV and they only have a undocumented bundle adjustment implementation.
So the question is:
Is there a bundle adjuster which has no dependencies and uses a licence which allows commerical use?
Are there other easy way to find the extrinsics?
Are objects with very different distances to the cameras a problem? (heavy parallax)
Thanks in advance
I'm also working on same problem and facing slimier issues.
Here are some suggestions -
Modify Essential Matrix Before Decomposition:
Modify Essential matrix before decomposition [U W Vt] = SVD(E), and new E' = diag(s,s,0) where s = W(0,0) + W(1,1) / 2
2-Stage Fundamental Matrix Estimation:
Recalculate the fundamental matrix with the RANSAC inliers
These steps should make the Rotation estimation more susceptible to noise.
you have to get 4 different solutions and select the one with the most # points having positive Z coordinates. The solution are generated by inverting the sign of the fundamental matrix an substituting w with w_inv which you did not do though you calculated w_inv. Are you reusing somebody else code?

What algorithm does OpenCV's Bayer conversion use?

I would like to implement a GPU Bayer to RGB image conversion algorithm, and I was wondering what algorithm the OpenCV cvtColor function uses. Looking at the source I see what appears to be a variable number of gradients algorithm and a basic algorithm that could maybe be bilinear interpolation? Does anyone have experience with this that they could share with me, or perhaps know of GPU code to convert from Bayer to BGR format?
The source code is in imgproc/src/color.cpp. I'm looking for a link to it. Bayer2RGB_ and Bayer2RGB_VNG_8u are the functions I'm looking at.
Edit: Here's a link to the source.
http://code.opencv.org/projects/opencv/repository/revisions/master/entry/modules/imgproc/src/color.cpp
I've already implemented a bilinear interpolation algorithm, but it doesn't seem to work very well for my purposes. The picture looks ok, but I want to compute HOG features from it and in that respect it doesn't seem like a good fit.
Default is 4way linear interpolation or variable number of gradients if you specify the VNG version.
see ..\modules\imgproc\src\color.cpp for details.
I submitted a simple linear CUDA Bayer->RGB(A) to opencv, haven't followed if it's been accepted but it should be in the bugs tracker.
It's based on the code in Cuda Bayer/CFA demosaicing example.
Here is a sample of howto use cv::GPU in your own code.
/*-------RG ccd BGRA output ----------------------------*/
__global__ void bayerRG(const cv::gpu::DevMem2Db in, cv::gpu::PtrStepb out)
{
// Note called for every pair, so x/y are for start of cell so need x+1,Y+1 for right/bottom pair
// R G
// G B
// src
int x = 2 * ((blockIdx.x*blockDim.x) + threadIdx.x);
int y = 2 * ((blockIdx.y*blockDim.y) + threadIdx.y);
uchar r,g,b;
// 'R'
r = (in.ptr(y)[x]);
g = (in.ptr(y)[x-1]+in.ptr(y)[x+1]+(in.ptr(y-1)[x]+in.ptr(y+1)[x]))/4;
b = (in.ptr(y-1)[x-1]+in.ptr(y-1)[x+1]+(in.ptr(y+1)[x-1]+in.ptr(y+1)[x+1]))/4;
((uchar4*)out.ptr(y))[x] = make_uchar4( b,g,r,0xff);
// 'G' in R
r = (in.ptr(y)[x]+in.ptr(y)[x+2])/2;
g = (in.ptr(y)[x+1]);
b = (in.ptr(y-1)[x+1]+in.ptr(y+1)[x+1])/2;
((uchar4*)out.ptr(y))[x+1] = make_uchar4( b,g,r,0xff);
// 'G' in B
r = (in.ptr(y)[x]+in.ptr(y+2)[x])/2;
g = (in.ptr(y+1)[x]);
b = (in.ptr(y+1)[x-1]+in.ptr(y+1)[x+2])/2;
((uchar4*)out.ptr(y+1))[x] = make_uchar4( b,g,r,0xff);
// 'B'
r = (in.ptr(y)[x]+in.ptr(y)[x+2]+in.ptr(y+2)[x]+in.ptr(y+2)[x+2])/4;;
g = (in.ptr(y+1)[x]+in.ptr(y+1)[x+2]+in.ptr(y)[x+1]+in.ptr(y+2)[x+1])/4;
b = (in.ptr(y+1)[x+1]);
((uchar4*)out.ptr(y+1))[x+1] = make_uchar4( b,g,r,0xff);
}
/* called from */
extern "C" void cuda_bayer(const cv::gpu::DevMem2Db& img, cv::gpu::PtrStepb out)
{
dim3 threads(16,16);
dim3 grid((img.cols/2)/(threads.x), (img.rows/2)/(threads.y));
bayerGR2<<<grid,threads>>>(img,out);
cudaThreadSynchronize();
}
Currently, to my knowledge, the best debayer out there is DFPD (directional filtering with posteriori decision) as explained in this paper. The paper is quite explanatory and you can easily prototype this approach on Matlab. Here's a blog post comparing the results of DFPD to debayer based on linear approach. You can visibly see the improvement in artifacts, colors and sharpness.
As far as I know at this point it is using adaptive homogeneity directed demosaicing. Explained in a paper by Hirakawa and many other sources on the web.