SIFT orientations in OpenCV implementation - c++

In the OpenCV implementation of SIFT, keypoints has (angles) in degrees (ranging from 180 to -180), which represents the calculated orientations for these keypoints. Since SIFT assign the dominant orientation of a keypoint using 10 degrees bins in a histogram, how we can get this range of angles? shouldn't the values be in 10 degrees steps?
Is that so because of the histogram smoothing?
This is the code where the keypoint.angle is assigned a value, can you help me understanding how we got this value?
float omax = calcOrientationHist(gauss_pyr[o*(nOctaveLayers+3) + layer],
Point(c1, r1),
cvRound(SIFT_ORI_RADIUS * scl_octv),
SIFT_ORI_SIG_FCTR * scl_octv,
hist, n);
float mag_thr = (float)(omax * SIFT_ORI_PEAK_RATIO);
for( int j = 0; j < n; j++ )
{
int l = j > 0 ? j - 1 : n - 1;
int r2 = j < n-1 ? j + 1 : 0;
if( hist[j] > hist[l] && hist[j] > hist[r2] && hist[j] >= mag_thr )
{
float bin = j + 0.5f * (hist[l]-hist[r2]) / (hist[l] - 2*hist[j] + hist[r2]);
bin = bin < 0 ? n + bin : bin >= n ? bin - n : bin;
kpt.angle = 360.f - (float)((360.f/n) * bin);
if(std::abs(kpt.angle - 360.f) < FLT_EPSILON)
kpt.angle = 0.f;
keypoints.push_back(kpt);
}
}

I think that I found the answer to my question.
A parabola is fit to the 3 histogram values closest to each peak to interpolate the peak position for better accuracy. That's why we can get continues range of values instead of 10 step values.
This is a link of how we can fit a parabola to 3 points:
Curve fitting

Related

Image Rotation without cropping

Dears,
With the below code, I rotate my cv::Mat object (I'm not using any Cv's functions, apart from load/save/convertionColor.., as this is a academic project) and I receive a cropped Image
rotation function:
float rads = angle*3.1415926/180.0;
float _cos = cos(-rads);
float _sin = sin(-rads);
float xcenter = (float)(src.cols)/2.0;
float ycenter = (float)(src.rows)/2.0;
for(int i = 0; i < src.rows; i++)
for(int j = 0; j < src.cols; j++){
int x = ycenter + ((float)(i)-ycenter)*_cos - ((float)(j)-xcenter)*_sin;
int y = xcenter + ((float)(i)-ycenter)*_sin + ((float)(j)-xcenter)*_cos;
if (x >= 0 && x < src.rows && y >= 0 && y < src.cols) {
dst.at<cv::Vec4b>(i ,j) = src.at<cv::Vec4b>(x, y);
}
else {
dst.at<cv::Vec4b>(i ,j)[3] = 0;
}
}
I would like to know, How I can keep my Full image every time I want to rotate it.
Am I missing something in my function maybe?
thanks in advance
The rotated image usually has to be large than the old image to store all pixel values.
Each point (x,y) is translated to
(x', y') = (x*cos(rads) - y*sin(rads), x*sin(rads) + y*cos(rads))
An image with height h and width w, center at (0,0) and corners at
(h/2, w/2)
(h/2, -w/2)
(-h/2, w/2)
(-h/2, -w/2)
has a new height of
h' = 2*y' = 2 * (w/2*sin(rads) + h/2*cos(rads))
and a new width of
w' = 2*x' = 2 * (w/2*cos(rads) + h/2*sin(rads))
for 0 <= rads <= pi/4. It is x * y <= x' * y' and for rads != k*pi/2 with k = 1, 2, ... it is x * y < x' * y'
In any case the area of the rotated image is same size as or larger than the area of the old image.
If you use the old size, you cut off the corners.
Example:
Your image has h=1, w=1 and rads=pi/4. You need a new image with h'=sqrt(2)=1.41421356237 and w'=sqrt(2)=1.41421356237 to store all pixel values. The pixel from (1,1) is translated to (0, sqrt(2)).

Transform images with bezier curves

I'm using this article: nonlingr as a font to understand non linear transformations, in the section GLYPHS ALONG A PATH he explains how to use a parametric curve to transform an image, i'm trying to apply a cubic bezier to an image, however i have been unsuccessfull, this is my code:
OUT.aloc(IN.width(), IN.height());
//get the control points...
wVector p0(values[vindex], values[vindex+1], 1);
wVector p1(values[vindex+2], values[vindex+3], 1);
wVector p2(values[vindex+4], values[vindex+5], 1);
wVector p3(values[vindex+6], values[vindex+7], 1);
//this is to calculate t based on x
double trange = 1 / (OUT.width()-1);
//curve coefficients
double A = (-p0[0] + 3*p1[0] - 3*p2[0] + p3[0]);
double B = (3*p0[0] - 6*p1[0] + 3*p2[0]);
double C = (-3*p0[0] + 3*p1[0]);
double D = p0[0];
double E = (-p0[1] + 3*p1[1] - 3*p2[1] + p3[1]);
double F = (3*p0[1] - 6*p1[1] + 3*p2[1]);
double G = (-3*p0[1] + 3*p1[1]);
double H = p0[1];
//apply the transformation
for(long i = 0; i < OUT.height(); i++){
for(long j = 0; j < OUT.width(); j++){
//t = x / width
double t = trange * j;
//apply the article given formulas
double x_path_d = 3*t*t*A + 2*t*B + C;
double y_path_d = 3*t*t*E + 2*t*F + G;
double angle = 3.14159265/2.0 + std::atan(y_path_d / x_path_d);
mapped_point.Set((t*t*t)*A + (t*t)*B + t*C + D + i*std::cos(angle),
(t*t*t)*E + (t*t)*F + t*G + H + i*std::sin(angle),
1);
//test if the point is inside the image
if(mapped_point[0] < 0 ||
mapped_point[0] >= OUT.width() ||
mapped_point[1] < 0 ||
mapped_point[1] >= IN.height())
continue;
OUT.setPixel(
long(mapped_point[0]),
long(mapped_point[1]),
IN.getPixel(j, i));
}
}
Applying this code in a 300x196 rgb image all i get is a black screen no matter what control points i use, is hard to find information about this kind of transformation, searching for parametric curves all i find is how to draw them, not apply to images. Can someone help me on how to transform an image with a bezier curve?
IMHO applying a curve to an image sound like using a LUT. So you will need to check for the value of the curve for different image values and then switch the image value with the one on the curve, so, create a Look-Up-Table for each possible value in the image (e.g : 0, 1, ..., 255, for a gray value 8 bit image), that is a 2x256 matrix, first column has the values from 0 to 255 and the second one having the value of the curve.

Algorithm for adjustment of image levels

I need to implement in C++ algorithm for adjusting image levels that works similar to Levels function in Photoshop or GIMP. I.e. inputs are: color RGB image to be adjusted adjust, while point, black point, midtone point, output from/to values. But I didn't find yet any info on how to perform this adjustment. Probably someone recommend me algorithm description or materials to study.
To the moment I've came up with following code myself, but it doesn't give expected result, similar to what I can see, for example in the GIMP, image becomes too lightened. Below is my current fragment of the code:
const int normalBlackPoint = 0;
const int normalMidtonePoint = 127;
const int normalWhitePoint = 255;
const double normalLowRange = normalMidtonePoint - normalBlackPoint + 1;
const double normalHighRange = normalWhitePoint - normalMidtonePoint;
int blackPoint = 53;
int midtonePoint = 110;
int whitePoint = 168;
int outputFrom = 0;
int outputTo = 255;
double outputRange = outputTo - outputFrom + 1;
double lowRange = midtonePoint - blackPoint + 1;
double highRange = whitePoint - midtonePoint;
double fullRange = whitePoint - blackPoint + 1;
double lowPart = lowRange / fullRange;
double highPart = highRange / fullRange;
int dim(256);
cv::Mat lut(1, &dim, CV_8U);
for(int i = 0; i < 256; ++i)
{
double p = i > normalMidtonePoint
? (static_cast<double>(i - normalMidtonePoint) / normalHighRange) * highRange * highPart + lowPart
: (static_cast<double>(i + 1) / normalLowRange) * lowRange * lowPart;
int v = static_cast<int>(outputRange * p ) + outputFrom - 1;
if(v < 0) v = 0;
else if(v > 255) v = 255;
lut.at<uchar>(i) = v;
}
....
cv::Mat sourceImage = cv::imread(inputFileName, CV_LOAD_IMAGE_COLOR);
if(!sourceImage.data)
{
std::cerr << "Error: couldn't load image " << inputFileName << "." << std::endl;
continue;
}
#if 0
const int forwardConversion = CV_BGR2YUV;
const int reverseConversion = CV_YUV2BGR;
#else
const int forwardConversion = CV_BGR2Lab;
const int reverseConversion = CV_Lab2BGR;
#endif
cv::Mat convertedImage;
cv::cvtColor(sourceImage, convertedImage, forwardConversion);
// Extract the L channel
std::vector<cv::Mat> convertedPlanes(3);
cv::split(convertedImage, convertedPlanes);
cv::LUT(convertedPlanes[0], lut, convertedPlanes[0]);
//dst.copyTo(convertedPlanes[0]);
cv::merge(convertedPlanes, convertedImage);
cv::Mat resImage;
cv::cvtColor(convertedImage, resImage, reverseConversion);
cv::imwrite(outputFileName, resImage);
Pseudocode for Photoshop's Levels Adjustment
First, calculate the gamma correction value to use for the midtone adjustment (if desired). The following roughly simulates Photoshop's technique, which applies gamma 9.99-1.00 for midtone values 0-128, and 1.00-0.01 for 128-255.
Apply gamma correction:
Gamma = 1
MidtoneNormal = Midtones / 255
If Midtones < 128 Then
MidtoneNormal = MidtoneNormal * 2
Gamma = 1 + ( 9 * ( 1 - MidtoneNormal ) )
Gamma = Min( Gamma, 9.99 )
Else If Midtones > 128 Then
MidtoneNormal = ( MidtoneNormal * 2 ) - 1
Gamma = 1 - MidtoneNormal
Gamma = Max( Gamma, 0.01 )
End If
GammaCorrection = 1 / Gamma
Then, for each channel value R, G, B (0-255) for each pixel, do the following in order.
Apply the input levels:
ChannelValue = 255 * ( ( ChannelValue - ShadowValue ) /
( HighlightValue - ShadowValue ) )
Apply the midtones:
If Midtones <> 128 Then
ChannelValue = 255 * ( Pow( ( ChannelValue / 255 ), GammaCorrection ) )
End If
Apply the output levels:
ChannelValue = ( ChannelValue / 255 ) *
( OutHighlightValue - OutShadowValue ) + OutShadowValue
Where:
All channel and adjustment parameter values are integers, 0-255 inclusive
Shadow/Midtone/HighlightValue are the input adjustment values (defaults 0, 128, 255)
OutShadow/HighlightValue are the output adjustment values (defaults 0, 255)
You should optimize things and make sure values are kept in bounds (like 0-255 for each channel)
For a more accurate simulation of Photoshop, you can use a non-linear interpolation curve if Midtones < 128. Photoshop also chops off the darkest and lightest 0.1% of the values by default.
Ignoring the midtone/Gamma, the Levels function is a simple linear scaling.
All input values are first linearly scaled so that all values less or equal to the "black point" are set to 0, and all values greater than or equal white point are set to 255.
Then all values are linearly scaled from 0/255 to the output range.
For the mid-point—it depends what you actually mean by that.
In GIMP, there is a Gamma value. The Gamma value is a simple exponent of the input values (after restricting to the black/white points).
For Gamma == 1, the values are not changed.
For gamma < 1, the values are darkened.

How can I calculate the curvature of an extracted contour by opencv?

I did use the findcontours() method to extract contour from the image, but I have no idea how to calculate the curvature from a set of contour points. Can somebody help me? Thank you very much!
While the theory behind Gombat's answer is correct, there are some errors in the code as well as in the formulae (the denominator t+n-x should be t+n-t). I have made several changes:
use symmetric derivatives to get more precise locations of curvature maxima
allow to use a step size for derivative calculation (can be used to reduce noise from noisy contours)
works with closed contours
Fixes:
* return infinity as curvature if denominator is 0 (not 0)
* added square calculation in denominator
* correct checking for 0 divisor
std::vector<double> getCurvature(std::vector<cv::Point> const& vecContourPoints, int step)
{
std::vector< double > vecCurvature( vecContourPoints.size() );
if (vecContourPoints.size() < step)
return vecCurvature;
auto frontToBack = vecContourPoints.front() - vecContourPoints.back();
std::cout << CONTENT_OF(frontToBack) << std::endl;
bool isClosed = ((int)std::max(std::abs(frontToBack.x), std::abs(frontToBack.y))) <= 1;
cv::Point2f pplus, pminus;
cv::Point2f f1stDerivative, f2ndDerivative;
for (int i = 0; i < vecContourPoints.size(); i++ )
{
const cv::Point2f& pos = vecContourPoints[i];
int maxStep = step;
if (!isClosed)
{
maxStep = std::min(std::min(step, i), (int)vecContourPoints.size()-1-i);
if (maxStep == 0)
{
vecCurvature[i] = std::numeric_limits<double>::infinity();
continue;
}
}
int iminus = i-maxStep;
int iplus = i+maxStep;
pminus = vecContourPoints[iminus < 0 ? iminus + vecContourPoints.size() : iminus];
pplus = vecContourPoints[iplus > vecContourPoints.size() ? iplus - vecContourPoints.size() : iplus];
f1stDerivative.x = (pplus.x - pminus.x) / (iplus-iminus);
f1stDerivative.y = (pplus.y - pminus.y) / (iplus-iminus);
f2ndDerivative.x = (pplus.x - 2*pos.x + pminus.x) / ((iplus-iminus)/2*(iplus-iminus)/2);
f2ndDerivative.y = (pplus.y - 2*pos.y + pminus.y) / ((iplus-iminus)/2*(iplus-iminus)/2);
double curvature2D;
double divisor = f1stDerivative.x*f1stDerivative.x + f1stDerivative.y*f1stDerivative.y;
if ( std::abs(divisor) > 10e-8 )
{
curvature2D = std::abs(f2ndDerivative.y*f1stDerivative.x - f2ndDerivative.x*f1stDerivative.y) /
pow(divisor, 3.0/2.0 ) ;
}
else
{
curvature2D = std::numeric_limits<double>::infinity();
}
vecCurvature[i] = curvature2D;
}
return vecCurvature;
}
For me curvature is:
where t is the position inside the contour and x(t) resp. y(t) return the related x resp. y value. See here.
So, according to my definition of curvature, one can implement it this way:
std::vector< float > vecCurvature( vecContourPoints.size() );
cv::Point2f posOld, posOlder;
cv::Point2f f1stDerivative, f2ndDerivative;
for (size_t i = 0; i < vecContourPoints.size(); i++ )
{
const cv::Point2f& pos = vecContourPoints[i];
if ( i == 0 ){ posOld = posOlder = pos; }
f1stDerivative.x = pos.x - posOld.x;
f1stDerivative.y = pos.y - posOld.y;
f2ndDerivative.x = - pos.x + 2.0f * posOld.x - posOlder.x;
f2ndDerivative.y = - pos.y + 2.0f * posOld.y - posOlder.y;
float curvature2D = 0.0f;
if ( std::abs(f2ndDerivative.x) > 10e-4 && std::abs(f2ndDerivative.y) > 10e-4 )
{
curvature2D = sqrt( std::abs(
pow( f2ndDerivative.y*f1stDerivative.x - f2ndDerivative.x*f1stDerivative.y, 2.0f ) /
pow( f2ndDerivative.x + f2ndDerivative.y, 3.0 ) ) );
}
vecCurvature[i] = curvature2D;
posOlder = posOld;
posOld = pos;
}
It works on non-closed pointlists as well. For closed contours, you may would like to change the boundary behavior (for the first iterations).
UPDATE:
Explanation for the derivatives:
A derivative for a continuous 1 dimensional function f(t) is:
But we are in a discrete space and have two discrete functions f_x(t) and f_y(t) where the smallest step for t is one.
The second derivative is the derivative of the first derivative:
Using the approximation of the first derivative, it yields to:
There are other approximations for the derivatives, if you google it, you will find a lot.
Here's a python implementation mainly based on Philipp's C++ code. For those interested, more details on the derivation can be found in Chapter 10.4.2 of:
Klette & Rosenfeld, 2004: Digital Geometry
def getCurvature(contour,stride=1):
curvature=[]
assert stride<len(contour),"stride must be shorther than length of contour"
for i in range(len(contour)):
before=i-stride+len(contour) if i-stride<0 else i-stride
after=i+stride-len(contour) if i+stride>=len(contour) else i+stride
f1x,f1y=(contour[after]-contour[before])/stride
f2x,f2y=(contour[after]-2*contour[i]+contour[before])/stride**2
denominator=(f1x**2+f1y**2)**3+1e-11
curvature_at_i=np.sqrt(4*(f2y*f1x-f2x*f1y)**2/denominator) if denominator > 1e-12 else -1
curvature.append(curvature_at_i)
return curvature
EDIT:
you can use convexityDefects from openCV, here's a link
a code example to find fingers based in their contour (variable res) source
def calculateFingers(res,drawing): # -> finished bool, cnt: finger count
# convexity defect
hull = cv2.convexHull(res, returnPoints=False)
if len(hull) > 3:
defects = cv2.convexityDefects(res, hull)
if type(defects) != type(None): # avoid crashing. (BUG not found)
cnt = 0
for i in range(defects.shape[0]): # calculate the angle
s, e, f, d = defects[i][0]
start = tuple(res[s][0])
end = tuple(res[e][0])
far = tuple(res[f][0])
a = math.sqrt((end[0] - start[0]) ** 2 + (end[1] - start[1]) ** 2)
b = math.sqrt((far[0] - start[0]) ** 2 + (far[1] - start[1]) ** 2)
c = math.sqrt((end[0] - far[0]) ** 2 + (end[1] - far[1]) ** 2)
angle = math.acos((b ** 2 + c ** 2 - a ** 2) / (2 * b * c)) # cosine theorem
if angle <= math.pi / 2: # angle less than 90 degree, treat as fingers
cnt += 1
cv2.circle(drawing, far, 8, [211, 84, 0], -1)
return True, cnt
return False, 0
in my case, i used about the same function to estimate the bending of board while extracting the contour
OLD COMMENT:
i am currently working in about the same, great information in this post, i'll come back with a solution when i'll have it ready
from Jonasson's answer, Shouldn't be here a tuple on the right side too?, i believe it won't unpack:
f1x,f1y=(contour[after]-contour[before])/stride
f2x,f2y=(contour[after]-2*contour[i]+contour[before])/stride**2

Efficient C/C++ algorithm on 2-dimensional max-sum window

I have a c[N][M] matrix where I apply a max-sum operation over a (K+1)² window. I am trying to reduce the complexity of the naive algorithm.
In particular, here's my code snippet in C++:
<!-- language: cpp -->
int N,M,K;
std::cin >> N >> M >> K;
std::pair< unsigned , unsigned > opt[N][M];
unsigned c[N][M];
// Read values for c[i][j]
// Initialize all opt[i][j] at (0,0).
for ( int i = 0; i < N; i ++ ) {
for ( int j = 0; j < M ; j ++ ) {
unsigned max = 0;
int posX = i, posY = j;
for ( int ii = i; (ii >= i - K) && (ii >= 0); ii -- ) {
for ( int jj = j; (jj >= j - K) && (jj >= 0); jj -- ) {
// Ignore the (i,j) position
if (( ii == i ) && ( jj == j )) {
continue;
}
if ( opt[ii][jj].second > max ) {
max = opt[ii][jj].second;
posX = ii;
posY = jj;
}
}
}
opt[i][j].first = opt[posX][posY].second;
opt[i][j].second = c[i][j] + opt[posX][posY].first;
}
}
The goal of the algorithm is to compute opt[N-1][M-1].
Example: for N = 4, M = 4, K = 2 and:
c[N][M] = 4 1 1 2
6 1 1 1
1 2 5 8
1 1 8 0
... the result should be opt[N-1][M-1] = {14, 11}.
The running complexity of this snippet is however O(N M K²). My goal is to reduce the running time complexity. I have already seen posts like this, but it appears that my "filter" is not separable, probably because of the sum operation.
More information (optional): this is essentially an algorithm which develops the optimal strategy in a "game" where:
Two players lead a single team in a N × M dungeon.
Each position of the dungeon has c[i][j] gold coins.
Starting position: (N-1,M-1) where c[N-1][M-1] = 0.
The active player chooses the next position to move the team to, from position (x,y).
The next position can be any of (x-i, y-j), i <= K, j <= K, i+j > 0. In other words, they can move only left and/or up, up to a step K per direction.
The player who just moved the team gets the coins in the new position.
The active player alternates each turn.
The game ends when the team reaches (0,0).
Optimal strategy for both players: maximize their own sum of gold coins, if they know that the opponent is following the same strategy.
Thus, opt[i][j].first represents the coins of the player who will now move from (i,j) to another position. opt[i][j].second represents the coins of the opponent.
Here is a O(N * M) solution.
Let's fix the lower row(r). If the maximum for all rows between r - K and r is known for every column, this problem can be reduced to a well-known sliding window maximum problem. So it is possible to compute the answer for a fixed row in O(M) time.
Let's iterate over all rows in increasing order. For each column the maximum for all rows between r - K and r is the sliding window maximum problem, too. Processing each column takes O(N) time for all rows.
The total time complexity is O(N * M).
However, there is one issue with this solution: it does not exclude the (i, j) element. It is possible to fix it by running the algorithm described above twice(with K * (K + 1) and (K + 1) * K windows) and then merging the results(a (K + 1) * (K + 1) square without a corner is a union of two rectangles with K * (K + 1) and (K + 1) * K size).