I recently implemented a linear sampled gaussian blur based on this article: Linear Sampled Gaussian Blur
It generally came out well, however it appears there is slight aliasing on text and thinner collections of pixels. I'm pretty stumped as to what is causing this, is it an issue with my shader or weight calculations or is it an inherit draw back of using this method?
I'd like to add that I don't run into this issue when I sample each pixel regularly instead of using bilinear filtering.
Any insights are much appreciated. Here's a code sample of how I work out my weights:
int support = int(sigma * 3.0f);
float total = 0.0f;
weights.push_back(exp(-(0*0)/(2*sigma*sigma))/(sqrt(2*constants::pi)*sigma));
total += weights.back();
offsets.push_back(0);
for (int i = 1; i <= support; i++)
{
float w1 = exp(-(i*i)/(2*sigma*sigma))/(sqrt(2*constants::pi)*sigma);
float w2 = exp(-((i+1)*(i+1))/(2*sigma*sigma))/(sqrt(2*constants::pi)*sigma);
weights.push_back(w1 + w2);
total += 2.0f * weights[i];
offsets.push_back((i * w1 + (i + 1) * w2) / weights[i]);
}
for (int i = 0; i < support; i++)
{
weights[i] /= total;
}
And here is the fragment shader (there is another vertical version of this shader too):
void main()
{
vec3 acc = texture2D(tex_object, v_tex_coord.st).rgb*weights[0];
for (int i = 1; i < NUM_SAMPLES; i++)
{
acc += texture2D(tex_object, (v_tex_coord.st+(vec2(offsets[i], 0.0)/tex_size))).rgb*weights[i];
acc += texture2D(tex_object, (v_tex_coord.st-(vec2(offsets[i], 0.0)/tex_size))).rgb*weights[i];
}
gl_FragColor = vec4(acc, 1.0);
Here is a screenshot depicting the issue:
This looks like a correct gaussian blur to me. The extent to which text is disrupted depends on your sigma. What value are you using?
Also I would check the scaling matrix for the projection you are using.
If you want to blur but without affecting text and thin pixel lines, you might think of
compositing the result with the output of a mild high-pass filter
use a smaller sigma
change the shape of the kernel so it's not gaussian: rather than exp(-i*i/s*s), you might try a function with higher excess kurtosis. You could try a linear up/down function, or one of the functions listed on this page instead: http://en.wikipedia.org/wiki/Kurtosis . They will all lead to blurs with varying degrees of disrupting fine detail.
This is an inherent issue with the bilinear filtering. It's unavoidable.
Related
I have a problem when re-computing my surface normals of a mesh in open3d. The problem is that the estimation is not good enough and I don't know how to make it better. From the image below it cannot be seen that the mesh has a hole in the middle of the belly.
However, if the same mesh is seen from the side it clearly has a hole.
If I then use MeshLab to do the normal estimation the hole can suddenly clearly be seen. MeshLab support 4 different functions for the normal estimation of a mesh but the result are more or less the same no matter which function I use.
Here is the same mesh after estimating normals with MeshLab.
I find it very strange that open3d does no even come close to the accuracy of MeshLabs normal estimation, and I believe that it is most likely because I miss some important calculation before using open3d normal estimation function.
Here is the code which is use for the normal estimation in open3d:
void ReconstructionSystem::constructMeshDeformation(glm::vec3& intersectionPosition, glm::vec3& robotPosition) {
double depthOfProbe = glm::distance(intersectionPosition.z, robotPosition.z);
double affectedArea = 0.015 * (depthOfProbe*100.0); // 0.015 is a random value the function simply works good with it
if (affectedArea > 0.08) {
affectedArea = 0.08;
}
deformatedMesh = std::make_shared<open3d::geometry::TriangleMesh>(*final_mesh);
int i = 0;
for each (Eigen::Vector3d vertex in deformatedMesh->vertices_) {
glm::vec3 vex = glm::vec3(vertex.x(), vertex.y(), vertex.z());
double dist = glm::distance(vex, intersectionPosition);
if (dist < affectedArea) {
double ratio = dist / affectedArea;
double deformationAmount = glm::cos((2.0 * M_PI * ratio) / 4.0);
deformatedMesh->vertices_.at(i).z() -= depthOfProbe * deformationAmount;
}
i++;
}
*deformatedMesh = deformatedMesh->ComputeVertexNormals();
}
I'm attempting to improve performance of the OpenCV lanczos interpolation algorithm for applying homography transformations to astronomical images, as it is prone to ringing artefacts around stars in some images.
My approach is to apply homography twice, once using lanczos and once using bilinear filtering which is not susceptible to ringing, but doesn't perform as well at preserving detail. I then use the bilinear-interpolated output as a guide image, and clamp the lanczos-interpolated output to the guide if it undershoots the guide by more than a given percentage.
I have working code (below) but have 2 questions:
It doesn't seem optimal to iterate across elements in the Mat. Is there a better way of doing the compare and replace loop using OpenCV Mat methods?
My overall approach is computationally expensive - I'm applying homography to the entire Mat twice. Is there an overall better approach to preventing deringing of lanczos interpolation? (Rewriting the entire algorithm plus all the various optimisations that OpenCV makes available is not an option for me.)
warpPerspective(in, out, H, Size(target_rx, target_ry), interpolation, BORDER_TRANSPARENT);
if (interpolation == OPENCV_LANCZOS4) {
int count = 0;
// factor sets how big an undershoot can be tolerated
double factor = 0.75;
// Create guide image
warpPerspective(in, guide, H, Size(target_rx, target_ry), OPENCV_LINEAR, BORDER_TRANSPARENT);
// Compare the two, replace out pixels with guide pixels if too far out
for (int i = 0 ; i < out.rows ; i++) {
const double* outi = out.ptr<double>(i);
const double* guidei = guide.ptr<double>(i);
for (int j = 0; j < out.cols ; j++) {
if (outi[j] < guidei[j] * factor) {
out.at<double>(i, j) = guidei[j];
count++;
}
}
}
}
With a steer from Christoph Rackwitz, the answer was surprisingly simple:
compare(out, (guide * factor), mask, CMP_LT);
guide.copyTo(out, mask);
Thanks :)
In my program, I am downscaling an image of 500px or larger to an extreme level of approx 16px-32px. The source image is user-specified so I do not have control over its size. As you can imagine, few pixel interpolations hold up and inevitably the result is heavily aliased.
I've tried bilinear, bicubic and square average sampling. The square average sampling actually provides the most decent results but the smaller it gets, the larger the sampling radius has to be. As a result, it gets quite slow - slower than the other interpolation methods.
I have also tried an adaptive square average sampling so that the smaller it gets the greater the sampling radius, while the closer it is to its original size, the smaller the sampling radius. However, it produces problems and I am not convinced this is the best approach.
So the question is: What is the recommended type of pixel interpolation that is fast and works well on such extreme levels of downscaling?
I do not wish to use a library so I will need something that I can code by hand and isn't too complex. I am working in C++ with VS 2012.
Here's some example code I've tried as requested (hopefully without errors from my pseudo-code cut and paste). This performs a 7x7 average downscale and although it's a better result than bilinear or bicubic interpolation, it also takes quite a hit:
// Sizing control
ctl(0): "Resize",Range=(0,800),Val=100
// Variables
float fracx,fracy;
int Xnew,Ynew,p,q,Calc;
int x,y,p1,q1,i,j;
//New image dimensions
Xnew=image->width*ctl(0)/100;
Ynew=image->height*ctl(0)/100;
for (y=0; y<image->height; y++){ // rows
for (x=0; x<image->width; x++){ // columns
p1=(int)x*image->width/Xnew;
q1=(int)y*image->height/Ynew;
for (z=0; z<3; z++){ // channels
for (i=-3;i<=3;i++) {
for (j=-3;j<=3;j++) {
Calc += (int)(src(p1-i,q1-j,z));
} //j
} //i
Calc /= 49;
pset(x, y, z, Calc);
} // channels
} // columns
} // rows
Thanks!
The first point is to use pointers to your data. Never use indexes at every pixel. When you write: src(p1-i,q1-j,z) or pset(x, y, z, Calc) how much computation is being made? Use pointers to data and manipulate those.
Second: your algorithm is wrong. You don't want an average filter, but you want to make a grid on your source image and for every grid cell compute the average and put it in the corresponding pixel of the output image.
The specific solution should be tailored to your data representation, but it could be something like this:
std::vector<uint32_t> accum(Xnew);
std::vector<uint32_t> count(Xnew);
uint32_t *paccum, *pcount;
uint8_t* pin = /*pointer to input data*/;
uint8_t* pout = /*pointer to output data*/;
for (int dr = 0, sr = 0, w = image->width, h = image->height; sr < h; ++dr) {
memset(paccum = accum.data(), 0, Xnew*4);
memset(pcount = count.data(), 0, Xnew*4);
while (sr * Ynew / h == dr) {
paccum = accum.data();
pcount = count.data();
for (int dc = 0, sc = 0; sc < w; ++sc) {
*paccum += *i;
*pcount += 1;
++pin;
if (sc * Xnew / w > dc) {
++dc;
++paccum;
++pcount;
}
}
sr++;
}
std::transform(begin(accum), end(accum), begin(count), pout, std::divides<uint32_t>());
pout += Xnew;
}
This was written using my own library (still in development) and it seems to work, but later I changed the variables names in order to make it simpler here, so I don't guarantee anything!
The idea is to have a local buffer of 32 bit ints which can hold the partial sum of all pixels in the rows which fall in a row of the output image. Then you divide by the cell count and save the output to the final image.
The first thing you should do is to set up a performance evaluation system to measure how much any change impacts on the performance.
As said precedently, you should not use indexes but pointers for (probably) a substantial
speed up & not simply average as a basic averaging of pixels is basically a blur filter.
I would highly advise you to rework your code to be using "kernels". This is the matrix representing the ratio of each pixel used. That way, you will be able to test different strategies and optimize quality.
Example of kernels:
https://en.wikipedia.org/wiki/Kernel_(image_processing)
Upsampling/downsampling kernel:
http://www.johncostella.com/magic/
Note, from the code it seems you apply a 3x3 kernel but initially done on a 7x7 kernel. The equivalent 3x3 kernel as posted would be:
[1 1 1]
[1 1 1] * 1/9
[1 1 1]
I tried to create a shader that paints all edges black as you might know from cel shading. I've googled a lot and found many articles and source code how to create black outlines. Unfortunately, I do not understand most of them:
I found this article about feature edge rendering and tried it like this. Unfortunately, only the silhouette is black but not the edges that lie in the mesh. The same counts for this article.
Then I found this article about Frei-Chen edge detector but I have no idea how this whole thing works, even after studying the description for quite a long while.
Could someone give me some help how to program such a shader?
EDIT: I do not use textures for my meshes.
Since I got a few downvotes for being too unspecific, I want to refer to Frei-Chen Edge detector. Here's the fragment shader code from Rastergrid:
#version 330 core
uniform sampler2D image;
out vec4 color;
void main(void)
{
mat3 I;
float cnv[9];
vec3 sample;
/* fetch the 3x3 neighbourhood and use the RGB vector's length as intensity value */
for (int i=0; i<3; i++)
for (int j=0; j<3; j++) {
sample = texelFetch( image, ivec2(gl_FragCoord) + ivec2(i-1,j-1), 0 ).rgb;
I[i][j] = length(sample);
}
/* calculate the convolution values for all the masks */
for (int i=0; i<9; i++) {
float dp3 = dot(G[i][0], I[0]) + dot(G[i][1], I[1]) + dot(G[i][2], I[2]);
cnv[i] = dp3 * dp3;
}
float M = (cnv[0] + cnv[1]) + (cnv[2] + cnv[3]);
float S = (cnv[4] + cnv[5]) + (cnv[6] + cnv[7]) + (cnv[8] + M);
color = vec4(sqrt(M/S));
}
I skipped the G[9] matrix since this would blow up the code too much.
So I would very thankful if somebody could tell me how the assignment of
color = vec4(sqrt(M/S));
should work since sqrt(M/S) returns a single float to a vec4()? Thanks!
This is discussed if you read the GLSL specification. Construction of a vec4 using a single scalar constructs a vec4 with each component set to the scalar.
5.4.2 Vector and Matrix Constructors
Constructors can be used to create vectors or matrices from a set of scalars, vectors, or matrices. This includes the ability to shorten vectors.
If there is a single scalar parameter to a vector constructor, it is used to initialize all components of the constructed vector to that scalar’s value
How this is useful, I could not say. Duplicating data across multiple channels of an image is a big waste of memory bandwidth...
Basically, I want to detect a fault in an image using logistic regression. I'm hoping to get so feedback on my approach, which is as follows:
For training:
Take a small section of the image marked "bad" and "good"
Greyscale them, then break them up into a series of 5*5 pixel segments
Calculate the histogram of pixel intensities for each of these segments
Pass the histograms along with the labels to the Logistic Regression class for training
Break the whole image into 5*5 segments and predict "good"/"bad" for each segment.
Using the sigmod function the linear regression equation is:
1/ (1 - e^(xθ))
Where x is the input values and theta (θ) is the weights. I use gradient descent to train the network. My code for this is:
void LogisticRegression::Train(float **trainingSet,float *labels, int m)
{
float tempThetaValues[m_NumberOfWeights];
for (int iteration = 0; iteration < 10000; ++iteration)
{
// Reset the temp values for theta.
memset(tempThetaValues,0,m_NumberOfWeights*sizeof(float));
float error = 0.0f;
// For each training set in the example
for (int trainingExample = 0; trainingExample < m; ++trainingExample)
{
float * x = trainingSet[trainingExample];
float y = labels[trainingExample];
// Partial derivative of the cost function.
float h = Hypothesis(x) - y;
for (int i =0; i < m_NumberOfWeights; ++i)
{
tempThetaValues[i] += h*x[i];
}
float cost = h-y; //Actual J(theta), Cost(x,y), keeps giving NaN use MSE for now
error += cost*cost;
}
// Update the weights using batch gradient desent.
for (int theta = 0; theta < m_NumberOfWeights; ++theta)
{
m_pWeights[theta] = m_pWeights[theta] - 0.1f*tempThetaValues[theta];
}
printf("Cost on iteration[%d] = %f\n",iteration,error);
}
}
Where sigmoid and the hypothesis are calculated using:
float LogisticRegression::Sigmoid(float z) const
{
return 1.0f/(1.0f+exp(-z));
}
float LogisticRegression::Hypothesis(float *x) const
{
float z = 0.0f;
for (int index = 0; index < m_NumberOfWeights; ++index)
{
z += m_pWeights[index]*x[index];
}
return Sigmoid(z);
}
And the final prediction is given by:
int LogisticRegression::Predict(float *x)
{
return Hypothesis(x) > 0.5f;
}
As we are using a histogram of intensities the input and weight arrays are 255 elements. My hope is to use it on something like a picture of an apple with a bruise and use it to identify the brused parts. The (normalized) histograms for the whole brused and apple training sets look somthing like this:
For the "good" sections of the apple (y=0):
For the "bad" sections of the apple (y=1):
I'm not 100% convinced that using the intensites alone will produce the results I want but even so, using it on a clearly seperable data set isn't working either. To test it I passed it a, labeled, completely white and a completely black image. I then run it on the small image below:
Even on this image it fails to identify any segments as being black.
Using MSE I see that the cost is converging downwards to a point where it remains, for the black and white test it starts at about cost 250 and settles on 100. The apple chuncks start at about 4000 and settle on 1600.
What I can't tell is where the issues are.
Is, the approach sound but the implementation broken? Is logistic regression the wrong algorithm to use for this task? Is gradient decent not robust enough?
I forgot to answer this... Basically the problem was in my histograms which when generated weren't being memset to 0. As to the overall problem of whether or not logistic regression with greyscale images was a good solution, the answer is no. Greyscale just didn't provide enough information for good classification. Using all colour channels was a bit better but I think the complexity of the problem I was trying to solve (bruises in apples) was a bit much for simple logistic regression on its own. You can see the results on my blog here.