I have been trying to learn more about matrices in opengl; right now I'm stuck trying to understand where things are stored inside the modelview matrix. (location,scaling,rotations etc) This is obviously very important as understanding matrices is one of the first steps to fully understand modern opengl.
I have been trying to find some good articles, and I've currently found 2: (1,2)
However, I stil don't understand where the values are stored; any help is very appreciated (links, pinpointers etc)
Here is a reference of how different (affine) transformation matrices are constructed:
Identity:
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
Translate (x, y ,z):
1 0 0 x
0 1 0 y
0 0 1 z
0 0 0 1
Scale (sx, sy, sz):
sx 0 0 0
0 sy 0 0
0 0 sz 0
0 0 0 1
Rotate along x axis (by angle t):
1 0 0 0
0 cos(t) -sin(t) 0
0 sin(t) cos(t) 0
0 0 0 1
Rotate along y axis (by angle t):
cos(t) 0 sin(t) 0
0 1 0 0
-sin(t) 0 cos(t) 0
0 0 0 1
Rotate along z axis (by angle t):
cos(t) -sin(t) 0 0
sin(t) cos(t) 0 0
0 0 1 0
0 0 0 1
Related
-2 -1 0
-1 1 1
0 1 2
This is 3x3 emboss kernel. How should I write this in 5x5?
As I understand, these filters take directional differences (see the wikipidea page).
We can decompose you filter into directions
0 -1 0 0 0 0 -2 0 0
0 0 0 -1 0 1 0 0 0
0 1 0 0 0 0 0 0 2
So, I think you can expand it over these 3 directions giving emphasis
0 0 -1 0 0 0 0 0 0 0 -2 0 0 0 0
0 0 -1 0 0 0 0 0 0 0 0 -2 0 0 0
0 0 0 0 0 -1 -1 0 1 1 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 2 0
0 0 1 0 0 0 0 0 0 0 0 0 0 0 2
So, the final kernel would be
-2 0 -1 0 0
0 -2 -1 0 0
-1 -1 1 1 1
0 0 1 2 0
0 0 1 0 2
May be you can also try interpolating filter coefficients marked as x
-2 x -1 0 0
x -2 -1 0 0
-1 -1 1 1 1
0 0 1 2 x
0 0 1 x 2
The simple solution to fitting any lower-dimensional convolution kernel into a higher-dimensional matrix of the same rank is to surround it by zero weights. This is especially true when you're dealing with a concept like embossing, which is arguably more interested in immediate vector of change than the rate at which it is changing. That is, for this embossing matrix,
You could equivalently use this in 5 x 5:
Granted, this will get you a different visual effect than anything with any part of the matrix filled in; but sometimes, especially with edge-detection, immediate clarity is more important. We aren't always displaying it. If this were something like a Guassian blur kernel, having a greater range could improve the effect, but embossing isn't that different conceptually from Sobel-Feldman and it may be better to keep it tight.
I want to rotate an object with the face side to the center of another one, but I have some problems with it: when I try to rotate an object to another one and it lies on X axis, it works properly [first two screenshots], but when I try to rotate it as on the screenshot, everything breaks down [second two screenshots].
Before1:
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
After1:
0 0 -1 0
-0 -1 0 0
1 0 0 0
0 0 0 1
Before2:
0 0 -1 0
-0 -1 0 0
1 0 0 0
0 0 0 1
After2:
0 0 -0.707107 0
0.5 -0.5 0 0
0.707107 -0.707107 0 0
0 0 0 1
Here's my code:
void ConcreteObject::faceObjectTo(ConcreteObject otherObject) {
Vector<double> temp = {0, 1, 0};
Vector<double> forward = otherObject.getCenter() - this->getCenter();
forward.normalize();
Vector<double> right = temp.cross(forward);
right.normalize();
Vector<double> up = forward.cross(right);
Matrix<double> newMatrix = this->getTransformMatrix().getCurrentState();
newMatrix(0, 0) = right[0];
newMatrix(0, 1) = right[1];
newMatrix(0, 2) = right[2];
newMatrix(1, 0) = up[0];
newMatrix(1, 1) = up[1];
newMatrix(1, 2) = up[2];
newMatrix(2, 0) = forward[0];
newMatrix(2, 1) = forward[1];
newMatrix(2, 2) = forward[2];
TransformMatrix newObjectMatrix(newMatrix);
this->setTransformMatrix(newObjectMatrix);
}
You need to normalize right, there's no reason for temp and forward to be orthogonal, hence even if they are unit vectors, their crossproduct need not be.
I am writing a Tetris Clone, it is almost done, except for the collisions. For example In order to move the Piece Z I use a method:
void PieceZ::movePieceDown()
{
drawBlock (x1,y1++);
drawBlock (x2,y2++);
drawBlock (x3,y3++);
drawBlock (x4,y4++);
}
and in order to rotate a Piece I use a setter (because coordinates are private). For rotation I use a 90 degree clockwise rotation matrix. For example if I want to move (x1,y1) and (x2, y2) is my origin, to get x and y of a new block:
newX = (y1-y2) + x2;
newY = (x2-x1) + y2 + 1;
That works to some extent, it starts out as:
0 0 0 0
0 1 1 0
0 0 1 1
0 0 0 0
Then as planned it rotates to:
0 0 0 1
0 0 1 1
0 0 1 0
0 0 0 0
And then it rotates to Piece S:
0 0 0 0
0 0 1 1
0 1 1 0
0 0 0 0
And then it just alternates between the second and the third stages.
My calculations are wrong but I can't figure out where, I just need a little hint.
Ok here is how it should go (somewhat):
Determine where you want to rotate the piece (this could be the upper or lower corner or the center) and call it origin
Calculate the new x newX = y - origin.y;
Calculate the new y newY = -x + origin.x;
This should work (I got this idea from wikipedia and rotation matrixes: https://en.wikipedia.org/wiki/Transformation_matrix)
I am trying to access the sparse mlf with the keys such as BEpos and BEneg where one key per line. Now the problem is that most commands are not meant to deal with too large input: bin2dec requires clean binary numbers without spaces but the regexp hack fails to too many rows -- and so on.
How to work with sparse keys to access sparse data?
Example
K>> mlf=sparse([],[],[],2^31,1);
BEpos=Cg(pos,:)
BEpos =
(1,1) 1
(2,3) 1
(2,4) 1
K>> mlf(bin2dec(num2str(BEpos)))=1
Error using bin2dec (line 36)
Binary string must be 52 bits or less.
K>> num2str(BEpos)
ans =
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
K>> bin2dec(num2str('1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0'))
Error using bin2dec (line 36)
Binary string must be 52 bits or less.
K>> regexprep(num2str(BEpos),'[^\w'']','')
Error using regexprep
The 'STRING' input must be a one-dimensional array
of char or cell arrays of strings.
Manually works
K>> mlf(bin2dec('1000000000000000000000000000000'))
ans =
All zero sparse: 1-by-1
Consider a different approach using manual binary to decimal conversions:
pows = pow2(size(BEpos,2)-1 : -1 : 0);
inds = uint32(BEpos*pows.')
I haven't benchmarked this, but it might work faster than bin2dec and cell arrays.
How it works
This is pretty simple: the powers of 2 are calculated and stored in pows (assuming the MSB is in the leftmost position). Then they are multiplied by the bits in the matching positions and summed to produce the corresponding decimal values.
Try to index with this:
inds = uint32( bin2dec(cellstr(num2str(BEpos,'%d'))) );
Could you help me find the right algorithm for image resizing? I have an image of a number. The maximum size is 200x200, I need to get an image with size 15x15 or even less. The image is monochrome (black and white) and the result should be the same. That's the info about my task.
I've already tried one algorithm, here it is
// xscale, yscale - decrease/increase rate
for (int f = 0; f<=49; f++)
{
for (int g = 0; g<=49; g++)//49+1 - final size
{
xpos = (int)f * xscale;
ypos = (int)g * yscale;
picture3[f][g]=picture4[xpos][ypos];
}
}
But it won't work with the decrease of an image, which is my prior target.
Could you help me find an algorithm, which could solve that problem (quality mustn't be perfect, the speed doesn't even matter). Some information about it would be perfect too considering the fact I'm a newbie. Of course, a short piece of c/c++ code (or a library) will be perfect too.
Edit:
I've found an algorithm. Will it be suitable for compressing from 200 to 20?
The general approach is to filter the input to generate a smaller size, and threshold to convert to monochrome. The easiest filter to implement is a simple average, and it often produces OK results. The Sinc filter is theoretically the best but it's impractical to implement and has ringing artifacts which are often undesirable. Many other filters are available, such as Lanczos or Tent (which is the generalized form of Bilinear).
Here's a version of an average filter combined with thresholding. Assuming picture4 is the input with pixel values of 0 or 1, and the output is picture3 in the same format. I also assumed that x is the least significant dimension which is opposite to the usual mathematical notation, and opposite to the coordinates in your question.
int thumbwidth = 15;
int thumbheight = 15;
double xscale = (thumbwidth+0.0) / width;
double yscale = (thumbheight+0.0) / height;
double threshold = 0.5 / (xscale * yscale);
double yend = 0.0;
for (int f = 0; f < thumbheight; f++) // y on output
{
double ystart = yend;
yend = (f + 1) / yscale;
if (yend >= height) yend = height - 0.000001;
double xend = 0.0;
for (int g = 0; g < thumbwidth; g++) // x on output
{
double xstart = xend;
xend = (g + 1) / xscale;
if (xend >= width) xend = width - 0.000001;
double sum = 0.0;
for (int y = (int)ystart; y <= (int)yend; ++y)
{
double yportion = 1.0;
if (y == (int)ystart) yportion -= ystart - y;
if (y == (int)yend) yportion -= y+1 - yend;
for (int x = (int)xstart; x <= (int)xend; ++x)
{
double xportion = 1.0;
if (x == (int)xstart) xportion -= xstart - x;
if (x == (int)xend) xportion -= x+1 - xend;
sum += picture4[y][x] * yportion * xportion;
}
}
picture3[f][g] = (sum > threshold) ? 1 : 0;
}
}
I've now tested this code. Here's the input 200x200 image, followed by a nearest-neighbor reduction to 15x15 (done in Paint Shop Pro), followed by the results of this code. I'll leave you to decide which is more faithful to the original; the difference would be much more obvious if the original had some fine detail.
To properly downscale an image, you should divide your image up into square blocks of pixels and then use something like Bilinear Interpolation in order to find the right color of the pixel that should replace the NxN block of pixels you're doing the interpolation on.
Since I'm not so good at the math involved, I'm not going to try give you an example of how the code would like. Sorry :(
Since you're fine with using a library, you could look into the imagemagick C++ bindings.
You could also output the image in a simple format like a pbm, and then call the imagemagick command to resize it:
system("convert input.pbm -resize 10x10 -compress none output.pbm");
Sample output file (note: you don't need to use a new line for each row):
P1
20 20
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 0 0 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The output file:
P1
10 10
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 1 0 1 1 0
0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 1 1 1 1
1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0
I've found an implementation of a bilinear interpolaton. C code.
Assuming that:
a - a primary array (which we need to stretch/compress) pointer.
oldw - primary width
oldh - primary height
b - a secondary array (which we get after compressing/stretching) pointer
neww - secondary width
newh - seconday height
#include <stdio.h>
#include <math.h>
#include <sys/types.h>
void resample(void *a, void *b, int oldw, int oldh, int neww, int newh)
{
int i;
int j;
int l;
int c;
float t;
float u;
float tmp;
float d1, d2, d3, d4;
u_int p1, p2, p3, p4; /* nearby pixels */
u_char red, green, blue;
for (i = 0; i < newh; i++) {
for (j = 0; j < neww; j++) {
tmp = (float) (i) / (float) (newh - 1) * (oldh - 1);
l = (int) floor(tmp);
if (l < 0) {
l = 0;
} else {
if (l >= oldh - 1) {
l = oldh - 2;
}
}
u = tmp - l;
tmp = (float) (j) / (float) (neww - 1) * (oldw - 1);
c = (int) floor(tmp);
if (c < 0) {
c = 0;
} else {
if (c >= oldw - 1) {
c = oldw - 2;
}
}
t = tmp - c;
/* coefficients */
d1 = (1 - t) * (1 - u);
d2 = t * (1 - u);
d3 = t * u;
d4 = (1 - t) * u;
/* nearby pixels: a[i][j] */
p1 = *((u_int*)a + (l * oldw) + c);
p2 = *((u_int*)a + (l * oldw) + c + 1);
p3 = *((u_int*)a + ((l + 1)* oldw) + c + 1);
p4 = *((u_int*)a + ((l + 1)* oldw) + c);
/* color components */
blue = (u_char)p1 * d1 + (u_char)p2 * d2 + (u_char)p3 * d3 + (u_char)p4 * d4;
green = (u_char)(p1 >> 8) * d1 + (u_char)(p2 >> 8) * d2 + (u_char)(p3 >> 8) * d3 + (u_char)(p4 >> 8) * d4;
red = (u_char)(p1 >> 16) * d1 + (u_char)(p2 >> 16) * d2 + (u_char)(p3 >> 16) * d3 + (u_char)(p4 >> 16) * d4;
/* new pixel R G B */
*((u_int*)b + (i * neww) + j) = (red << 16) | (green << 8) | (blue);
}
}
}
Hope it will be useful for other users. But nevertheless I still doubth whether it will work in my situation (when not stratching, but compressing an array). Any ideas?
I think, you need Interpolation. There are a lot of algorithms, for example you can use Bilinear interpolation
If you use Win32, then StretchBlt function possibly help.
The StretchBlt function copies a bitmap from a source rectangle into a destination rectangle, stretching or compressing the bitmap to fit the dimensions of the destination rectangle, if necessary. The system stretches or compresses the bitmap according to the stretching mode currently set in the destination device context.
One approach to downsizing a 200x200 image to, say 100x100, would be to take every 2nd pixel along each row and column. I'll leave you to roll your own code for downsizing to a size which is not a divisor of the original size. And I provide no warranty as to the suitability of this approach for your problem.