x,y,z to vector offset - c++

I know this may sound stupid but I'm goin crazy with this XD
I'm loading ad image (with ImageMagick) into a 1D vector, so that I have something like:
012345678...
RGBRGBRGB...
Where 0-. Are obviously the indexes of the vector, and R, G, and B are respectively the red byte, green byte, and blue byte.
So I have a WIDTHxHEIGHTx3 bytes vector.
Now, let's say I want to access the x,y,z byte, where z is the index of the color, which is the transformation formula to have a linear offset into the vector?

This expression produces an index to color component z at pixel (x,y):
((y * WIDTH) + x) * 3 + z
Assumptions are:
Data is placed in row-major order.
No padding/alignment bytes are used between rows.

Assuming your data is stored as a series of rows (not a crazy assumption), you can find byte x,y,z at y*WIDTH*3 + 3*x + z

Related

Coordinates of a pixel between two images

I am looking for a solution to easily compute the pixel coordinate from two images.
Question: If you take the following code, how could I compute the pixel coordinate that changed from the "QVector difference" ? Is it possible to have an (x,y) coordinate and find on the currentImage which pixel it represents ?
char *previousImage;
char *currentImage;
QVector difference<LONG>;
for(int i = 0 ; i < CurrentImageSize; i++)
{
//Check if pixels are the same (we can also do it with RGB values, this is just for the example)
if(previousImagePixel != currentImagePixel)
{
difference.push_back(currentImage - previousImage);
}
currentImage++;
}
EDIT:
More information about this topic:
The image is in RGB format
The width, the height and the bpp of both images are known
I have a pointer to the bytes representing the image
The main objective here is to clearly know what is the new value of a pixel that changed between the two images and to know which pixel is it (its coordinates)
There is not enough information to answer, but I will try to give you some idea.
You have declared char *previousImage;, which implies to me that you have a pointer to the bytes representing an image. You need more than that to interpret the image.
You need to know the pixel format. You mention RGB, So -- for the time being, let's assume that the image uses 3 bytes for each pixel and the order is RGB
You need to know the width of the image.
Given the above 2, you can calculate the "Row Stride", which is the number of bytes that a row takes up. This is usually the "bytes per pixel" * "image width", but it is typically padded out to be divisible by 4. So 3 bpp and a width of 15, would be 45 bytes + 3 bytes of padding to make the row stride 48.
Given that, if you have an index into the image data, you first integer-divide it against the row stride to get the row (Y coordinate).
The X coordinate is the (index mod the row stride) integer-divided by the bytes per pixel.
From what I understand, you want compute the displacement or motion that occured between two images. E.g. for each pixel I(x, y, t=previous) in previousImage, you want to know where it did go in currentImage, and what is his new coordinate I(x, y, t=current).
If that is the case, then it's called motion estimation and measuring the optical flow. There are many algorithms for that, who rely on more or less complex hypotheses, depending on the objects you observe in the image sequence.
The simpliest hypothesis is that if you follow a moving pixel I(x, y, t) in the scene you observe, its luminance will remain constant over time. In other words, dI(x,y,t) / dt = 0.
Since I(x, y, t) is function of three parameters (space and time) with two unknowns, and there is only one equation, this is an ill defined problem that has no easy solution. Many of the algorithms add an additional hypothesis, so that the problem can be solved with a unique solution.
You can use existing libraries which will do that for you, one of them which is pretty popular is openCV.

How to get a value from CV_8UC4 matrix

I successfully create and fill a matrix with depth and RGB data from the Kinect V2 libfreenect2 library like so:
cv::Mat(registered.height, registered.width, CV_8UC4, registered.data).copyTo(cpu_depth);
cv::imshow("depth", cpu_depth);
I believe this matrix is equivilent to [X,Y,Z,R,G,B,A] for each point within the image. How do I access the unsigned char values within the matrix?
I have tried like this:
uchar xValue = cpu_depth.at(cv::Point(20, 20))[0];
but it doesn't compile and I feel I am missing something very obvious.
I figured it out. You need to state that you have 4 bytes per chanel with a cast. So to correctly access points within the matrix you do this:
uchar xValue = cpu_depth.at<cv::Vec4b>(cv::Point(20, 20))[0];
This matrix is NOT equivalent to [X,Y,Z,R,G,B,A] for each point. This matrix is 2-dimensional array of cv::Vec4b elements (i. e. cv::Vec<uchar, 4> elements - one uchar element per channel). Each element can be (R, G, B, A) or (x, y, z, val) or something else - it's just 4 values at position (x, y).
Thus for access element in position (x, y) for desired channel you can use the following options:
cpu_depth.at<cv::Vec4b>(cv::Point(x, y))[channel] - get channel value at point (x, y);
cpu_depth.at<cv::Vec4b>(y, x)[channel] - get channel value at point (x, y) - matrix first index is row, that's why firstly y and then x;
*(cpu_depth.ptr<uchar>(y) + 4 * x + channel) - value of pointer in y-th row and x-th column, i. e. at position (x, y).

screen.h header file method confusion

`Obtain the stride (the number of bytes between pixels on different rows)
screen_get_buffer_property_iv(mScreenPixelBuffer, SCREEN_PROPERTY_STRIDE, &mStride)`
I don't understand what the first line meant about having bytes between pixels on different rows. The function is what the stride is obtained through.
If we have a rectangular bunch of pixels (a screen, bitmap, or some such), there must be a way for a program to calculate the position of a pixel. Lets call this sort of bunch of pixels a "surface".
The surface can be split into individual pixels, and we could just put then in a very long row and number then from 0 to some large number (e.g. a 1280 x 1024 screen would have 1310720 pixels). But if you show this long row of pixels on a screen, it makes more sense to talk about lines of pixels that are 1280 pixels long, and have 1024 rows of them.
Now, let's say we want to draw a line from pixels 100,100 to 100,200. We can easily write that as:
int i;
for(i = 100; i < 200; i++)
{
setpixel(surface, 100, 100+i, colour);
}
Now, if we want to implement setpixel, what do we need to do? One thing would be to translate our x, y coordinates (100, 100+i) into a location of our "long row of pixels".
The general formula tends to be (x + y * width) * bytes_per_pixel. So if we have a 32bpp image (four bytes per pixel), that would make (100 + (100+i) * 1280) * 4
However, to make it easier to design the graphics chip there are often limits on things like "the width of a surface must be an even multiple of X", where X is usually 16, 32, 64 or some other power of 2. Sometimes, it has to be a power of two directly (for example textures in early opengl can only be 2^n x 2^n pixels in size - you don't have to USE the entire texture). And this is where stride comes in.
Say we want to have a bitmap of 100 x 100 pixels. But the graphics chip that we use to draw the bitmap to the screen has a rule that you MUST have a even multiple of 32 pixels wide surfaces. So we make something like this
XXXXXXXXXX...
XXXXXXXXXX...
XXXXXXXXXX...
XXXXXXXXXX...
XXXXXXXXXX...
XXXXXXXXXX...
XXXXXXXXXX...
XXXXXXXXXX...
XXXXXXXXXX...
XXXXXXXXXX...
The X's here represent the actual pixels (10 per X) in our bitmap, and the ... 28 pixels of "waste" that we have to have to make the graphics chip happy.
Now the formula of using width doesn't work, because from the software creating the bitmap, the width is 100 pixels. We need to change the math to make up for the "extra space at the end of each row of pixels":
(x + y * stride) * bytes_per_pixel
Now, the stride is 128, but the width is 100 pixels.
Stride here refers to array stride, the number of bytes between memory locations that correspond to the beginning of adjacent rows of an array, in this case of pixels.
In a fully packed array, the stride equals the size of an individual pixel multiplied with the number of pixels in the row. For performance reasons, arrays are frequently aligned so that each row takes a "round" number of bytes, typically an exponent of two. The byte size of the row, aka the stride, cannot be computed from other array parameters and must be known in order to correctly calculate the memory position of an arbitrary pixel.

Using X,Y Coords With A Bitmap Image

I have a c/c++ program running on my Linux box that loads up a 24bit bitmap, reads the two headers and then stores the image data into a char* variable. I have verified this works by dumping that variables contents into a raw binary file and compared it to the original bitmap+offset. I used the code from HERE and unmodified and takes care or reordering into RGB and bottom up.
Now if I have a list of coordinates like X, Y, Width, Height how the heck do I translate these into the byte offsets of my image?!
In MY CODE you see that I am calculating the width of one scanline and the glyph location to find Y and then adding a scanline for each y+1. Similarly for X I am iterating over by three bytes at a time. And finally I store those three bytes sequentially into my temporary character array.
In truth I do not need the pixel data as the glyph is a 0xFF or 0x00 with no smoothing. I included it to make sure my bits where accounted for.
HERE is the image I am using.
EDIT: --------------------------------------------
As mentioned below my math was a bit quarky. fixed the line in the i,j,k loop to:
tmpChar[i][j][k] = img.data[(((Y+j) * imgWidth) + (X + i)) * imgBPP + k];
As for my programs output HERE as you can see it loads the bitmap fine and the header info is proper but when I try to display the contents of the tmpChar array its all 0xFF (I used a signed int so 0xFF = -1 and 0x00 = +0)
The layout in memory of the image is (ignoring that I might have reversed R, G and B):
[R of pixel 0] [G of pixel 0] [B of pixel 0] ....... [B of (0, imgWidth-1)] [R of pixel (1, 0)] .....
So to calculate the offset of any given pixel: offset = ((Y * imgWidth) + X) * imgBPP + colorByte.
Giving for your inner loop, as far as I can tell and assuming your X and Y for the character are correct:
tmpChar[i][j][k] = img.data[(((Y+j) * imgWidth) + (x + i)) * imgBPP + k];
I guess that the pixels are stored in a upside-down order in memory, as is usual with BMP file format:
Normally pixels are stored "upside-down" with respect to normal image
raster scan order, starting in the lower left corner, going from left
to right, and then row by row from the bottom to the top of the image
So your code may be reading the wrong block of pixels.

Working out positions with array indexes

I have an array that represents a grid
For the sake of this example we will start the array at 1 rather that 0 because I realized after doing the picture, and can't be bothered to edit it
In this example blue would have an index of 5, green an index of 23 and red 38
Each color represents an object and the array index represents where the object is. I have implemented very simple gravity, whereby if the grid underneath is empty x + (WIDTH * (y + 1)) then the grid below is occupied by this object, and the grid that the object was in becomes empty.
This all works well in its current form, but what I want to do is make it so that red is the gravity point, so that in this example, blue will move to array index 16 and then 27.
This is not too bad, but how would the object be able to work out dynamically where to move, as in the example of the green grid? How can I get it to move to the correct index?
Also, what would be the best way to iterate through the array to 'find' the location of red? I should also note that red won't always be at 38
Any questions please ask, also thank you for your help.
This sounds very similar to line rasterization. Just imagine the grid to be a grid of pixels. Now when you draw a line from the green point to the red point, the pixels/cells that the line will pass are the cells that the green point should travel along, which should indeed be the shortest path from the green point to the red point along the discrete grid cells. You then just stop once you encounter a non-empty grid cell.
Look for Bresenham's algorithm as THE school book algorithm for line rasterization.
And for searching the red point, just iterate over the array linearly until you have it and then keep track of its grid position, like William already suggested in his answer.
x = x position
y = y position
cols = number of columns across in your grid
(y * cols) + x = index in array absolute value for any x, y
you could generalize this in a function:
int get_index(int x, int y, int gridcols)
{
return (gridcols * y) + x;
}
It should be noted that this works for ZERO BASED INDICES.
This is assuming I am understanding what you're talking about at all...
As for the second question, for any colored element you have, you should keep a value in memory (possibly stored in a structure) that keeps track of its position so you don't have to search for it at all.
struct _THING {
int xpos;
int ypos;
};
Using the get_index() function, you could find the index of the grid cell below it by calling like this:
index_below = get_index(thing.x, thing.y + 1, gridcols);
thing.y++; // increment the thing's y now since it has moved down
simple...
IF YOU WANT TO DO IT IN REVERSE, as in finding the x,y position by the array index, you can use the modulus operator and division.
ypos = array_index / total_cols; // division without remainder
xpos = array_index % total_cols; // gives the remainder
You could generalize this in a function like this:
// x and y parameters are references, and return values using these references
void get_positions_from_index(int array_index, int total_columns, int& x, int& y)
{
y = array_index / total_columns;
x = array_index % total_columns;
}
Whenever you're referring to an array index, it must be zero-based. However, when you are referring to the number of columns, that value will be 1-based for the calculations. x and y positions will also be zero based.
Probably easiest would be to work entirely in a system of (x,y) coordinates to calculate gravity and switch to the array coordinates when you finally need to lookup and store objects.
In your example, consider (2, 4) (red) to be the center of gravity; (5, 1) (blue) needs to move in the direction (2-5, 4-1) == (-3, 3) by the distance _n_. You get decide how simple you want n to be -- it could be that you move your objects to an adjoining element, including diagonals, so move (blue) to (5-1, 1+1) == (4, 2). Or perhaps you could move objects by some scalar multiple of the unit vector that describes the direction you need to move. (Say, heavier objects move further because the attraction of gravity is stronger. Or, lighter objects move further because they have less inertia to overcome. Or objects move further the closer they are to the gravity well, because gravity is an inverse square law).
Once you've sorted out the virtual coordinates of your universe, then convert your numbers (4, 2) via some simple linear formulas: 4*columns + 2 -- or just use multidimensional arrays and truncate your floating-point results to get your array indexes.