Why? specific pixel processing should be process (h, w) not (w, h) or (x, y) - c++

Some sample code about image processing using OpenCV give somethings like this:
for(i=0;i<height;i++)
{
for(j=0;j<width;j++)
{
if(pointPolygonTest(Point(i,j),myPolygon))
{
// do some processing
}
}
}
In the iteration, why we need to start from height and width? and also why the Point is store (height, width) so that is -> (y,x) ?

Ranges between [0..Height] and [0..Width] are maximum boundaries of your working area.
This code is testing which pixels of whole image are inside the polygon myPolygon.
The word "whole" means you should check all pixels of your image so you should iterate from 0 to height for Y, and iterate from 0 to width for X.

Actually here, the row/column convention is used to iterate over the whole image.
height = Number of Rows
width = Number of Columns
The image is being accessed row wise.The outer loop is iterating over rows of the image and the inner loop is iterating on columns. So basically i is the current row and j is the current column of the image.
The inner loop processes a complete row of the image.

Related

How to iterate through specific elements in a vector C++?

I'm making a game with C++ and SFML and was wondering if there's a way to iterate through specific elements in a vector. I have a vector of tiles which makes up the game world, but depending on the game map's size, (1000 x 1000 tiles) iterating through all of them seems very inefficient. I was wondering if there was a way to say "for each tile in vector of tiles that (fits a condition)". Right now, my code for drawing these tiles looks like this:
void Tile::draw()
{
for (const auto& TILE : tiles)
{
if (TILE.sprite.getGlobalBounds().intersects(Game::drawCuller.getGlobalBounds()))
{
Game::window.draw(TILE.sprite);
}
}
}
As you can see, I'm only drawing the tiles in the view (or drawculler). If the vector is too large, it will take a really long time to iterate through it. This greatly impacts my fps. When I have a 100 x 100 tile map, I get around 800 fps, but when I use a 1000 x 1000 tile map, I get roughly 25 fps due to the lengthy iteration. I know that I could separate my tiles into chunks and only iterate through the ones in the current chunk, but I wanted something a little easier to implement. Any help would be appreciated :)
Given the following assumptions:
Your tiles are likely arranged on a regular grid with a (column, row) index.
Your tiles are likely inserted into your vector in row-major order, and is also likely fully-populated. So the index of a tile in your vector is likely (row * numColumns + column).
Your view is likely axis-aligned to the grid (where you can't rotate your view - as is the case with many 2d tile-based games)
If those assumptions hold true, then you can easily iterate through the appropriate range of tiles with a nested loop.
for (int row = minRow; row <= maxRow; ++row) {
for( int column = numColumn; column <= maxColumn; ++column) {
int index = row * numColumns + column;
// Here you can...
doSomethingWith(tiles[index]);
}
}
This just requires that you can compute the minRow, maxRow, minColumn, and maxColumn from your Game::drawCuller.getGlobalBounds(). You haven't disclosed the details, but it's likely something like a rectangle in world coordinates (which might be in some units like meters). It's likely either a left, top, width, height style rectangle or a min, max style bounds rectangle. Assuming the latter:
minViewColumn = floor((bounds.minInMeters.x - originOfGridInMeters.x) / gridTileSizeInMeters);
maxViewColumn = ceil((bounds.maxInMeters.x - originOfGridInMeters.x) / gridTileSizeInMeters);
// similarly for rows
minViewRow = floor((bounds.minInMeters.y - originOfGridInMeters.y) / gridTileSizeInMeters);
maxViewRow = ceil((bounds.maxInMeters.y - originOfGridInMeters.y) / gridTileSizeInMeters);
The originOfGridInMeters is the global coordinates of top-left corner of the tile at (row=0, column=0), which may very well be (0, 0), conveniently, if you set up your world like that. And gridTileSizeInMeters is, well, just that; presumably your tiles have a square aspect ratio in world space.
If the view is permitted to go outside the extents of the tile array, minViewColumn, (and the other iterator ranges) may now be less than 0 or greater than or equal to the number of columns in your tile array. So, it would then be necessary to compute minColumn from minViewColumn by clipping it to the range of tiles stored in your grid. (Same goes for the other iteration extents.)
// Clip to the range of valid rows and columns.
minColumn = min(max(minViewColumn, 0), numColumns - 1);
maxColumn = min(max(maxViewColumn, 0), numColumns - 1);
minRow = min(max(minViewRow, 0), numRows - 1);
maxRow = min(max(maxViewRow, 0), numRows - 1);
Now do that loop I showed you above, and you're good to go!
I was wondering if there was a way to say "for each tile in vector of tiles that (fits a condition)
In general, no. The only way to know if an element fits a condition is to look at it and see if it fits the condition. You can't do that without iterating over all the elements and checking the condition for each.
The way to avoid this is to build some sort of index structure. For instance, if you have tiles with attributes that change rarely, you could pre-build vectors of pointers to all of your tiles with some attribute. That way you can check the condition only once (or rarely) instead of on each frame. For instance you could build separate vectors of all of your blue tiles, all of your red tiles, and all of your green tiles. Then if you want to iterate over all of the tiles of a certain color you could do "for each blue tile" directly instead of "for each tile, if it's blue". This generally trades storage/memory usage for execution speed.
The same concept applies to your specific situation, as you mentioned. You can pre-build caches of chunks, and quickly filter out whole chunks that aren't near your camera. This will prevent you from having to check every tile to see if it's in view.

Create continuous matrix of rectangles from set of rectangles

I have a set of objects(each object contains a rectangle and a value assigned to it) which is kept in a vector container.
See picture below:
I need to create a matrix by drawing horizontal and vertical lines at each y/x lower left (LL) / upper right(UR) coordinate like below:
And I need to assign value = 0 to each new empty rectangle, and to other rectangles which are inside of initial rectangles, I need to assign their old values.
I've implemented this with some naive algorithm but it works too slow when I have huge number of rectangles. My algorithm basically does the following:
- Stores all rectangles in a map container. Each element of the map contains set of rectangles with the same LL Y coordinate and they are sorted by LL X coordinate, i.e. key is LL Y coordinate.
- Stores all X/Y coordinates in set containers.
- Iterates over Y/X coordinate containers, and for each new rectangle finds out if it exists in map or not, if exists-assigns existing value to it, otherwise-assigns 0 value. I.e, for each new rectangle it looks for its LL Y coordinate in map, if such Y exists, then searches through the corresponding value(set of rectangles), otherwise-it searches in a whole map.
Is there an effective algorithm to get needed results?
For n rectangles this can be solved easily in O(n^3) time (or just O(n^2) time if at most a bounded number of rectangles intersect) by looking at the problem a different way. This should be adequate for handling up to thousands of rectangles in a few seconds.
Also, unless some other constraints are added to the problem, the latter time bound is optimal: that is, there exist inputs consisting of n non-intersecting rectangles for which O(n^2) smaller grid rectangles will need to be output (which of course requires O(n^2) time). An example such input is n width-1 rectangles, all having equal bottommost y co-ord and having heights 1, 2, ..., n.
Grid size bounds
First of all, notice that there can be at most 2n vertical lines, and at most 2n horizontal lines, since each input rectangle introduces at most 2 of each kind (it may introduce less if one or both vertical lines are also the edge(s) for some already-considered rectangle, and likewise for horizontal lines). So there can be at most (2*n - 1)^2 = O(n^2) cells in the grid defined by these lines.
The grid cell co-ordinate system
We can invent a co-ordinate system for grid cells in which each cell is identified by its lower-left corner, and the co-ordinates of an intersection of two grid lines is given simply by the number of horizontal grid lines below it and the number of vertical grid lines to its left (so that the bottommost, leftmost grid cell has co-ords (0, 0), the cell to its right has co-ords (1, 0), the cell two cells above that cell has co-ords (1, 2), etc.)
The algorithm
For each input rectangle having LL co-ords (x1, y1) and UR co-ords (x2, y2), we determine the horizontal and vertical intervals that it occupies within the new grid co-ordinate system, and then simply iterate through every cell (i, j) belonging to this rectangular region (i.e., every grid cell (i, j) such that toGridX(x1) <= i < toGridX(x2) and toGridY(y1) <= j < toGridY(y2)) with a nested for loop, recording in a hashtable that the ID (colour?) for the cell at (i, j) should be the colour of the current input rectangle. Input rectangles should be processed in decreasing z-order (implicitly at least there seems to be such an order, from your example) so that for any cell covered by more than one input rectangle, the hashtable will wind up recording whatever the "nearest" rectangle's colour is. Finally, iterate through the hash table, converting each grid co-ord pair (i, j) back to the LL and UR co-ords of the input-space rectangle that corresponds to this grid cell, and output this rectangle with the ID given by the value for this hash key.
Preprocessing
In order to accomplish the above, we need two things: a way to map input-space co-ordinates to grid co-ordinates (to determine the horizontal and vertical grid intervals for a given input rectangle), and a way to map grid co-ordinates back to input-space co-ordinates (to generate the output rectangles in the final step). Both operations are easy to do via that old workhorse, sorting.
Given any corner (x, y) of some input rectangle, the grid x co-ordinate corresponding to x, toGridX(x), is simply the rank position of x within the sorted list of all distinct x positions of vertical edges that are present among the input rectangles. Similarly, toGridY(y) is just the rank position of y within the sorted list of all distinct y positions of horizontal edges that are present among the input rectangles. In the other direction, for any grid co-ordinate (i, j), the corresponding input-space x co-ordinate, fromGridX(i), is simply the i-th smallest x co-ord (ignoring duplicates) of any vertical edge among the input rectangles, and similarly for fromGridY(j). These can all be computed as follows (all array indices start at 0, and I show only how to do it for x co-ords; y co-ords are similar):
For each rectangle i in the input having LL co-ords (x1, y1) and (x2, y2):
Append the two-element array [x1, i] to the list-of-arrays VERT.
Append the two-element array [x2, i] to the list-of-arrays VERT.
Sort the list VERT in increasing order by its first item.
Combine elements in VERT having identical x co-ords. Specifically:
Set j = 0.
For i from 1 to n-1:
If VERT[i][0] == VERT[j][0] then append VERT[i][1] to VERT[j] (thereby forming an array of length 3 or more at position j), otherwise set j = j + 1 and overwrite VERT[j] with the two-element array VERT[i].
Delete VERT[j+1] and all later elements from VERT.
By this time, for any i, VERT[i] is an array that contains (in its second and subsequent positions) the IDs of every input rectangle that uses, as either its left or right edge, the ith-leftmost distinct vertical line used by any input rectangle -- or in other words, the rank-i vertical line. We now "invert" this:
For i from 0 to n-1:
For j from 1 to length(VERT[i])-1:
Set toGridX[VERT[i][j]] = i.
For i from 0 to length(VERT)-1:
Set fromGridX[i] = VERT[i][0].
Running time
As previously established, there are at most O(n^2) grid cells. Each of the n input rectangles can occupy at most all of these cells, each of which is visited once per input rectangle, for a time bound of O(n^3). Note that this is an extremely pessimistic time bound, and for example if none (or none but a bounded number) of your rectangles overlap, then it drops to O(n^2) since no grid cell will ever be visited more than once.
I suspect the lookups and iterations are not fast enough. Things like 'otherwise it searches the whole map' point out that you do very heavy computations.
What I think you need is to use a 2d datastructure. A k-d tree or a BSP would work but the easiest to understand and implement would be a quad tree.
In a quad tree each node represents a rectangle in your space. Each node can be split into 4 children by selecting the mid point along the 2 dimensions and having the children represent the 4 resulting rectangles. Each node also holds the value that you want to assign to the area and an extra flag if the value is uniform.
To mark a rectangle with some value, you start from the root and recursively:
If the input rectangle covers the node rectangle you set the value to that node, mark it as uniform and return.
If the input rectangle and the node rectangle don't touch just return.
If the node is marked as uniform, copy the value to it's children and mark the node not uniform.
Recursively call for the 4 children (you might have to create them).
On the way back, check if the 4 children have the same value and are all marked as uniform and if so mark the node as uniform and set the same value as the children.
The main advantage of this approach is that you get to mark large areas of your map quickly. You can also prove that marking a area is O(logN) where N is the size of your map (with a larger constant than the usual tree).
You can find a more detailed explanation and some helpful images on wikipedia.
Assuming you know the top- and bottom-most y and the left- and right-most x, extend the four vectors belonging to each rectangle to the respective max and min x and y points. Keep a set of extended vertical vectors and a set of extended horizontal ones. Whenever an extended vector is added, it will necessarily intersect with each vector in the perpendicular list - the intersections are the cell coordinates of the matrix.
Once the list of cell coordinates is made, iterate over them and assign values appropriately, looking up if they are in or out of an original rectangle. I'm not too versed in data structures for rectangles, but it seems to me that two interval trees, one for horizontal, the other for vertical could find that answer in O(log n) time per query, where n is the number of intervals in the tree.
All together, this method seems to be O(n * log m) time, where n is the number of cell coordinates in the resultant matrix and m is the number of original rectangles.

Coordinates of a pixel between two images

I am looking for a solution to easily compute the pixel coordinate from two images.
Question: If you take the following code, how could I compute the pixel coordinate that changed from the "QVector difference" ? Is it possible to have an (x,y) coordinate and find on the currentImage which pixel it represents ?
char *previousImage;
char *currentImage;
QVector difference<LONG>;
for(int i = 0 ; i < CurrentImageSize; i++)
{
//Check if pixels are the same (we can also do it with RGB values, this is just for the example)
if(previousImagePixel != currentImagePixel)
{
difference.push_back(currentImage - previousImage);
}
currentImage++;
}
EDIT:
More information about this topic:
The image is in RGB format
The width, the height and the bpp of both images are known
I have a pointer to the bytes representing the image
The main objective here is to clearly know what is the new value of a pixel that changed between the two images and to know which pixel is it (its coordinates)
There is not enough information to answer, but I will try to give you some idea.
You have declared char *previousImage;, which implies to me that you have a pointer to the bytes representing an image. You need more than that to interpret the image.
You need to know the pixel format. You mention RGB, So -- for the time being, let's assume that the image uses 3 bytes for each pixel and the order is RGB
You need to know the width of the image.
Given the above 2, you can calculate the "Row Stride", which is the number of bytes that a row takes up. This is usually the "bytes per pixel" * "image width", but it is typically padded out to be divisible by 4. So 3 bpp and a width of 15, would be 45 bytes + 3 bytes of padding to make the row stride 48.
Given that, if you have an index into the image data, you first integer-divide it against the row stride to get the row (Y coordinate).
The X coordinate is the (index mod the row stride) integer-divided by the bytes per pixel.
From what I understand, you want compute the displacement or motion that occured between two images. E.g. for each pixel I(x, y, t=previous) in previousImage, you want to know where it did go in currentImage, and what is his new coordinate I(x, y, t=current).
If that is the case, then it's called motion estimation and measuring the optical flow. There are many algorithms for that, who rely on more or less complex hypotheses, depending on the objects you observe in the image sequence.
The simpliest hypothesis is that if you follow a moving pixel I(x, y, t) in the scene you observe, its luminance will remain constant over time. In other words, dI(x,y,t) / dt = 0.
Since I(x, y, t) is function of three parameters (space and time) with two unknowns, and there is only one equation, this is an ill defined problem that has no easy solution. Many of the algorithms add an additional hypothesis, so that the problem can be solved with a unique solution.
You can use existing libraries which will do that for you, one of them which is pretty popular is openCV.

Adaptive median filter with Opencv c++

I had a problem with writing the code of the adaptive median.
Which is the best way to compute the min intensity pixel max n median?
Til now I read every value of the pixels of the image
for (int y = 0; y < h; y++)
{
uchar *ptr = (uchar*)(img->imageData + y * step);
for (int x = 0; x < w; x++){
printf("%u, ", ptr[x]);
}
printf("\n");
}
For the maxima and minima in a rectangular window, I would look to van Herk's dilation algorithm, as grayscale dilation corresponds to the maximum operator, and grayscale erosion to the minimum operator and a rectangular structuring element could be decomposed to a vertical and a horizontal line.
For the median filtering I would look to moving histogram techniques.
For the min/max pixel you'll need to record the value of the first pixel and then compare each other pixel to it, storing the new value if it's lower/higher respectively. OpenCV provides the cv::minmaxLoc to make this easy.
For the median you'll need to sort all pixels and select the middle one (once sorted of course, finding the min/max is trivial as they'll be on either end of the list). This is more tricky, how far have you got and what is not working?

Traverse a 2.5D grid

I'm trying to figure out how to traverse a 2.5D grid in an efficient manner. The grid itself is 2D, but each cell in the grid has a float min/max height. The line to traverse is defined by two 3D floating point coordinates. I want to stop traversing the line if the range of z values between entering/exiting a grid cell doesn't overlap with the min/max height for that cell.
I'm currently using the 2D DDA algorithm to traverse through the grid cells in order(see picture), but I'm not sure how to calculate the z value when each grid cell is reached. If I could do that, I could test the z value when entering/leaving the cell against the min/max height for the cell.
Is there a way to modify this algorithm that allows z to be calculated when each grid cell is entered? Or is there a better traversal algorithm that would allow me to do that?
Here's the current code I'm using:
void Grid::TraceGrid(Point3<float>& const start, Point3<float>& const end, GridCallback callback )
{
// calculate and normalize the 2D direction vector
Point2<float> direction=end-start;
float length=direction.getLength( );
direction/=length;
// calculate delta using the grid resolution
Point2<float> delta(m_gridresolution/fabs(direction.x), m_gridresolution/fabs(direction.y));
// calculate the starting/ending points in the grid
Point2<int> startGrid((int)(start.x/m_gridresolution), (int)(start.y/m_gridresolution));
Point2<int> endGrid((int)(end.x/m_gridresolution), (int)(end.y/m_gridresolution));
Point2<int> currentGrid=startGrid;
// calculate the direction step in the grid based on the direction vector
Point2<int> step(direction.x>=0?1:-1, direction.y>=0?1:-1);
// calculate the distance to the next grid cell from the start
Point2<float> currentDistance(((step.x>0?start.x:start.x+1)*m_gridresolution-start.x)/direction.x, ((step.y>0?start.y:start.y+1)*m_gridresolution-start.y)/direction.y);
while(true)
{
// pass currentGrid to the callback
float z = 0.0f; // need to calculate z value somehow
bool bstop=callback(currentGrid, z);
// check if the callback wants to stop or the end grid cell was reached
if(bstop||currentGrid==endGrid) break;
// traverse to the next grid cell
if(currentDistance.x<currentDistance.y) {
currentDistance.x+=delta.x;
currentGrid.x+=step.x;
} else {
currentDistance.y+=delta.y;
currentGrid.y+=step.y;
}
}
}
It seems like a 3D extension of the Bresenham Line Algorithm would work. You would iterate over X and independently track the error for the Y and Z components of your line segment to determine the Y and Z values for each corresponding X value. You just stop when the accumulated error in Z reaches some critical level which would indicate it is outside of your min/max.
For each cell, you know from which cell you came from. This means you know from which side you came from. Calculating z at the intersection of the green line and a given grid line seems trivial.
I figured out a good way to do it. Add to the start of the function:
float fzoffset=end.z-start.z;
Point2<float> deltaZ(fzoffset/fabs(end.x-start.x), fzoffset/fabs(end.y-start.y));
Point2<float> currentOffset((step.x>0?start.x:start.x+1)*m_gridresolution-start.x, (step.y>0?start.y:start.y+1)*m_gridresolution-start.y);
Inside the loop where currentDistance.x/.y are incremented, add:
currentOffset.x+=m_gridresolution; //When stepping in the x axis
currentOffset.y+=m_gridresolution; //When stepping in the y axis
Then to calculate z at each step:
z=currentOffset.x*deltaZ.x+start.z; //When stepping in the x axis
z=currentOffset.y*deltaZ.y+start.z; //When stepping in the y axis