I'm trying to implement a Gaussian Blur from scratch (using C++). In the code below I've hard-coded the Gaussian kernel I'm using. I only kept one dimension as I'm trying to use the optimization I've read about where you can do a horizontal convolution pass and a vertical one over that to make your blur more efficient. Unfortunately, I'm running into some issues. Here is my code:
float gKern[5] = {0.05448868, 0.24420134, 0.40261995, 0.24420134, 0.05448868};
int** gaussianBlur(int** image, int height, int width) {
int **ret = new int*[height];
for(int i = 0; i < height; i++) {
ret[i] = new int[width];
}
for (int i = 0; i < height; i++) {
for (int j = 0; j < width; j++) {
if (i == 0) {
ret[i][j] = (gKern[0] * image[2][j]) + (gKern[1] * image[1][j]) + (gKern[2] * image[0][j]) + (gKern[3] * image[1][j]) + (gKern[4] * image[2][j]);
} else if (i == 1) {
ret[i][j] = (gKern[0] * image[1][j]) + (gKern[1] * image[0][j]) + (gKern[2] * image[1][j]) + (gKern[3] * image[2][j]) + (gKern[4] * image[3][j]);
} else if (i == (height - 2)) {
ret[i][j] = (gKern[0] * image[i - 2][j]) + (gKern[1] * image[i - 1][j]) + (gKern[2] * image[i][j]) + (gKern[3] * image[i + 1][j]) + (gKern[4] * image[i][j]);
} else if (i == (height - 1)) {
ret[i][j] = (gKern[0] * image[i - 2][j]) + (gKern[1] * image[i - 1][j]) + (gKern[2] * image[i][j]) + (gKern[3] * image[i - 1][j]) + (gKern[4] * image[i - 2][j]);
} else {
ret[i][j] = (gKern[0] * image[i - 2][j]) + (gKern[1] * image[i - 1][j]) + (gKern[2] * image[i][j]) + (gKern[3] * image[i + 1][j]) + (gKern[4] * image[i + 2][j]);
}
}
}
int** temp = image;
image = ret;
for (int i = 0; i < height; i++) {
for (int j = 0; j < width; j++) {
if (j == 0) {
ret[i][j] = (gKern[0] * image[i][2]) + (gKern[1] * image[i][1]) + (gKern[2] * image[i][0]) + (gKern[3] * image[i][1]) + (gKern[4] * image[i][2]);
} else if (j == 1) {
ret[i][j] = (gKern[0] * image[i][1]) + (gKern[1] * image[i][0]) + (gKern[2] * image[i][1]) + (gKern[3] * image[i][2]) + (gKern[4] * image[i][3]);
} else if (j == (width - 2)) {
ret[i][j] = (gKern[0] * image[i][j - 2]) + (gKern[1] * image[i][j - 1]) + (gKern[2] * image[i][j]) + (gKern[3] * image[i][j + 1]) + (gKern[4] * image[i][j]);
} else if (j == (width - 1)) {
ret[i][j] = (gKern[0] * image[i][j - 2]) + (gKern[1] * image[i][j - 1]) + (gKern[2] * image[i][j]) + (gKern[3] * image[i][j - 1]) + (gKern[4] * image[i][j - 2]);
} else {
ret[i][j] = (gKern[0] * image[i][j - 2]) + (gKern[1] * image[i][j - 1]) + (gKern[2] * image[i][j]) + (gKern[3] * image[i][j + 1]) + (gKern[4] * image[i][j + 2]);
}
}
}
image = temp;
return ret;
}
The first pass (the first for block) seems to work fine as when I comment out the second block I do get a slightly blurred image. But when I use both I get a choppy "weird" image, as shown below (the first image is my grayscale input, the second is the choppy output):
The problem is with the pointers you use.
The function starts with image as input and ret as the intermediate result of the first step.
The second step must use ret as input, and write either to the original input (overwrite the input image) or to a new image. Instead, you do:
int** temp = image;
image = ret;
// read from image and write to ret
image = temp;
return ret;
That is, going into the second pass, both image and ret point to the same data, you then read and write to the same data. Next you do a pointer assignment that has no effect (image is never used after this) and return the intermediate buffer.
If you want to write to the input image, simply swap the image and ret pointers before the second pass:
std::swap(image, res);
If you don’t want that, you’ll have to new another image to write into.
It is bad practice to use an array of arrays to store an image. If you look into the source code of any image processing library, you’ll see they allocate a single large memory block for the image, which stores all image rows concatenated. Knowing the width of the image, you know how to index: image[x + y*width].
This not only simplifies code (no loops to allocate a single image), but it also greatly speeds up code: there is no pointer lookup any more, and all data is close together to best use the cache.
This whole code can be simplified significantly by following the advice above: the two passes can be done with the same code. Write a function that filters one line of the image. It takes a pointer to the first pixel, a line length, and a step (which is 1 for horizontal lines, and width for vertical lines). This 1D function is then called in a loop over the lines in another function. This second function is then called once to do the horizontal pass, and once to do the vertical pass. (See here for details.)
In this situation, it is easy to avoid intermediate images by using a buffer of the size of a single image line. Write into that buffer, then copy the whole line back into the input image after it is filtered. This means you have a single buffer of size max(width,height) rather than a buffer of size width*height.
The 1D filter function can also be simplified. That loop should not have any if statements, they will significantly slow down. Execution. Instead, special-case the first two and last two pixels, and loop only over the bulk of the pixels where you don’t have to worry about the image edge.
Related
I am trying to understand this code:
void stencil(const int nx, const int ny, const int width, const int height,
double* image, double* tmp_image)
{
for (int j = 1; j < ny + 1; ++j) {
for (int i = 1; i < nx + 1; ++i) {
tmp_image[j + i * height] = image[j + i * height] * 3.0 / 5.0;
tmp_image[j + i * height] += image[j + (i - 1) * height] * 0.5 / 5.0;
tmp_image[j + i * height] += image[j + (i + 1) * height] * 0.5 / 5.0;
tmp_image[j + i * height] += image[j - 1 + i * height] * 0.5 / 5.0;
tmp_image[j + i * height] += image[j + 1 + i * height] * 0.5 / 5.0;
}
}
}
The 1-d array notation is very confusing. I am trying to convert it to a 2-d notation (which I find easier to read). Could someone point me in the right direction as to how I can accomplish this?
All this code is doing is creating a new image from an original image by taking 60% from the corresponding pixel and 10% from each neighboring pixel.
When you see tmp_image[j + i * height], read it as tmp_image[i][j].
Changing the code to literally use 2D syntax may require knowing at least one of the dimensions at compile time, whereas now it is a runtime argument. So that might be a non-starter, unless you're using C++ and want to write or use a matrix class instead of plain arrays.
I have been able to successfully implement midpoint displacement in a numbered array in another program and so i'v tried to implement it in a 3D world to create terrain however the outcome of the algorithm isn't what i expected.
void MidPointDisplacement(float grid[][WIDTH], int left, int right, int top, int bottom, int row, int col)
{
int centreX = (left + right) / 2; //Get the centre of the row
int centreY = (top + bottom) / 2; //Get the centre of the column
if (centreX == left)
{
return;
}
if (centreY == bottom)
{
return;
}
grid[top][centreX] = ((grid[top][left] + grid[top][right]) / 2) + jitter; //define top
grid[bottom][centreX] = ((grid[bottom][left] + grid[bottom][right]) / 2) + jitter; //define bottom
grid[centreY][left] = ((grid[top][left] + grid[bottom][left]) / 2) + jitter; //define left
grid[centreY][right] = ((grid[top][right] + grid[bottom][right]) / 2) + jitter; //define right
grid[centreX][centreY] = ((grid[centreY][left] + grid[centreY][right] + grid[top][centreX] + grid[bottom][centreX]) / 4) + jitter; //Get centre
//decreased the random values
RANDMAX / 2;
RANDMIN / 2;
MidPointDisplacement(grid, centreX, right, centreY, bottom, row, col);
MidPointDisplacement(grid, left, centreX, top, centreY, row, col);
MidPointDisplacement(grid, centreX, right, top, centreY, row, col);
MidPointDisplacement(grid, left, centreX, centreY, bottom, row, col);
}
The result of the above is this:
3D Midpoint Displacement
However I expected a terrain like this:
What I expected
Is there any reason why this maybe? I initially thought it could've been the jitter however decreasing the initial value doesn't solve this problem.
float jitter = rand() % (int)(RANDMAX - RANDMIN + 1) + RANDMIN;
Jitter is a global float that takes a random value between RANDMAX (initial value of 5.0f) and RANDMIN (initial value of -5.0f).
Solution:
To fix the above error I deleted the jitter I had created in the program and instead made jitter a float variable of 1.
In the MidpointDisplacement function I changed this:
grid[top][centreX] = ((grid[top][left] + grid[top][right]) / 2) + jitter; //define top
grid[bottom][centreX] = ((grid[bottom][left] + grid[bottom][right]) / 2) + jitter; //define bottom
grid[centreY][left] = ((grid[top][left] + grid[bottom][left]) / 2) + jitter; //define left
grid[centreY][right] = ((grid[top][right] + grid[bottom][right]) / 2) + jitter; //define right
grid[centreX][centreY] = ((grid[centreY][left] + grid[centreY][right] + grid[top][centreX] + grid[bottom][centreX]) / 4) + jitter; //Get centre
To this:
grid[top][centreX] = ((grid[top][left] + grid[top][right]) / 2) + ((rand() % 256) - 128) / 128.0F * JITTER_RANGE * jitter; //define top
grid[bottom][centreX] = ((grid[bottom][left] + grid[bottom][right]) / 2) + ((rand() % 256) - 128) / 128.0F * JITTER_RANGE * jitter; //define bottom
grid[centreY][left] = ((grid[top][left] + grid[bottom][left]) / 2) + ((rand() % 256) - 128) / 128.0F * JITTER_RANGE * jitter; //define left
grid[centreY][right] = ((grid[top][right] + grid[bottom][right]) / 2) + ((rand() % 256) - 128) / 128.0F * JITTER_RANGE * jitter; //define right
grid[centreY][centreX] = ((grid[centreY][left] + grid[centreY][right] + grid[top][centreX] + grid[bottom][centreX]) / 4) + ((rand() % 256) - 128) / 128.0F * JITTER_RANGE * jitter; //Get centre
I then called the function again as done before however this time I divided the jitter by 2.
Also, I discovered that I stupidly had centreX and centreY the wrong way around in the line(Not changed here):
grid[centreX][centreY] = ((grid[centreY][left] + grid[centreY][right] + grid[top][centreX] + grid[bottom][centreX]) / 4) + jitter; //Get centre
I have a uint8_t YUYV 422 (Interleaved) image array in memory and I want to be able to flip it both vertically and horizontally. I have successfully implemented a vertical flip but I'm having a problem with flipping both horizontally and vertically at the same time.
My code for the vertical flip, below, works perfectly.
int counter = 0;
int array_width = 2; // YUYV
for (int h = (m_Width * m_Height * array_width) - m_Width * array_width; h > 0; h -= m_Width * array_width)
{
for (int w = 0; w < m_Width * array_width; w++)
{
flipped[counter] = buffer[h + w];
counter++;
}
}
However, the following vertical and horizontal flip code appears to work but there is a loss of definition. To better understand what I am referring to, please see my sample images.
int x = 0;
for (int n = m_Width * m_Height * 2 - 1; n >= 0; n -= 4)
{
flipped[x] = buffer[n - 3]; // Y0
flipped[x + 1] = buffer[n - 2]; // U
flipped[x + 2] = buffer[n - 1]; // Y1
flipped[x + 3] = buffer[n]; // V
x += 4;
}
As you can see, I am moving the YUYV components and keeping them in the same order. I don't believe that I am dropping pixels so I don't understand why I am losing definition. To reiterate, I don't see this problem when flipping vertically (Using the first code snippet).
Here is the reference image, please note the stem of the lamp:
This is the flipped image, the stem of the lamp has lost definition:
You also need to swap Y0 and Y1 in your loop.
int x = 0;
for (int n = m_Width * m_Height * 2 - 1; n >= 3; n -= 4)
{
flipped[x] = buffer[n - 1]; // Y1->Y0
flipped[x + 1] = buffer[n - 2]; // U
flipped[x + 2] = buffer[n - 3]; // Y0->Y1
flipped[x + 3] = buffer[n]; // V
x += 4;
}
While I was at it, since you're accessing n - 3 I changed the loop condition to be absolutely sure it was safe.
m_Width * m_Height * 2 is not a multiple of 4 (the number of data blocks in YUYV format. Try changing '2' into '4', an also array_width.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
How can functions like this:
void Map::Display()
{
if(initialized)
{
HRESULT hr;
int hScrollPos = GetScrollPos(M_HWnd, SB_HORZ);
int vScrollPos = GetScrollPos(M_HWnd, SB_VERT);
D2D1_RECT_F region = {0,0,TILE_WIDTH,TILE_HEIGHT};
D2D1_RECT_F tFRegion = {0,0,TILE_WIDTH,21}; // tile front's region
Coor coor;
int tileHeight;
RECT rect;
GetWindowRect(M_HWnd, &rect);
int HWndWidth = rect.right - rect.left;
int HWndHeight = rect.bottom - rect.top;
pRT->BeginDraw();
pRT->Clear(D2D1::ColorF(0.45f, 0.76f, 0.98f, 1.0f));
pRT->SetAntialiasMode(D2D1_ANTIALIAS_MODE_ALIASED);
for(int x=0; x<nTiles; x++)
{
coor = ppTile[x]->Getcoor();
tileHeight = ppTile[x]->Getheight();
pRT->SetTransform(D2D1::Matrix3x2F::Identity());
if((coor.GetX() - 1) * (TILE_WIDTH * 0.5) - hScrollPos > 0 - TILE_WIDTH &&
(coor.GetX() - 1) * (TILE_WIDTH * 0.5) - hScrollPos < HWndWidth &&
((coor.GetY() - 1) * (TILE_HEIGHT * 0.5) * 1.5f) + ((MAX_MAP_HEIGHT - tileHeight) * (TILE_PIXEL_PER_LAYER)) + TILE_HEIGHT - vScrollPos > 0 - (TILE_HEIGHT * 2.5) &&
((coor.GetY() - 1) * (TILE_HEIGHT * 0.5) * 1.5f) + ((MAX_MAP_HEIGHT - tileHeight) * (TILE_PIXEL_PER_LAYER)) + TILE_HEIGHT - vScrollPos < HWndHeight)
{
/* Draws tiles */
pRT->SetTransform(D2D1::Matrix3x2F::Translation(
(coor.GetX() - 1) * (TILE_WIDTH * 0.5) - hScrollPos,
((coor.GetY() - 1) * (TILE_HEIGHT * 0.5) * 1.5f) + ((MAX_MAP_HEIGHT - tileHeight) * (TILE_PIXEL_PER_LAYER)) + TILE_HEIGHT - vScrollPos
));
pRT->FillRectangle( ®ion, pBmpTileBrush[ppTile[x]->GetType() + 1]);
/* Draws tiles' front */
if((coor.Y - 1) / 2 < mapSizeY - 1) // If we are not in the front row,
{
if(coor.X > 1)
{
for(int diffH = tileHeight - ppTile[x + mapSizeX - 1]->Getheight(); diffH == 0; diffH--)
{
pRT->SetTransform(D2D1::Matrix3x2F::Identity());
pRT->SetTransform(D2D1::Matrix3x2F::Translation(
(coor.GetX() - 1) * (TILE_WIDTH * 0.5) - hScrollPos,
((coor.GetY() - 1) * (TILE_HEIGHT * 0.5) * 1.5f) + ((MAX_MAP_HEIGHT - tileHeight) * (TILE_PIXEL_PER_LAYER)) + TILE_HEIGHT - vScrollPos + (TILE_HEIGHT * 0.75) + (diffH * TILE_PIXEL_PER_LAYER)
));
pRT->FillRectangle( &tFRegion, pBmpTileFrontBrush[ppTile[x]->GetType()]);
}
}
if(((coor.X -1) / 2) + 1 < mapSizeX)
{
for(int diffH = tileHeight - ppTile[x + mapSizeX]->Getheight(); diffH == 0; diffH--)
{
pRT->SetTransform(D2D1::Matrix3x2F::Identity());
pRT->SetTransform(D2D1::Matrix3x2F::Translation(
(coor.GetX() - 1) * (TILE_WIDTH * 0.5) - hScrollPos,
((coor.GetY() - 1) * (TILE_HEIGHT * 0.5) * 1.5f) + ((MAX_MAP_HEIGHT - tileHeight) * (TILE_PIXEL_PER_LAYER)) + TILE_HEIGHT - vScrollPos + (TILE_HEIGHT * 0.75) + (diffH * TILE_PIXEL_PER_LAYER)
));
pRT->FillRectangle( &tFRegion, pBmpTileFrontBrush[ppTile[x]->GetType()]);
}
}
if(coor.X == 1 || (coor.X - 1) / 2 == mapSizeY - 1) // If the tile if at any of left or right edge,
{
for(int n = ((TH * 1.5) / TPPL) - (ppTile[x + mapSizeY + mapSizeY - 1]->Getheight() - tileHeight); n>=0; n--)
{
pRT->SetTransform(D2D1::Matrix3x2F::Identity());
pRT->SetTransform(D2D1::Matrix3x2F::Translation(
(coor.X - 1) * (TILE_WIDTH * 0.5) - hScrollPos,
((coor.Y - 1) * (TILE_HEIGHT * 0.5) * 1.5f) + ((MAX_MAP_HEIGHT - tileHeight) * (TILE_PIXEL_PER_LAYER)) + TILE_HEIGHT - vScrollPos + (TILE_HEIGHT * 0.75) + (n * TILE_PIXEL_PER_LAYER)
));
pRT->FillRectangle( &tFRegion, pBmpTileFrontBrush[ppTile[x]->GetType()]);
}
}
}
else // If we are in the front row
{
for(int h = tileHeight; h >= 0; h--)
{
pRT->SetTransform(D2D1::Matrix3x2F::Identity());
pRT->SetTransform(D2D1::Matrix3x2F::Translation(
(coor.GetX() - 1) * (TILE_WIDTH * 0.5) - hScrollPos,
((coor.GetY() - 1) * (TILE_HEIGHT * 0.5) * 1.5f) + ((MAX_MAP_HEIGHT - tileHeight) * (TILE_PIXEL_PER_LAYER)) + TILE_HEIGHT - vScrollPos + (TILE_HEIGHT * 0.75) + (h * TILE_PIXEL_PER_LAYER)
));
pRT->FillRectangle( &tFRegion, pBmpTileFrontBrush[ppTile[x]->GetType()]);
}
}
}
}
pRT->SetAntialiasMode(D2D1_ANTIALIAS_MODE_PER_PRIMITIVE);
hr = pRT->EndDraw();
}
}
this:
Tile* Map::GetClickedTile(short xPos, short yPos)
{
Tile* pNoClickedTile = NULL;
int hScrollPos = GetScrollPos(M_HWnd, SB_HORZ);
int vScrollPos = GetScrollPos(M_HWnd, SB_VERT);
if(xPos < (mapSizeX * TILE_WIDTH) - hScrollPos) // If the click is within width of the map then...
{
Coor coor;
int height;
int currentTile;
int tileDistanceFromTop;
/* Checks if click is in an odd row of tiles */
int column = (xPos + hScrollPos) / TILE_WIDTH;
for (int y=mapSizeY-1; y>=0; y--)
{
currentTile = column + (y * (mapSizeX+mapSizeX-1));
coor = ppTile[currentTile]->Getcoor();
height = ppTile[currentTile]->Getheight();
tileDistanceFromTop = ((coor.Y / 2) * TILE_HEIGHT * 1.5f) + // Distance between two tiles
( (MAX_MAP_HEIGHT - height) * TILE_PIXEL_PER_LAYER) -
vScrollPos +
SPACE_LEFT_FOR_BACKGROUND;
/*if (tileDistanceFromTop < 0) // If the tile is partially hidden,
tileDistanceFromTop = tileDistanceFromTop % TILE_HEIGHT; // then % TILE_HEIGHT*/
if( yPos > tileDistanceFromTop &&
yPos < tileDistanceFromTop + TILE_HEIGHT)
{
/* Get relative coordinates */
int rpx = xPos % TILE_WIDTH;
int rpy = ( (yPos - SPACE_LEFT_FOR_BACKGROUND) -
(y * (TILE_HEIGHT /2) ) -
( ( MAX_MAP_HEIGHT - height) * TILE_PIXEL_PER_LAYER) +
vScrollPos) %
TILE_HEIGHT;
/* Checks if click is withing area of current tile */
if (rpy + (rpx / (TILE_WIDTH /16)) > TILE_HEIGHT * 0.25f && // if click is Down Right the Upper Left slope and,
rpy + (rpx / (TILE_WIDTH /16)) < TILE_HEIGHT * 1.25f && // it is UL the LR slope and,
rpy - (rpx / (TILE_WIDTH /16)) < TILE_HEIGHT * 0.75f && // it is UR the LL slope and,
rpy - (rpx / (TILE_WIDTH /16)) > TILE_HEIGHT * -0.25f) // it is DL the UR slope,
return ppTile[currentTile]; // Then return currentTile
}
}
/* Checks if click is in an even row of tiles */
column = (xPos + hScrollPos - (TILE_WIDTH/2)) / TILE_WIDTH;
for (int y=mapSizeY-2; y>=0; y--)
{
currentTile = column + (y * (mapSizeX+mapSizeX-1)) + mapSizeX;
coor = ppTile[currentTile]->Getcoor();
height = ppTile[currentTile]->Getheight();
tileDistanceFromTop = (((coor.Y - 1) / 2) * TILE_HEIGHT * 1.5f) + // Distance between two tiles
( (MAX_MAP_HEIGHT - height) * TILE_PIXEL_PER_LAYER) +
(TILE_HEIGHT * 0.75) -
vScrollPos +
SPACE_LEFT_FOR_BACKGROUND;
/*if (tileDistanceFromTop < 0)
tileDistanceFromTop = tileDistanceFromTop % TILE_HEIGHT;*/
if( yPos > tileDistanceFromTop &&
yPos < tileDistanceFromTop + TILE_HEIGHT)
{
/* Get relative coordinates */
int rpx = xPos % TILE_WIDTH;
int rpy = (int)((yPos - SPACE_LEFT_FOR_BACKGROUND) -
(y * (TILE_HEIGHT /2) ) -
( ( MAX_MAP_HEIGHT - height) * TILE_PIXEL_PER_LAYER) -
(TILE_HEIGHT * 0.675) +
vScrollPos) %
TILE_HEIGHT;
/* Checks if click is withing area of current tile */
if (rpy + (rpx / (TILE_WIDTH /16)) > TILE_HEIGHT * 0.25f && // if click is Down Right the Upper Left slope and,
rpy + (rpx / (TILE_WIDTH /16)) < TILE_HEIGHT * 1.25f && // it is UL the LR slope and,
rpy - (rpx / (TILE_WIDTH /16)) < TILE_HEIGHT * 0.75f && // it is UR the LL slope and,
rpy - (rpx / (TILE_WIDTH /16)) > TILE_HEIGHT * -0.25f) // it is DL the UR slope,
return ppTile[currentTile]; // Then return currentTile // Then return currentTile
}
}
}
return pNoClickedTile;
}
Or even this:
int Map::GetTileNByCoor(Coor coor)
{
return ((coor.X / 2 + ((coor.Y - 1) * mapSizeY) - (coor.Y / 2));
}
be made easier to read? As my code grows bigger, I realize how important, if not at times necessary, it is to have a clean, easy to read code. What are some tips to make codes like the ones above cleaner?
My general refactoring practices is usually to do the following:
Pull out names for things that aren't apparent in the code. You can use local variables to give defining names to small pieces of code. So, in cases like your last example, what does (coor.X / 2 + ((coor.Y - 1) * mapSizeY) represent?
In most cases its better to have things names well, than worry about storing local variables (they will be deleted when the stack leaves the function, and usually you are not going to be too worried about memory space/speed of the code at such a fine grain).
Pull out groups of executing code into methods. A good rule of thumb is if your function is more than 6 lines of code, you can probably pull out a smaller function inside of it. Then your code will read better to what it's actually doing.
A very common place to look at this is loops. You can almost always pull the code inside a loop into it's own function, with a good descriptive name.
After you have pulled out methods, you can group common shared functionality into smaller objects. It's almost always better to have smaller objects working together to do the work, than to have giant objects that do a lot of work. You want your objects to each have a single responsibility.
Pretty solid code, well done. I would consider:
Comment the function itself at a high-level, and then add better comments for all the significant blocks in the code, and for anything unusually tricky.
Use descriptive consts or #defines for all the magic variables you're using. Why multiply by 0.675? What does 0.675 represent? Ditto 0.25, 1.25, -0.25 etc.
Turn things like the "Checks if click is withing area of current tile" test (and others) into a separate method that you call, for example isClickInsideTile(x,y,tile).
Add debug trace so that the next person responsible can enable debug to get diagnostics.
PS good job with your variable names and method names.
I'm trying to run an integer-to-integer lifting 5/3 on an image of lena. I've been following the paper "A low-power Low-memory system for wavelet-based image compression" by Walker, Nguyen, and Chen (Link active as of 7 Oct 2015).
I'm running into issues though. The image just doesn't seem to come out quite right. I appear to be overflowing slightly in the green and blue channels which means that subsequent passes of the wavelet function find high frequencies where there ought not to be any. I'm also pretty sure I'm getting something else wrong as I am seeing a line of the s0 image at the edges of the high frequency parts.
My function is as follows:
bool PerformHorizontal( Col24* pPixelsIn, Col24* pPixelsOut, int width, int pixelPitch, int height )
{
const int widthDiv2 = width / 2;
int y = 0;
while( y < height )
{
int x = 0;
while( x < width )
{
const int n = (x) + (y * pixelPitch);
const int n2 = (x / 2) + (y * pixelPitch);
const int s = n2;
const int d = n2 + widthDiv2;
// Non-lifting 5 / 3
/*pPixelsOut[n2 + widthDiv2].r = pPixelsIn[n + 2].r - ((pPixelsIn[n + 1].r + pPixelsIn[n + 3].r) / 2) + 128;
pPixelsOut[n2].r = ((4 * pPixelsIn[n + 2].r) + (2 * pPixelsIn[n + 2].r) + (2 * (pPixelsIn[n + 1].r + pPixelsIn[n + 3].r)) - (pPixelsIn[n + 0].r + pPixelsIn[n + 4].r)) / 8;
pPixelsOut[n2 + widthDiv2].g = pPixelsIn[n + 2].g - ((pPixelsIn[n + 1].g + pPixelsIn[n + 3].g) / 2) + 128;
pPixelsOut[n2].g = ((4 * pPixelsIn[n + 2].g) + (2 * pPixelsIn[n + 2].g) + (2 * (pPixelsIn[n + 1].g + pPixelsIn[n + 3].g)) - (pPixelsIn[n + 0].g + pPixelsIn[n + 4].g)) / 8;
pPixelsOut[n2 + widthDiv2].b = pPixelsIn[n + 2].b - ((pPixelsIn[n + 1].b + pPixelsIn[n + 3].b) / 2) + 128;
pPixelsOut[n2].b = ((4 * pPixelsIn[n + 2].b) + (2 * pPixelsIn[n + 2].b) + (2 * (pPixelsIn[n + 1].b + pPixelsIn[n + 3].b)) - (pPixelsIn[n + 0].b + pPixelsIn[n + 4].b)) / 8;*/
pPixelsOut[d].r = pPixelsIn[n + 1].r - (((pPixelsIn[n].r + pPixelsIn[n + 2].r) >> 1) + 127);
pPixelsOut[s].r = pPixelsIn[n].r + (((pPixelsOut[d - 1].r + pPixelsOut[d].r) >> 2) - 64);
pPixelsOut[d].g = pPixelsIn[n + 1].g - (((pPixelsIn[n].g + pPixelsIn[n + 2].g) >> 1) + 127);
pPixelsOut[s].g = pPixelsIn[n].g + (((pPixelsOut[d - 1].g + pPixelsOut[d].g) >> 2) - 64);
pPixelsOut[d].b = pPixelsIn[n + 1].b - (((pPixelsIn[n].b + pPixelsIn[n + 2].b) >> 1) + 127);
pPixelsOut[s].b = pPixelsIn[n].b + (((pPixelsOut[d - 1].b + pPixelsOut[d].b) >> 2) - 64);
x += 2;
}
y++;
}
return true;
}
There is definitely something wrong but I just can't figure it out. Can anyone with slightly more brain than me point out where I am going wrong? Its worth noting that you can see the un-lifted version of the Daub 5/3 above the working code and this, too, give me the same artifacts ... I'm very confused as I have had this working once before (It was over 2 years ago and I no longer have that code).
Any help would be much appreciated :)
Edit: I appear to have eliminated my overflow issues by clamping the low pass pixels to the 0 to 255 range. I'm slightly concerned this isn't the right solution though. Can anyone comment on this?
You can do some tests with extreme values to see the possibility of overflow. Example:
pPixelsOut[d].r = pPixelsIn[n + 1].r - (((pPixelsIn[n].r + pPixelsIn[n + 2].r) >> 1) + 127);
If:
pPixelsIn[n ].r == 255
pPixelsIn[n+1].r == 0
pPixelsIn[n+2].r == 255
Then:
pPixelsOut[d].r == -382
But if:
pPixelsIn[n ].r == 0
pPixelsIn[n+1].r == 255
pPixelsIn[n+2].r == 0
Then:
pPixelsOut[d].r == 128
You have a range of 511 possible values (-382 .. 128), so, in order to avoid overflow or clamping, you would need one extra bit, some quantization, or another encoding type!
I'm assuming the data have already been thresholded?
I also don't get why you're adding back in +127 and -64.
OK I can losslessly forward then inverse as long as I store my post forward transform data in a short. Obviously this takes up a little more space than I was hoping for but this does allow me a good starting point for going into the various compression algorithms. You can also, nicely, compress 2 4 component pixels at a time using SSE2 instructions. This is the standard C forward transform I came up with:
const int16_t dr = (int16_t)pPixelsIn[n + 1].r - ((((int16_t)pPixelsIn[n].r + (int16_t)pPixelsIn[n + 2].r) >> 1));
const int16_t sr = (int16_t)pPixelsIn[n].r + ((((int16_t)pPixelsOut[d - 1].r + dr) >> 2));
const int16_t dg = (int16_t)pPixelsIn[n + 1].g - ((((int16_t)pPixelsIn[n].g + (int16_t)pPixelsIn[n + 2].g) >> 1));
const int16_t sg = (int16_t)pPixelsIn[n].g + ((((int16_t)pPixelsOut[d - 1].g + dg) >> 2));
const int16_t db = (int16_t)pPixelsIn[n + 1].b - ((((int16_t)pPixelsIn[n].b + (int16_t)pPixelsIn[n + 2].b) >> 1));
const int16_t sb = (int16_t)pPixelsIn[n].b + ((((int16_t)pPixelsOut[d - 1].b + db) >> 2));
pPixelsOut[d].r = dr;
pPixelsOut[s].r = sr;
pPixelsOut[d].g = dg;
pPixelsOut[s].g = sg;
pPixelsOut[d].b = db;
pPixelsOut[s].b = sb;
It is trivial to create the inverse of this (A VERY simple bit of algebra). Its worth noting, btw, that you need to inverse the image from right to left bottom to top. I'll next see if I can shunt this data into uint8_ts and lost a bit or 2 of accuracy. For compression this really isn't a problem.