Quick code to resize DIB image and maintain good img quality

Quick code to resize DIB image and maintain good img quality - c++

There is many algorithms to do image resizing - lancorz, bicubic, bilinear, e.g. But most of them are pretty complex and therefore consume too much CPU.
What I need is fast relatively simple C++ code to resize images with acceptable quality.
Here is an example of what I'm currently doing:
for (int y = 0; y < height; y ++)
{
int srcY1Coord = int((double)(y * srcHeight) / height);
int srcY2Coord = min(srcHeight - 1, max(srcY1Coord, int((double)((y + 1) * srcHeight) / height) - 1));
for (int x = 0; x < width; x ++)
{
int srcX1Coord = int((double)(x * srcWidth) / width);
int srcX2Coord = min(srcWidth - 1, max(srcX1Coord, int((double)((x + 1) * srcWidth) / width) - 1));
int srcPixelsCount = (srcX2Coord - srcX1Coord + 1) * (srcY2Coord - srcY1Coord + 1);
RGB32 color32;
UINT32 r(0), g(0), b(0), a(0);
for (int xSrc = srcX1Coord; xSrc <= srcX2Coord; xSrc ++)
for (int ySrc = srcY1Coord; ySrc <= srcY2Coord; ySrc ++)
{
RGB32 curSrcColor32 = pSrcDIB->GetDIBPixel(xSrc, ySrc);
r += curSrcColor32.r; g += curSrcColor32.g; b += curSrcColor32.b; a += curSrcColor32.alpha;
}
color32.r = BYTE(r / srcPixelsCount); color32.g = BYTE(g / srcPixelsCount); color32.b = BYTE(b / srcPixelsCount); color32.alpha = BYTE(a / srcPixelsCount);
SetDIBPixel(x, y, color32);
}
}
The code above is fast enough, but the quality is not ok on scaling pictures up.
Therefore, possibly someone already has fast and good C++ code sample for scaling DIBs?
Note: I was using StretchDIBits before - it was super-slow when was needed to downsize 10000x10000 picture down to 100x100 size, my code is much, much faster, I just want to have a bit higher quality
P.S. I'm using my own SetPixel/GetPixel functions, to work directly with data array and fast, that's not device context!

Why are you doing it on the CPU? Using GDI, there's a good chance of some hardware acceleration. Use StretchBlt and SetStretchBltMode.
In pseudocode:
create source dc and destination dc using CreateCompatibleDC
create source and destination bitmaps
SelectObject source bitmap into source DC and dest bitmap into dest DC
SetStretchBltMode
StretchBlt
release DCs

Allright, here is the answer, had to do it myself... It works perfectly well for scaling pictures up (for scaling down my initial code works perfectly well too). Hope someone will find a good use for it, it's fast enough and produced very good picture quality.
for (int y = 0; y < height; y ++)
{
double srcY1Coord = (y * srcHeight) / (double)height;
int srcY1CoordInt = (int)(srcY1Coord);
double srcY2Coord = ((y + 1) * srcHeight) / (double)height - 0.00000000001;
int srcY2CoordInt = min(maxSrcYcoord, (int)(srcY2Coord));
double yMultiplierForFirstCoord = (0.5 * (1 - (srcY1Coord - srcY1CoordInt)));
double yMultiplierForLastCoord = (0.5 * (srcY2Coord - srcY2CoordInt));
for (int x = 0; x < width; x ++)
{
double srcX1Coord = (x * srcWidth) / (double)width;
int srcX1CoordInt = (int)(srcX1Coord);
double srcX2Coord = ((x + 1) * srcWidth) / (double)width - 0.00000000001;
int srcX2CoordInt = min(maxSrcXcoord, (int)(srcX2Coord));
RGB32 color32;
ASSERT(srcX1Coord < srcWidth && srcY1Coord < srcHeight);
double r(0), g(0), b(0), a(0), multiplier(0);
for (int xSrc = srcX1CoordInt; xSrc <= srcX2CoordInt; xSrc ++)
for (int ySrc = srcY1CoordInt; ySrc <= srcY2CoordInt; ySrc ++)
{
RGB32 curSrcColor32 = pSrcDIB->GetDIBPixel(xSrc, ySrc);
double xMultiplier = xSrc < srcX1Coord ? (0.5 * (1 - (srcX1Coord - srcX1CoordInt))) : (xSrc >= srcX2Coord ? (0.5 * (srcX2Coord - srcX2CoordInt)) : 0.5);
double yMultiplier = ySrc < srcY1Coord ? yMultiplierForFirstCoord : (ySrc >= srcY2Coord ? yMultiplierForLastCoord : 0.5);
double curPixelMultiplier = xMultiplier + yMultiplier;
if (curPixelMultiplier > 0)
{
r += (curSrcColor32.r * curPixelMultiplier); g += (curSrcColor32.g * curPixelMultiplier); b += (curSrcColor32.b * curPixelMultiplier); a += (curSrcColor32.alpha * curPixelMultiplier);
multiplier += curPixelMultiplier;
}
}
color32.r = BYTE(r / multiplier); color32.g = BYTE(g / multiplier); color32.b = BYTE(b / multiplier); color32.alpha = BYTE(a / multiplier);
SetDIBPixel(x, y, color32);
}
}
P.S. Please don't ask why I’m not using StretchDIBits - leave comments for these who understand that not always system api is available or acceptable.

Again, why do it on the CPU? Why not use OpenGL / DirectX and fragment shaders? In pseudocode:
upload source texture (cache it if it's to be reused)
create destination texture
use shader program
render quad
download output texture
where shader program is the filtering method you're using. The GPU is much better at processing pixels than CPU/GetPixel/SetPixel.
You could probably find fragment shaders for lots of different filtering methods on the web - GPU Gems is a good place to start.

Related

Change Perlin noise algorithm to work with continuous procedural generation

Right now I have a perlin noise function where I pass a buffer of seeds and another buffer which the function fills with the noise values. I am using this to procedurely generate the heights of the vertices in a terrain. The problem is right now the terrain is limited to the size of the buffer but I want to have it continuosly generate chunks with the chunks being consistant with eachother but I don't see how to do that with the current function I am using. Here is the code for the algorithm is there anything I can change to make it work?
inline void perlInNoise2D(int nWidth,int nHeight, float *Seed, int nOctaves, float fBias, float *fOutput)
{
for(int x = 0; x < nWidth; x++)
{
for(int y = 0; y < nHeight; y++)
{
float fNoise = 0.0f;
float fScale = 1.0f;
float fScaleAccum = 0.0f;
for(int o = 0; o < nOctaves;o++)
{
int nPitch = nWidth >> o;
int sampleX1 = (x / nPitch) * nPitch;
int sampleY1 = (y / nPitch) * nPitch;
int sampleX2 = (sampleX1 + nPitch) % nWidth;
int sampleY2 = (sampleY1 + nPitch) % nWidth;
float fBlendX = (float)(x - sampleX1) / (float) nPitch;
float fBlendY = (float)(y - sampleY1) / (float) nPitch;
float fSampleT = (1.0f - fBlendX) * Seed[sampleY1 * nWidth + sampleX1] + fBlendX * Seed[sampleY1 * nWidth + sampleX2];
float fSampleB = (1.0f - fBlendX) * Seed[sampleY2 * nWidth + sampleX1] + fBlendX * Seed[sampleY2 * nWidth + sampleX2];
fNoise += (fBlendY * (fSampleB - fSampleT) + fSampleT) * fScale;
fScaleAccum += fScale;
fScale = fScale / fBias;
}
fOutput[(y * nWidth) + x] = fNoise / fScaleAccum;
}
}
}

Presumably this is tied in to a "map reveal" mechanism?
A common technique is to generate overlapping chunks and average them together. As a simple example, you generate chunks of 2*nWidth by 2*nHeight. You'd then have 4 overlapping chunks at any XY pos. At the edge of the map, you'll have a strip where not all chunks have been generated. When this part of the map needs to be revealed, you generate those chunks on the fly. This moves the edge outwards.
The averaging process already smooths out the boundary effects. You can make this more effective by smoothing out each individual chunk near its edges. Since the chunk edges do not coincide, the smoothing of different chunks does not coincide either. A simple triangle smooth could be sufficient (i.e. the smooth window is 1 in the middle, 0 at the edge, and linear in between) but you could also use a gaussian or any other function that peaks in the middle and gradually smooths towards the chunk edge.

What is the highest bit depth greyscale image I can export from FreeImage?

As context, I'm working with building a topographic program which needs relatively extreme detail. I do not expect the files to be small, and they do not formally need to be viewed on a monitor, they just need to have very high resolution.
I know that most image formats are limited to 8 bpp, on account of the standard limits on both monitors (at a reasonable price) and on human perception. However, 2⁸ is just 256 possible values, which induces plateauing artifacts in a reconstructed displacement. 2¹⁶ may be close enough at 65,536 possible values, which I have achieved.
I'm using FreeImage and DLang to construct the data, currently on a Linux Mint machine.
However, when I went on to 2³², software support seemed to fade on me. I tried a TIFF of this form and nothing seemed to be able to interpret it, either showing a completely (or mostly) transparent image (remembering that I didn't expect any monitor to really support 2³² shades of a channel) or complaining about being unable to decode the RGB data. I imagine that it's because it was assumed to be an RGB or RGBA image.
FreeImage is reasonably well documented for most purposes, but I'm now wondering, what is the highest-precision single-channel format I can export, and how would I do it? Can anyone provide an example? Am I really limited, in any typical and not-home-rolled image format, to 16-bit? I know that's high enough for, say, medical imaging, but I'm sure I'm not the first person to try to aim higher and we science-types can be pretty ambitious about our precision-level…
Did I make a glaring mistake in my code? Is there something else I should try instead for this kind of precision?
Here's my code.
The 16-bit TIFF that worked
void writeGrayscaleMonochromeBitmap(const double width, const double height) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT16, cast(int)width, cast(int)height);
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
ushort v = cast(ushort)((x * 0xFFFF)/width);
ubyte[2] bytes = nativeToLittleEndian(cast(ushort)(x/width * 0xFFFF));
scanline[x * ushort.sizeof + 0] = bytes[0];
scanline[x * ushort.sizeof + 1] = bytes[1];
}
}
FreeImage_Save(FIF_TIFF, bitmap, "test.tif", TIFF_DEFAULT);
FreeImage_Unload(bitmap);
}
The 32-bit TIFF that didn't really work
void writeGrayscaleMonochromeBitmap32(const double width, const double height) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT32, cast(int)width, cast(int)height);
writeln(width, ", ", height);
writeln("Width: ", FreeImage_GetWidth(bitmap));
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
writeln(y, ": ", scanline);
for(int x = 0; x < width; x++) {
//writeln(x, " < ", width);
uint v = cast(uint)((x/width) * 0xFFFFFFFF);
writeln("V: ", v);
ubyte[4] bytes = nativeToLittleEndian(v);
scanline[x * uint.sizeof + 0] = bytes[0];
scanline[x * uint.sizeof + 1] = bytes[1];
scanline[x * uint.sizeof + 2] = bytes[2];
scanline[x * uint.sizeof + 3] = bytes[3];
}
}
FreeImage_Save(FIF_TIFF, bitmap, "test32.tif", TIFF_NONE);
FreeImage_Unload(bitmap);
}
Thanks for any pointers.

For a single channel, the highest available from FreeImage is 32-bit, as FIT_UINT32. However, the file format must be capable of this, and as of the moment, only TIFF appears to be up to the task (See page 104 of the Stanford Documentation). Additionally, most monitors are incapable of representing more than 8-bits-per-sample, 12 in extreme cases, so it is very difficult to read data back out and have it render properly.
A unit test involving comparing bytes before marshaling to the bitmap, and sampled from the same bitmap afterward, show that the data is in fact being encoded.
To imprint data to a 16-bit gray scale (currently supported by J2K, JP2, PGM, PGMRAW, PNG and TIF), you would do something like this:
void toFreeImageUINT16PNG(string fileName, const double width, const double height, double[] data) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT16, cast(int)width, cast(int)height);
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
//This magic has to happen with the y-coordinate in order to keep FreeImage from following its default behavior, and generating
//the image upside down.
ushort v = cast(ushort)(data[cast(ulong)(((height - 1) - y) * width + x)] * 0xFFFF); //((x * 0xFFFF)/width);
ubyte[2] bytes = nativeToLittleEndian(v);
scanline[x * ushort.sizeof + 0] = bytes[0];
scanline[x * ushort.sizeof + 1] = bytes[1];
}
}
FreeImage_Save(FIF_PNG, bitmap, fileName.toStringz);
FreeImage_Unload(bitmap);
}
Of course you would want to make adjustments for your target file type. To export as 48-bit RGB16, you would do this.
void toFreeImageColorPNG(string fileName, const double width, const double height, double[] data) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_RGB16, cast(int)width, cast(int)height);
uint pitch = FreeImage_GetPitch(bitmap);
uint bpp = FreeImage_GetBPP(bitmap);
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
ulong offset = cast(ulong)((((height - 1) - y) * width + x) * 3);
ushort r = cast(ushort)(data[(offset + 0)] * 0xFFFF);
ushort g = cast(ushort)(data[(offset + 1)] * 0xFFFF);
ushort b = cast(ushort)(data[(offset + 2)] * 0xFFFF);
ubyte[6] bytes = nativeToLittleEndian(r) ~ nativeToLittleEndian(g) ~ nativeToLittleEndian(b);
scanline[(x * 3 * ushort.sizeof) + 0] = bytes[0];
scanline[(x * 3 * ushort.sizeof) + 1] = bytes[1];
scanline[(x * 3 * ushort.sizeof) + 2] = bytes[2];
scanline[(x * 3 * ushort.sizeof) + 3] = bytes[3];
scanline[(x * 3 * ushort.sizeof) + 4] = bytes[4];
scanline[(x * 3 * ushort.sizeof) + 5] = bytes[5];
}
}
FreeImage_Save(FIF_PNG, bitmap, fileName.toStringz);
FreeImage_Unload(bitmap);
}
Lastly, to encode a UINT32 greyscale image (limited purely to TIFF at the moment), you would do this.
void toFreeImageTIF32(string fileName, const double width, const double height, double[] data) {
FIBITMAP *bitmap = FreeImage_AllocateT(FIT_UINT32, cast(int)width, cast(int)height);
//DEBUG
int xtest = cast(int)(width/2);
int ytest = cast(int)(height/2);
uint comp1a = cast(uint)(data[cast(ulong)(((height - 1) - ytest) * width + xtest)] * 0xFFFFFFFF);
writeln("initial: ", nativeToLittleEndian(comp1a));
for(int y = 0; y < height; y++) {
ubyte *scanline = FreeImage_GetScanLine(bitmap, y);
for(int x = 0; x < width; x++) {
//This magic has to happen with the y-coordinate in order to keep FreeImage from following its default behavior, and generating
//the image upside down.
ulong i = cast(ulong)(((height - 1) - y) * width + x);
uint v = cast(uint)(data[i] * 0xFFFFFFFF);
ubyte[4] bytes = nativeToLittleEndian(v);
scanline[x * uint.sizeof + 0] = bytes[0];
scanline[x * uint.sizeof + 1] = bytes[1];
scanline[x * uint.sizeof + 2] = bytes[2];
scanline[x * uint.sizeof + 3] = bytes[3];
}
}
//DEBUG
ulong index = cast(ulong)(xtest * uint.sizeof);
writeln("Final: ", FreeImage_GetScanLine(bitmap, ytest)
[index .. index + uint.sizeof]);
FreeImage_Save(FIF_TIFF, bitmap, fileName.toStringz);
FreeImage_Unload(bitmap);
}
I've yet to find a program, built by anyone else, which will readily render a 32-bit gray-scale image on a monitor's available palette. However, I left my checking code in which will consistently write out the same array both at the top DEBUG and the bottom one, and that's consistent enough for me.
Hopefully this will help someone else out in the future.

How to downsample a not-power-of-2 texture in UnrealEngine?

I am rendering the Viewport with a resolution of something like 1920x1080 multiplied by a Oversampling value like 4. Now i need to downsample from the rendered Resolution 7680‬x4320 back to the 1920x1080.
Are there any functions in Unreal I could use for that ? Or any Library (windows only) which handle this nicely ?
Or what would be a propper way of writing this my own ?
We tried to implement a downsampling but it only works if SnapshotScale is 2, when its higher than 2 it doesn't seem to have an effect regarding image quality.
UTexture2D* AAVESnapShotManager::DownsampleTexture(UTexture2D* Texture)
{
UTexture2D* Result = UTexture2D::CreateTransient(RenderSettings.imageWidth, RenderSettings.imageHeight, PF_B8G8R8A8);
void* TextureDataVoid = Texture->PlatformData->Mips[0].BulkData.Lock(LOCK_READ_ONLY);
void* ResultDataVoid = Result->PlatformData->Mips[0].BulkData.Lock(LOCK_READ_WRITE);
FColor* TextureData = (FColor*)TextureDataVoid;
FColor* ResultData = (FColor*)ResultDataVoid;
int32 WindowSize = RenderSettings.resolutionScale / 2;
for (int x = 0; x < Result->GetSizeX(); ++x)
{
for (int y = 0; y < Result->GetSizeY(); ++y)
{
const uint32 ResultIndex = y * Result->GetSizeX() + x;
uint32_t R = 0, G = 0, B = 0, A = 0;
int32 Samples = 0;
for (int32 dx = -WindowSize; dx < WindowSize; ++dx)
{
for (int32 dy = -WindowSize; dy < WindowSize; ++dy)
{
int32 PosX = (x * RenderSettings.resolutionScale + dx);
int32 PosY = (y * RenderSettings.resolutionScale + dy);
if (PosX < 0 || PosX >= Texture->GetSizeX() || PosY < 0 || PosY >= Texture->GetSizeY())
{
continue;
}
size_t TextureIndex = PosY * Texture->GetSizeX() + PosX;
FColor& Color = TextureData[TextureIndex];
R += Color.R;
G += Color.G;
B += Color.B;
A += Color.A;
++Samples;
}
}
ResultData[ResultIndex] = FColor(R / Samples, G / Samples, B / Samples, A / Samples);
}
}
Texture->PlatformData->Mips[0].BulkData.Unlock();
Result->PlatformData->Mips[0].BulkData.Unlock();
Result->UpdateResource();
return Result;
}
I expect a high quality oversampled Texture output, working with any positive int value in SnapshotScale.

I have a suggestion. It's not really direct, but it involves no writing of image filtering or importing of libraries.
Make an unlit Material with nodes TextureObject->TextureSample-> connect to Emissive.
Use the texture you start with in your function to populate the Texture Object on a Material Instance Dynamic of the material.
Use the "Draw Material to Render Target" function to draw the Material Instance Dynamic to a Render Target that is pre-set with your target resolution.

Masking image and video in opencv c++

I am a beginner in image processing especially in openCV C++. I have a problem on my work. In C# with EmguCV it is possible to make masking in image and video files based on ROI. My question is, is it possible to make masks the same way in OpenCV C++? . I have tried to use ROI in OpenCV C++, but the result only cropping the image not like the example that i attached Here. I also attached the pseucode of masking in C# with EmguCV but have not found yet in C++ version. I am looking forward for any answer. Thank You
pixelSize, out long processingTime)
{
int x = imageInput.Width / pixelSize;
int y = imageInput.Height / pixelSize;
Mat imageBlock = new Mat();
Point darkestBlockPoint = new Point();
int darkestBlockValue = 100000;
//AppendLogTxt("", "y,x,value", "masking");
for (int i = marginV; i < y - marginV; i++)
{
for (int j = marginH; j < x - marginH; j++)
{
imageBlock = new Mat(imageInput, new Rectangle(j * pixelSize, i * pixelSize, pixelSize, pixelSize));
MCvScalar avg = CvInvoke.Mean(imageBlock);
//AppendLogTxt("", i.ToString() + "," + j.ToString() + "," + avg.V0.ToString(), "masking");
if ((int)avg.V0 < darkestBlockValue)
{
darkestBlockValue = (int)avg.V0;
darkestBlockPoint.X = j;
darkestBlockPoint.Y = i;
}
}
}
darkestBlockPoint.X = darkestBlockPoint.X * pixelSize + pixelSize / 2;
darkestBlockPoint.Y = darkestBlockPoint.Y * pixelSize + pixelSize / 2;
return darkestBlockPoint;
}

Character recognition from an image C++

*Note: while this post is pretty much asking about bilinear interpolation I kept the title more general and included extra information in case someone has any ideas on how I can possibly do this better
I have been having trouble implementing a way to identify letters from an image in order to create a word search solving program. For mainly educational but also portability purposes, I have been attempting this without the use of a library. It can be assumed that the image the characters will be picked off of contains nothing else but the puzzle. Although this page is only recognizing a small set of characters, I have been using it to guide my efforts along with this one as well. As the article suggested I have an image of each letter scaled down to 5x5 to compare each unknown letter to. I have had the best success by scaling down the unknown to 5x5 using bilinear resampling and summing the squares of the difference in intensity of each corresponding pixel in the known and unknown images. To attempt to get more accurate results I also added the square of the difference in width:height ratios, and white:black pixel ratios of the top half and bottom half of each image. The known image with the closest "difference score" to the unknown image is then considered the unknown letter. The problem is that this seems to have only about a 50% accuracy. To improve this I have tried using larger samples (instead of 5x5 I tried 15x15) but this proved even less effective. I also tried to go through the known and unknown images and look for features and shapes, and determine a match based on two images having about the same amount of the same features. For example shapes like the following were identified and counted up (Where ■ represents a black pixel). This proved less effective as the original method.
■ ■ ■ ■
■ ■
So here is an example: the following image gets loaded:
The program then converts it to monochrome by determining if each pixel has an intensity above or below the average intensity of an 11x11 square using a summed area table, fixes the skew and picks out the letters by identifying an area of relatively equal spacing. I then use the intersecting horizontal and vertical spaces to get a general idea of where each character is. Next I make sure that the entire letter is contained in each square picked out by going line by line, above, below, left and right of the original square until the square's border detects no dark pixels on it.
Then I take each letter, resample it and compare it to the known images.
*Note: the known samples are using arial font size 12, rescaled in photoshop to 5x5 using bilinear interpolation.
Here is an example of a successful match:
The following letter is picked out:
scaled down to:
which looks like
from afar. This is successfully matched to the known N sample:
Here is a failed match:
is picked out and scaled down to:
which, to no real surprise does not match to the known R sample
I changed how images are picked out, so that the letter is not cut off as you can see in the above images so I believe the issue comes from scaling the images down. Currently I am using bilinear interpolation to resample the image. To understand how exactly this works with downsampling I referred to the second answer in this post and came up with the following code. Previously I have tested that this code works (at least to a "this looks ok" point) so it could be a combination of factors causing problems.
void Image::scaleTo(int width, int height)
{
int originalWidth = this->width;
int originalHeight = this->height;
Image * originalData = new Image(this->width, this->height, 0, 0);
for (int i = 0; i < this->width * this->height; i++) {
int x = i % this->width;
int y = i / this->width;
originalData->setPixel(x, y, this->getPixel(x, y));
}
this->resize(width, height); //simply resizes the image, after the resize it is just a black bmp.
double factorX = (double)originalWidth / width;
double factorY = (double)originalHeight / height;
float * xCenters = new float[originalWidth]; //the following stores the "centers" of each pixel.
float * yCenters = new float[originalHeight];
float * newXCenters = new float[width];
float * newYCenters = new float[height];
//1 represents one of the originally sized pixel's side length
for (int i = 0; i < originalWidth; i++)
xCenters[i] = i + 0.5;
for (int i = 0; i < width; i++)
newXCenters[i] = (factorX * i) + (factorX / 2.0);
for (int i = 0; i < height; i++)
newYCenters[i] = (factorY * i) + (factorY / 2.0);
for (int i = 0; i < originalHeight; i++)
yCenters[i] = i + 0.5;
/* p[0] p[1]
p
p[2] p[3] */
//the following will find the closest points to the sampled pixel that still remain in this order
for (int x = 0; x < width; x++) {
for (int y = 0; y < height; y++) {
POINT p[4]; //POINT used is the Win32 struct POINT
float pDists[4] = { FLT_MAX, FLT_MAX, FLT_MAX, FLT_MAX };
float xDists[4];
float yDists[4];
for (int i = 0; i < originalWidth; i++) {
for (int j = 0; j < originalHeight; j++) {
float xDist = abs(xCenters[i] - newXCenters[x]);
float yDist = abs(yCenters[j] - newYCenters[y]);
float dist = sqrt(xDist * xDist + yDist * yDist);
if (xCenters[i] < newXCenters[x] && yCenters[j] < newYCenters[y] && dist < pDists[0]) {
p[0] = { i, j };
pDists[0] = dist;
xDists[0] = xDist;
yDists[0] = yDist;
}
else if (xCenters[i] > newXCenters[x] && yCenters[j] < newYCenters[y] && dist < pDists[1]) {
p[1] = { i, j };
pDists[1] = dist;
xDists[1] = xDist;
yDists[1] = yDist;
}
else if (xCenters[i] < newXCenters[x] && yCenters[j] > newYCenters[y] && dist < pDists[2]) {
p[2] = { i, j };
pDists[2] = dist;
xDists[2] = xDist;
yDists[2] = yDist;
}
else if (xCenters[i] > newXCenters[x] && yCenters[j] > newYCenters[y] && dist < pDists[3]) {
p[3] = { i, j };
pDists[3] = dist;
xDists[3] = xDist;
yDists[3] = yDist;
}
}
}
//channel is a typedef for unsigned char
//getOPixel(point) is a macro for originalData->getPixel(point.x, point.y)
float r1 = (xDists[3] / (xDists[2] + xDists[3])) * getOPixel(p[2]).r + (xDists[2] / (xDists[2] + xDists[3])) * getOPixel(p[3]).r;
float r2 = (xDists[1] / (xDists[0] + xDists[1])) * getOPixel(p[0]).r + (xDists[0] / (xDists[0] + xDists[1])) * getOPixel(p[1]).r;
float interpolated = (yDists[0] / (yDists[0] + yDists[3])) * r1 + (yDists[3] / (yDists[0] + yDists[3])) * r2;
channel r = (channel)round(interpolated);
r1 = (xDists[3] / (xDists[2] + xDists[3])) * getOPixel(p[2]).g + (xDists[2] / (xDists[2] + xDists[3])) * getOPixel(p[3]).g; //yDist[3]
r2 = (xDists[1] / (xDists[0] + xDists[1])) * getOPixel(p[0]).g + (xDists[0] / (xDists[0] + xDists[1])) * getOPixel(p[1]).g; //yDist[0]
interpolated = (yDists[0] / (yDists[0] + yDists[3])) * r1 + (yDists[3] / (yDists[0] + yDists[3])) * r2;
channel g = (channel)round(interpolated);
r1 = (xDists[3] / (xDists[2] + xDists[3])) * getOPixel(p[2]).b + (xDists[2] / (xDists[2] + xDists[3])) * getOPixel(p[3]).b; //yDist[3]
r2 = (xDists[1] / (xDists[0] + xDists[1])) * getOPixel(p[0]).b + (xDists[0] / (xDists[0] + xDists[1])) * getOPixel(p[1]).b; //yDist[0]
interpolated = (yDists[0] / (yDists[0] + yDists[3])) * r1 + (yDists[3] / (yDists[0] + yDists[3])) * r2;
channel b = (channel)round(interpolated);
this->setPixel(x, y, { r, g, b });
}
}
delete[] xCenters;
delete[] yCenters;
delete[] newXCenters;
delete[] newYCenters;
delete originalData;
}
I have utmost respect for anyone even remotely willing to sift through this to try and help. Any and all suggestion will be extremely appreciated.
UPDATE:
So as suggested I started augmenting the known data set with scaled down letters from word searches. This greatly improved accuracy from about 50% to 70% (percents calculated from a very small sample size so take the numbers lightly). Basically I'm using the original set of chars as a base (this original set was actually the most accurate out of other sets I've tried ex: a set calculated using the same resampling algorithm, a set using a different font etc.) And I just am manually adding knowns to that set. I basically will manually assign the first 20 or so images picked out in a search their corresponding letter and save that into the known set folder. I still am choosing the closest out of the entire known set to match a letter. Would this still be a good method or should some kind of change be made? I also implemented a feature where if a letter is about a 90% match with a known letter, I assume the match is correct and and the current "unknown" to the list of knowns. I could see this possibly going both ways, I feel like it could either a. make the program more accurate over time or b. solidify the original guess and possibly make the program less accurate over time. I have actually not noticed this cause a change (either for the better or for the worse). Am I on the right track with this? I'm not going to call this solved just yet, until I get accuracy just a little higher and test the program from more examples.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Quick code to resize DIB image and maintain good img quality - c++

Related

Change Perlin noise algorithm to work with continuous procedural generation

What is the highest bit depth greyscale image I can export from FreeImage?

How to downsample a not-power-of-2 texture in UnrealEngine?

Masking image and video in opencv c++

Character recognition from an image C++

Categories

Resources