Floyd Steinberg Dithering gray(pgm ascii) to black-white (pbm ascii) - c++

I have image in pgm
after using this function:
void convertWithDithering(array_type& pixelgray)
{
int oldpixel;
int newpixel;
int quant_error;
for (int y = 0; y< HEIGHT-1; y++){
for (int x = 1; x<WIDTH-1; x++){
oldpixel = pixelgray[x][y];
newpixel = (oldpixel > 128) ? 0 : 1;
pixelgray[x][y] = newpixel;
quant_error = oldpixel - newpixel;
pixelgray[x+1][y] = pixelgray[x+1][y] + 7/16 * quant_error;
pixelgray[x-1][y+1] = pixelgray[x-1][y+1] + 3/16 * quant_error;
pixelgray[x ][y+1]=pixelgray[x ][y+1]+ 5/16 * quant_error;
pixelgray[x+1][y+1] = pixelgray[x+1][y+1]+ 1/16 * quant_error;
}
}
}
i have this
I want to get the same image only in black white colors

Last time I had a simililar smeering with a PGM file, it was because I saved data in a file opened by fopen(filename,"w");
The file had a lot of \r\n line endings (os: windows), whereas it needed only \n. Maybe your issue is something like that. Save the file in binary format, with
fopen(filename,"wb");
Edit: Asides from the smeering, your implementation of Floyd–Steinberg dithering is incorrect.
First, your error propagation should be XXX * quant_error /16 instead of XXX/16 * quant_error (which will always be equal to 0).
Then, you are mixing up the two color spaces (0/1 and 0->255). A correct way to handle it is to always use the 0->255 space by changing to test line to
newpixel = (oldpixel > 128) ? 255 : 0;
(note that the order 255 : 0 is important, if you let 0 : 255, the algorithm won't work)
At the end of the function, your array will be full or 0 or 255. If you want, you can iterate one more time to convert it to 0-1, but I think it is easier to do it once you record your pbm file.

It looks like the conversion to PBM outside of this function is doing something wrong. It looks like it is converting one PGM pixel to several PBM pixels, which results in this 'smeering' effect. Just a wild guess tho.
The function itself looks okay to me, apart from one little thing: I think due to you using int for everything, all your 5/16 * quant_error will be zero. Rather use floats or doubles and make it 5.0/16.0.

Related

Reading BMP file into an array

I am writing a longer program and I found myself needing to read a .bmp file into an array in a specific way so that the rest of the program can use it without extensive rewrites. I failed to find older answers that would resolve my problem, and I am pretty much at the beginner stages.
The image I am trying to read is used to create a text font, so I want to read it character by character into an array, where the pixels belonging to one character are added in order to a 2d bool (true if pixel is not black) array [character_id] [pixel_n]. The dimensions of characters are predetermined and known, and the file is cropped so that they all appear in a single row with no unaccounted margins.
This is the specific file I am trying to read, though here it might not show up as .bmp
As an example, shown here, I want to read the pixels in the order of the yellow line, then jump to another character. For clarity each character is 5px wide and 11px high, with 1px of margin on both sides horizontally.
Based on what I was able to find, I have written a function to do it, but I fail to make it work as intended, as far as I can tell even the pixel values are not being read correctly:
void readBMP(char* filename)
{
int i;
FILE* f = fopen(filename, "rb");
unsigned char info[54];
// read the 54-byte header
fread(info, sizeof(unsigned char), 54, f);
// extract image height and width from header
int width = *(int*)&info[18];
int height = *(int*)&info[22];
// number of pixels in total
int size = 3 * width * height;
unsigned char* data = new unsigned char[size];
// number of characters to read
int counter1 = size / ((font_width + 2) * font_height) / 3 ;
// read the rest of the data at once
fread(data, sizeof(unsigned char), size, f);
fclose(f);
//loop that goes from character to character
for(int i = 0; i < counter1; i++)
{
int tmp = 0;
//loop that reads one character into font_ref array
for(int j = 0; j < font_height; j++)
{
//loop for each row of a character
for(int k = 0; k < font_width; k++)
{
int w = static_cast<int>(data[3*(j*(font_width+2)*(counter1) + i*(font_width + 2) + 1 + k + j*font_width + j)-1]);
if( w != 0 )
font_ref [i][(tmp)] = 1;
else
font_ref [i][(tmp)] = 0;
tmp++;
}
}
}
}
(bool font_ref [150][font_width*font_height]; is the array where the font is being loaded and stored)
this code reads something, but the result is a seemingly random mess and I am unable to resolve that. Here is an example of lowercase alphabet printed using another function in the program, where white pixels represent true bools. I am aware that some libraries exist to work with graphical files, however in this program I wanted to possibly avoid that to learn more lower-level things, and the goal is rather limited and specific.
Thank you in advance for any help with the issue.
The main errors are in the offset computation for a pixel in the bitmap data:
int w = static_cast<int>(data[3*(j*(font_width+2)*(counter1) + i*(font_width + 2) + 1 + k + j*font_width + j)-1]);
j*(font_width+2)*(counter1) - This doesn't take into account that
although you say the file is cropped, there is extra black space to the right of the last character cell, so the true width must be used;
(as drescherjm and user3386109 mentioned) padding bytes are appended to the rows so that their length is a multiple of four bytes.
+ j*font_width + j)-1 - This part makes no sense - perhaps you tried to compensate the above errors.
This would be correct:
int w = data[j*(3*width+3&~3)+3*(i*(font_width+2)+1+k)];

(C++)(Visual Studio) Change RGB to Grayscale

I am accessing the image like so:
pDoc = GetDocument();
int iBitPerPixel = pDoc->_bmp->bitsperpixel; // used to see if grayscale(8 bits) or RGB (24 bits)
int iWidth = pDoc->_bmp->width;
int iHeight = pDoc->_bmp->height;
BYTE *pImg = pDoc->_bmp->point; // pointer used to point at pixels in the image
int Wp = iWidth;
const int area = iWidth * iHeight;
int r; // red pixel value
int g; // green pixel value
int b; // blue pixel value
int gray; // gray pixel value
BYTE *pImgGS = pImg; // grayscale image pixel array
and attempting to change the rgb image to gray like so:
// convert RGB values to grayscale at each pixel, then put in grayscale array
for (int i = 0; i<iHeight; i++)
for (int j = 0; j<iWidth; j++)
{
r = pImg[i*iWidth * 3 + j * 3 + 2];
g = pImg[i*iWidth * 3 + j * 3 + 1];
b = pImg[i*Wp + j * 3];
r * 0.299;
g * 0.587;
b * 0.144;
gray = std::round(r + g + b);
pImgGS[i*Wp + j] = gray;
}
finally, this is how I try to draw the image:
//draw the picture as grayscale
for (int i = 0; i < iHeight; i++) {
for (int j = 0; j < iWidth; j++) {
// this should set every corresponding grayscale picture to the current picture as grayscale
pImg[i*Wp + j] = pImgGS[i*Wp + j];
}
}
}
original image:
and the resulting image that I get is this:
First check if image type is 24 bits per pixels.
Second, allocate memory to pImgGS;
BYTE* pImgGS = (BTYE*)malloc(sizeof(BYTE)*iWidth *iHeight);
Please refer this article to see how bmp data is saved. bmp images are saved upside down. Also, first 54 byte of information is BITMAPFILEHEADER.
Hence you should access values in following way,
double r,g,b;
unsigned char gray;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[(i*iWidth + j)*3 + 2];
g = (double)pImg[(i*iWidth + j)*3 + 1];
b = (double)pImg[(i*iWidth + j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
}
If there is padding present, then first determine padding and access in different way. Refer this to understand pitch and padding.
double r,g,b;
unsigned char gray;
long index=0;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[index+ (j)*3 + 2];
g = (double)pImg[index+ (j)*3 + 1];
b = (double)pImg[index+ (j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
index =index +pitch;
}
While drawing image,
as pImg is 24bpp, you need to copy gray values thrice to each R,G,B channel. If you ultimately want to save grayscale image in bmp format, then again you have to write bmp data upside down or you can simply skip that step in converting to gray here:
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
tl; dr:
Make one common path. Convert everything to 32-bits in a well-defined manner, and do not use image dimensions or coordinates. Refactor the YCbCr conversion ( = grey value calculation) into a separate function, this is easier to read and runs at exactly the same speed.
The lengthy stuff
First, you seem to have been confused with strides and offsets. The artefact that you see is because you accidentially wrote out one value (and in total only one third of the data) when you should have written three values.
One can get confused with this easily, but here it happened because you do useless stuff that you needed not do in the first place. You are iterating coordinates left to right, top-to-bottom and painstakingly calculate the correct byte offset in the data for each location.
However, you're doing a full-screen effect, so what you really want is iterate over the complete image. Who cares about the width and height? You know the beginning of the data, and you know the length. One loop over the complete blob will do the same, only faster, with less obscure code, and fewer opportunities of getting something wrong.
Next, 24-bit bitmaps are common as files, but they are rather unusual for in-memory representation because the format is nasty to access and unsuitable for hardware. Drawing such a bitmap will require a lot of work from the driver or the graphics hardware (it will work, but it will not work well). Therefore, 32-bit depth is usually a much better, faster, and more comfortable choice. It is much more "natural" to access program-wise.
You can rather trivially convert 24-bit to 32-bit. Iterate over the complete bitmap data and write out a complete 32-bit word for each 3 byte-tuple read. Windows bitmaps ignore the A channel (the highest-order byte), so just leave it zero, or whatever.
Also, there is no such thing as a 8-bit greyscale bitmap. This simply doesn't exist. Although there exist bitmaps that look like greyscale bitmaps, they are in reality paletted 8-bit bitmaps where (incidentially) the bmiColors member contains all greyscale values.
Therefore, unless you can guarantee that you will only ever process images that you have created yourself, you cannot just rely that e.g. the values 5 and 73 correspond to 5/255 and 73/255 greyscale intensity, respectively. That may be the case, but it is in general a wrong assumption.
In order to be on the safe side as far as correctness goes, you must convert your 8-bit greyscale bitmaps to real colors by looking up the indices (the bitmap's grey values are really indices) in the palette. Otherwise, you could be loading a greyscale image where the palette is the other way around (so 5 would mean 250 and 250 would mean 5), or a bitmap which isn't greyscale at all.
So... you want to convert 24-bit and you want to convert 8-bit bitmaps, both to 32-bit depth. That means you do all the annoying what-if stuff once at the beginning, and the rest is one identical common path. That's a good thing.
What you will be showing on-screen is always a 32-bit bitmap where the topmost byte is ignored, and the lower three are all the same value, resulting in what looks like a shade of grey. That's simple, and simple is good.
Note that if you do a BT.601 style YCbCr conversion (as indicated by your use of the constants 0.299, 0.587, and 0.144), and if your 8-bit greyscale images are perceptive (this is something you must know, there is no way of telling from the file!), then for 100% correctness, you need to to the inverse transformation when converting from paletted 8-bit to RGB. Otherwise, your final result will look like almost right, but not quite. If your 8-bit greycales are linear, i.e. were created without using the above constants (again, you must know, you cannot tell from the image), you need to copy everything as-is (here, doing the conversion would make it look almost-but-not-quite right).
About the RGB-to-greyscale conversion, you do not need an extra greyscale bitmap just to hold the values that you never need again afterwards. You can read the three color values from the loaded bitmap, calculate Y, and directly build the 32-bit ARGB word, which you then write out to the final bitmap. This saves one entirely useless round-trip to memory which is not necessary.
Something like this:
uint32_t* out = (uint32_t*) output_bitmap_data;
for(int i = 0; i < inputSize; i+= 3)
{
uint8_t Y = calc_greyscale(in[0], in[1], in[2]);
*out++ = (Y<<16) | (Y<<8) | Y;
}
Alternatively, you can also do the from-whatever-to-32 conversion, and then do the to-greyscale conversion in-place there. This, in turn, introduces an extra round-trip to memory, but the code becomes much, much easier overall.

C++ GDI+ bitmap manipulation needs speed up on byte operations

I'm using GDI+ in C++ to manipulate some Bitmap images, changing the colour and resizing the images. My code is very slow at one particular point and I was looking for some potential ways to speed up the line that's been highlighted in the VS2013 Profiler
for (UINT y = 0; y < 3000; ++y)
{
//one scanline at a time because bitmaps are stored wrong way up
byte* oRow = (byte*)bitmapData1.Scan0 + (y * bitmapData1.Stride);
for (UINT x = 0; x < 4000; ++x)
{
//get grey value from 0.114*Blue + 0.299*Red + 0.587*Green
byte grey = (oRow[x * 3] * .114) + (oRow[x * 3 + 1] * .587) + (oRow[x * 3 + 2] * .299); //THIS LINE IS THE HIGHLIGHTED ONE
//rest of manipulation code
}
}
Any handy hints on how to handle this arithmetic line better? It's causing massive slow downs in my code
Thanks in advance!
Optimization depends heavily on the used compiler and the target system. But there are some hints which may be usefull. Avoid multiplications:
Instead of:
byte grey = (oRow[x * 3] * .114) + (oRow[x * 3 + 1] * .587) + (oRow[x * 3 + 2] * .299); //THIS LINE IS THE HIGHLIGHTED ONE
use...
//get grey value from 0.114*Blue + 0.299*Red + 0.587*Green
byte grey = (*oRow) * .114;
oRow++;
grey += (*oRow) * .587;
oRow++;
grey += (*oRow) * .299;
oRow++;
You can put the incrimination of the pointer in the same line. I put it in a separate line for better understanding.
Also, instead of using the multiplication of a float you can use a table, which can be faster than arithmetic. This depends on CPU und table size, but you can give it a shot:
// somwhere global or class attributes
byte tred[256];
byte tgreen[256];
byte tblue[256];
...at startup...
// Only init once at startup
// I am ignoring the warnings, you should not :-)
for(int i=0;i<255;i++)
{
tred[i]=i*.114;
tgreen[i]=i*.587;
tblue[i]=i*.229;
}
...in the loop...
byte grey = tred[*oRow];
oRow++;
grey += tgreen[*oRow];
oRow++;
grey += tblue[*oRow];
oRow++;
Also. 255*255*255 is not such a great size. You can build one big table. As this Table will be larger than the usual CPU cache, I give it not such more speed efficiency.
As suggested, you could do math in integer, but you could also try floats instead of doubles (.114f instead of .114), which are usually quicker and you don't need the precision.
Do the loop like this, instead, to save on pointer math. Creating a temporary pointer like this won't cost because the compiler will understand what you're up to.
for(UINT x = 0; x < 12000; x+=3)
{
byte* pVal = &oRow[x];
....
}
This code is also easily threadable - the compiler can do it for you automatically in various ways; here's one, using parallel for:
https://msdn.microsoft.com/en-us/library/dd728073.aspx
If you have 4 cores, that's a 4x speedup, just about.
Also be sure to check release vs debug build - you don't know the perf until you run it in release/optimized mode.
You could premultiply values like: oRow[x * 3] * .114 and put them into an array. oRow[x*3] has 256 values, so you can easily create array aMul1 of 256 values from 0->255, and multiply it by .144. Then use aMul1[oRow[x * 3]] to find multiplied value. And the same for other components.
Actually you could even create such array for RGB values, ie. your pixel is 888, so you will need an array of size 256*256*256, which is 16777216 = ~16MB.Whether this would speed up your process, you would have to check yourself with profiler.
In general I've found that more direct pointer management, intermediate instructions, less instructions (on most CPUs, they're all equal cost these days), and less memory fetches - e.g. tables are not the answer more often than they are - is the usual optimum, without going to direct assembly. Vectorization, especially explicit is also helpful as is dumping assembly of the function and confirming the inner bits conform to your expectations. Try this:
for (UINT y = 0; y < 3000; ++y)
{
//one scanline at a time because bitmaps are stored wrong way up
byte* oRow = (byte*)bitmapData1.Scan0 + (y * bitmapData1.Stride);
byte *p = oRow;
byte *pend = p + 4000 * 3;
for(; p != pend; p+=3){
const float grey = p[0] * .114f + p[1] * .587f + p[2] * .299f;
}
//alternatively with an autovectorizing compiler
for(; p != pend; p+=3){
#pragma unroll //or use a compiler option to unroll loops
//make sure vectorization and relevant instruction sets are enabled - this is effectively a dot product so the following intrinsic fits the bill:
//https://msdn.microsoft.com/en-us/library/bb514054.aspx
//vector types or compiler intrinsics are more reliable often too... but get compiler specific or architecture dependent respectively.
float grey = 0;
const float w[3] = {.114f, .587f, .299f};
for(int c = 0; c < 3; ++c){
grey += w[c] * p[c];
}
}
}
Consider fooling around with OpenCL and targeting your CPU to see how fast you could solve with CPU specific optimizations and easily multiple cores - OpenCL covers this up for you pretty well and provides built in vector ops and dot product.

Image Processing (vector subscript out of range)

I have spent probably too many hours looking for tutorials on image processing (WITHOUT the use of external libraries) with no real success. If anyone knows any good tutorials that can be found that can help in this way, I'd really appreciate that.
I am pretty new to coding (this is my first year in college), and the assignment our professor is asking for requires original code to transform 24-bit bitmap images.
I found a question in StackExchange that shows rotation of an image without use of external libraries:
My code rotates a bmp picture correctly but only if the number of pixels is a muliple of 4... can anyone see whats wrong?
Using this code (with the starter project we were given and I had to build upon), I was able to create this code:
Byte is defined as a typedef of unsigned chars.
void BMPImage::RotateImage()
{
vector<byte> newBMP(m_BIH.biWidth * m_BIH.biHeight);
long newHeight = m_BIH.biWidth; /* Preserving the original width */
m_BIH.biWidth = m_BIH.biHeight; /* Setting the width as the height*/
m_BIH.biHeight = newHeight; /* Using the value of the original width, we set it as the new height */
for (int r = 0; r < m_BIH.biHeight; r++)
{
for (int c = 0; c < m_BIH.biWidth; c++)
{
long y = c + (r*m_BIH.biHeight);
long x = c + (r*m_BIH.biWidth - r - 1) + (m_BIH.biHeight*c);
newBMP[y] = m_ImageData[x];
}
}
m_ImageData = newBMP;
}
This code doesn't show any red squigglies, but when I try to execute the rotation, I get a vector subscript out of range error message pop-up. I've only used vectors in one assignment before, so I don't know where the issue is. Help please!
I think the issue might be here:
m_ImageData = newBMP;
Assume your newBMP has width = 1 and height = 2 then
vector<byte> newBMP(m_BIH.biWidth * m_BIH.biHeight);
would be an array of size 2 with valid indexrange [0 1]. Your index calculation
long y = c + (r*m_BIH.biHeight);
would be 2 for c = 0 and r = 1. But 2 is not a valid index for your vector and with
newBMP[y] = ...
you access an element that is not part of the vector. Your index x would be -1 for this example.

Writing to .BMP - distorted image

I'd like to write a normal map to a .bmp file, so I've implemented a simple .bmp writer first:
void BITMAPLOADER::writeHeader(std::ofstream& out, int width, int height)
{
BITMAPFILEHEADER tWBFH;
tWBFH.bfType = 0x4d42;
tWBFH.bfSize = 14 + 40 + (width*height*3);
tWBFH.bfReserved1 = 0;
tWBFH.bfReserved2 = 0;
tWBFH.bfOffBits = 14 + 40;
BITMAPINFOHEADER tW2BH;
memset(&tW2BH,0,40);
tW2BH.biSize = 40;
tW2BH.biWidth = width;
tW2BH.biHeight = height;
tW2BH.biPlanes = 1;
tW2BH.biBitCount = 24;
tW2BH.biCompression = 0;
out.write((char*)(&tWBFH),14);
out.write((char*)(&tW2BH),40);
}
bool TERRAINLOADER::makeNormalmap(unsigned int width, unsigned int height)
{
std::ofstream file;
file.open("terrainnormal.bmp");
if(!file)
{
file.close();
return false;
}
bitmaploader.writeHeader(file,width,height);
for(int y = 0; y < height; y++)
{
for(int x = 0; x < width; x++)
{
file << static_cast<unsigned char>(255*x/height); //(unsigned char)((getHeight(float(x)/float(width),float(y)/float(height))));
file << static_cast<unsigned char>(0); //(unsigned char)((getHeight(float(x)/float(width),float(y)/float(height))));
file << static_cast<unsigned char>(0); //(unsigned char)((getHeight(float(x)/float(width),float(y)/float(height))));
};
};
file.close();
return true;
};
The writeHeader(...) function is from SO, from a solved,working post. (I've forgot the name of it)
The getHeight(...) is using bicubic interpolation, so I can write it to big resolution images, and it stays smooth. It will be also used for collision detection and now is used as a LOD factor for my clipmaps.
Now the problem is that this outputs a distorted image. The pictures will tell everything I think:
The expected/distorted result(s):
for the heightmap: I have the function that describes a mesh: getHeight(x,z). It gives back the correct results because I've tested it with shaders (by sending heights as vertex attribs) too. The image downloaded from internet:
And with the y(x,z) function values written to a .BMP: (the commented out part of the code):
With a simple function: file << static_cast<unsigned char>(255*(float)x/height)
which should be a simple blend from black to white to the right.
I used an image size of 256 x 256, because I've read it should be multiple of 4. I CAN use libraries, but I'd like to solve this problem without one. So, what caused this distortion?
EDIT:
On the last image some lines are also colored, but they shouldn't be. This post is similar, but my heightmap is not distorted linearly as in this post: Image Distortion with Lock Bits
EDIT:
Another strange issue is when I don't make all colors the same, it get's distorted in colors too. For example set only the RED to the heights, and leave G and B 0, it became not only RED, but a noisy colored heightmap.
EDIT /comments/
If I understood them right, there's the size of the header, then comes my pixel data. Now before the pixel data there must be 4 * n bytes. So that padding mean after the header I put some more data that fills the place.
For example assuming (I will look up hot to get it exactly) my header is 55 bytes, then I should add 1 more byte to it because 55+1 = 56 and 4|56.
So
file << static_cast<unsigned char>('a');
for(int y = 1; y <= width; y++)
{
for(int x = 1; x <= height; x++)
{
file << static_cast<unsigned char>(x);
file << static_cast<unsigned char>(x);
file << static_cast<unsigned char>(x);
};
};
should be correct.
But I realized the real issue (as Jigsore commented). When I cast from int to char, it seems like a 1 digit number becomes 1 byte, 2 digits number 2, and 3 digits 3 bytes. Clamping the height to 3 digits works well, but the image is a bit whitey, because 'darkest' color becomes (100,100,100) instead of (0,0,0). Also, this is the cause of the non-regular distortion, because it depends on how many 'hills' or 'mountains' are there in one row. How can I solve this, and I hope the last problem? I don't want to compress the image to 100-256 range.;)
Open your file in binary mode.
Under Windows, if you open a file in the default text mode, it will write an extra 0x0d (Return) character after every 0x0a (Linefeed) that gets written out. The first time this happens it will change the colors of the following pixels, as the RGB order gets out of alignment. After it happens 3 times you'll be off by a full pixel.