C++ - Trying to read a .PPM image, unexpected output - c++

I'm developing a uni project for reading the image data of P6-type, 255-depth .ppm images. The problem I encounter is that when I try to print the average values of each color (R,G,B) for an Image, the output I get is wrong (the proffessor has given us an output file which says what float values to expect for each given image).
Now, I'm at a loss here. Through many checks, I have concluded that the function reads the whole data from the image, without leaving out pixels or whatever, converts them correctly from 0-255 to 0.f - 1.f values (by dividing with 255.0), adds every red, every green and every blue value to three seperate counters and then divides them by the Width*Height of the given image to get the desired average brightness of each colour. I will provide part of the function that does this process for calculating the average red for a 960*642 Image (sorry for the hardcoded stuff, it's just there for debugging purposes).
The output I get for this is 0.58... when it should be 0.539068. seekg() is called with 14 as an argument because position 14 is the last space after the header and before the data. Could you provide any insight to why this isn't working as expected? One thing I found through the checks is the sum I get after adding all the red float values, is not a float but an int. Possible loss of data? I'm grasping at straws here.
Here is the code:
std::ifstream infile;
infile.open("Image02.ppm", std::ios::in | std::ios::binary);
const unsigned char* buffer;
float * data_ptr;
infile.seekg(0, std::ios::end);
int length = infile.tellg(); //calculating length of data
buffer = new unsigned char[length];
ptr = new unsigned char[length];
data_ptr = new float[length];
infile.seekg(14, std::ios::beg); //restoring pointer to the start of data stream
infile.read((char*)buffer, length); //reading the image
for (int i = 0; i < length; i++){ //casting the char data to floats to get the 0-255 values
data_ptr[i] = ((float)buffer[i]);
data_ptr[i] = data_ptr[i] / 255.f; // converting to 0.0 - 1.0
}
int j = 0;
float a = 0.f;
while (j < length){ //calculating sum of red pixel values
a = a + data_ptr[j];
j = j + 3;
}
std::cout << a / (960*642); //calculating average
FYI, PPM image files that are P6 have their image data stored from left to right, with the first line being line 0 and the last line of the image being the last. They are structured like this R G B R G B R G B so on, where the first RGB correspond to the first pixel and so forth.
Thanks in advance!

You need pixels only for average calculation.
But in your source code, additional 14 garbage values are being used.

Related

Reading BMP file into an array

I am writing a longer program and I found myself needing to read a .bmp file into an array in a specific way so that the rest of the program can use it without extensive rewrites. I failed to find older answers that would resolve my problem, and I am pretty much at the beginner stages.
The image I am trying to read is used to create a text font, so I want to read it character by character into an array, where the pixels belonging to one character are added in order to a 2d bool (true if pixel is not black) array [character_id] [pixel_n]. The dimensions of characters are predetermined and known, and the file is cropped so that they all appear in a single row with no unaccounted margins.
This is the specific file I am trying to read, though here it might not show up as .bmp
As an example, shown here, I want to read the pixels in the order of the yellow line, then jump to another character. For clarity each character is 5px wide and 11px high, with 1px of margin on both sides horizontally.
Based on what I was able to find, I have written a function to do it, but I fail to make it work as intended, as far as I can tell even the pixel values are not being read correctly:
void readBMP(char* filename)
{
int i;
FILE* f = fopen(filename, "rb");
unsigned char info[54];
// read the 54-byte header
fread(info, sizeof(unsigned char), 54, f);
// extract image height and width from header
int width = *(int*)&info[18];
int height = *(int*)&info[22];
// number of pixels in total
int size = 3 * width * height;
unsigned char* data = new unsigned char[size];
// number of characters to read
int counter1 = size / ((font_width + 2) * font_height) / 3 ;
// read the rest of the data at once
fread(data, sizeof(unsigned char), size, f);
fclose(f);
//loop that goes from character to character
for(int i = 0; i < counter1; i++)
{
int tmp = 0;
//loop that reads one character into font_ref array
for(int j = 0; j < font_height; j++)
{
//loop for each row of a character
for(int k = 0; k < font_width; k++)
{
int w = static_cast<int>(data[3*(j*(font_width+2)*(counter1) + i*(font_width + 2) + 1 + k + j*font_width + j)-1]);
if( w != 0 )
font_ref [i][(tmp)] = 1;
else
font_ref [i][(tmp)] = 0;
tmp++;
}
}
}
}
(bool font_ref [150][font_width*font_height]; is the array where the font is being loaded and stored)
this code reads something, but the result is a seemingly random mess and I am unable to resolve that. Here is an example of lowercase alphabet printed using another function in the program, where white pixels represent true bools. I am aware that some libraries exist to work with graphical files, however in this program I wanted to possibly avoid that to learn more lower-level things, and the goal is rather limited and specific.
Thank you in advance for any help with the issue.
The main errors are in the offset computation for a pixel in the bitmap data:
int w = static_cast<int>(data[3*(j*(font_width+2)*(counter1) + i*(font_width + 2) + 1 + k + j*font_width + j)-1]);
j*(font_width+2)*(counter1) - This doesn't take into account that
although you say the file is cropped, there is extra black space to the right of the last character cell, so the true width must be used;
(as drescherjm and user3386109 mentioned) padding bytes are appended to the rows so that their length is a multiple of four bytes.
+ j*font_width + j)-1 - This part makes no sense - perhaps you tried to compensate the above errors.
This would be correct:
int w = data[j*(3*width+3&~3)+3*(i*(font_width+2)+1+k)];

How can I use openimageIO to store RGB values in arrays? (using C++, OpenGL)

I am using openimageIO to read and display an image from a JPG file, and I now need to store the RGB values in arrays so that I can manipulate and re-display them later.
I want to do something like this:
for (int i=0; i<picturesize;i++)
{
Rarray[i]=pixelredvalue;
Garray[i]=pixelgreenvalue;
Barray[i]=pixelbluevalue;
}
This is an openimageIO source that I found online: https://people.cs.clemson.edu/~dhouse/courses/404/papers/openimageio.pdf
"Section 3.2: Advanced Image Output" (pg 35) is the closest to what I'm doing, but I don't understand how I can use the channels to write pixel data to arrays. I also don't fully understand the difference between "writing" and "storing in an array". This is the piece of code in the reference that I am talking about:
int channels = 4;
ImageSpec spec (width, length, channels, TypeDesc::UINT8);
spec.channelnames.clear ();
spec.channelnames.push_back ("R");
spec.channelnames.push_back ("G");
spec.channelnames.push_back ("B");
spec.channelnames.push_back ("A");
I managed to read the image and display it using the code in the reference, but now I need to store all the pixel values in my array.
Here is another useful piece of code from the link, but again, I can't understand how to retrieve the individual RGB values and place them into an array:
#include <OpenImageIO/imageio.h>
OIIO_NAMESPACE_USING
...
const char *filename = "foo.jpg";
const int xres = 640, yres = 480;
const int channels = 3; // RGB
unsigned char pixels[xres*yres*channels];
ImageOutput *out = ImageOutput::create (filename);
if (! out)
return;
ImageSpec spec (xres, yres, channels, TypeDesc::UINT8);
out->open (filename, spec);
out->write_image (TypeDesc::UINT8, pixels);
out->close ();
ImageOutput::destroy (out);
But this is about writing to a file, and still does not solve my problem. This is on page 35.
Let's assume, that your code which reads an image, looks like this (snippet from OpenImageIO 1.7 Programmer Documentation, Chapter 4.1 Image Input Made Simple, page 55):
ImageInput *in = ImageInput::open (filename);
const ImageSpec &spec = in->spec();
int xres = spec.width;
int yres = spec.height;
int channels = spec.nchannels;
std::vector<unsigned char> pixels (xres*yres*channels);
in->read_image (TypeDesc::UINT8, &pixels[0]);
in->close();
ImageInput::destroy (in);
Now all the bytes of the image are contained in std::vector<unsigned char> pixels.
If you want to access the RGB valuse of the pixel at positon x, y, the you can do it like this:
int pixel_addr = (y * yres + x) * channels;
unsigned char red = pixels[pixel_addr];
unsigned char green = pixels[pixel_addr + 1];
unsigned char blue = pixels[pixel_addr + 2];
Since all the pixels are stored in pixels, there is no reason to store them in separate arrays for the 3 color channels.
But if you want to store the red, green and blue values in separated arrays, then you can do it like this:
std::vector<unsigned char> Rarray(x_res*yres);
std::vector<unsigned char> Garray(x_res*yres);
std::vector<unsigned char> Barray(x_res*yres);
for (int i=0; i<x_res*yres; i++)
{
Rarray[i] = pixels[i*channels];
Garray[i] = pixels[i*channels + 1];
Barray[i] = pixels[i*channels + 2];
}
Of course the pixels have to be tightly packed to pixels (line alignment of 1).

(C++)(Visual Studio) Change RGB to Grayscale

I am accessing the image like so:
pDoc = GetDocument();
int iBitPerPixel = pDoc->_bmp->bitsperpixel; // used to see if grayscale(8 bits) or RGB (24 bits)
int iWidth = pDoc->_bmp->width;
int iHeight = pDoc->_bmp->height;
BYTE *pImg = pDoc->_bmp->point; // pointer used to point at pixels in the image
int Wp = iWidth;
const int area = iWidth * iHeight;
int r; // red pixel value
int g; // green pixel value
int b; // blue pixel value
int gray; // gray pixel value
BYTE *pImgGS = pImg; // grayscale image pixel array
and attempting to change the rgb image to gray like so:
// convert RGB values to grayscale at each pixel, then put in grayscale array
for (int i = 0; i<iHeight; i++)
for (int j = 0; j<iWidth; j++)
{
r = pImg[i*iWidth * 3 + j * 3 + 2];
g = pImg[i*iWidth * 3 + j * 3 + 1];
b = pImg[i*Wp + j * 3];
r * 0.299;
g * 0.587;
b * 0.144;
gray = std::round(r + g + b);
pImgGS[i*Wp + j] = gray;
}
finally, this is how I try to draw the image:
//draw the picture as grayscale
for (int i = 0; i < iHeight; i++) {
for (int j = 0; j < iWidth; j++) {
// this should set every corresponding grayscale picture to the current picture as grayscale
pImg[i*Wp + j] = pImgGS[i*Wp + j];
}
}
}
original image:
and the resulting image that I get is this:
First check if image type is 24 bits per pixels.
Second, allocate memory to pImgGS;
BYTE* pImgGS = (BTYE*)malloc(sizeof(BYTE)*iWidth *iHeight);
Please refer this article to see how bmp data is saved. bmp images are saved upside down. Also, first 54 byte of information is BITMAPFILEHEADER.
Hence you should access values in following way,
double r,g,b;
unsigned char gray;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[(i*iWidth + j)*3 + 2];
g = (double)pImg[(i*iWidth + j)*3 + 1];
b = (double)pImg[(i*iWidth + j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
}
If there is padding present, then first determine padding and access in different way. Refer this to understand pitch and padding.
double r,g,b;
unsigned char gray;
long index=0;
for (int i = 0; i<iHeight; i++)
{
for (int j = 0; j<iWidth; j++)
{
r = (double)pImg[index+ (j)*3 + 2];
g = (double)pImg[index+ (j)*3 + 1];
b = (double)pImg[index+ (j)*3 + 0];
r= r * 0.299;
g= g * 0.587;
b= b * 0.144;
gray = floor((r + g + b + 0.5));
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
}
index =index +pitch;
}
While drawing image,
as pImg is 24bpp, you need to copy gray values thrice to each R,G,B channel. If you ultimately want to save grayscale image in bmp format, then again you have to write bmp data upside down or you can simply skip that step in converting to gray here:
pImgGS[(iHeight-i-1)*iWidth + j] = gray;
tl; dr:
Make one common path. Convert everything to 32-bits in a well-defined manner, and do not use image dimensions or coordinates. Refactor the YCbCr conversion ( = grey value calculation) into a separate function, this is easier to read and runs at exactly the same speed.
The lengthy stuff
First, you seem to have been confused with strides and offsets. The artefact that you see is because you accidentially wrote out one value (and in total only one third of the data) when you should have written three values.
One can get confused with this easily, but here it happened because you do useless stuff that you needed not do in the first place. You are iterating coordinates left to right, top-to-bottom and painstakingly calculate the correct byte offset in the data for each location.
However, you're doing a full-screen effect, so what you really want is iterate over the complete image. Who cares about the width and height? You know the beginning of the data, and you know the length. One loop over the complete blob will do the same, only faster, with less obscure code, and fewer opportunities of getting something wrong.
Next, 24-bit bitmaps are common as files, but they are rather unusual for in-memory representation because the format is nasty to access and unsuitable for hardware. Drawing such a bitmap will require a lot of work from the driver or the graphics hardware (it will work, but it will not work well). Therefore, 32-bit depth is usually a much better, faster, and more comfortable choice. It is much more "natural" to access program-wise.
You can rather trivially convert 24-bit to 32-bit. Iterate over the complete bitmap data and write out a complete 32-bit word for each 3 byte-tuple read. Windows bitmaps ignore the A channel (the highest-order byte), so just leave it zero, or whatever.
Also, there is no such thing as a 8-bit greyscale bitmap. This simply doesn't exist. Although there exist bitmaps that look like greyscale bitmaps, they are in reality paletted 8-bit bitmaps where (incidentially) the bmiColors member contains all greyscale values.
Therefore, unless you can guarantee that you will only ever process images that you have created yourself, you cannot just rely that e.g. the values 5 and 73 correspond to 5/255 and 73/255 greyscale intensity, respectively. That may be the case, but it is in general a wrong assumption.
In order to be on the safe side as far as correctness goes, you must convert your 8-bit greyscale bitmaps to real colors by looking up the indices (the bitmap's grey values are really indices) in the palette. Otherwise, you could be loading a greyscale image where the palette is the other way around (so 5 would mean 250 and 250 would mean 5), or a bitmap which isn't greyscale at all.
So... you want to convert 24-bit and you want to convert 8-bit bitmaps, both to 32-bit depth. That means you do all the annoying what-if stuff once at the beginning, and the rest is one identical common path. That's a good thing.
What you will be showing on-screen is always a 32-bit bitmap where the topmost byte is ignored, and the lower three are all the same value, resulting in what looks like a shade of grey. That's simple, and simple is good.
Note that if you do a BT.601 style YCbCr conversion (as indicated by your use of the constants 0.299, 0.587, and 0.144), and if your 8-bit greyscale images are perceptive (this is something you must know, there is no way of telling from the file!), then for 100% correctness, you need to to the inverse transformation when converting from paletted 8-bit to RGB. Otherwise, your final result will look like almost right, but not quite. If your 8-bit greycales are linear, i.e. were created without using the above constants (again, you must know, you cannot tell from the image), you need to copy everything as-is (here, doing the conversion would make it look almost-but-not-quite right).
About the RGB-to-greyscale conversion, you do not need an extra greyscale bitmap just to hold the values that you never need again afterwards. You can read the three color values from the loaded bitmap, calculate Y, and directly build the 32-bit ARGB word, which you then write out to the final bitmap. This saves one entirely useless round-trip to memory which is not necessary.
Something like this:
uint32_t* out = (uint32_t*) output_bitmap_data;
for(int i = 0; i < inputSize; i+= 3)
{
uint8_t Y = calc_greyscale(in[0], in[1], in[2]);
*out++ = (Y<<16) | (Y<<8) | Y;
}
Alternatively, you can also do the from-whatever-to-32 conversion, and then do the to-greyscale conversion in-place there. This, in turn, introduces an extra round-trip to memory, but the code becomes much, much easier overall.

C++ - Convert uint8_t* image data to double** image data

I am working on a C++ function (inside my iOS app) where I have image data in the form uint8_t*.
I obtained the image data using the code using the CVPixelBufferGetBaseAddress() method of the iOS SDK:
uint8_t *bPixels = (uint8_t *)CVPixelBufferGetBaseAddress(imageBuffer);
I have another function (from a third part source) that does some of the image processing functions I would like to use on my image data, but the input for the image data for these functions is double**.
Does anyone have any idea how to go about converting this?
What other information can I provide?
The constructor prototype for the class that use double** look like:
Image(double **iPixels, unsigned int iWidth, unsigned int iHeight);
Your uint8_t *bPixels seems to hold image data as 1-dimensional continuous array of height*width lenght. So to access pixel in the x-th row and y-th column you have to write bPixels[x*width+y].
Image() seems to work on 2-dimensional arrays. To access pixel like above you would have to write iPixels[x][y].
So you need to copy your existing 1-dimensional array to a 2-dimensional:
double **mypixels = new double* [height];
for (int x=0; x<height; x++)
{
mypixels[x] = new double [width];
for (int y=0; y<width; y++)
mypixels[x][y] = bPixels[x*width+y]; // attention here, maybe normalization is necessary
// e.g. mypixels[x][y] = bPixels[x*width+y] / 255.0
}
Because your 1-dimensional array has pixel of type uint8_t and the 2-dimensional one pixel of type double, you must allocate new memory. Otherwise, if both would have same pixel type, the more elegant solution (a simple map) would be:
uint8_t **mypixels = new uint8_t* [height];
for (int x=0; x<height; x++)
mypixels[x] = bPixels+x*width;
Attention: beside the problem of eventually necessary normalization, there is also a problem with the indices-compatibility! My examples assume that the 1-dimensional array is stored row-by-row and that the functions working on 2-dimensional index with [x][y] (that means first-row-then-column). The declaration of Image() however, could lead to the conclusion that it needs its arrays to be indexed with [y][x] maybe.
I'm going to take a giant bunch of guesses here in hopes that this will lead you towards getting at the documentation and answering back. If there's no further documentation, well, here's a starting point.
Guess 1) The Image constructor requires a doubly dimensioned array where each component is an R,G,B,Alpha channel in that order. So iPixels[0] is the red data, iPixels[1] is the green data, etc.
Guess 2) Because it's not integer data, the values range from 0 to 1.
Guess 3) All of this must be pre-allocated.
Guess 4) Image data is row-major
Guess 5) Source data is BRGA
So with that in mind, starting with bPixels
double *redData = new double[width*height];
double *greenData = new double[width*height];
double *blueData = new double[width*height];
double *alphaData = new double[width*height];
double **iPixels = new double*[4];
iPixels[0] = redData;
iPixels[1] = greenData;
iPixels[2] = blueData;
iPixels[3] = alphaData;
for(int y = 0;y < height;y++)
{
for(int x = 0;x < width;x++)
{
int alpha = bPixels[(y*width + x)*4 + 3];
int red = bPixels[(y*width +x)*4 + 2];
int green = bPixels[(y*width + x)*4 + 1];
int blue = bPixels[(y*width + x)*4];
redData[y*width + x] = red/255.0;
greenData[y*width + x] = green/255.0;
blueData[y*width + x] = blue/255.0;
alphaData[y*width + x] = alpha/255.0;
}
}
Image newImage(iPixels,width,height);
some of the things that can go wrong.
Source is not BGRA but RGBA, which will make the colors all wrong.
Not row major or destination is not in slices which will make things look all screwed up and/or seg-fault

Writing to .BMP - distorted image

I'd like to write a normal map to a .bmp file, so I've implemented a simple .bmp writer first:
void BITMAPLOADER::writeHeader(std::ofstream& out, int width, int height)
{
BITMAPFILEHEADER tWBFH;
tWBFH.bfType = 0x4d42;
tWBFH.bfSize = 14 + 40 + (width*height*3);
tWBFH.bfReserved1 = 0;
tWBFH.bfReserved2 = 0;
tWBFH.bfOffBits = 14 + 40;
BITMAPINFOHEADER tW2BH;
memset(&tW2BH,0,40);
tW2BH.biSize = 40;
tW2BH.biWidth = width;
tW2BH.biHeight = height;
tW2BH.biPlanes = 1;
tW2BH.biBitCount = 24;
tW2BH.biCompression = 0;
out.write((char*)(&tWBFH),14);
out.write((char*)(&tW2BH),40);
}
bool TERRAINLOADER::makeNormalmap(unsigned int width, unsigned int height)
{
std::ofstream file;
file.open("terrainnormal.bmp");
if(!file)
{
file.close();
return false;
}
bitmaploader.writeHeader(file,width,height);
for(int y = 0; y < height; y++)
{
for(int x = 0; x < width; x++)
{
file << static_cast<unsigned char>(255*x/height); //(unsigned char)((getHeight(float(x)/float(width),float(y)/float(height))));
file << static_cast<unsigned char>(0); //(unsigned char)((getHeight(float(x)/float(width),float(y)/float(height))));
file << static_cast<unsigned char>(0); //(unsigned char)((getHeight(float(x)/float(width),float(y)/float(height))));
};
};
file.close();
return true;
};
The writeHeader(...) function is from SO, from a solved,working post. (I've forgot the name of it)
The getHeight(...) is using bicubic interpolation, so I can write it to big resolution images, and it stays smooth. It will be also used for collision detection and now is used as a LOD factor for my clipmaps.
Now the problem is that this outputs a distorted image. The pictures will tell everything I think:
The expected/distorted result(s):
for the heightmap: I have the function that describes a mesh: getHeight(x,z). It gives back the correct results because I've tested it with shaders (by sending heights as vertex attribs) too. The image downloaded from internet:
And with the y(x,z) function values written to a .BMP: (the commented out part of the code):
With a simple function: file << static_cast<unsigned char>(255*(float)x/height)
which should be a simple blend from black to white to the right.
I used an image size of 256 x 256, because I've read it should be multiple of 4. I CAN use libraries, but I'd like to solve this problem without one. So, what caused this distortion?
EDIT:
On the last image some lines are also colored, but they shouldn't be. This post is similar, but my heightmap is not distorted linearly as in this post: Image Distortion with Lock Bits
EDIT:
Another strange issue is when I don't make all colors the same, it get's distorted in colors too. For example set only the RED to the heights, and leave G and B 0, it became not only RED, but a noisy colored heightmap.
EDIT /comments/
If I understood them right, there's the size of the header, then comes my pixel data. Now before the pixel data there must be 4 * n bytes. So that padding mean after the header I put some more data that fills the place.
For example assuming (I will look up hot to get it exactly) my header is 55 bytes, then I should add 1 more byte to it because 55+1 = 56 and 4|56.
So
file << static_cast<unsigned char>('a');
for(int y = 1; y <= width; y++)
{
for(int x = 1; x <= height; x++)
{
file << static_cast<unsigned char>(x);
file << static_cast<unsigned char>(x);
file << static_cast<unsigned char>(x);
};
};
should be correct.
But I realized the real issue (as Jigsore commented). When I cast from int to char, it seems like a 1 digit number becomes 1 byte, 2 digits number 2, and 3 digits 3 bytes. Clamping the height to 3 digits works well, but the image is a bit whitey, because 'darkest' color becomes (100,100,100) instead of (0,0,0). Also, this is the cause of the non-regular distortion, because it depends on how many 'hills' or 'mountains' are there in one row. How can I solve this, and I hope the last problem? I don't want to compress the image to 100-256 range.;)
Open your file in binary mode.
Under Windows, if you open a file in the default text mode, it will write an extra 0x0d (Return) character after every 0x0a (Linefeed) that gets written out. The first time this happens it will change the colors of the following pixels, as the RGB order gets out of alignment. After it happens 3 times you'll be off by a full pixel.