C++: Write BMP image format error on WINDOWS - c++

I have the most strange problem here... I'm using the same code(copy-paste) from Linux in Windows to READ and WRITE and BMP image. And from some reason in Linux every thing works perfectly fine, but when I'm coming to Windows 10 from some I can't open that images and I've receive an error message how said something like this:
"It looks like we don't support this file format."
Do you have any idea what should I do? I will put the code below.
EDIT:
I've solved the padding problem and now it's write the images but they are completely white, any idea why? I've update the code also.
struct BMP {
int width;
int height;
unsigned char header[54];
unsigned char *pixels;
int size;
int row_padded;
};
void writeBMP(string filename, BMP image) {
string fileName = "Output Files\\" + filename;
FILE *out = fopen(fileName.c_str(), "wb");
fwrite(image.header, sizeof(unsigned char), 54, out);
unsigned char tmp;
for (int i = 0; i < image.height; i++) {
for (int j = 0; j < image.width * 3; j += 3) {
// Convert (B, G, R) to (R, G, B)
tmp = image.pixels[j];
image.pixels[j] = image.pixels[j + 2];
image.pixels[j + 2] = tmp;
}
fwrite(image.pixels, sizeof(unsigned char), image.row_padded, out);
}
fclose(out);
}
BMP readBMP(string filename) {
BMP image;
string fileName = "Input Files\\" + filename;
FILE *f = fopen(fileName.c_str(), "rb");
if (f == NULL)
throw "Argument Exception";
fread(image.header, sizeof(unsigned char), 54, f); // read the 54-byte header
// extract image height and width from header
image.width = *(int *) &image.header[18];
image.height = *(int *) &image.header[22];
image.row_padded = (image.width * 3 + 3) & (~3);
image.pixels = new unsigned char[image.row_padded];
unsigned char tmp;
for (int i = 0; i < image.height; i++) {
fread(image.pixels, sizeof(unsigned char), image.row_padded, f);
for (int j = 0; j < image.width * 3; j += 3) {
// Convert (B, G, R) to (R, G, B)
tmp = image.pixels[j];
image.pixels[j] = image.pixels[j + 2];
image.pixels[j + 2] = tmp;
}
}
fclose(f);
return image;
}
In my point of view this code should be cross-platform... But it's not... why?
Thanks for help

Check the header
The header must start with the following two signature bytes: 0x42 0x4D. If it's something different a third party application will think that this file doesn't contain a bmp picture despite the .bmp file extension.
The size and the way pixels are stored is also a little bit more complex than what you expect: you assume that the number of bits per pixels is 24 and no no compression is used. This is not guaranteed. If it's not the case, you might read more data than available, and corrupt the file when writing it back.
Furthermore, the size of the header depends also on the BMP version you are using, which you can detect using the 4 byte integer at offset 14.
Improve your code
When you load a file, check the signature, the bmp version, the number of bits per pixel and the compression. For debugging purpose, consider dumping the header to check it manually:
for (int i=0; i<54; i++)
cout << hex << image.header[i] << " ";`
cout <<endl;
Furthermore, when you fread() check that the number of bytes read correspond to the size you wanted to read, so to be sure that you're not working with uninitialized buffer data.
Edit:
Having checked the dump, it appears that the format is as expected. But verifying the padded size in the header with the padded size that you have calculated it appears that the error is here:
image.row_padded = (image.width * 3 + 3) & (~3); // ok size of a single row rounded up to multiple of 4
image.pixels = new unsigned char[image.row_padded]; // oops ! A little short ?
In fact you read row by row, but you only keep the last one in memory ! This is different of your first version, where you did read the full pixels of the picture.
Similarly, you write the last row repeated height time.
Reconsider your padding, working with the total padded size.
image.row_padded = (image.width * 3 + 3) & (~3); // ok size of a single row rounded up to multiple of 4
image.size_padded = image.row_padded * image.height; // padded full size
image.pixels = new unsigned char[image.size_padded]; // yeah !
if (fread(image.pixels, sizeof(unsigned char), image.size_padded, f) != image.size_padded) {
cout << "Error: all bytes couldn't be read"<<endl;
}
else {
... // process the pixels as expected
}
...

Related

Writing read_jpeg and decode_jpeg functions for TensorFlow Lite C++

TensorFlow Lite has a good C++ image classification example in their repo, here.
However, I'm working with .jpeg and this example is restricted to decoding .bmp images with bitmap_helpers.cc.
I'm trying to create my own jpeg decoder but I'm not well versed in image processing so could use some help. I'm reusing this jpeg decoder as a third party helper library. In the example's bmp decoding, I don't quite understand what's the deal with calculating row_sizes and taking in the bytes array after the header. Could anyone shed some light into how this would apply for a jpeg decoder? Or, even better, is there already a C++ decode_jpeg function hiding somewhere which I have not found?
The final implementation must be in TensorFlow Lite in C++.
thank you so much!
EDIT:
Below is what I have so far. I don't get the same confidence values as when I use the Python example of the image classifier for the same input image and tflite model so this is a clear indication that something is wrong. I essentially copy and pasted the row_size calculation from read_bmp without understanding it so I suspect that might be the issue. What is row_size meant to represent?
std::vector<uint8_t> decode_jpeg(const uint8_t* input, int row_size, int width, int height) {
// Channels will always be 3. Hardcode it for now.
int channels = 3;
// The output that wil lcontain the data for TensorFlow to process.
std::vector<uint8_t> output(height * width * channels);
// Go through every pixel of the image.
for(int i = 0; i < height; i++) {
int src_pos;
int dst_pos;
for(int j = 0; j < width; j++) {
src_pos = i * row_size + j * channels;
dst_pos = (i * width + j) * channels;
// Put RGB channel data into the output array.
output[dst_pos] = input[src_pos + 2];
output[dst_pos + 1] = input[src_pos + 1];
output[dst_pos + 2] = input[src_pos];
}
}
return output;
}
std::vector<uint8_t> read_jpeg(const std::string& input_jpeg_name, int* width, int* height, Settings* s) {
// Size and buffer.
size_t size;
unsigned char *buf;
// Open the input file.
FILE *f;
f = fopen(input_jpeg_name.c_str(), "rb");
if (!f) {
if (s->verbose) LOG(INFO) << "Error opening the input file\n";
exit(-1);
}
// Read the file.
fseek(f, 0, SEEK_END);
// Ge tthe file size.
size = ftell(f);
// Get file data into buffer.
buf = (unsigned char*)malloc(size);
fseek(f, 0, SEEK_SET);
size_t read = fread(buf, 1, size, f);
// Close the file.
fclose(f);
// Decode the file.
Decoder decoder(buf, size);
if (decoder.GetResult() != Decoder::OK)
{
if (s->verbose) LOG(INFO) << "Error decoding the input file\n";
exit(-1);
}
// Get the image from the decoded file.
unsigned char* img = decoder.GetImage();
// Get image width and height.
*width = decoder.GetWidth();
*height = decoder.GetHeight();
// TODO: Understand what this row size means. Don't just copy and paste.
const int row_size = (8 * *channels * *width + 31) / 32 * 4;
// Decode the JPEG.
return decode_jpeg(img, row_size, *width, *height);
}
Library you are using is already handling decoding for you, decoder.getImage() contains raw rgb data. You do not need to calculate any sizes whatsoever.
Stuff like row_size is something specific to BMP file format. BMP files may contain some padding bytes in addition to pixel color data, the code was handling that stuff.
Also BMP files store pixel values in BGR order, that is why you have reverse ordering in your original code:
// Put RGB channel data into the output array.
output[dst_pos] = input[src_pos + 2];
output[dst_pos + 1] = input[src_pos + 1];
output[dst_pos + 2] = input[src_pos];
Below code should be working for you (note that decode_jpeg function does not perform any decoding):
std::vector<uint8_t> decode_jpeg(const uint8_t* input, int width, int height) {
// Channels will always be 3. Hardcode it for now.
int channels = 3;
// The output that will contain the data for TensorFlow to process.
std::vector<uint8_t> output(height * width * channels);
// Copy pixel data to output
for (size_t i = 0; i < height*width*channels; ++i)
{
output[i] = input[i];
}
return output;
}
std::vector<uint8_t> read_jpeg(const std::string& input_jpeg_name, int* width, int* height, Settings* s) {
// Size and buffer.
size_t size;
unsigned char *buf;
// Open the input file.
FILE *f;
f = fopen(input_jpeg_name.c_str(), "rb");
if (!f) {
if (s->verbose) LOG(INFO) << "Error opening the input file\n";
exit(-1);
}
// Read the file.
fseek(f, 0, SEEK_END);
// Ge tthe file size.
size = ftell(f);
// Get file data into buffer.
buf = (unsigned char*)malloc(size);
fseek(f, 0, SEEK_SET);
size_t read = fread(buf, 1, size, f);
// Close the file.
fclose(f);
// Decode the file.
Decoder decoder(buf, size);
if (decoder.GetResult() != Decoder::OK)
{
if (s->verbose) LOG(INFO) << "Error decoding the input file\n";
exit(-1);
}
// Get the image from the decoded file.
unsigned char* img = decoder.GetImage();
// Get image width and height.
*width = decoder.GetWidth();
*height = decoder.GetHeight();
// Decode the JPEG.
return decode_jpeg(img, *width, *height);
}

Converting bitmap data to rgb visual c++

I am trying to take a bitmap image and get the RGB values of the pixels. What I currently have will open the bitmap file and read the pixel data:
#define _CRT_SECURE_NO_DEPRECATE
#include "findColor.h"
#include <vector>
#include <iostream>
int findColor(std::string path) {
std::vector<std::string> averageColor; //Will hold the average hex color of each image in order.
std::string currentImage;
currentImage = path + std::to_string(i) + ".btm";
FILE* f = fopen(currentImage.c_str(), "rb");
unsigned char info[54]; //Bitmap header is 54 bytes
fread(info, sizeof(unsigned char), 54, f); //reading the header
// extract image height and width from header
int width, height;
memcpy(&width, info + 18, sizeof(int));
memcpy(&height, info + 22, sizeof(int));
int heightSign = 1;
if (height < 0) {
heightSign = -1;
}
int size = 3 * width * height; //size of image in bytes. 3 bytes per pixel.
unsigned char* data = new unsigned char[size]; // allocate 3 bytes per pixel
fread(data, sizeof(unsigned char), size, f); // read the rest of the data at once
fclose(f); //close image.
for (i = 0; i < size; i += 3) //Flip the image data? It is stored as BGR flipping it to RGB?
{
unsigned char tmp = data[i-33];
data[i] = data[i + 2];
data[i + 2] = tmp;
}
return 0;
}
I really don't know where to go from here. Any responses will be appreciated.

Working with BMP - loading and saving

With friends we're trying to write app to work with BMP files and we're going to make it as simple as it could be for us, because we're just starting to learn C and C++. Copying was going good with new real size of lines but now I wanted to add grayscale effect and got another problem: the right side of the picture is moved to the left - check out pictures. What's causing this problem?
#include <iostream>
#include <fstream>
#include <stdio.h>
#include <unistd.h>
using namespace std;
void ReadBMP()
{
FILE* f = fopen("test2.bmp", "rb");
FILE* w = fopen("zapis.bmp", "wb");
if(f == NULL)
throw "Argument Exception";
unsigned char info[54];
fread(info, sizeof(unsigned char), 54, f);
fwrite(info, sizeof(unsigned char), 54, w);
int width = *(int*)&info[18];
int height = *(int*)&info[22];
cout << endl;
cout << "Width: " << width << endl;
cout << "Height: " << height << endl;
int realwidth = 3*width+(4 - ((3*width)%4))%4;
int volume = height * realwidth;
unsigned char* data = new unsigned char[volume];
fwrite(info, sizeof(unsigned char), 54, w);
fread(data, sizeof(unsigned char), volume, f);
unsigned char color = 0;
for(int i = 0; i < volume; i+=3)
{
color = 0;
color+=data[i]*0.114;
color+=data[i+1]*0.587;
color+=data[i+2]*0.299;
data[i] = color;
data[i+1] = color;
data[i+2] = color;
}
fwrite(data, sizeof(unsigned char), volume, w);
fclose(f);
fclose(w);
delete(data);
}
int main()
{
ReadBMP();
return 0;
}
Input image
Output image
Your formula for the size of the image data is wrong. First you need to find the pitch, by multiplying the width by the bytes per pixel (3 for a 24-bit image), and then rounding up to the nearest multiple of 4. Then multiply the pitch by the height;
int byte_width = width * 3;
int pitch = byte_width + (4 - byte_width % 4) % 4;
int volume = pitch * height;
unsigned char info[54];
fread(info, sizeof(unsigned char), 54, f);
// fwrite(info, sizeof(unsigned char), 54, w); --- comment this line !!!!!!!
int width = *(int*)&info[18];
int height = *(int*)&info[22];
You're writing header to file twice unnecessarily.
As I see in my own code (that was written about 20 years ago) each line of the image is complemented by 0 or more empty bytes to align the start byte. It seems, your calculate wrong alignment.
Just copy & paste here:
unsigned short paddingSize;
unsigned short bitsPerLine = width * bitsPerPixel;
if(1 == bitsPerPixel || 4 == bitsPerPixel)
{
if(bitsPerLine % 8)
bitsPerLine += 8;
paddingSize = (bitsPerLine/8) % 2;
}
else if(8 == bitsPerPixel)
paddingSize = 0x0003 & ~((bitsPerLine/8) % 4 - 1);
else
paddingSize = (bitsPerLine/8) % 2;
Real size of each line is calculatedSize + paddingSize where calculatedSize is exact size of line in bytes i.e. ceil(bitsPerLine/8) or (bitsPerLine + 7)/8 ic C/C++.
What I can say about the code is it's debugged and it works. But I don't remember why all these checks here.

OpenCV: how to read .pfm files?

Is there a way to read .pfm files in OpenCV?
Thank you very much for any suggestions!
PFM is an uncommon image format and I don't know why the Middlebury dataset chose to use that, probably because it uses floating point values.
Anyway I was able to read the images with OpenCV:
import numpy as np
import cv2
groundtruth = cv2.imread('disp0.pfm', cv2.IMREAD_UNCHANGED)
Note the IMREAD_UNCHANGED flag. Somehow it is able to read all the correct values even if OpenCV does not support it.
But wait a minute: inf values are commonly used to set INVALID pixel disparity, so to properly display the image you should do:
# Remove infinite value to display
groundtruth[groundtruth==np.inf] = 0
# Normalize and convert to uint8
groundtruth = cv2.normalize(groundtruth, None, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
# Show
cv2.imshow("groundtruth", groundtruth)
cv2.waitKey(0)
cv2.destroyAllWindows()
Based on the description of the ".pfm" file formate (see http://netpbm.sourceforge.net/doc/pfm.html), I wrote the following read/write functions, which only depend standard C/C++ library. It is proved to work well on reading/writing the pfm file, like, the ground truth disparity ".pfm" files from MiddleBury Computer Vision (see http://vision.middlebury.edu/stereo/submit3/).
#ifndef _PGM_H_
#define _PGM_H_
#include <fstream>
#include <iostream>
#include <algorithm>
#include <string>
#include <cstdint>
#include <cstdlib>
#include <cstring>
#include <bitset> /*std::bitset<32>*/
#include <cstdio>
enum PFM_endianness { BIG, LITTLE, ERROR};
class PFM {
public:
PFM();
inline bool is_little_big_endianness_swap(){
if (this->endianess == 0.f) {
std::cerr << "this-> endianness is not assigned yet!\n";
exit(0);
}
else {
uint32_t endianness = 0xdeadbeef;
//std::cout << "\n" << std::bitset<32>(endianness) << std::endl;
unsigned char * temp = (unsigned char *)&endianness;
//std::cout << std::bitset<8>(*temp) << std::endl;
PFM_endianness endianType_ = ((*temp) ^ 0xef == 0 ?
LITTLE : (*temp) ^ (0xde) == 0 ? BIG : ERROR);
// ".pfm" format file specifies that:
// positive scale means big endianess;
// negative scale means little endianess.
return ((BIG == endianType_) && (this->endianess < 0.f))
|| ((LITTLE == endianType_) && (this->endianess > 0.f));
}
}
template<typename T>
T * read_pfm(const std::string & filename) {
FILE * pFile;
pFile = fopen(filename.c_str(), "rb");
char c[100];
if (pFile != NULL) {
fscanf(pFile, "%s", c);
// strcmp() returns 0 if they are equal.
if (!strcmp(c, "Pf")) {
fscanf(pFile, "%s", c);
// atoi: ASCII to integer.
// itoa: integer to ASCII.
this->width = atoi(c);
fscanf(pFile, "%s", c);
this->height = atoi(c);
int length_ = this->width * this->height;
fscanf(pFile, "%s", c);
this->endianess = atof(c);
fseek(pFile, 0, SEEK_END);
long lSize = ftell(pFile);
long pos = lSize - this->width*this->height * sizeof(T);
fseek(pFile, pos, SEEK_SET);
T* img = new T[length_];
//cout << "sizeof(T) = " << sizeof(T);
fread(img, sizeof(T), length_, pFile);
fclose(pFile);
/* The raster is a sequence of pixels, packed one after another,
* with no delimiters of any kind. They are grouped by row,
* with the pixels in each row ordered left to right and
* the rows ordered bottom to top.
*/
T* tbimg = (T *)malloc(length_ * sizeof(T));// top-to-bottom.
//PFM SPEC image stored bottom -> top reversing image
for (int i = 0; i < this->height; i++) {
memcpy(&tbimg[(this->height - i - 1)*(this->width)],
&img[(i*(this->width))],
(this->width) * sizeof(T));
}
if (this->is_little_big_endianness_swap()){
std::cout << "little-big endianness transformation is needed.\n";
// little-big endianness transformation is needed.
union {
T f;
unsigned char u8[sizeof(T)];
} source, dest;
for (int i = 0; i < length_; ++i) {
source.f = tbimg[i];
for (unsigned int k = 0, s_T = sizeof(T); k < s_T; k++)
dest.u8[k] = source.u8[s_T - k - 1];
tbimg[i] = dest.f;
//cout << dest.f << ", ";
}
}
delete[] img;
return tbimg;
}
else {
std::cout << "Invalid magic number!"
<< " No Pf (meaning grayscale pfm) is missing!!\n";
fclose(pFile);
exit(0);
}
}
else {
std::cout << "Cannot open file " << filename
<< ", or it does not exist!\n";
fclose(pFile);
exit(0);
}
}
template<typename T>
void write_pfm(const std::string & filename, const T* imgbuffer,
const float & endianess_) {
std::ofstream ofs(filename.c_str(), std::ifstream::binary);
// ** 1) Identifier Line: The identifier line contains the characters
// "PF" or "Pf". PF means it's a color PFM.
// Pf means it's a grayscale PFM.
// ** 2) Dimensions Line:
// The dimensions line contains two positive decimal integers,
// separated by a blank. The first is the width of the image;
// the second is the height. Both are in pixels.
// ** 3) Scale Factor / Endianness:
// The Scale Factor / Endianness line is a queer line that jams
// endianness information into an otherwise sane description
// of a scale. The line consists of a nonzero decimal number,
// not necessarily an integer. If the number is negative, that
// means the PFM raster is little endian. Otherwise, it is big
// endian. The absolute value of the number is the scale
// factor for the image.
// The scale factor tells the units of the samples in the raster.
// You use somehow it along with some separately understood unit
// information to turn a sample value into something meaningful,
// such as watts per square meter.
ofs << "Pf\n"
<< this->width << " " << this->height << "\n"
<< endianess_ << "\n";
/* PFM raster:
* The raster is a sequence of pixels, packed one after another,
* with no delimiters of any kind. They are grouped by row,
* with the pixels in each row ordered left to right and
* the rows ordered bottom to top.
* Each pixel consists of 1 or 3 samples, packed one after another,
* with no delimiters of any kind. 1 sample for a grayscale PFM
* and 3 for a color PFM (see the Identifier Line of the PFM header).
* Each sample consists of 4 consecutive bytes. The bytes represent
* a 32 bit string, in either big endian or little endian format,
* as determined by the Scale Factor / Endianness line of the PFM
* header. That string is an IEEE 32 bit floating point number code.
* Since that's the same format that most CPUs and compiler use,
* you can usually just make a program use the bytes directly
* as a floating point number, after taking care of the
* endianness variation.
*/
int length_ = this->width*this->height;
this->endianess = endianess_;
T* tbimg = (T *)malloc(length_ * sizeof(T));
// PFM SPEC image stored bottom -> top reversing image
for (int i = 0; i < this->height; i++) {
memcpy(&tbimg[(this->height - i - 1)*this->width],
&imgbuffer[(i*this->width)],
this->width * sizeof(T));
}
if (this->is_little_big_endianness_swap()) {
std::cout << "little-big endianness transformation is needed.\n";
// little-big endianness transformation is needed.
union {
T f;
unsigned char u8[sizeof(T)];
} source, dest;
for (int i = 0; i < length_; ++i) {
source.f = tbimg[i];
for (size_t k = 0, s_T = sizeof(T); k < s_T; k++)
dest.u8[k] = source.u8[s_T - k - 1];
tbimg[i] = dest.f;
//cout << dest.f << ", ";
}
}
ofs.write((char *)tbimg, this->width*this->height * sizeof(T));
ofs.close();
free(tbimg);
}
inline float getEndianess(){return endianess;}
inline int getHeight(void){return height;}
inline int getWidth(void){return width;}
inline void setHeight(const int & h){height = h;}
inline void setWidth(const int & w){width = w;}
private:
int height;
int width;
float endianess;
};
#endif /* PGM_H_ */
Forgive me to leave lots of useless comments in the code.
A simple example shows the write/read:
int main(){
PFM pfm_rw;
string temp = "img/Motorcycle/disp0GT.pfm";
float * p_disp_gt = pfm_rw.read_pfm<float>(temp);
//int imgH = pfm_rw.getHeight();
//int imgW = pfm_rw.getWidth();
//float scale = pfm_rw.getEndianess();
string temp2 = "result/Motorcycle/disp0GT_n1.pfm";
pfm_rw.write_pfm<float>(temp2, p_disp_gt, -1.0f);
return 1;
}
As far as I know, OpenCV doesn't support to read PFM files directly.
You can refer to the code snippet here for a simple PFM reader, which will enable you to read PFM files into COLOR *data with COLOR defined as follows:
typedef struct {
float r;
float g;
float b;
} COLOR;

How to read char array representing a pixel as unsigned int

I am writting a C++ command line application that will apply the Haar transform to the pixels of a bmp image. I have successfully been able to extract the header information and determine the byte array size for the pixels. After filling a char[pixelHeight][rowSizeInBytes] with the pixel data from the file, I am reading each pixel (24 bits for the bmp I'm using) into a vector. It is working on my machine but I would like to know if my implementation for converting the char array representing a pixel into an unsigned int is safe and/or the idiomatic C++ way. I am assuming a little endian architecture.
unsigned char pixelData[infoHeader->pixelHeight][rowSize];
fseek(pFile, basicHeader->pixelDataOffset, SEEK_SET);
fread(&pixelData, pixelArraySize, 1, pFile);
for(int row = 0; row < infoHeader->pixelHeight; row++)
{
for(int i = 0; i < rowSize; i = i + 3)
{
unsigned char blue = pixelData[row][i];
unsigned char green = pixelData[row][i + 1];
unsigned char red = pixelData[row][i + 2];
char vals[4];
vals[0] = blue;
vals[1] = green;
vals[2] = red;
vals[3] = '\0';
unsigned int pixelVal = *((unsigned int *)vals);
pixelVec.push_back(pixelVal);
}
}
No, this is unidiomatic. You should code what you mean rather than relying on the endianness of the system. For example:
unsigned int pixelVal = static_cast<unsigned int>(blue) |
(static_cast<unsigned int>(green) << 8) |
(static_cast<unsigned int>(red) << 16);
This assumes your intention was to get a vector with specific values for unsigned integers. If your intention was to get a vector with specific bytes, you should use a vector of byte-sized structures, not unsigned integers.