I'm writing a simple PGM file reader for a basic CV idea, and I'm having a weird issue. My method seems to work alright for symmetric files (255 x 255, for example), but when I try to read an asymmetric file (300 x 246), I get some weird input. One file reads to a certain point and then dumps ESCAPE characters (ASCII 27) into the remainder of the image (see below), and others just won't read. I think this might be some flawed logic or a memory issue. Any help would be appreciated.
// Process files of binary type (P5)
else if(holdString[1] == '5') {
// Assign fileType value
fileType = 5;
// Read in comments and discard
getline(fileIN, holdString);
// Read in image Width value
fileIN >> width;
// Read in image Height value
fileIN >> height;
// Read in Maximum Grayscale Value
fileIN >> max;
// Determine byte size if Maximum value is over 256 (1 byte)
if(max < 256) {
// Collection variable for bytes
char readChar;
// Assign image dynamic memory
*image = new int*[height];
for(int index = 0; index < height; index++) {
(*image)[index] = new int[width];
}
// Read in 1 byte at a time
for(int row = 0; row < height; row++) {
for(int column = 0; column < width; column++) {
fileIN.get(readChar);
(*image)[row][column] = (int) readChar;
}
}
// Close the file
fileIN.close();
} else {
// Assign image dynamic memory
// Read in 2 bytes at a time
// Close the file
}
}
Tinkered with it a bit, and came up with at least most of a solution. Using the .read() function, I was able to draw the whole file in and then read it piece by piece into the int array. I kept the dynamic memory because I wanted to draw in files of different sizes, but I did pay more attention to how it was read into the array, so thank you for the suggestion, Mark. The edits seem to work well on files up to 1000 pixels wide or tall, which is fine for what I'm using it for. After, it distorts, but I'll still take that over not reading the file.
if(max < 256) {
// Collection variable for bytes
int size = height * width;
unsigned char* data = new unsigned char[size];
// Assign image dynamic memory
*image = new int*[height];
for(int index = 0; index < height; index++) {
(*image)[index] = new int[width];
}
// Read in 1 byte at a time
fileIN.read(reinterpret_cast<char*>(data), size * sizeof(unsigned char));
// Close the file
fileIN.close();
// Set data to the image
for(int row = 0; row < height; row++) {
for(int column = 0; column < width; column++) {
(*image)[row][column] = (int) data[row*width+column];
}
}
// Delete temporary memory
delete[] data;
}
Related
I want to apply a simple derive/gradient filter, [-1, 0, 1], to an image from a .ppm file.
The raw binary data from the .ppm file is read into a one-dimensional array:
uint8_t* raw_image_data;
size_t n_rows, n_cols, depth;
// Open the file as an input binary file
std::ifstream file;
file.open("test_image.ppm", std::ios::in | std::ios::binary);
if (!file.is_open()) { /* error */ }
std::string temp_line;
// Check that it's a valid P6 file
if (!(std::getline(file, temp_line) && temp_line == "P6")) {}
// Then skip all the comments (lines that begin with a #)
while (std::getline(file, temp_line) && temp_line.at(0) == '#');
// Try read in the info about the number of rows and columns
try {
n_rows = std::stoi(temp_line.substr(0, temp_line.find(' ')));
n_cols = std::stoi(temp_line.substr(temp_line.find(' ')+1,temp_line.size()));
std::getline(file, temp_line);
depth = std::stoi(temp_line);
} catch (const std::invalid_argument & e) { /* stoi has failed */}
// Allocate memory and read in all image data from ppm
raw_image_data = new uint8_t[n_rows*n_cols*3];
file.read((char*)raw_image_data, n_rows*n_cols*3);
file.close();
I then read a grayscale image from the data into a two-dimensional array, called image_grayscale:
uint8_t** image_grayscale;
image_grayscale = new uint8_t*[n_rows];
for (size_t i = 0; i < n_rows; ++i) {
image_grayscale[i] = new uint8_t[n_cols];
}
// Convert linear array of raw image data to 2d grayscale image
size_t counter = 0;
for (size_t r = 0; r < n_rows; ++r) {
for (size_t c = 0; c < n_cols; ++c) {
image_grayscale[r][c] = 0.21*raw_image_data[counter]
+ 0.72*raw_image_data[counter+1]
+ 0.07*raw_image_data[counter+2];
counter += 3;
}
}
I want to write the resulting filtered image to another two-dimensional array, gradient_magnitude:
uint32_t** gradient_magnitude;
// Allocate memory
gradient_magnitude = new uint32_t*[n_rows];
for (size_t i = 0; i < n_rows; ++i) {
gradient_magnitude[i] = new uint32_t[n_cols];
}
// Filtering operation
int32_t grad_h, grad_v;
for (int r = 1; r < n_rows-1; ++r) {
for (int c = 1; c < n_cols-1; ++c) {
grad_h = image_grayscale[r][c+1] - image_grayscale[r][c-1];
grad_v = image_grayscale[r+1][c] - image_grayscale[r-1][c];
gradient_magnitude[r][c] = std::sqrt(pow(grad_h, 2) + pow(grad_v, 2));
}
}
Finally, I write the filtered image to a .ppm output.
std::ofstream out;
out.open("output.ppm", std::ios::out | std::ios::binary);
// ppm header
out << "P6\n" << n_rows << " " << n_cols << "\n" << "255\n";
// Write data to file
for (int r = 0; r < n_rows; ++r) {
for (int c = 0; c < n_cols; ++c) {
for (int i = 0; i < 3; ++i) {
out.write((char*) &gradient_magnitude[r][c],1);
}
}
}
out.close();
The output image, however, is a mess.
When I simply set grad_v = 0; in the loop (i.e. solely calculate the horizontal gradient), the output is seemingly correct:
When I instead set grad_h = 0; (i.e. solely calculate the vertical gradient), the output is strange:
It seems like part of the image has been circularly shifted, but I cannot understand why. Moreover, I have tried with many images and the same issue occurs.
Can anyone see any issues? Thanks so much!
Ok, first clue is that the image looks circularly shifted. This hints that strides are wrong. The core of your problem is simple:
n_rows = std::stoi(temp_line.substr(0, temp_line.find(' ')));
n_cols = std::stoi(temp_line.substr(temp_line.find(' ')+1,temp_line.size()));
but in the documentation you can read:
Each PPM image consists of the following:
A "magic number" for identifying the file type. A ppm image's magic number is the two
characters "P6".
Whitespace (blanks, TABs, CRs, LFs).
A width, formatted as ASCII characters in decimal.
Whitespace.
A height, again in ASCII decimal.
[...]
Width is columns, height is rows. So that's the classical error that you get when implementing image processing stuff: swapping rows and columns.
From a didactic point of view, why are you doing this mistake? My guess: poor debugging tools. After making a working example from your question (effort that I would have saved if you had provided a MCVE), I run to the end of image loading and used Image Watch to see the content of your image with #mem(raw_image_data, UINT8, 3, n_cols, n_rows, n_cols*3). Result:
Ok, let's try to swap them: #mem(raw_image_data, UINT8, 3, n_rows, n_cols, n_rows*3). Result:
Much better. Unfortunately I don't know how to specify RGB instead of BGR in Image Watch #mem pseudo command, so the wrong colors.
Then we come back to your code: please compile with all warnings on. Then I'd use more of the std::stream features for parsing your input and less std::stoi() or find(). Avoid memory allocation by using std::vector and make a (possibly template) class for images. Even if you stick to your pointer to pointer, don't make multiple new for each row: make a single new for the pointer at row 0, and have the other pointers point to it:
uint8_t** image_grayscale = new uint8_t*[n_rows];
image_grayscale[0] = new uint8_t[n_rows*n_cols];
for (size_t i = 1; i < n_rows; ++i) {
image_grayscale[i] = image_grayscale[i - 1] + n_cols;
}
Same effect, but easier to deallocate and to manage as a single piece of memory. For example, saving as a PGM becomes:
{
std::ofstream out("output.pgm", std::ios::binary);
out << "P5\n" << n_rows << " " << n_cols << "\n" << "255\n";
out.write(reinterpret_cast<char*>(image_grayscale[0]), n_rows*n_cols);
}
Fill your borders! Using the single allocation style I showed you you can do it as:
uint32_t** gradient_magnitude = new uint32_t*[n_rows];
gradient_magnitude[0] = new uint32_t[n_rows*n_cols];
for (size_t i = 1; i < n_rows; ++i) {
gradient_magnitude[i] = gradient_magnitude[i - 1] + n_cols;
}
std::fill_n(gradient_magnitude[0], n_rows*n_cols, 0);
Finally the gradient magnitude is an integer value between 0 and 360 (you used a uint32_t). Then you save only the least significant byte of it! Of course it's wrong. You need to map from [0,360] to [0,255]. How? You can saturate (if greater than 255 set to 255) or apply a linear scaling (*255/360). Of course you can do also other things, but it's not important.
Here you can see the result on a zoomed version of the three cases: saturate, scale, only LSB (wrong):
With the wrong version you see dark pixels where the value should be higer than 255.
I'm trying to read an 8bit image in binary or raw file format, and put every pixel in a row in a csv file and include the 12 neigbours within 2 pixels in x, y, z. I started by just trying to write the value of each pixel in a
// ----------------------- Create pointer to hold input values for ml
short p[1308*1308*200][13];
ofstream full_stack;
full_stack.open("full_stack.csv");
int index;
// // ----------------------- for loop execution
for( int x = 0; x < 1308; x++ ) {
for( int y = 0; y < 1308; y++ ) {
for( int z = 0; z < 200; z++ ) {
index = x+1308*y+1308*1308*z;
myData.read(buf, sizeof(buf));
memcpy(&value, buf, sizeof(buf));
p[index][0] = value;
}
}
}
for ( int i = 0; i < 1308*1308*200; i++){
for ( int j = 0; j < 13; j++){
full_stack << p[i][j] << endl;
}
}
full_stack.close();
}
As #Sid_S points out, you're attempting to declare an 8 gigabyte array on the stack. The stack in a typical application on a typical machine these days is around 1-2 megabytes. You need to dynamically allocate the array using malloc(), new, or by using a C++ collection like std::vector<short>. Given that you have a 2-dimensional array, you'd need to do something like std::vector<std::vector<short>>.
I have a program that reads from a really big binary file (48 MB) and then passes the data to a matrix of custom structs named pixel:
struct pixel {
int r;
int g;
int b;
};
Opening the file:
ifstream myFile(inputPath, ios::binary);
pixel **matrixPixel;
The read of the file is done this way:
int position = 0;
for (int i = 0; i < HEIGHT; ++i) {
for (int j = 0; j < WIDTH; ++j) {
if (!myFile.eof()) {
myFile.seekg(position, ios::beg);
myFile.read((char *) &matrixPixel[i][j].r, 1); // red byte
myFile.seekg(position + HEIGHT * WIDTH, ios::beg);
myFile.read((char *) &matrixPixel[i][j].g, 1); // green byte
myFile.seekg(position + HEIGHT * WIDTH * 2, ios::beg);
myFile.read((char *) &matrixPixel[i][j].b, 1); // blue byte
++position;
}
}
}
myFile.close();
The thing is that, for a big file like the one at the beginning, it takes a lot of time (~7 min) and it's supposed to be optimized. How could I read from the file in less time?
So, the structure of the data you're storing in memory looks like this:
rgbrgbrgbrgbrgbrgbrgbrgbrgbrgb..............rgb
But the structure of the file you're reading looks like this (assuming your code's logic is correct):
rrrrrrrrrrrrrrrrrrrrrrrrrrr....
ggggggggggggggggggggggggggg....
bbbbbbbbbbbbbbbbbbbbbbbbbbb....
And in your code, you're translating between the two. Fundamentally, that's going to be slow. And what's more, you've chosen to read the file by making manual seeks to arbitrary points in the file. That's going to slow things down even more.
The first thing you can do is streamline the Hard Disk reads:
for(int channel = 0; channel < 3; channel++) {
for (int i = 0; i < HEIGHT; ++i) {
for (int j = 0; j < WIDTH; ++j) {
if (!myFile.eof()) {
switch(channel) {
case 0: myFile.read((char *) &matrixPixel[i][j].r, 1); break;
case 1: myFile.read((char *) &matrixPixel[i][j].g, 1); break;
case 2: myFile.read((char *) &matrixPixel[i][j].b, 1); break;
}
}
}
}
}
That requires the fewest changes to your code, and will speed up your code, but the code will probably still be slow.
A better approach, which increases CPU use but dramatically reduces Hard Disk use (which, in the vast majority of applications, will result in a speed-up), would be to store the data like so:
std::vector<unsigned char> reds(WIDTH * HEIGHT);
std::vector<unsigned char> greens(WIDTH * HEIGHT);
std::vector<unsigned char> blues(WIDTH * HEIGHT);
myFile.read(reds.data(), WIDTH * HEIGHT); //Stream can be checked for errors resulting from EOF or other issues.
myFile.read(greens.data(), WIDTH * HEIGHT);
myFile.read(blues.data(), WIDTH * HEIGHT);
std::vector<pixel> pixels(WIDTH * HEIGHT);
for(size_t index = 0; index < WIDTH * HEIGHT; index++) {
pixels[index].r = reds[index];
pixels[index].g = greens[index];
pixels[index].b = blues[index];
}
The final, best approach, is to change how the binary file is formatted, because the way it appears to be formatted is insane (from a performance perspective). If the file is reformatted to the rgbrgbrgbrgbrgb style (which is far more standard in the industry), your code simply becomes this:
struct pixel {
unsigned char red, green, blue;
}; //You'll never read values above 255 when doing byte-length color values.
std::vector<pixel> pixels(WIDTH * HEIGHT);
myFile.read(reinterpret_cast<char*>(pixels.data()), WIDTH * HEIGHT * 3);
This is extremely short, and is probably going to outperform all the other methods. But of course, that may not be an option for you.
I haven't tested any of these methods (and there may be a typo or two) but all of these methods should be faster than what you're currently doing.
A faster method would be to read the bitmap into a buffer:
uint8_t buffer[HEIGHT][WIDTH];
const unsigned int bitmap_size_in_bytes = sizeof(buffer);
myFile.read(buffer, bitmap_size_in_bytes);
An even faster method is to read more than one bitmap into memory.
I'm trying to read a .pgm version p5 file. The header is in plain text then the actual data is stored in plain bytes. The header can be an arbitrary length. I how can I start reading byte by byte after reading in the plain text line by line?
int main()
{
//Declare
int rows = 0, cols = 0, maxVal = 0;
ifstream infile("image.pgm");
string inputLine = "";
string trash = "";
//First line "P5"
getline(infile,inputLine);
//ignore lines with comments
getline(infile,trash);
while (trash[0] == '#')
{
getline(infile,trash);
}
//get the rows and cols
istringstream iss(trash);
getline(iss, inputLine, ' ');
rows = atoi(inputLine.c_str());
getline(iss, inputLine, ' ');
cols = atoi(inputLine.c_str());
//get the last plain text line maxval
getline(infile,inputLine);
maxVal = atoi(inputLine.c_str());
//Now start reading individual bites
Matrix<int, rows, cols> m;
//now comes the data
for(i = 0; i<rows; i++)
{
for(j = 0; j < cols; j++)
{
//store data into matrix
}
}
system("Pause");
return 0;
}
Use ifstream::read to read a block of binary data and copy it into a buffer. You know the size of data from the image size in the header.
If your matrix object has a method to get an address you can copy it directly, or read it into some temporary buffer and then copy that into the Matrix. Reading a byte a time is likely to be very slow.
I want to able to compare 2 images (same format) and perform bit level comparison on those images. 1)create structs for headers.2) open the files and read the contents starting at the image data offset from the SOI marker.3) Store the respective values in a 3d array or a vector array.4)Do an element wise comparison and return a result. I have successfully been able to do this for a bmp using fread() and used a 3d array as a container with methods that can allocate and deallocate memory.(But bmp's are uncompressed images). Somehow this process seems a lot harder for jpeg's and tiff's.Even after understanding the header format for these 2 formats, my code says that it cannot read the color at element [45][24] .I have looked at several other options like libjpeg and CImg but I would like to get point of views before i jump into a new library.
My code for bmp is as follows :
...snip...
unsigned char*** create3dArray(FILE **fptr1,int height,int width,int depth)
{
unsigned char*** databuff = new unsigned char **[height];
// Allocate an array for each element of the first array
for(int x = 0; x < height; ++x)
{
databuff[x] = new unsigned char *[width];
// Allocate an array of integers for each element of this array
for(int y = 0; y < width; ++y)
{
databuff[x][y] = new unsigned char[depth];
// Specify an initial value (if desired)
for(int z = 0; z < depth; ++z)
{
databuff[x][y][z] = -1;
}
}
}
if ((sizeof(fheader) != 14) || (sizeof(iheader) != 40))
{
printf("Header structs are not properly packed\n");
return 0;
}
if (fread(&fheader, sizeof(fheader), 1, *fptr1) != 1)
{
printf("Couldn't read fheader.\n");
return 0;
}
if (fread(&iheader, sizeof(iheader), 1, *fptr1) != 1)
{
printf("Couldn't read iheader.\n");
return 0;
}
// uncomment to get an idea of what the headers look like.
if ((iheader.height != height) || (iheader.width != width) || (iheader.bits != 24))
{
printf("This only works for 512x512 24-color bitmaps\n");
return 0;
}
if (fheader.offset != 54) {
printf("This only works if the offset is equal to 54\n");
return 0;
}
for (int i = 0; i < iheader.height; i++) {
for (int j = 0; j < iheader.width; j++) {
if (fread(&databuff[i][j][0], 3, 1, *fptr1) != 1 ){
printf("Couldn't read colors for element [%d][%d]\n", i, j);
return 0;
}
}
}
return databuff;
}
template <typename Tx>
void destroy3dArray(Tx*** myArray)
{
delete[] **myArray;
delete[] *myArray;
delete[] myArray;
}
int main()
{
FILE *fptr1,*fptr2; // two file pointers one for each file.
int count=0;
float total_bits=0;
float ber=0; //variable for bit error rate
int width,height,depth;
cout<<"Please enter height of the image "<<endl;
cin>>height;
cout<<"Please enter width of the image "<<endl;
cin>>width;
cout<<"Please enter depth of the image. The max depth can be 3 for RGB values"<<endl;
cin>>depth;
char *filename = "lena512.bmp";
char *filename2 = "lena512_2.bmp";
//std::string trueBinaryDataInString[512][512][3];
if ((fptr1 = fopen(filename, "r")) == NULL) {
printf("Coulsn't open file %s for reading.\n", filename);
return 1;
}
unsigned char*** trueArray = create3dArray(&fptr1,height,width,depth);
for(int i=0;i<height;i++)
{
//std::cout << "Row " << i << std::endl;
for(int j=0;j<width;j++)
{
for(int k=0;k<depth;k++)
{
total_bits += ToBinary(trueArray[i][j][k]).length();
}
//std::cout<<endl;
}
//std::cout<<endl;
}
std::cout << total_bits<<endl;
//createAnddestroy3dArray<unsigned char> MyArray;
if ((fptr2 = fopen(filename2, "r")) == NULL) {
printf("Coulsn't open file %s for reading.\n", filename2);
return 1;
}
unsigned char*** trueArray2 = create3dArray(&fptr2,height,width,depth);
/*for(int i=0;i<512;i++)
{
std::cout << "Row " << i << std::endl;
for(int j=0;j<512;j++)
{
for(int k=0;k<3;k++)
{
std::cout<<" "<<ToBinary(trueArray2[i][j][k]);
}
std::cout<<endl;
}
std::cout<<endl;
}
*/
/******** BIT Error Rate Calculation ******/
for(int i=0;i<height;i++)
{
for(int j=0;j<width;j++)
{
for(int k=0;k<depth;k++)
{
if(ToBinary(trueArray[i][j][k])!= ToBinary(trueArray2[i][j][k]))
{
std::cout<<ToBinary(trueArray[i][j][k])<< " " <<ToBinary(trueArray2[i] [j][k])<<endl;
count++;
}
else
continue;
}
}
}
ber = (count/total_bits)*100;
std::cout<<"Bit Error Rate (BER) = "<<ber<<endl;
destroy3dArray<unsigned char>(trueArray); //Deallocating memory for array 1
destroy3dArray<unsigned char>(trueArray2); //Deallocating memory for array 2
return 0;
}
JPEG and TIFF are compressed formats with perhaps greater degrees of freedom in encoding images that you perhaps might expect.
So you are approaching the problem from the wrong angle. To support choices of imaging formats you need libraries to read and decompress the files into bitmap, such as 24-bit RGB or something else. There might be a color space conversion required as one of the compared images might be decompressed into 4:2:2 YUV space and the other is 4:2:0 etc.
Leveraging some image library at your choice (perhaps you have OS constraints as well) you would be able to load and decompress the files into 2D array of pixels of format of your interest. Having that done you would feed that into your C++ number crunching code and do the comparison from there.
Successfully parsing and handling the possible variations in JPEG and TIFF files is hard. There are a surprising number of details: color depth, progressive encoding, EXIF data, thumbnails, - the list goes on. Take advantage of the libraries and don't reinvent the wheel. Use libjpeg and libtiff to load suitable (RGB?) buffers for comparison.
http://www.libtiff.org/
http://www.ijg.org/
FWIW, libpng is pretty good, too - if you want to extend your image comparison to that format, as well. http://www.libpng.org/pub/png/libpng.html