jpeg and tiff Pixel value extraction - c++

I want to able to compare 2 images (same format) and perform bit level comparison on those images. 1)create structs for headers.2) open the files and read the contents starting at the image data offset from the SOI marker.3) Store the respective values in a 3d array or a vector array.4)Do an element wise comparison and return a result. I have successfully been able to do this for a bmp using fread() and used a 3d array as a container with methods that can allocate and deallocate memory.(But bmp's are uncompressed images). Somehow this process seems a lot harder for jpeg's and tiff's.Even after understanding the header format for these 2 formats, my code says that it cannot read the color at element [45][24] .I have looked at several other options like libjpeg and CImg but I would like to get point of views before i jump into a new library.
My code for bmp is as follows :
...snip...
unsigned char*** create3dArray(FILE **fptr1,int height,int width,int depth)
{
unsigned char*** databuff = new unsigned char **[height];
// Allocate an array for each element of the first array
for(int x = 0; x < height; ++x)
{
databuff[x] = new unsigned char *[width];
// Allocate an array of integers for each element of this array
for(int y = 0; y < width; ++y)
{
databuff[x][y] = new unsigned char[depth];
// Specify an initial value (if desired)
for(int z = 0; z < depth; ++z)
{
databuff[x][y][z] = -1;
}
}
}
if ((sizeof(fheader) != 14) || (sizeof(iheader) != 40))
{
printf("Header structs are not properly packed\n");
return 0;
}
if (fread(&fheader, sizeof(fheader), 1, *fptr1) != 1)
{
printf("Couldn't read fheader.\n");
return 0;
}
if (fread(&iheader, sizeof(iheader), 1, *fptr1) != 1)
{
printf("Couldn't read iheader.\n");
return 0;
}
// uncomment to get an idea of what the headers look like.
if ((iheader.height != height) || (iheader.width != width) || (iheader.bits != 24))
{
printf("This only works for 512x512 24-color bitmaps\n");
return 0;
}
if (fheader.offset != 54) {
printf("This only works if the offset is equal to 54\n");
return 0;
}
for (int i = 0; i < iheader.height; i++) {
for (int j = 0; j < iheader.width; j++) {
if (fread(&databuff[i][j][0], 3, 1, *fptr1) != 1 ){
printf("Couldn't read colors for element [%d][%d]\n", i, j);
return 0;
}
}
}
return databuff;
}
template <typename Tx>
void destroy3dArray(Tx*** myArray)
{
delete[] **myArray;
delete[] *myArray;
delete[] myArray;
}
int main()
{
FILE *fptr1,*fptr2; // two file pointers one for each file.
int count=0;
float total_bits=0;
float ber=0; //variable for bit error rate
int width,height,depth;
cout<<"Please enter height of the image "<<endl;
cin>>height;
cout<<"Please enter width of the image "<<endl;
cin>>width;
cout<<"Please enter depth of the image. The max depth can be 3 for RGB values"<<endl;
cin>>depth;
char *filename = "lena512.bmp";
char *filename2 = "lena512_2.bmp";
//std::string trueBinaryDataInString[512][512][3];
if ((fptr1 = fopen(filename, "r")) == NULL) {
printf("Coulsn't open file %s for reading.\n", filename);
return 1;
}
unsigned char*** trueArray = create3dArray(&fptr1,height,width,depth);
for(int i=0;i<height;i++)
{
//std::cout << "Row " << i << std::endl;
for(int j=0;j<width;j++)
{
for(int k=0;k<depth;k++)
{
total_bits += ToBinary(trueArray[i][j][k]).length();
}
//std::cout<<endl;
}
//std::cout<<endl;
}
std::cout << total_bits<<endl;
//createAnddestroy3dArray<unsigned char> MyArray;
if ((fptr2 = fopen(filename2, "r")) == NULL) {
printf("Coulsn't open file %s for reading.\n", filename2);
return 1;
}
unsigned char*** trueArray2 = create3dArray(&fptr2,height,width,depth);
/*for(int i=0;i<512;i++)
{
std::cout << "Row " << i << std::endl;
for(int j=0;j<512;j++)
{
for(int k=0;k<3;k++)
{
std::cout<<" "<<ToBinary(trueArray2[i][j][k]);
}
std::cout<<endl;
}
std::cout<<endl;
}
*/
/******** BIT Error Rate Calculation ******/
for(int i=0;i<height;i++)
{
for(int j=0;j<width;j++)
{
for(int k=0;k<depth;k++)
{
if(ToBinary(trueArray[i][j][k])!= ToBinary(trueArray2[i][j][k]))
{
std::cout<<ToBinary(trueArray[i][j][k])<< " " <<ToBinary(trueArray2[i] [j][k])<<endl;
count++;
}
else
continue;
}
}
}
ber = (count/total_bits)*100;
std::cout<<"Bit Error Rate (BER) = "<<ber<<endl;
destroy3dArray<unsigned char>(trueArray); //Deallocating memory for array 1
destroy3dArray<unsigned char>(trueArray2); //Deallocating memory for array 2
return 0;
}

JPEG and TIFF are compressed formats with perhaps greater degrees of freedom in encoding images that you perhaps might expect.
So you are approaching the problem from the wrong angle. To support choices of imaging formats you need libraries to read and decompress the files into bitmap, such as 24-bit RGB or something else. There might be a color space conversion required as one of the compared images might be decompressed into 4:2:2 YUV space and the other is 4:2:0 etc.
Leveraging some image library at your choice (perhaps you have OS constraints as well) you would be able to load and decompress the files into 2D array of pixels of format of your interest. Having that done you would feed that into your C++ number crunching code and do the comparison from there.

Successfully parsing and handling the possible variations in JPEG and TIFF files is hard. There are a surprising number of details: color depth, progressive encoding, EXIF data, thumbnails, - the list goes on. Take advantage of the libraries and don't reinvent the wheel. Use libjpeg and libtiff to load suitable (RGB?) buffers for comparison.
http://www.libtiff.org/
http://www.ijg.org/
FWIW, libpng is pretty good, too - if you want to extend your image comparison to that format, as well. http://www.libpng.org/pub/png/libpng.html

Related

Why does iterating over a hex array stop early?

Problem and Code
I am working with code to take a screenshot on a Raspberry Pi. Using some magic from the VC handler, I can take a screenshot and store it in memory with calloc. I can use this to store the data in a file as a ppm image with the requisite header using:
void * image;
image = calloc(1, width * 3 * height);
// code to store data into *image
FILE *fp = fopen("myfile.ppm", "wb");
fprintf(fp, "P6\n%d %d\n255\n", width, height);
fwrite(image, width*3*height, 1, fp);
fclose(fp);
This successfully stores the data. I can access it and view it normally.
However, if I instead try to inspect the data which are being put into the file for debugging purposes by printing:
int cnt = 0;
std::string imstr = (char *)image;
for (int i=0; i<(width*3*height); i++) {
std::cout << (int)imstr[i] << " " << cnt << std::endl;
cnt += 1;
}
I segfault early. The numbers which are returned in the print make sense for the context (e.g. color values <255)
Example Numbers
In the case of a 1280 x 768 x 3 image, my cnt stops at 64231. The value it stops at doesn't seem to have any relation to the sizeof char or int.
I think I'm missing something obvious here, but I can't see it. Any suggestions?
very probably you have at least a null character in (char *)image, so the std::string length is shorter than width*3*height due to its initialization because only the characters up to that first null character are used
use a std::array rather than a std::stringinitialized like that
The way you are converting the image data to a std::string is wrong. If the image's raw data contains any 0x00 bytes then the std::string will be truncated, causing your loop to access out of bounds of the std::string. And if the image's raw data does not contain any 0x00 bytes then the std::string constructor will try to read past the bounds of the image's allocated memory.
You need to take the image's size into account when constructing the std::string, eg:
size_t cnt = 0;
std::string imstr(static_cast<char*>(image), width*3*height);
for (size_t i = 0; i < imstr.size(); ++i) {
std::cout << static_cast<int>(imstr[i]) << " " << cnt << std::endl;
++cnt;
}
Otherwise, simply don't convert the image to std::string at all. You can iterate the image's raw data directly instead, eg:
size_t cnt = 0, imsize = width*3*height;
char *imdata = static_cast<char*>(image);
for (size_t i = 0; i < imsize; ++i) {
std::cout << static_cast<int>(imdata[i]) << " " << cnt << std::endl;
++cnt;
}

Calculating the vertical gradient of 2D image causes strange output

I want to apply a simple derive/gradient filter, [-1, 0, 1], to an image from a .ppm file.
The raw binary data from the .ppm file is read into a one-dimensional array:
uint8_t* raw_image_data;
size_t n_rows, n_cols, depth;
// Open the file as an input binary file
std::ifstream file;
file.open("test_image.ppm", std::ios::in | std::ios::binary);
if (!file.is_open()) { /* error */ }
std::string temp_line;
// Check that it's a valid P6 file
if (!(std::getline(file, temp_line) && temp_line == "P6")) {}
// Then skip all the comments (lines that begin with a #)
while (std::getline(file, temp_line) && temp_line.at(0) == '#');
// Try read in the info about the number of rows and columns
try {
n_rows = std::stoi(temp_line.substr(0, temp_line.find(' ')));
n_cols = std::stoi(temp_line.substr(temp_line.find(' ')+1,temp_line.size()));
std::getline(file, temp_line);
depth = std::stoi(temp_line);
} catch (const std::invalid_argument & e) { /* stoi has failed */}
// Allocate memory and read in all image data from ppm
raw_image_data = new uint8_t[n_rows*n_cols*3];
file.read((char*)raw_image_data, n_rows*n_cols*3);
file.close();
I then read a grayscale image from the data into a two-dimensional array, called image_grayscale:
uint8_t** image_grayscale;
image_grayscale = new uint8_t*[n_rows];
for (size_t i = 0; i < n_rows; ++i) {
image_grayscale[i] = new uint8_t[n_cols];
}
// Convert linear array of raw image data to 2d grayscale image
size_t counter = 0;
for (size_t r = 0; r < n_rows; ++r) {
for (size_t c = 0; c < n_cols; ++c) {
image_grayscale[r][c] = 0.21*raw_image_data[counter]
+ 0.72*raw_image_data[counter+1]
+ 0.07*raw_image_data[counter+2];
counter += 3;
}
}
I want to write the resulting filtered image to another two-dimensional array, gradient_magnitude:
uint32_t** gradient_magnitude;
// Allocate memory
gradient_magnitude = new uint32_t*[n_rows];
for (size_t i = 0; i < n_rows; ++i) {
gradient_magnitude[i] = new uint32_t[n_cols];
}
// Filtering operation
int32_t grad_h, grad_v;
for (int r = 1; r < n_rows-1; ++r) {
for (int c = 1; c < n_cols-1; ++c) {
grad_h = image_grayscale[r][c+1] - image_grayscale[r][c-1];
grad_v = image_grayscale[r+1][c] - image_grayscale[r-1][c];
gradient_magnitude[r][c] = std::sqrt(pow(grad_h, 2) + pow(grad_v, 2));
}
}
Finally, I write the filtered image to a .ppm output.
std::ofstream out;
out.open("output.ppm", std::ios::out | std::ios::binary);
// ppm header
out << "P6\n" << n_rows << " " << n_cols << "\n" << "255\n";
// Write data to file
for (int r = 0; r < n_rows; ++r) {
for (int c = 0; c < n_cols; ++c) {
for (int i = 0; i < 3; ++i) {
out.write((char*) &gradient_magnitude[r][c],1);
}
}
}
out.close();
The output image, however, is a mess.
When I simply set grad_v = 0; in the loop (i.e. solely calculate the horizontal gradient), the output is seemingly correct:
When I instead set grad_h = 0; (i.e. solely calculate the vertical gradient), the output is strange:
It seems like part of the image has been circularly shifted, but I cannot understand why. Moreover, I have tried with many images and the same issue occurs.
Can anyone see any issues? Thanks so much!
Ok, first clue is that the image looks circularly shifted. This hints that strides are wrong. The core of your problem is simple:
n_rows = std::stoi(temp_line.substr(0, temp_line.find(' ')));
n_cols = std::stoi(temp_line.substr(temp_line.find(' ')+1,temp_line.size()));
but in the documentation you can read:
Each PPM image consists of the following:
A "magic number" for identifying the file type. A ppm image's magic number is the two
characters "P6".
Whitespace (blanks, TABs, CRs, LFs).
A width, formatted as ASCII characters in decimal.
Whitespace.
A height, again in ASCII decimal.
[...]
Width is columns, height is rows. So that's the classical error that you get when implementing image processing stuff: swapping rows and columns.
From a didactic point of view, why are you doing this mistake? My guess: poor debugging tools. After making a working example from your question (effort that I would have saved if you had provided a MCVE), I run to the end of image loading and used Image Watch to see the content of your image with #mem(raw_image_data, UINT8, 3, n_cols, n_rows, n_cols*3). Result:
Ok, let's try to swap them: #mem(raw_image_data, UINT8, 3, n_rows, n_cols, n_rows*3). Result:
Much better. Unfortunately I don't know how to specify RGB instead of BGR in Image Watch #mem pseudo command, so the wrong colors.
Then we come back to your code: please compile with all warnings on. Then I'd use more of the std::stream features for parsing your input and less std::stoi() or find(). Avoid memory allocation by using std::vector and make a (possibly template) class for images. Even if you stick to your pointer to pointer, don't make multiple new for each row: make a single new for the pointer at row 0, and have the other pointers point to it:
uint8_t** image_grayscale = new uint8_t*[n_rows];
image_grayscale[0] = new uint8_t[n_rows*n_cols];
for (size_t i = 1; i < n_rows; ++i) {
image_grayscale[i] = image_grayscale[i - 1] + n_cols;
}
Same effect, but easier to deallocate and to manage as a single piece of memory. For example, saving as a PGM becomes:
{
std::ofstream out("output.pgm", std::ios::binary);
out << "P5\n" << n_rows << " " << n_cols << "\n" << "255\n";
out.write(reinterpret_cast<char*>(image_grayscale[0]), n_rows*n_cols);
}
Fill your borders! Using the single allocation style I showed you you can do it as:
uint32_t** gradient_magnitude = new uint32_t*[n_rows];
gradient_magnitude[0] = new uint32_t[n_rows*n_cols];
for (size_t i = 1; i < n_rows; ++i) {
gradient_magnitude[i] = gradient_magnitude[i - 1] + n_cols;
}
std::fill_n(gradient_magnitude[0], n_rows*n_cols, 0);
Finally the gradient magnitude is an integer value between 0 and 360 (you used a uint32_t). Then you save only the least significant byte of it! Of course it's wrong. You need to map from [0,360] to [0,255]. How? You can saturate (if greater than 255 set to 255) or apply a linear scaling (*255/360). Of course you can do also other things, but it's not important.
Here you can see the result on a zoomed version of the three cases: saturate, scale, only LSB (wrong):
With the wrong version you see dark pixels where the value should be higer than 255.

Populating an array from a .txt file

I'm trying to populate an array from a .txt that I am reading. I am using this code that I am using as a function to read the file:
double* read_text(const char *fileName, int sizeR, int sizeC)
{
double* data = new double[sizeR*sizeC];
int i = 0;
ifstream myfile(fileName);
if (myfile.is_open())
{
while (myfile.good())
{
if (i > sizeR*sizeC - 1) break;
myfile >> *(data + i);
//cout << *(data + i) << ' '; // Displays converted data.
i++;
}
myfile.close();
}
else cout << "Unable to open file";
//cout << i;
return data;
}
Now when I read the file I am trying to take the elements from the 1D data array and store them into a 2D array.
I've tried to create an array in a public class, however I have no idea on how to move the data that I am reading to a 2D array.
I know it's not very clear but basically I'm doing the nearest neighbour search algorithm to compare 2 images. I have taken one image and converted it into the values using this bit of code above. However now I am trying to store the data that I am reading into a 2D public array?
Here is a more compact version of reading in a 2D matrix:
int quantity = sizeR * sizeC;
double * matrix = new double [quantity];
double value = 0.0;
double * p_cell = matrix;
//...
while ((myfile >> value) && (quantity > 0))
{
*p_cell++ = value;
--quantity;
}
In the above code snippet, a pointer is used to point to the next slot or cell of the matrix (2D array). The pointer is incremented after each read.
The quantity is decremented as a safety check to prevent buffer overrun.
Assuming every double returned represents a pixel. You can define a function that retrieves pixels like so:
double get_pixel(int x, int y, double* data, int sizeC)
{
return data[x + y*sizeC];
}
Where sizeC is the width of the image (number of columns).
You can then use the function above to fill your 2D array like so:
for(int i = 0; i < sizeC; i++)
for(int j = 0; j < sizeR; j++)
my2Darray[i][j] = get_pixel(i, j, data, sizeC);
But then notice how unnecessary this is. You don't really need a 2D array :) keep it simple and efficient.
The function above could be a part of a struct that represents the Image where you'd have sizeC, sizeR and data defined as members.
struct Image
{
int sizeC;
int sizeR;
double* data;
get_pixel(int x, int y)
{
return data[x + y*sizeC];
}
};
Then to access the image pixels you can simply do:
Image img;
// read image data and stuff
double p = img.get_pixel(4, 2);
You can even make it look prettier by overriding the operator() instead of get_pixel so retrieving the pixel would look something like:
double p = img(4, 2);

Why do I get segfault trying to write a value in an array into a CSV file

I'm trying to read an 8bit image in binary or raw file format, and put every pixel in a row in a csv file and include the 12 neigbours within 2 pixels in x, y, z. I started by just trying to write the value of each pixel in a
// ----------------------- Create pointer to hold input values for ml
short p[1308*1308*200][13];
ofstream full_stack;
full_stack.open("full_stack.csv");
int index;
// // ----------------------- for loop execution
for( int x = 0; x < 1308; x++ ) {
for( int y = 0; y < 1308; y++ ) {
for( int z = 0; z < 200; z++ ) {
index = x+1308*y+1308*1308*z;
myData.read(buf, sizeof(buf));
memcpy(&value, buf, sizeof(buf));
p[index][0] = value;
}
}
}
for ( int i = 0; i < 1308*1308*200; i++){
for ( int j = 0; j < 13; j++){
full_stack << p[i][j] << endl;
}
}
full_stack.close();
}
As #Sid_S points out, you're attempting to declare an 8 gigabyte array on the stack. The stack in a typical application on a typical machine these days is around 1-2 megabytes. You need to dynamically allocate the array using malloc(), new, or by using a C++ collection like std::vector<short>. Given that you have a 2-dimensional array, you'd need to do something like std::vector<std::vector<short>>.

PGM File Reader Doesn't Read Asymmetric Files

I'm writing a simple PGM file reader for a basic CV idea, and I'm having a weird issue. My method seems to work alright for symmetric files (255 x 255, for example), but when I try to read an asymmetric file (300 x 246), I get some weird input. One file reads to a certain point and then dumps ESCAPE characters (ASCII 27) into the remainder of the image (see below), and others just won't read. I think this might be some flawed logic or a memory issue. Any help would be appreciated.
// Process files of binary type (P5)
else if(holdString[1] == '5') {
// Assign fileType value
fileType = 5;
// Read in comments and discard
getline(fileIN, holdString);
// Read in image Width value
fileIN >> width;
// Read in image Height value
fileIN >> height;
// Read in Maximum Grayscale Value
fileIN >> max;
// Determine byte size if Maximum value is over 256 (1 byte)
if(max < 256) {
// Collection variable for bytes
char readChar;
// Assign image dynamic memory
*image = new int*[height];
for(int index = 0; index < height; index++) {
(*image)[index] = new int[width];
}
// Read in 1 byte at a time
for(int row = 0; row < height; row++) {
for(int column = 0; column < width; column++) {
fileIN.get(readChar);
(*image)[row][column] = (int) readChar;
}
}
// Close the file
fileIN.close();
} else {
// Assign image dynamic memory
// Read in 2 bytes at a time
// Close the file
}
}
Tinkered with it a bit, and came up with at least most of a solution. Using the .read() function, I was able to draw the whole file in and then read it piece by piece into the int array. I kept the dynamic memory because I wanted to draw in files of different sizes, but I did pay more attention to how it was read into the array, so thank you for the suggestion, Mark. The edits seem to work well on files up to 1000 pixels wide or tall, which is fine for what I'm using it for. After, it distorts, but I'll still take that over not reading the file.
if(max < 256) {
// Collection variable for bytes
int size = height * width;
unsigned char* data = new unsigned char[size];
// Assign image dynamic memory
*image = new int*[height];
for(int index = 0; index < height; index++) {
(*image)[index] = new int[width];
}
// Read in 1 byte at a time
fileIN.read(reinterpret_cast<char*>(data), size * sizeof(unsigned char));
// Close the file
fileIN.close();
// Set data to the image
for(int row = 0; row < height; row++) {
for(int column = 0; column < width; column++) {
(*image)[row][column] = (int) data[row*width+column];
}
}
// Delete temporary memory
delete[] data;
}