OpenCV vs byte array - c++

I am working on a simple C++ image processing application and deciding whether to use OpenCV for loading the image and accessing individual pixels.
My current approach is to simply load the image using fopen, reading the 54 byte header and load the rest of the bytes in a char* array.
To access a specific pixel I use
long q = (long*)(bmpData + x*3 + (bmpSize.height - y - 1) * bmpSize.stride);
To perform a simple color check, for ex. "is blue?"
if (((long*)q | 0xFF000000) == 0xFFFF0000) //for some reason RGB is reversed to BGR
//do something here
Is OpenCV any faster considering all the function calls, parsing, etc.?

Bitmap file header is actually 54 bytes and you can't skip it. You have to read it to find the width, height, bitcount... calculate padding if necessary... and other information.
Depending on how the file is opened, OpenCV will read the header and reads the pixels directly in to a buffer. The only change is that the rows are flipped so the image is right side up.
cv::Mat mat = cv::imread("filename.bmp", CV_LOAD_IMAGE_COLOR);
uint8_t* data = (uint8_t*)mat.data;
The header checks and the small changes made by OpenCV will not significantly affect performance. The bottle neck is mainly in reading the file from the disk. The change in performance will be difficult to measure, unless you are doing a very specific task, for example you want only 3 bytes in a very large file, and you don't want to read the entire file.
OpenCV is overkill for this task, so you may choose other libraries for example CImg as suggested in comments. If you use smaller libraries they load faster, it might be noticeable when your program starts.
The following code is a test run on Windows.
For a large 16MB bitmap file, the result is almost identical for opencv versus plain c++.
For a small 200kb bitmap file, the result is 0.00013 seconds to read in plain C++, and 0.00040 seconds for opencv. Note the plain c++ is not doing much beside reading the bytes.
class stopwatch
{
std::chrono::time_point<std::chrono::system_clock> time_start, time_end;
public:
stopwatch() { reset();}
void reset(){ time_start = std::chrono::system_clock::now(); }
void print(const char* title)
{
time_end = std::chrono::system_clock::now();
std::chrono::duration<double> diff = time_end - time_start;
if(title) std::cout << title;
std::cout << diff.count() << "\n";
}
};
int main()
{
const char* filename = "filename.bmp";
//I use `fake` to prevent the compiler from over-optimization
//and skipping the whole loop. But it may not be necessary here
int fake = 0;
//open the file 100 times
int count = 100;
stopwatch sw;
for(int i = 0; i < count; i++)
{
//plain c++
std::ifstream fin(filename, std::ios::binary);
fin.seekg(0, std::ios::end);
int filesize = (int)fin.tellg();
fin.seekg(0, std::ios::beg);
std::vector<uint8_t> pixels(filesize - 54);
BITMAPFILEHEADER hd;
BITMAPINFOHEADER bi;
fin.read((char*)&hd, sizeof(hd));
fin.read((char*)&bi, sizeof(bi));
fin.read((char*)pixels.data(), pixels.size());
fake += pixels[i];
}
sw.print("time fstream: ");
sw.reset();
for(int i = 0; i < count; i++)
{
//opencv:
cv::Mat mat = cv::imread(filename, CV_LOAD_IMAGE_COLOR);
uint8_t* pixels = (uint8_t*)mat.data;
fake += pixels[i];
}
sw.print("time opencv: ");
printf("show some fake calculation: %d\n", fake);
return 0;
}

Related

How can one write a multi-dimensional vector of image data to an output file?

Question:
Is there a good way to write a 3D float vector of size (9000,9000,4) to an output file in C++?
My C++ program generates a 9000x9000 image matrix with 4 color values (R, G, B, A) for each pixel. I need to save this data as an output file to be read into a numpy.array() (or similar) using python at a later time. Each color value is saved as a float (can be larger than 1.0) which will be normalized in the python portion of the code.
Currently, I am writing the (9000,9000,4) sized vector into a CSV file with 81 million lines and 4 columns. This is slow for reading and writing and it creates large files (~650MB).
NOTE: I run the program multiple times (up to 20) for each trial, so read/write times and file sizes add up.
Current C++ Code:
This is the snippet that initializes and writes the 3D vector.
// initializes the vector with data from 'makematrix' class instance
vector<vector<vector<float>>> colorMat = makematrix->getMatrix();
outfile.open("../output/11_14MidRed9k8.csv",std::ios::out);
if (outfile.is_open()) {
outfile << "r,g,b,a\n"; // writes column labels
for (unsigned int l=0; l<colorMat.size(); l++) { // 0 to 8999
for (unsigned int m=0; m<colorMat[0].size(); m++) { // 0 to 8999
outfile << colorMat[l][m][0] << ',' << colorMat[l][m][1] << ','
<< colorMat[l][m][2] << ',' << colorMat[l][m][3] << '\n';
}
}
}
outfile.close();
Summary:
I am willing to change the output file type, the data structures I used, or anything else that would make this more efficient. Any and all suggestions are welcome!
Use the old C file functions and binary format
auto startT = chrono::high_resolution_clock::now();
ofstream outfile;
FILE* f = fopen("example.bin", "wb");
if (f) {
const int imgWidth = 9000;
const int imgHeight = 9000;
fwrite(&imgWidth, sizeof(imgWidth), 1, f);
fwrite(&imgHeight, sizeof(imgHeight), 1, f);
for (unsigned int i=0; i<colorMat.size(); ++i)
{
fwrite(&colorMat[i], sizeof(struct Pixel), 1, f);
}
}
auto endT = chrono::high_resolution_clock::now();
cout << "Time taken : " << chrono::duration_cast<chrono::seconds>(endT-startT).count() << endl;
fclose(f);
The format is the following :
[ImageWidth][ImageHeight][RGBA][RGBA[RGBA]... for all ImageWidth * ImageHeight pixels.
Your sample ran in 119s in my machine. This code ran in 2s.
But please note that the file will be huge anyway : you are writing the equivalent of two 8K files without any kind of compression.
Besides that, some tips on your code :
Don't use a vector of floats to represent your pixels. They won't have more components than RGBA. Instead create a simple struct with four floats.
You don't need to look through width and height separately. Internally all lines are put sequentially one after the other. It is easier to create a one dimension array of width * height size.

Unable to create image bitmap c++

My goal is to analyse image by pixels (to determine color). I want to create bitmap in C++ from image path:
string path = currImg.path;
cout << path << " " << endl;
Then I do some type changes which needed because Bitmap constructor does not accept simple string type:
wstring path_wstr = wstring(path.begin(), path.end());
const wchar_t* path_wchar_t = path_wstr.c_str();
And finally construct Bitmap:
Bitmap* img = new Bitmap(path_wchar_t);
In debugging mode I see that Bitmap is just null:
How can I consstruct Bitmap to scan photo by pixel to know each pixel's color?
Either Gdiplus::GdiplusStartup is not called, and the function fails. Or filename doesn't exist and the function fails. Either way img is NULL.
Wrong filename is likely in above code, because of the wrong UTF16 conversion. Raw string to wstring copy can work only if the source is ASCII. This is very likely to fail on non-English systems (it can easily fail even on English systems). Use MultiByteToWideChar instead. Ideally, use UTF16 to start with (though it's a bit difficult in a console program)
int main()
{
Gdiplus::GdiplusStartupInput tmp;
ULONG_PTR token;
Gdiplus::GdiplusStartup(&token, &tmp, NULL);
test_gdi();
Gdiplus::GdiplusShutdown(token);
return 0;
}
Test to make sure the function succeeded before going further.
void test_gdi()
{
std::string str = "c:\\path\\filename.bmp";
int size = MultiByteToWideChar(CP_ACP, 0, str.c_str(), -1, 0, 0);
std::wstring u16(size, 0);
MultiByteToWideChar(CP_ACP, 0, str.c_str(), -1, &u16[0], size);
Gdiplus::Bitmap* bmp = new Gdiplus::Bitmap(u16.c_str());
if (!bmp)
return; //print error
int w = bmp->GetWidth();
int h = bmp->GetHeight();
for (int y = 0; y < h; y++)
for (int x = 0; x < w; x++)
{
Gdiplus::Color clr;
bmp->GetPixel(x, y, &clr);
auto red = clr.GetR();
auto grn = clr.GetG();
auto blu = clr.GetB();
}
delete bmp;
}
first you need to supply headers for bitmap image file format ... then read it byte by byte.
then image pixel data is next to where headers end. headers also contain offset from where the pixel data starts ...
then you can read pixel data at once by calculating from width height and bytes per pixel...
you also need to take padding at the end of the row to take account of images whose width is not divisible by four.
you need to write a bitmap image parser basically ...
make sure to open the bitmap file in binary mode...
more info here ...
https://en.wikipedia.org/wiki/BMP_file_format

GL Screenshot Breaks on viewport resize…sometimes

I’m developing a plugin for SIMDIS (basically military google earth), written in c++ using VS 2012. It’s a pretty nifty little thing to auto plot points, and one of its functions is to take a series of screenshot of the view-port and save the images off so it can be used/processed somewhere else. This works fine too… until you re-size the view-port one too many times. Re-size is done by clicking the corner of the window and dragging it bigger and smaller, and the program may launch full screen or windowed mode; either way it works fine the first few sets… or as long as the window is not re-sized.
When it breaks, the program will still march happily along, create the files, and filling them with data at what seems to be an appropriate size for whatever resolution image I’m trying to generate… but the format becomes no-good. It will still be a *.bmp, but windows stops being able to understand it. No errors are thrown though, (I think, I’m not catching any GL errors?[if that’s possible?]).
I can’t get it to consistently happen with a specific number of actions, but it seems to start failing after 3-7 view-port re-sizes. I don’t know if this is a problem with my screenshot code, an issue with the SIMDIS program or plugin, a GL issue, or what. I’ve tested it on multiple machines.
Has anyone run into this problem before? Is there something specific I should be doing that I’m not? Is this a problem native to the parent program (SIMDIS), or something I can work with/around with GL commands I don’t know about?
Screenshot code follows:
#include "TakeScreenshot.h" //has "#include <gl/GL.h>" etc...
TakeScreenshot::TakeScreenshot()
{
}
std::vector<int> * TakeScreenshot::TakeAScreenshotBMP(const char* filename)
{
//std::cout << "Screenshot! ";
std::vector<int> * returnVec = new std::vector<int>();
int VPort[4] = {0,0,0,0};
int FSize = 0;
int PackStore = 0;
//get GL viewport dimensions, x,y,w,h into vport
glGetIntegerv(GL_VIEWPORT,VPort);
//make a framebuffer, RGB
FSize = VPort[2]*VPort[3]*3;
unsigned char PStore[8294400];// 4k sized buffer
//store settings
glGetIntegerv(GL_PACK_ALIGNMENT, &PackStore);
//unpack to byte order
glPixelStorei(GL_PACK_ALIGNMENT, 1);
//read the gl buffer into our buffer
glReadPixels(VPort[0],VPort[1],VPort[2],VPort[3],GL_RGB,GL_UNSIGNED_BYTE,&PStore);
//Pass back settings
glPixelStorei(GL_PACK_ALIGNMENT, PackStore);
///
//set up file info
///
BITMAPINFOHEADER BMIH; //info header
BMIH.biSize = sizeof(BITMAPINFOHEADER);
BMIH.biSizeImage= VPort[2] * VPort[3] * 3;
BMIH.biWidth = VPort[2];
BMIH.biHeight = VPort[3];
BMIH.biPlanes = 1;
BMIH.biBitCount = 24;
BMIH.biCompression = BI_RGB;
BITMAPFILEHEADER bmfh;//file header
int nBitsOffset = sizeof(BITMAPFILEHEADER) + BMIH.biSize;
LONG lImageSize = BMIH.biSizeImage;
LONG lFileSize = nBitsOffset + lImageSize;
bmfh.bfType = 'B' + ('M'<<8);
bmfh.bfOffBits = nBitsOffset;
bmfh.bfSize = lFileSize;
bmfh.bfReserved1 = bmfh.bfReserved2 = 0;
// swap r and b values because GL has them backwards for BMP format.
unsigned char SwapByte;
for(int loop = 0; loop<FSize; loop+=3)
{
SwapByte = PStore[loop];
PStore[loop] = PStore[loop+2];
PStore[loop +2] = SwapByte;
}
///
// File writing section
///
FILE *pFile;
pFile = fopen(filename, "wb");
//if something borked
if(pFile == NULL)
{
std::cout << "TakeScreenshot::TakeAScreenshotBMP>> Error; was not able to create file (Permisions?)" << std::endl;
returnVec->push_back(-1);
returnVec->push_back(-1);
return returnVec; //exit
}
UINT nWrittenFileHeaderSize = fwrite(&bmfh,1,sizeof(BITMAPFILEHEADER), pFile);
UINT nWrittenInfoHeaderSize = fwrite(&BMIH,1,sizeof(BITMAPINFOHEADER), pFile);
UINT nWrittenDIBDataSize = fwrite(&PStore, 1, lImageSize, pFile);
fclose(pFile);
//some return data for processing later
returnVec->push_back(VPort[2]);
returnVec->push_back(VPort[3]);
return returnVec;
}
TakeScreenshot::~TakeScreenshot(void)
{
}

Export buffer to WAV in C++

I have a simple program that creates a single cycle sine wave and puts the float numbers to a buffer. Then this is exported to a text file.
But I want to be able to export it to a WAV file (24 bit). Is there a simple way of doing it like on the text file?
Here is the code I have so far:
#include <iostream>
#include <fstream>
#include <cmath>
using namespace std;
int main ()
{
long double pi = 3.14159265359; // Declaration of PI
ofstream textfile; // Text object
textfile.open("sine.txt"); // Creating the txt
double samplerate = 44100.00; // Sample rate
double frequency = 200.00; // Frequency
int bufferSize = (1/frequency)*samplerate; // Buffer size
double buffer[bufferSize]; // Buffer
for (int i = 0; i <= (1/frequency)*samplerate; ++i) // Single cycle
{
buffer[i] = sin(frequency * (2 * pi) * i / samplerate); // Putting into buffer the float values
textfile << buffer[i] << endl; // Exporting to txt
}
textfile.close(); // Closing the txt
return 0; // Success
}
First you need to open the stream for binary.
ofstream stream;
stream.open("sine.wav", ios::out | ios::binary);
Next you'll need to write out a wave header. You can search to find the details of the wave file format. The important bits are the sample rate, bit depth, and length of the data.
int bufferSize = (1/frequency)*samplerate;
stream.write("RIFF", 4); // RIFF chunk
write<int>(stream, 36 + bufferSize*sizeof(int)); // RIFF chunk size in bytes
stream.write("WAVE", 4); // WAVE chunk
stream.write("fmt ", 4); // fmt chunk
write32(stream, 16); // size of fmt chunk
write16(stream, 1); // Format = PCM
write16(stream, 1); // # of Channels
write32(stream, samplerate); // Sample Rate
write32(stream, samplerate*sizeof(int)); // Byte rate
write16(stream, sizeof(int)); // Frame size
write16(stream, 24); // Bits per sample
stream.write("data", 4); // data chunk
write32(stream, bufferSize*sizeof(int)); // data chunk size in bytes
Now that the header is out of the way, you'll just need to modify your loop to first convert the double (-1.0,1.0) samples into 32-bit signed int. Truncate the bottom 8-bits since you only want 24-bit and then write out the data. Just so you know, it is common practice to store 24-bit samples inside of a 32-bit word because it is much easier to stride through using native types.
for (int i = 0; i < bufferSize; ++i) // Single cycle
{
double tmp = sin(frequency * (2 * pi) * i / samplerate);
int intVal = (int)(tmp * 2147483647.0) & 0xffffff00;
stream << intVal;
}
A couple other things:
1) I don't know how you weren't overflowing buffer by using the <= in your loop. I changed it to a <.
2) Again regarding the buffer size. I'm not sure if you are aware but you can't have a repeated waveform represented by a single cycle for all frequencies. What I mean is that for most frequencies if you use this code and expect to play the waveform repeated, you're going to hear a glitch on every cycle. It'll work for nice synchronous frequencies like 1kHz because there will be exactly 48 samples per cycle and it will come around to exactly the same phase. 999.9 Hz will be a different story though.

How do I read JPEG and PNG pixels in C++ on Linux?

I'm doing some image processing, and I'd like to individually read each pixel value in a JPEG and PNG images.
In my deployment scenario, it would be awkward for me to use a 3rd party library (as I have restricted access on the target computer), but I'm assuming that there's no standard C or C++ library for reading JPEG/PNG...
So, if you know of a way of not using a library then great, if not then answers are still welcome!
There is no standard library in the C-standard to read the file-formats.
However, most programs, especially on the linux platform use the same library to decode the image-formats:
For jpeg it's libjpeg, for png it's libpng.
The chances that the libs are already installed is very high.
http://www.libpng.org
http://www.ijg.org
This is a small routine I digged from 10 year old source code (using libjpeg):
#include <jpeglib.h>
int loadJpg(const char* Name) {
unsigned char a, r, g, b;
int width, height;
struct jpeg_decompress_struct cinfo;
struct jpeg_error_mgr jerr;
FILE * infile; /* source file */
JSAMPARRAY pJpegBuffer; /* Output row buffer */
int row_stride; /* physical row width in output buffer */
if ((infile = fopen(Name, "rb")) == NULL) {
fprintf(stderr, "can't open %s\n", Name);
return 0;
}
cinfo.err = jpeg_std_error(&jerr);
jpeg_create_decompress(&cinfo);
jpeg_stdio_src(&cinfo, infile);
(void) jpeg_read_header(&cinfo, TRUE);
(void) jpeg_start_decompress(&cinfo);
width = cinfo.output_width;
height = cinfo.output_height;
unsigned char * pDummy = new unsigned char [width*height*4];
unsigned char * pTest = pDummy;
if (!pDummy) {
printf("NO MEM FOR JPEG CONVERT!\n");
return 0;
}
row_stride = width * cinfo.output_components;
pJpegBuffer = (*cinfo.mem->alloc_sarray)
((j_common_ptr) &cinfo, JPOOL_IMAGE, row_stride, 1);
while (cinfo.output_scanline < cinfo.output_height) {
(void) jpeg_read_scanlines(&cinfo, pJpegBuffer, 1);
for (int x = 0; x < width; x++) {
a = 0; // alpha value is not supported on jpg
r = pJpegBuffer[0][cinfo.output_components * x];
if (cinfo.output_components > 2) {
g = pJpegBuffer[0][cinfo.output_components * x + 1];
b = pJpegBuffer[0][cinfo.output_components * x + 2];
} else {
g = r;
b = r;
}
*(pDummy++) = b;
*(pDummy++) = g;
*(pDummy++) = r;
*(pDummy++) = a;
}
}
fclose(infile);
(void) jpeg_finish_decompress(&cinfo);
jpeg_destroy_decompress(&cinfo);
BMap = (int*)pTest;
Height = height;
Width = width;
Depth = 32;
}
For jpeg, there is already a library called libjpeg, and there is libpng for png. The good news is that they compile right in and so target machines will not need dll files or anything. The bad news is they are in C :(
Also, don't even think of trying to read the files yourself. If you want an easy-to-read format, use PPM instead.
Unfortunately, jpeg format is compressed, so you would have to decompress it before reading individual pixels. This is a non-trivial task. If you can't use a library, you may want to refer to one to see how it's decompressing the image. There is an open-source library on sourceforge: CImg on sourceforge.
Since it could use the exposure, I'll mention one other library to investigate: The IM Toolkit, which is hosted at Sourceforge. It is cross platform, and abstracts the file format completely away from the user, allowing an image to be loaded and processed without worrying about most of the details. It does support both PNG and JPEG out of the box, and can be extended with other import filters if needed.
It comes with a large collection of image processing operators as well...
It also has a good quality binding to Lua.
As Nils pointed, there is no such thing as a C or C++ standard library for JPEG compression and image manipulation.
In case you'd be able to use a third party library, you may want to try GDAL which supports JPEG, PNG and tens of other formats, compressions and mediums.
Here is simple example that presents how to read pixel data from JPEG file using GDAL C++ API:
#include <gdal_priv.h>
#include <cassert>
#include <iostream>
#include <string>
#include <vector>
int main()
{
GDALAllRegister(); // once per application
// Assume 3-band image with 8-bit per pixel per channel (24-bit depth)
std::string const file("/home/mloskot/test.jpg");
// Open file with image data
GDALDataset* ds = static_cast<GDALDataset*>(GDALOpen(file.c_str(), GA_ReadOnly));
assert(0 != ds);
// Example 1 - Read multiple bands at once, assume 8-bit depth per band
{
int const ncols = ds->GetRasterXSize();
int const nrows = ds->GetRasterYSize();
int const nbands = ds->GetRasterCount();
int const nbpp = GDALGetDataTypeSize(GDT_Byte) / 8;
std::vector<unsigned char> data(ncols * nrows * nbands * nbpp);
CPLErr err = ds->RasterIO(GF_Read, 0, 0, ncols, nrows, &data[0], ncols, nrows, GDT_Byte, nbands, 0, 0, 0, 0);
assert(CE_None == err);
// ... use data
}
// Example 2 - Read first scanline by scanline of 1 band only, assume 8-bit depth per band
{
GDALRasterBand* band1 = ds->GetRasterBand(1);
assert(0 != band1);
int const ncols = band1->GetXSize();
int const nrows = band1->GetYSize();
int const nbpp = GDALGetDataTypeSize(GDT_Byte) / 8;
std::vector<unsigned char> scanline(ncols * nbpp);
for (int i = 0; i < nrows; ++i)
{
CPLErr err = band1->RasterIO(GF_Read, 0, 0, ncols, 1, &scanline[0], ncols, 1, GDT_Byte, 0, 0);
assert(CE_None == err);
// ... use scanline
}
}
return 0;
}
There is more complete GDAL API tutorial available.
I've had good experiences with the DevIL library. It supports a wide range of image formats and follows a function-style very similar to OpenGL.
Granted, it is a library, but it's definitely worth a try.
Since the other answers already mention that you will most likely need to use a library, take a look at ImageMagick and see if it is possible to do what you need it to do. It comes with a variety of different ways to interface with the core functionality of ImageMagick, including libraries for almost every single programming language available.
Homepage: ImageMagick
If speed is not a problem you can try LodePNG that take a very minimalist approach to PNG loading and saving.
Or even go with picoPNG from the same author that is a self-contained png loader in a function.