How to speed up my .bmp class?

How to speed up my .bmp class? - c++

Well greetings to you all :)
A few days ago I finally managed to create a functional C++ class to make .bmp images. Even though it's functional (no errors yet) it isn't efficient in terms of speed (in my opinion). Doing a few test to see how much time it took to write different sizes of images I ended up with these results:
Image Dimensions Time taken(in seconds) Comparison to the 1000x1000 image
10x100 0.0491 x 1000 = 49.1 seconds
100x100 0.2471 x 100 = 24.7 seconds
100x1000 2.3276 x 10 = 23.3 seconds
1000x1000 22.515 x 1 = 22.5 seconds
1000x10000 224.76 \ 10 = 22.4 seconds
For example the 10x100 image had 1000 pixels (each with with a ARGB channel [32 bits or 4 bytes]) plus the 54 bytes for the header, it took 0.05 seconds to write 4054 bytes (char).
I feel this is super slow, because my computer can copy a ~85MB file in like a second or two. I'm using fstream to do the writing to disk and any help to make the class go faster is appreciated. Thank You!!!
My class it's called SimpleBMP and here it is (I only put the revelent functions):
#include <fstream>
class SimpleBMP{
struct PIXEL{
unsigned char A, R, G, B;
}*PixelArray;
unsigned char *BMPHEADER, *BMPINFOHEADER;
std::string DATA;
unsigned int Size_Of_BMP, Size_Of_PixelArray;
int BMP_Width, BMP_Height;
public:
void SetPixel(int Column, int Row, unsigned char A, unsigned char R, unsigned char G, unsigned char B){
PixelArray[(Row*BMP_Width)+Column].A = A;
PixelArray[(Row*BMP_Width)+Column].R = R;
PixelArray[(Row*BMP_Width)+Column].G = G;
PixelArray[(Row*BMP_Width)+Column].B = B;
};
bool MakeImage(std::string Name){
Name.append(".bmp");
std::ofstream OffFile(Name, std::ios::out|std::ios::binary);
if(OffFile.is_open()){
DATA.clear();
for(int temp = 0; temp < 14; temp++){
BMPHEADER[temp] = 0x00;
};
BMPHEADER[0] = 'B';
BMPHEADER[1] = 'M';
BMPHEADER[2] = Size_Of_BMP;
BMPHEADER[3] = (Size_Of_BMP >> 8);
BMPHEADER[4] = (Size_Of_BMP >> 16);
BMPHEADER[5] = (Size_Of_BMP >> 24);
BMPHEADER[10] = 0x36;
for(int temp = 0; temp < 40; temp++){
BMPINFOHEADER[temp] = 0x00;
};
BMPINFOHEADER[0] = 0x28;
for(int temp = 0; temp < 4; temp++){
BMPINFOHEADER[temp+4] = (BMP_Width >> (temp*8));
};
for(int temp = 0; temp < 4; temp++){
BMPINFOHEADER[temp+8] = (BMP_Height >> (temp*8));
};
BMPINFOHEADER[12] = 0x01;
BMPINFOHEADER[14] = 0x20;
for(int temp = 0; temp < 4; temp++){
BMPINFOHEADER[temp+20] = (Size_Of_PixelArray >> (temp*8));
};
BMPINFOHEADER[24] = 0x13;
BMPINFOHEADER[25] = 0x0b;
BMPINFOHEADER[28] = 0x13;
BMPINFOHEADER[29] = 0x0b;
for(int temp = 0; temp < 14; temp++){
DATA.push_back(BMPHEADER[temp]);
};
for(int temp = 0; temp < 40; temp++){
DATA.push_back(BMPINFOHEADER[temp]);
};
for(int temp = 0; temp < (Size_Of_PixelArray/4); temp++){
DATA.push_back(PixelArray[temp].B);
DATA.push_back(PixelArray[temp].G);
DATA.push_back(PixelArray[temp].R);
DATA.push_back(PixelArray[temp].A);
};
OffFile.write(DATA.c_str(), Size_Of_BMP);
OffFile.close();
return true;
}
else
return false;
};
};

When running tests you should compile your project in release mode. Debug mode in most environments introduces additional checks and code. The debug libraries linked can also include additional checks such as bounds checking and validation of iterators that are not present in release mode. All of this can introduce performance hits that are not present in release mode.
There are other optimizations that you can apply such as reserving memory in DATA before loading the data. This will reduce the number of copies that need to be made when the buffer is expanded. Although the performance gain may not be significant it can definitely help. I suggest running your code through a profiler to see where all of the bottlenecks are and optimize accordingly.

If you know you are on a little-endian machine, you can completely skip the re-packing of the data, and just store the pixelarray data directly.
OffFile.Write((char *)&PixelArray, Size_Of_BMP);
It may not be quite as portable, but it will certainly speed up the saving to file.
(And you could have a
#ifdef LITTLE_ENDIAN
struct PIXEL{
unsigned char A, R, G, B;
};
#else
struct PIXEL{
unsigned char B, G, R, A;
};
#endif
PIXEL *PixelArray;
in the declaration.

Related

problem with sending a float number in a stream in vivado_hls

I am trying to do a simple image processing filter where the pixel values will be divided by half to reduce the intensity and I am trying to develop the hardware for the same. hence I am using vivado hls to generate the IP. As explained here https://forums.xilinx.com/t5/High-Level-Synthesis-HLS/Float-numbers-with-hls-stream/m-p/942747 to send floating numbers in a hls stream , an union needs to be used and I did the same. However, the results don't seem to be matching for the red and green components of the image whereas it is matching for the blue component of the image. It is a very simple algorithm where a pixel value will be divided by half.
I have been trying to resolve it but I am not able to see where the problem is. I have attached all the files below, can someone can help me resolve it??
////header file
#include "ap_fixed.h"
#include "hls_stream.h"
typedef union {
unsigned int i;
float r;
float g;
float b;
} conv;
typedef hls::stream <unsigned int> Stream_t;
void ftest(Stream_t& Sin,Stream_t& Sout);
////testbench
#include "stream_check_h.hpp"
int main()
{
Mat img_rev = imread("C:/Users/20181217/Desktop/images/imgs/output_fwd_v3.png");//(256x512)
Mat final_img(img_rev.rows,img_rev.cols,CV_8UC3);
Mat ref_img(img_rev.rows,img_rev.cols,CV_8UC3);
Stream_t S1,S2;
int err_r = 0;
int err_g = 0;
int err_b = 0;
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
conv c;
c.r = (float)img_rev.at<Vec3b>(i,j)[0];
c.g = (float)img_rev.at<Vec3b>(i,j)[1];
c.b = (float)img_rev.at<Vec3b>(i,j)[2];
S1 << c.i;
}
}
ftest(S1,S2);
conv c;
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
S2 >> c.i;
final_img.at<Vec3b>(i,j)[0]=(unsigned char)c.r;
final_img.at<Vec3b>(i,j)[1]=(unsigned char)c.g;
final_img.at<Vec3b>(i,j)[2]=(unsigned char)c.b;
ref_img.at<Vec3b>(i,j)[0] = (unsigned char)(((float)img_rev.at<Vec3b>(i,j)[0])/2.0);
ref_img.at<Vec3b>(i,j)[1] = (unsigned char)(((float)img_rev.at<Vec3b>(i,j)[1])/2.0);
ref_img.at<Vec3b>(i,j)[2] = (unsigned char)(((float)img_rev.at<Vec3b>(i,j)[2])/2.0);
}
}
Mat diff;
cout<<diff;
diff= abs(final_img-ref_img);
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
if((int)diff.at<Vec3b>(i,j)[0] > 0)
{
err_r++;
cout<<"expected value: "<<(int)ref_img.at<Vec3b>(i,j)[0]<<", final_value: "<<(int)final_img.at<Vec3b>(i,j)[0]<<", actual value:"<<(int)img_rev.at<Vec3b>(i,j)[0]<<endl;
}
if((int)diff.at<Vec3b>(i,j)[1] > 0)
err_g++;
if((int)diff.at<Vec3b>(i,j)[2] > 0)
err_b++;
}
}
cout<<"number of errors: "<<err_r<<", "<<err_g<<", "<<err_b;
return 0;
}
////core
#include "stream_check_h.hpp"
void ftest(Stream_t& Sin,Stream_t& Sout)
{
conv cin,cout;
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
Sin >> cin.i;
cout.r = cin.r/2.0 ;
cout.g = cin.g/2.0 ;
cout.b = cin.b/2.0 ;
Sout << cout.i;
}
}
}
when I debugged, it showed that the blue components of the pixels are matching. for one red pixel it showed me the following:
expected value: 22, final_value: 14, actual value:45
and the total errors for red, green, and blue are:
number of errors: 126773, 131072, 0
I am not able to see why it is going wrong for red and green. I posted here hoping a fresh set of eyes would help my problem.
Thanks in advance

I'm assuming you're using a 32bit-wide stream with 3 RGB pixels 8bit unsigned (CV_8U3). I believe the problem with the union type in your case is the overlapping of its three members (not just like the one float value in the example you cite). This means that by doing the division, you're actually doing it over the whole 32bit data you're receiving.
I possible workaround I quickly cam up with would be to cast the unsigned int you're getting from the stream into an ap_uint<32> type, then chop it in the R, G, B chunks (with the range() method) and divide. Finally, assemble back the result and stream it back.
unsigned int packet;
Sin >> packet;
ap_uint<32> packet_uint32 = *((ap_uint<32>*)&packet); // casting (not elegant, but works)
ap_int<8> b = packet_uint32.range(7, 0);
ap_int<8> g = packet_uint32.range(15, 8);
ap_int<8> r = packet_uint32.range(23, 16); // In case they are in the wrong bit range/order, just flip the r, g, b assignements
b /= 2;
g /= 2;
r /= 2;
packet_uint32.range(7, 0) = b;
packet_uint32.range(15, 8) = g;
packet_uint32.range(23, 16) = r;
packet = packet_uint32.to_int();
Sout << packet;
NOTE: I've reused the same variables in the code above: HLS shouldn't complain about it and come out with a good RTL anyway. In case it shouldn't, just create new ones.

Failing to fill a buffer for a *.wav file using two different frequencies

This was solved by changing the buffer from int16_t to int8_t since I was trying to write 8bit audio.
I'm trying to fill a buffer for a mono wave file with two different frequencies but failing at it. I'm using CLion in Ubuntu 18.04.
I know, the buffer size is equal to duration*sample_rate so I'm creating a int16_t vector with that size. I tried filling it with one note first.
for(int i = 0; i < frame_total; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i));
which generated a nice long beeep. And then I changed it with:
for(int i = 0; i < frame_total; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i*2));
which generated a higher beeeep, but when trying to do the following:
for(int i = 0; i < frame_total/2; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i*2));
for(int i = frame_total/2; i < frame_total; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i));
I expect it to write the higher beep in the first half of the audio, and fill the another fall with the "normal" beep. The *.wav file just plays the first note the entire time.
#define FORMAT_AUDIO 1
#define FORMAT_SIZE 16
struct wave_header{
// Header
char riff[4];
int32_t file_size;
char wave[4];
// Format
char fmt[4];
int32_t format_size;
int16_t format_audio;
int16_t num_channels;
int32_t sample_rate;
int32_t byte_rate;
int16_t block_align;
int16_t bits_per_sample;
// Data
char data[4]
int32_t data_size;
};
void write_header(ofstream &music_file ,int16_t bits, int32_t samples, int32_t duration){
wave_header wav_header{};
int16_t channels_quantity = 1;
int32_t total_data = duration * samples * channels_quantity * bits/8;
int32_t file_data = 4 + 8 + FORMAT_SIZE + 8 + total_data;
wav_header.riff[0] = 'R';
wav_header.riff[1] = 'I';
wav_header.riff[2] = 'F';
wav_header.riff[3] = 'F';
wav_header.file_size = file_data;
wav_header.wave[0] = 'W';
wav_header.wave[1] = 'A';
wav_header.wave[2] = 'V';
wav_header.wave[3] = 'E';
wav_header.fmt[0] = 'f';
wav_header.fmt[1] = 'm';
wav_header.fmt[2] = 't';
wav_header.fmt[3] = ' ';
wav_header.format_size = FORMAT_SIZE;
wav_header.format_audio = FORMAT_AUDIO;
wav_header.num_channels = channels_quantity;
wav_header.sample_rate = samples;
wav_header.byte_rate = samples * channels_quantity * bits/8;
wav_header.block_align = static_cast<int16_t>(channels_quantity * bits / 8);
wav_header.bits_per_sample = bits;
wav_header.data[0] = 'd';
wav_header.data[1] = 'a';
wav_header.data[2] = 't';
wav_header.data[3] = 'a';
wav_header.data_size = total_data;
music_file.write((char*)&wav_header, sizeof(wave_header));
}
int main(int argc, char const *argv[]) {
int16_t bits = 8;
int32_t samples = 44100;
int32_t duration = 4;
ofstream music_file("music.wav", ios::out | ios::binary);
int32_t frame_total = samples * duration;
auto* audio = new int16_t[frame_total];
for(int i = 0; i < frame_total/2; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i*2));
for(int i = frame_total/2; i < frame_total; i++)
audio[i] = static_cast<int16_t>(128 + 127 * sin(i));
write_header(music_file, bits, samples, duration);
music_file.write(reinterpret_cast<char*>(audio),sizeof(int16_t)*frame_total);
return 0;
}

There are two major issues with your code.
The first one is that you may be writing an invalid header depending on your compiler settings and environment.
The reason is that the wave_header struct is not packed in memory, and may contain padding between members. Therefore, when you do:
music_file.write((char*)&wav_header, sizeof(wave_header));
It may write something that isn't a valid WAV header. Even if you are lucky enough to get exactly what you wanted, it is a good idea to fix it, because it may change at any moment and surely it isn't portable.
The second issue is that the call to write the actual wave:
music_file.write(reinterpret_cast<char*>(audio),sizeof(char)*frame_total);
Is writing exactly half the amount of data you are expecting. The actual size of the data pointed by audio is sizeof(int16_t) * frame_total.
This explains why you are only hearing the first part of the wave you wrote.

This was solved by changing the buffer (audio) from int16_t to int8_t since I was trying to write 8bit audio.

Reading individual bits from Memory

To convert a number from base 10 to base 2, I thought of directly reading bits from memory instead of performing bitshift (>>).
Consider the program:
int n = 14;
bool* pointer = (bool*)&n;
for(int i = 0; i < 32; i++)
cout << *(p + i);
The program is not giving the correct output.
The program below works:
int n = 14;
bool *p = (bool*)&n;
for(int i = 0; i < 32; i++){
cout << *p;
n = n >> 1;
}
Bitshifting wastes unnecessary time. Also please point out the error in the first code snippet.

If you really are worried that shifting is too expensive (which is only really true on some older CPUs, like the 68008, for example), you can work with constant bit masks like so:
const unsigned int table [] = {0x00000001, 0x00000002, 0x00000004, 0x00000008,
0x00000010, 0x00000020, 0x00000040, 0x00000080,
// ...rinse and repeat until bit 31....
};
int isSet (unsigned int value, int bitNo) {
return ((value & table [bitNo]) != 0);
}
What you have shown above is not going to work for a gazillion of reasons, some are in the comments you have received.

Random data in image-size field in the bitmap file header created with c++

This is my first question at Stack overflow. I'm new to Image processing and to C++, I'm working with bitmap files now. While creating a Bitmap file using C++, the file can not be opened using any viewers. I used a hex editor to view the file and there were random data in Image size field in the info header. After editing it in the hex editor, the bitmap is view-able. I don't know what is wrong with the code.
The header (bitmap.h) I created is as follows
#include<iostream>
#include<fstream>
using namespace std;
struct BmpSignature
{
unsigned char data[2];
BmpSignature(){ data[0] = data[1] = 0; }
};
struct BmpHeader
{
unsigned int fileSize; // this field gives out the size of the full Image includong the headers. it is of 4 byte in width
unsigned short reserved1; // this field is reserved. it is 2 byte in width
unsigned short reserved2; //This field is also reserved. it is 2 byte in width
unsigned int dataOffset; // this gives the starting location of the starting of the image data array
};
struct BmpInfoHeader
{
unsigned int size; // this field gives the size of the Bitmap info Header. This is 4 byte in width
unsigned int width; // this gives the width of the image
unsigned int height; // this gives the height of the image
unsigned short planes; //this gives the number of planes in the image
unsigned short bitCount; // this gives the number of bits per pixels in the image. for ex. like 24 bits, 8 bits
unsigned short compression; // gives info whether the image is compressed or not
unsigned int ImageSize; // gives the actual size of the image
unsigned int XPixelsPerM; // give the number of pixels in the X direction. It is usually 2834
unsigned int YPixelsPerM;// give the number of pixels in the Y direction. It is usually 2834
unsigned int ColoursUsed; // this field gives the number of Colours used in the Image
unsigned int ColoursImp; // gives the number of Important colours in the image. if all colours are important it is usually 0
};
the cpp file I created is as follows (Create_Bitmap.cpp)
#include"bitmap.h"
#include<cmath>
#include<fstream>
using namespace std;
int main()
{
ofstream fout;
fout.open("D:/My Library/test1.bmp", ios::out |ios::binary);
BmpHeader header;
BmpInfoHeader infoheader;
BmpSignature sign;
infoheader.size = 40;
infoheader.height = 15;
infoheader.width = 15;
infoheader.planes = 1;
infoheader.bitCount = 8;
infoheader.compression = 0;
infoheader.ImageSize = 0;
infoheader.XPixelsPerM = 0;
infoheader.YPixelsPerM = 0;
infoheader.ColoursUsed = 0;
infoheader.ColoursImp = 0;
unsigned char* pixelData;
int pad=0;
for (int i = 0; i < infoheader.height * infoheader.width; i++)
{
if ((i) % 16 == 0) pad++;
}
int arrsz = infoheader.height * infoheader.width + pad;
pixelData = new unsigned char[arrsz];
unsigned char* offsetData;
offsetData = new unsigned char[4 * 256];
int xn = 0;
int yn = 4 * 256;
for (int i = 0; i < yn; i+=4)
{
offsetData[i] = xn;
offsetData[i+1] = xn;
offsetData[i+2] = xn;
offsetData[i+3] = 0;
xn++;
}
int num = 0;
for (int i = 0; i < arrsz; i++)
{
pixelData[i] = i;
}
sign.data[0] = 'B'; sign.data[1] = 'M';
header.fileSize = 0;
header.reserved1 = header.reserved2 = 0;
header.dataOffset = 0;
fout.seekp(0, ios::beg);
fout.write((char*)&sign, sizeof(sign));
fout.seekp(2, ios::beg);
fout.write((char*)&header, sizeof(header));
fout.seekp(14, ios::beg);
fout.write((char*)&infoheader, sizeof(infoheader));
fout.seekp(54, ios::beg);
fout.write((char*)offsetData, yn);
fout.write((char*)pixelData, arrsz);
fout.close();
delete[] pixelData;
delete[] offsetData;
return 0;
}
I have attached the screenshot of the created bmp file in a hex editor with the image size field selected
Bitmap Image opened in Hex Editor
Upon replacing the contents in the field using hex editor the Bitmap file can be viewed with an Image Viewer. I don't know what is wrong in this code

So you want to write in BMP format? Remember that compiler may insert padding in C++ POD structs. You may need use some compiler pragma to pack the struct. Also make sure you use little-endian for all integers, but that should be OK since you are on Windows, assuming an x86.

read bits, write bits data to arbitrary position in char array in c++

I created functions to read bit data from arbitrary bit position of char array to long long int, and I also created function to write data to char array from vector bool.
However, I do not really like my implementation since I think it has burden implementations in order to read and write bits.
Can anyone see my implementation and enlighten me for better implementation?
Here is my implementation to write bit data from vector<bool> bitdataForModification
unsigned char*& str is original char array
int startBitLocation is arbitrary bit position to read from
int sizeToModify is the size in bits for modification
void setBitDataToBitPosition(unsigned char*& str, int startBitLocation, int sizeToModify, std::vector<bool>& bitdataForModification){
int endBitLocation = startBitLocation + sizeToModify-1;
int sizeChar = (endBitLocation - startBitLocation)/8;
//Save leftover data
int startCharPosition = startBitLocation/8;
int startLeftOverBits = startBitLocation%8;
//endPosition
int endCharPosition = endBitLocation/8;
int endLeftOverBits = 7-(endBitLocation%8);
unsigned char tempChar = str[startCharPosition];
unsigned char tempLastChar = str[endCharPosition]; // store last char
int posBitdata = 0;
for(int i = 0 ; i < startLeftOverBits; i++){
str[startCharPosition] <<= 1;
str[startCharPosition] = (str[startCharPosition] | ((tempChar >> (7-i)) & 0x1));
}
for(int i = startCharPosition*8 + startLeftOverBits ; i <= endBitLocation ; i++){
str[i/8] <<= 1;
if(posBitdata <= endBitLocation){
str[i/8] = (str[i/8] | ((bitdataForModification[posBitdata]) & 0x1));
posBitdata++;
}
}
for(int i = 0 ; i < endLeftOverBits ; i++){
str[endCharPosition] <<= 1;
str[endCharPosition] = (str[endCharPosition] | ((tempChar >> i) & 0x1));
}
}
I do not like above function because it copies from original char[] to temp char[] and copy back the bits I need.
Following is my read function I implemented.
It reads from char array and copy the data to long long int data
void getBitDataFromBitPosition(unsigned char* str, int startBitLocation, int sizeToRead, unsigned long long* data){
int endBitLocation = startBitLocation + sizeToRead;
int sizeChar = (endBitLocation - startBitLocation)/8;
int startCharPosition = startBitLocation/8;
int endCharPosition = endBitLocation/8 +1;
vector<bool> bitData;
int bitCnt = 0;
for(int i = startCharPosition; i < endCharPosition; i++){
unsigned char tempChar = str[i];
for(int j = 7 ; j >= 0 ; j--){
int curLoc = ((i*8)+(bitCnt%8));
if(curLoc >= startBitLocation){
if(curLoc < endBitLocation){
bool temp = ((str[i] >> j) & 0x1);
bitData.push_back(temp);
}
}
bitCnt++;
}
}
for(int i = bitData.size() -1 ; i >= 0 ; i--){
*data <<= 1;
*data = (*data | bitData[bitData.size()-i-1]);
}
}
I think it is burden to copy to bool vector and copy back to long long int.
Can anyone provide better solution for me?
Thank you in advance!

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to speed up my .bmp class? - c++

Related

problem with sending a float number in a stream in vivado_hls

Failing to fill a buffer for a *.wav file using two different frequencies

Reading individual bits from Memory

Random data in image-size field in the bitmap file header created with c++

read bits, write bits data to arbitrary position in char array in c++

Categories

Resources