How to read pixels from MNIST digit database and create the iplimage - c++

sorry this may be somewhat duplication, but i am not able to fix it. i am involved with handwritten OCR application. I use MNIST digit database for training process here. I use following codehere for read pixels from the database and re-create the image. programs doesnt give any error but it gives meaningless image(totally black and unclear pixel patterns) as output. can someone explain the reason for that? plz help
here is my code
int reverseInt(int i) {
unsigned char c1, c2, c3, c4;
c1 = i & 255;
c2 = (i >> 8) & 255;
c3 = (i >> 16) & 255;
c4 = (i >> 24) & 255;
return ((int)c1 << 24) + ((int)c2 << 16) + ((int)c3 << 8) + c4;
}
void create_image(CvSize size, int channels, unsigned char* data[28][28], int imagenumber) {
string imgname; ostringstream imgstrm;string fullpath;
imgstrm << imagenumber;
imgname=imgstrm.str();
fullpath="D:\\"+imgname+".jpg";
IplImage *imghead=cvCreateImageHeader(size, IPL_DEPTH_16S, channels);
imghead->imageData=(char *)data;
cvSaveImage(fullpath.c_str(),imghead);
}
int main(){
ifstream file ("D:\\train-images.idx3-ubyte",ios::binary);
if (file.is_open())
{
int magic_number=0; int number_of_images=0;int r; int c;
int n_rows=0; int n_cols=0;CvSize size;unsigned char temp=0;
file.read((char*)&magic_number,sizeof(magic_number));
magic_number= reverseInt(magic_number);
file.read((char*)&number_of_images,sizeof(number_of_images));
number_of_images= reverseInt(number_of_images);
file.read((char*)&n_rows,sizeof(n_rows));
n_rows= reverseInt(n_rows);
file.read((char*)&n_cols,sizeof(n_cols));
n_cols= reverseInt(n_cols);
unsigned char *arr[28][28];
for(int i=0;i<number_of_images;++i)
{
for(r=0;r<n_rows;++r)
{
for(c=0;c<n_cols;++c)
{
file.read((char*)&temp,sizeof(temp));
arr[r][c]= &temp;
}
}
size.height=r;size.width=c;
create_image(size,1, arr, i);
}
}
return 0;
}

You have:
unsigned char temp=0;
...
file.read((char*)&temp,sizeof(temp));
With that you are reading a byte into a single char, and overwriting it with each subsequent byte in the file.
When you do this:
create_image(size,3, &temp, i);
temp is only one character long and just contains the last byte in the file, so your image ends up being just whatever happens to be in memeory after temp.
You need to allocate an array to hold the image data and increment a pointer into it as you fill it with data.
Also you are creating a 3 channel image, but the MNIST data is only single channel, right?
Also,
imghead->imageData=(char *)data;
should be
cvSetData(imghead, data, size.width)
and
unsigned char *arr[28][28];
should be
unsigned char arr[28][28];

I also wanted to use MNIST with OpenCV and this question was the closest i got.
I thought I post a "copy&paste->be happy" version based on cv::Mat instead of iplimage, since this is easier to work with. Also, cv::Mat is preferred since OpenCV 2.x.
This method get you a vector of pairs of cv::Mat images and labels as ints. Have fun.
std::vector<std::pair<cv::Mat,int>> loadBinary(const std::string &datapath, const std::string &labelpath){
std::vector<std::pair<cv::Mat,int>> dataset;
std::ifstream datas(datapath,std::ios::binary);
std::ifstream labels(labelpath,std::ios::binary);
if (!datas.is_open() || !labels.is_open())
throw std::runtime_error("binary files could not be loaded");
int magic_number=0; int number_of_images=0;int r; int c;
int n_rows=0; int n_cols=0; unsigned char temp=0;
// parse data header
datas.read((char*)&magic_number,sizeof(magic_number));
magic_number=reverseInt(magic_number);
datas.read((char*)&number_of_images,sizeof(number_of_images));
number_of_images=reverseInt(number_of_images);
datas.read((char*)&n_rows,sizeof(n_rows));
n_rows=reverseInt(n_rows);
datas.read((char*)&n_cols,sizeof(n_cols));
n_cols=reverseInt(n_cols);
// parse label header - ignore
int dummy;
labels.read((char*)&dummy,sizeof(dummy));
labels.read((char*)&dummy,sizeof(dummy));
for(int i=0;i<number_of_images;++i){
cv::Mat img(n_rows,n_cols,CV_32FC1);
for(r=0;r<n_rows;++r){
for(c=0;c<n_cols;++c){
datas.read((char*)&temp,sizeof(temp));
img.at<float>(r,c) = 1.0-((float)temp)/255.0; // inverse 0.255 values
}
}
labels.read((char*)&temp,sizeof(temp));
dataset.push_back(std::make_pair(img,(int)temp));
}
return dataset;
}
just the same as above:
int reverseInt(int i) {
unsigned char c1, c2, c3, c4;
c1 = i & 255; c2 = (i >> 8) & 255; c3 = (i >> 16) & 255; c4 = (i >> 24) & 255;
return ((int)c1 << 24) + ((int)c2 << 16) + ((int)c3 << 8) + c4;
}

Related

problem with sending a float number in a stream in vivado_hls

I am trying to do a simple image processing filter where the pixel values will be divided by half to reduce the intensity and I am trying to develop the hardware for the same. hence I am using vivado hls to generate the IP. As explained here https://forums.xilinx.com/t5/High-Level-Synthesis-HLS/Float-numbers-with-hls-stream/m-p/942747 to send floating numbers in a hls stream , an union needs to be used and I did the same. However, the results don't seem to be matching for the red and green components of the image whereas it is matching for the blue component of the image. It is a very simple algorithm where a pixel value will be divided by half.
I have been trying to resolve it but I am not able to see where the problem is. I have attached all the files below, can someone can help me resolve it??
////header file
#include "ap_fixed.h"
#include "hls_stream.h"
typedef union {
unsigned int i;
float r;
float g;
float b;
} conv;
typedef hls::stream <unsigned int> Stream_t;
void ftest(Stream_t& Sin,Stream_t& Sout);
////testbench
#include "stream_check_h.hpp"
int main()
{
Mat img_rev = imread("C:/Users/20181217/Desktop/images/imgs/output_fwd_v3.png");//(256x512)
Mat final_img(img_rev.rows,img_rev.cols,CV_8UC3);
Mat ref_img(img_rev.rows,img_rev.cols,CV_8UC3);
Stream_t S1,S2;
int err_r = 0;
int err_g = 0;
int err_b = 0;
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
conv c;
c.r = (float)img_rev.at<Vec3b>(i,j)[0];
c.g = (float)img_rev.at<Vec3b>(i,j)[1];
c.b = (float)img_rev.at<Vec3b>(i,j)[2];
S1 << c.i;
}
}
ftest(S1,S2);
conv c;
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
S2 >> c.i;
final_img.at<Vec3b>(i,j)[0]=(unsigned char)c.r;
final_img.at<Vec3b>(i,j)[1]=(unsigned char)c.g;
final_img.at<Vec3b>(i,j)[2]=(unsigned char)c.b;
ref_img.at<Vec3b>(i,j)[0] = (unsigned char)(((float)img_rev.at<Vec3b>(i,j)[0])/2.0);
ref_img.at<Vec3b>(i,j)[1] = (unsigned char)(((float)img_rev.at<Vec3b>(i,j)[1])/2.0);
ref_img.at<Vec3b>(i,j)[2] = (unsigned char)(((float)img_rev.at<Vec3b>(i,j)[2])/2.0);
}
}
Mat diff;
cout<<diff;
diff= abs(final_img-ref_img);
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
if((int)diff.at<Vec3b>(i,j)[0] > 0)
{
err_r++;
cout<<"expected value: "<<(int)ref_img.at<Vec3b>(i,j)[0]<<", final_value: "<<(int)final_img.at<Vec3b>(i,j)[0]<<", actual value:"<<(int)img_rev.at<Vec3b>(i,j)[0]<<endl;
}
if((int)diff.at<Vec3b>(i,j)[1] > 0)
err_g++;
if((int)diff.at<Vec3b>(i,j)[2] > 0)
err_b++;
}
}
cout<<"number of errors: "<<err_r<<", "<<err_g<<", "<<err_b;
return 0;
}
////core
#include "stream_check_h.hpp"
void ftest(Stream_t& Sin,Stream_t& Sout)
{
conv cin,cout;
for(int i=0;i<256;i++)
{
for(int j=0;j<512;j++)
{
Sin >> cin.i;
cout.r = cin.r/2.0 ;
cout.g = cin.g/2.0 ;
cout.b = cin.b/2.0 ;
Sout << cout.i;
}
}
}
when I debugged, it showed that the blue components of the pixels are matching. for one red pixel it showed me the following:
expected value: 22, final_value: 14, actual value:45
and the total errors for red, green, and blue are:
number of errors: 126773, 131072, 0
I am not able to see why it is going wrong for red and green. I posted here hoping a fresh set of eyes would help my problem.
Thanks in advance
I'm assuming you're using a 32bit-wide stream with 3 RGB pixels 8bit unsigned (CV_8U3). I believe the problem with the union type in your case is the overlapping of its three members (not just like the one float value in the example you cite). This means that by doing the division, you're actually doing it over the whole 32bit data you're receiving.
I possible workaround I quickly cam up with would be to cast the unsigned int you're getting from the stream into an ap_uint<32> type, then chop it in the R, G, B chunks (with the range() method) and divide. Finally, assemble back the result and stream it back.
unsigned int packet;
Sin >> packet;
ap_uint<32> packet_uint32 = *((ap_uint<32>*)&packet); // casting (not elegant, but works)
ap_int<8> b = packet_uint32.range(7, 0);
ap_int<8> g = packet_uint32.range(15, 8);
ap_int<8> r = packet_uint32.range(23, 16); // In case they are in the wrong bit range/order, just flip the r, g, b assignements
b /= 2;
g /= 2;
r /= 2;
packet_uint32.range(7, 0) = b;
packet_uint32.range(15, 8) = g;
packet_uint32.range(23, 16) = r;
packet = packet_uint32.to_int();
Sout << packet;
NOTE: I've reused the same variables in the code above: HLS shouldn't complain about it and come out with a good RTL anyway. In case it shouldn't, just create new ones.

How to read N bytes from a file continuously untill the EOF

I am trying a wave to base 64 converter program.
I am trying this following code snippet:
vector<char> in(3);
std::string out = "abcd"; //four letter garbage value as initializer
ifstream file_ptr(filename.c_str(), ios::in | ios::binary);
unsigned int threebytes = 0;
//Apply the Base 64 encoding algorithm
do {
threebytes = (unsigned int) file_ptr.rdbuf()->sgetn(&in[0], 3);
if (threebytes > 0) {
EncodeBlock(in, out, (int)threebytes); //Apply conversion algorithm to convert 3 bytes into 4
outbuff = outbuff + out; //Append the 4 bytes got from above step to the output
}
} while (threebytes == in.size());
file_ptr.close();
In encode block where the Base64 encoding algorithm is written
void EncodeBlock(const std::vector<char>& in, std::string& out, int len) {
using namespace std;
cb64 = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
out[0] = cb64[(int) (in[0] >> 2)];
out[1] = cb64[(int) (((in[0] << 6) >> 2) | (in[1] >> 4))];
out[2] = (len > 1) ?
cb64[(int) (((in[1] << 4) >> 2) | (in[2] >> 6))] :
'=';
out[3] = (len > 2) ?
cb64[(int) ((in[2] << 2) >> 2)] :
'=';
}
The cb64 is a 64 length long string but the index generated by bit manipulation sometimes fall out of range (0 to 63).
Why!!!
The resolution to this was to handle the bit manipulation correctly.
the char 8 bits are operated and then casted to unsigned int introduces 24 bits extra into it which needed to be set to 0.
So,
out[0] = cb64[(unsigned int) ((in[0] >> 2) & 0x003f)];
out[1] = cb64[(unsigned int) ((((in[0] << 6) >> 2) | (in[1] >> 4))) & 0x003f)]; .. and so on handles the masking

Most Significant Byte Computation

I am trying to implement a four byte value (most significant data first) to compute the total length of data. I found a code snippet to compute this but I didn't get a 4 byte data in the output. Instead I only got a 2 byte value.
char bytesLen[4] ;
unsigned int blockSize = 535;
bytesLen[0] = (blockSize & 0xFF);
bytesLen[1] = (blockSize >> 8) & 0xFF;
bytesLen[2] = (blockSize >> 16) & 0xFF;
bytesLen[3] = (blockSize >> 24) & 0xFF;
std::cout << "bytesLen: " << bytesLen << '\n';
Did I missed something in my code?
No, you didn't. You're outputting the array as a C string, which is null terminated. The third byte is nul so only two characters will be shown.
This is not a rational way to output binary values.
Also you're saving least significant byte first, not most significant. For most significant you have to reverse the order of the bytes.
This shows how to do the same thing without shift operators and bitmasks.
#include <iostream>
#include <iomanip>
// C++11
#include <cstdint>
int main(void)
{
// with union, the memory allocation is shared
union {
uint8_t bytes[4];
uint32_t n;
} length;
// see htonl if needs to be in network byte order
// or ntohl if from network byte order to host
length.n = 535;
std::cout << std::hex;
for(int i=0; i<4; i++) {
std::cout << (unsigned int)length.bytes[i] << " ";
}
std::cout << std::endl;
return 0;
}
If you want ms byte first, then you've reversed the order of the bytes.
You get incorrect output because you treat everything as a C string even though it is not. Get rid of the char type and fix the printing.
In C++, it would be like this:
#include <iostream>
#include <cstdint>
int main()
{
uint8_t bytesLen[sizeof(uint32_t)];
uint32_t blockSize = 535;
bytesLen[3] = (blockSize >> 0) & 0xFF;
bytesLen[2] = (blockSize >> 8) & 0xFF;
bytesLen[1] = (blockSize >> 16) & 0xFF;
bytesLen[0] = (blockSize >> 24) & 0xFF;
bool removeZeroes = true;
std::cout << "bytesLen: 0x";
for(size_t i=0; i<sizeof(bytesLen); i++)
{
if(bytesLen[i] != 0)
{
removeZeroes = false;
}
if(!removeZeroes)
{
std::cout << std::hex << (int)bytesLen[i];
}
}
std::cout << std::endl;
return 0;
}
Here's the fixed code [untested]. Note this won't compile as is. You'll need to reorder it slightly, but it should help:
unsigned char bytesLen[4] ;
unsigned int blockSize = 535;
// little endian
#if 0
bytesLen[0] = (blockSize & 0xFF);
bytesLen[1] = (blockSize >> 8) & 0xFF;
bytesLen[2] = (blockSize >> 16) & 0xFF;
bytesLen[3] = (blockSize >> 24) & 0xFF;
// big endian
#else
bytesLen[3] = (blockSize & 0xFF);
bytesLen[2] = (blockSize >> 8) & 0xFF;
bytesLen[1] = (blockSize >> 16) & 0xFF;
bytesLen[0] = (blockSize >> 24) & 0xFF;
#endif
char tmp[9];
char *
pretty_print(char *dst,unsigned char *src)
{
char *hex = "0123456789ABCDEF";
char *bp = dst;
int chr;
for (int idx = 0; idx <= 3; ++idx) {
chr = src[idx];
*bp++ = hex[(chr >> 4) & 0x0F];
*bp++ = hex[(chr >> 0) & 0x0F];
}
*bp = 0;
return dst;
}
std::cout << "bytesLen: " << pretty_print(tmp,bytesLen) << '\n';
UPDATE:
Based upon your followup question, to concatenate binary data, we can not use string-like functions such as sprintf [because the binary data may have 0x00 inside, which would stop the string transfer short]. Also, if the binary data had no 0x00 in it, the string functions would run beyond the end of the array(s) looking for it, and bad things would happen. The string functions also assume signed char data and when dealing with raw binary, we want to use unsigned char.
Here's something to try:
unsigned char finalData[1000]; // size is just example
unsigned char bytesLen[4];
unsigned char blockContent[300];
unsigned char *dst;
dst = finalData;
memcpy(dst,bytesLen,sizeof(bytesLen));
dst += sizeof(bytesLen);
memcpy(dst,blockContent,sizeof(blockContent));
dst += sizeof(blockContent);
// append more if needed in similar way ...
Note: The above presupposes that blockContent is of fixed size. If it were to have a variable number of bytes, we'd need to replace sizeof(blockContent) with (e.g.) bclen where that is the number of bytes in blockContent

C++ write a number on two bytes

I am new to the low level c++, and I find it a bit hard to understand how to manipulate bits. I am trying to do the following to use in a compression algorithm I am trying to make:
unsigned int num = ...;//we want to store this number
unsigned int num_size = 3;//this is the maximum size of the number in bits, and
//can be anything from 1 bit to 32
unsigned int pos = 7;//the starting pos on the 1st bit.
//this can be anything from 1 to 8
char a;
char b;
if the num_size is 3 and pos is 7 for example, we must store num, on the 7th and 8th bit of a and on the 1st bit of b.
How about just?
a = num << (pos-1);
b = ((num << (pos-1)) & 0xFF00) >> 8;
To read num back just
num = ((unsigned int)a + ((unsigned int b) << 8)) >> (pos - 1);
Note, this doesn't do any sanity checks, such as whether all the relevant bits fit in a and b, you'll have to do that yourself.
For this specific test case, the highest number that fits into 2 unsigned char is actually 65535.
#include <iostream>
unsigned char high(int input)
{
return (input >> 8) & 0xFF;
}
unsigned char low(int input)
{
return input & 0xFF;
}
int value(unsigned char low, unsigned char high)
{
return low | (high << 8);
}
int main()
{
int num = 65535;
unsigned char l = low(num);
unsigned char h = high(num);
int val = value(l, h);
std::cout<<"l: "<<l<<" h: "<<h<<" val: "<<val;
}

Swap endian in pcm audio

I've made simply program to swap endian in PCM audio (2 channels, 48kHz, 24 bit), but only one channel is swapped correctly, second one is still little Endian (i've checked generated output in CoolEdit 2000). Could anybody give me some guidance what's wrong in my code?
inline int endian_swap(unsigned int x)
{
unsigned char c1, c2, c3, c4;
c1 = x & 255;
c2 = (x >> 8) & 255;
c3 = (x >> 16) & 255;
c4 = (x >> 24) & 255;
return ((int)c1 << 24) + ((int)c2 << 16) + ((int)c3 << 8) + c4;
}
int main()
{
FILE *fpIn, *fpOut;
short x;
fpIn = fopen("audio.pcm", "rb");
fpOut = fopen("out.pcm", "wb");
int test = sizeof(short);
int count = 0;
int swaped = 0;
while( fread(&x, sizeof(int), 1, fpIn) == 1 )
{
swaped = endian_swap(x);
fwrite(&swaped, sizeof(int), 1, fpOut);
}
system("pause");
return 0;
}
Best regards!
You are reading in the file one int at a time. But an int is probably either 16-bit or 32-bit. You say you have 24-bit audio.
You should modify your code to read three char at a time, into a char [3] array. You will then too modify your swap_endian function to operate on a char [3] (this is easy; just swap the contents of the first and last elements of the array!)
You declared short x. Try declaring unsigned int x.