Convert bytes from a file to an integer c++ - c++

I am trying to parse a .dat file reading it byte by byte with this code.(the name of the file is in arv[1])
std::ifstream is (arv[1], std::ifstream::binary);
if (is) {
is.seekg (0, is.end);
int length = is.tellg();
is.seekg (0, is.beg);
char * buffer = new char [length];
is.read (buffer,length);
if (is)
std::cout << "all characters read successfully.";
else
std::cout << "error: only " << is.gcount() << " could be read";
is.close();
}
Now all file is in the buffer variable. The file contains numbers represented in 32 bits, how can I iterate over the buffer reading 4 bytes at a time and convert them to integer?

first of all , you have a memory leak, you dynamically allocate character array but never delete[] them.
use std::string instead:
std::string buffer(length,0);
is.read (&buffer[0],length);
now, assuming you had written the integer correctly, and have read it correctly into buffer, you can use this character array as pointer to integer:
int myInt = *(int*)&buffer[0];
(do you understand why?)
if you have more then one integer stored:
std::vector<int> integers;
for (int i=0;i<buffer.size();i+=sizeof(int)){
integers.push_back(*(int*)&buffer[i]);
}

Instead of:
char * buffer = new char [length];
is.read (buffer,length);
You can use:
int numIntegers = length/sizeof(int);
int* buffer = new int[numIntegers];
is.read(reinterpret_cast<char*>(buffer), numIntegers*sizeof(int));
Update, in response to OP's comment
I am not seeing any problems with the approach I suggested. Here's a sample program and the output I see using g++ 4.9.2.
#include <iostream>
#include <fstream>
#include <cstdlib>
void writeData(char const* filename, int n)
{
std::ofstream out(filename, std::ios::binary);
for ( int i = 0; i < n; ++i )
{
int num = std::rand();
out.write(reinterpret_cast<char*>(&num), sizeof(int));
}
}
void readData(char const* filename)
{
std::ifstream is(filename, std::ifstream::binary);
if (is)
{
is.seekg (0, is.end);
int length = is.tellg();
is.seekg (0, is.beg);
int numIntegers = length/sizeof(int);
int* buffer = new int [numIntegers];
std::cout << "Number of integers: " << numIntegers << std::endl;
is.read(reinterpret_cast<char*>(buffer), numIntegers*sizeof(int));
if (is)
std::cout << "all characters read successfully." << std::endl;
else
std::cout << "error: only " << is.gcount() << " could be read" << std::endl;
for (int i = 0; i < numIntegers; ++i )
{
std::cout << buffer[i] << std::endl;
}
}
}
int main()
{
writeData("test.bin", 10);
readData("test.bin");
}
Output
Number of integers: 10
all characters read successfully.
1481765933
1085377743
1270216262
1191391529
812669700
553475508
445349752
1344887256
730417256
1812158119

Related

How to read large files in segments?

I'm using small files currently for testing and will scale up once it works.
I made a file bigFile.txt that has:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
I'm running this to segment the data that is being read from the file:
#include <iostream>
#include <fstream>
#include <memory>
using namespace std;
int main()
{
ifstream file("bigfile.txt", ios::binary | ios::ate);
cout << file.tellg() << " Bytes" << '\n';
ifstream bigFile("bigfile.txt");
constexpr size_t bufferSize = 4;
unique_ptr<char[]> buffer(new char[bufferSize]);
while (bigFile)
{
bigFile.read(buffer.get(), bufferSize);
// print the buffer data
cout << buffer.get() << endl;
}
}
This gives me the following result:
26 Bytes
ABCD
EFGH
IJKL
MNOP
QRST
UVWX
YZWX
Notice how in the last line after 'Z' the character 'WX' is repeated again?
How do I get rid of it so that it stops after reaching the end?
cout << buffer.get() uses the const char* overload, which prints a NULL-terminated C string.
But your buffer isn't NULL-terminated, and istream::read() can read less characters than the buffer size. So when you print buffer, you end up printing old characters that were already there, until the next NULL character is encountered.
Use istream::gcount() to determine how many characters were read, and print exactly that many characters. For example, using std::string_view:
#include <iostream>
#include <fstream>
#include <memory>
#include <string_view>
using namespace std;
int main()
{
ifstream file("bigfile.txt", ios::binary | ios::ate);
cout << file.tellg() << " Bytes" << "\n";
file.seekg(0, std::ios::beg); // rewind to the beginning
constexpr size_t bufferSize = 4;
unique_ptr<char[]> buffer = std::make_unique<char[]>(bufferSize);
while (file)
{
file.read(buffer.get(), bufferSize);
auto bytesRead = file.gcount();
if (bytesRead == 0) {
// EOF
break;
}
// print the buffer data
cout << std::string_view(buffer.get(), bytesRead) << endl;
}
}
Note also that there's no need to open the file again - you can rewind the original one to the beginning and read it.
The problem is that you don't override the buffer's content. Here's what your code does:
It reads the beginning of the file
When reaching the 'YZ', it reads it and only overrides the buffer's first two characters ('U' and 'V') because it has reached the end of the file.
One easy fix is to clear the buffer before each file read:
#include <iostream>
#include <fstream>
#include <array>
int main()
{
std::ifstream bigFile("bigfile.txt", std::ios::binary | std::ios::ate);
int fileSize = bigFile.tellg();
std::cout << bigFile.tellg() << " Bytes" << '\n';
bigFile.seekg(0);
constexpr size_t bufferSize = 4;
std::array<char, bufferSize> buffer;
while (bigFile)
{
for (int i(0); i < bufferSize; ++i)
buffer[i] = '\0';
bigFile.read(buffer.data(), bufferSize);
// Print the buffer data
std::cout.write(buffer.data(), bufferSize) << '\n';
}
}
I also changed:
The std::unique_ptr<char[]> to a std::array since we don't need dynamic allocation here and std::arrays's are safer that C-style arrays
The printing instruction to std::cout.write because it caused undefined behavior (see #paddy's comment). std::cout << prints a null-terminated string (a sequence of characters terminated by a '\0' character) whereas std::cout.write prints a fixed amount of characters
The second file opening to a call to the std::istream::seekg method (see #rustyx's answer).
Another (and most likely more efficient) way of doing this is to read the file character by character, put them in the buffer, and printing the buffer when it's full. We then print the buffer if it hasn't been already in the main for loop.
#include <iostream>
#include <fstream>
#include <array>
int main()
{
std::ifstream bigFile("bigfile.txt", std::ios::binary | std::ios::ate);
int fileSize = bigFile.tellg();
std::cout << bigFile.tellg() << " Bytes" << '\n';
bigFile.seekg(0);
constexpr size_t bufferSize = 4;
std::array<char, bufferSize> buffer;
int bufferIndex;
for (int i(0); i < fileSize; ++i)
{
// Add one character to the buffer
bufferIndex = i % bufferSize;
buffer[bufferIndex] = bigFile.get();
// Print the buffer data
if (bufferIndex == bufferSize - 1)
std::cout.write(buffer.data(), bufferSize) << '\n';
}
// Override the characters which haven't been already (in this case 'W' and 'X')
for (++bufferIndex; bufferIndex < bufferSize; ++bufferIndex)
buffer[bufferIndex] = '\0';
// Print the buffer for the last time if it hasn't been already
if (fileSize % bufferSize /* != 0 */)
std::cout.write(buffer.data(), bufferSize) << '\n';
}

Seperate 4 channels (R,G,G,B) of .raw image file and save them as valid image in c++

I have to take a .raw12 file, separate the 4 channels (R, G, G, B) and save them as valid images (8-bits in memory) without using any external library (.ppm) in C++.
You can read about raw12 here and ppm file format here.
I have written the code but it is giving me this output.
Click Here
I have tried a lot of things but it is always giving me output similar to the one above. I think there is a problem in datatype conversion but I am not sure.
I am trying to debug it from 2 days, still no luck.
Here is this code.
#include <fstream>
#include <iostream>
using namespace std;
const int BUFFERSIZE = 4096;
int main ()
{
ifstream infile;
infile.open("portrait.raw12", ios::binary | ios::in);
ofstream outfile;
outfile.open("Redtwo.ppm", ios::binary);
//outfile.write("P6 ", 3);
//outfile.write("1536 2048 ", 8);
//outfile.write("2048 ", 4);
//outfile.write("255 ", 4);
//outfile << "P6\n" << 1536 << "\n" << 2048 << "\n255\n";
outfile << "P6" << "\n"
<< 1536 << " "
<< 2048 << "\n"
<< 255 << "\n"
;
uint8_t * bufferRow = new uint8_t[BUFFERSIZE];
if(!infile)
{
cout<<"Failed to open"<<endl;
}
int size=1536*2048*3;
char * RedChannel=new char[size];
int GreenChannel_1,GreenChannel_2,BlueChannel;
int rowNum=0;
int i=0;
int j=0;
int pixel=1;
while(rowNum<3072)
{
infile.read(reinterpret_cast<char*>(bufferRow), BUFFERSIZE);
if(rowNum%2==0)
{
while(i<BUFFERSIZE)
{
RedChannel[j]=(uint8_t)bufferRow[i];
GreenChannel_1=((uint8_t)(bufferRow[i+1] & 0x0F) << 4) | ((uint8_t)(bufferRow[i+2] >> 4) & 0x0F);
i+=3;
//Collect s;
//s.r=(char)RedChannel[j];
//s.g=(char)0;
//s.b=(char)0;
//unsigned char c = (unsigned char)(255.0f * (float)RedChannel[j] + 0.5f);
//outfile.write((char*) &c, 3);
//outfile.write((char*) 255, sizeof(c));
//outfile.write(reinterpret_cast<char*>(&RedChannel), 4);
if(pixel<=3 && rowNum<5)
{
cout<<"RedChannel: "<<RedChannel[j]<<endl;
if(pixel!=3)
cout<<"GreenChannel 1: "<<GreenChannel_1<<endl;
}
pixel++;
j++;
}
RedChannel[j]='\n';
j++;
}
else
{
while(i<BUFFERSIZE)
{
GreenChannel_2=(uint8_t)bufferRow[i];
BlueChannel=((uint8_t)(bufferRow[i+1] & 0x0F) << 4) | ((uint8_t)(bufferRow[i+2] >> 4) & 0x0F);
i+=3;
if(pixel<=3 && rowNum<5)
{
cout<<"GreenChannel 2: "<<GreenChannel_2<<endl;
if(pixel!=3)
cout<<"BlueChannel: "<<BlueChannel<<endl;
}
pixel++;
}
}
rowNum++;
i=0;
pixel=1;
if(rowNum<5)
cout<<" "<<endl;
}
infile.close();
outfile.write(RedChannel, size);
outfile.close();
}
Github Link To The Code
I have simplified your code quite a lot and made it output just one single channel, since your question says you should generate one image per channel.
Now you have something working, you can add back in the other parts - it's easier to start with something that works! I won't do all your challenge for you... that would be no fun and leave you without any sense of achievement. You can do the rest - good luck!
#include <fstream>
#include <iostream>
using namespace std;
// Enough for one line of the input image
const int BUFFERSIZE = 4096 * 3;
int main(){
ifstream infile;
ofstream outfile;
infile.open("portrait.raw12", ios::binary | ios::in);
outfile.open("result.pgm", ios::binary);
// Write single channel PGM file
outfile << "P5\n2048 1536\n255\n";
unsigned char * bufferRow = new unsigned char[BUFFERSIZE];
if(!infile)
{
cout<<"Failed to open"<<endl;
}
int size=2048*1536;
unsigned char * RedChannel=new unsigned char[size];
unsigned char * Redp = RedChannel;
for(int rowNum=0;rowNum<1536;rowNum++){
// Read an entire row
infile.read(reinterpret_cast<char*>(bufferRow), BUFFERSIZE);
if(rowNum%2==0){
for(int i=0;i<BUFFERSIZE;i+=3){
*Redp++=bufferRow[i];
}
}
}
infile.close();
outfile.write(reinterpret_cast<char*>(RedChannel), size);
outfile.close();
}

Finding int value in large binary file c++

I tried to make a program that loads chunks of a large (We're speaking of a few MBs) of file, and searches for a value, and prints its address and value, except my program every few times throws a !myfile , doesn't give the value except a weird symbol (Although I've used 'hex' in cout), the addresses seem to loop sorta, and it doesn't seem to find all the values at all. I've tried for a long time and I gave up, so I'm asking experiences coders out there to find the issue.
I should note that I'm trying to find a 32 bit value in this file, but all I could make was a program that checks bytes, i'd require assistance for that too.
#include <iostream>
#include <fstream>
#include <climits>
#include <sstream>
#include <windows.h>
#include <math.h>
using namespace std;
int get_file_size(std::string filename) // path to file
{
FILE *p_file = NULL;
p_file = fopen(filename.c_str(),"rb");
fseek(p_file,0,SEEK_END);
int size = ftell(p_file);
fclose(p_file);
return size;
}
int main( void )
{
ifstream myfile;
myfile.open( "file.bin", ios::binary | ios::in );
char addr_start = 0,
addr_end = 0,
temp2 = 0x40000;
bool found = false;
cout << "\nEnter address start (Little endian, hex): ";
cin >> hex >> addr_start;
cout << "\nEnter address end (Little endian, hex): ";
cin >> hex >> addr_end;
unsigned long int fsize = get_file_size("file.bin");
char buffer[100];
for(int counter = fsize; counter != 0; counter--)
{
myfile.read(buffer,100);
if(!myfile)
{
cout << "\nAn error has occurred. Bytes read: " << myfile.gcount();
myfile.clear();
}
for(int x = 0; x < 100 - 1 ; x++)
{
if(buffer[x] >= addr_start && buffer[x] <= addr_end)
cout << "Addr: " << (fsize - counter * x) << " Value: " << hex << buffer[x] << endl;
}
}
myfile.close();
system("PAUSE"); //Don't worry about its inefficiency
}
A simple program to search for a 32-bit integer in a binary file:
int main(void)
{
ifstream data_file("my_file.bin", ios::binary);
if (!data_file)
{
cerr << "Error opening my_file.bin.\n";
EXIT_FAILURE;
}
const uint32_t search_key = 0x12345678U;
uint32_t value;
while (data_file.read((char *) &value, sizeof(value))
{
if (value == search_key)
{
cout << "Found value.\n";
break;
}
}
return EXIT_SUCCESS;
}
You could augment the performance by reading into a buffer and searching the buffer.
//...
const unsigned int BUFFER_SIZE = 1024;
static uint32_t buffer[BUFFER_SIZE];
while (data_file.read((char *)&(buffer[0]), sizeof(buffer) / sizeof(uint32_t))
{
int bytes_read = data_file.gcount();
if (bytes_read > 0)
{
values_read = ((unsigned int) bytes_read) / sizeof(uint32_t);
for (unsigned int index = 0U; index < values_read; ++index)
{
if (buffer[index] == search_key)
{
cout << "Value found.\n";
break;
}
}
}
}
With the above code, when the read fails, the number of bytes should be checked, and if any bytes were read, the buffer then searched.

Reading a char* from a binary file

I have to store the name of a file into a binary file that I am writing, I currently have written it like this:
void write(map<char, bits> &bitstring,map<char,int> &ccount, string header,string fname,ofstream &outf)
{
ifstream inf(header+fname);
cout << "FName: " << fname << endl;
const char * pName = fname.c_str();
fname = header+ fname + ".mcp";
const char * c = fname.c_str();
FILE* pFile;
pFile = fopen(c, "w+b");
inf.seekg(ios::beg, ios::end);
int size = inf.tellg();
int length = 0;
string data = "";
int magicNum = 2262;
int fileNameSize = strlen(pName);
fwrite(&fileNameSize, sizeof(int), 1, pFile);
cout <<"FIle Name Size: "<< fileNameSize << endl;
fwrite(pName, fileNameSize, 1, pFile);
fclose(pFile);
}
And I also send the size of the file name, so that I know how much data I need to read to get the whole file name.
void read2(string fname, map<char, int> &charCounts, std::vector<bool> &bits,ofstream &outf)
{
string fname1 = fname + ".mcp", outfile = "binarycheck";
bool testDone = false, counts = false;
std::ifstream inf(fname1, std::ios::binary);
std::ofstream ouf("binarycheck.txt", std::ios::binary);
char character;
int count[1] = { 0 };
int checkcount = 0;
int mNum[1] = { 0 }, size[1] = { 0 };
int FNamesize = 0;
inf.read((char*)&FNamesize, sizeof(int));
char *name=new char[FNamesize+1];
inf.read(name, FNamesize);
name[FNamesize] = '\0';
string str(name);
cout << "File Name: ";
cout << std::string(name) << endl;
cout << "Magic Num: " << mNum[0] << endl;
cout << "File Name Size: " << FNamesize<< endl;
inf.close();
}
I get the Size correctly, but I have no idea how to iterate through name in order to save it back as a string. I tried using a vector but it didn't really help since inf.read uses a char* as its first parameter.
Any help would be great.
Well, in a fluke accident I ended up solving my own issue. For some reason when I declared
FILE* pFile;
pFile = fopen(c, "w+b");
Before the declaration of
const char * pName = fname.c_str();
The call corrupted the value of pName before it was written to the file, which is what caused the errors. Problem solved!
Seeing as you're using ifstream, why not also use ofstream? Then it would just be ofs << filename to store and ifs >> filename to read where filename is a string. No need to faff around with the length yourself.

lz4 compression c++ example [duplicate]

This question already has answers here:
Question about seekg() function of ifstream in C++?
(3 answers)
Closed 8 years ago.
In the process of writing a lz4 csv to compressed binary file converter (high volume forex tick data csv) in the hope of reducing the storage/disk bandwidth requirements on my tiny vps.
self contained code to illustrate
#include <string>
#include <fstream>
#include <iostream>
#include "lz4.h"
using namespace std;
int main()
{
char szString[] = "2013-01-07 00:00:04,0.98644,0.98676 2013-01-07 00:01:19,0.98654,0.98676 2013-01-07 00:01:38,0.98644,0.98696";
const char* pchSource = szString;
int nInputSize = sizeof(szString);
cout <<"- pchSource -" << endl << pchSource << endl;
cout <<"nbytes = "<< nInputSize << endl << endl;
ofstream source("pchSource.txt");
source << pchSource;
int nbytesPassed = 0;
int nMaxCompressedSize = LZ4_compressBound(nInputSize);
char *pszDest = new char[nMaxCompressedSize];
nbytesPassed = LZ4_compress(pchSource, pszDest, nInputSize);
cout <<"- pszDest Compressed-" << endl;
cout <<"nbytesPassed = "<< nbytesPassed << endl;
cout << pszDest << endl << endl;
// pszDest garbage ?
char *pszDestUnCompressed = new char[nInputSize];
LZ4_uncompress(pszDest, pszDestUnCompressed, nInputSize);
cout <<"- pszDestUnCompressed -" << endl;
cout <<"nbytesPassed = "<< nbytesPassed << endl;
cout << pszDestUnCompressed << endl << endl;
//pszDestUnCompressed is correct ?
delete[] pszDestUnCompressed;
pszDestUnCompressed = 0;
// ok lets write compressed pszDest to pszDest.dat
ofstream outCompressedFile("pszDest.dat",std::ofstream::binary);
outCompressedFile.write (pszDest,nMaxCompressedSize);
delete[] pszDest;
pszDest = 0;
//read it back in and try to uncompress it
ifstream infile("pszDest.dat",std::ifstream::binary);
infile.seekg (0,infile.end);
int nCompressedInputSize = infile.tellg();
infile.seekg (0);
char* buffer = new char[nCompressedInputSize];
infile.read (buffer,nCompressedInputSize);
const char* pchbuffer = buffer;
char* pszUnCompressedFile = new char[nInputSize];
nbytesPassed = LZ4_uncompress(pchbuffer, pszUnCompressedFile, nInputSize);
cout <<"- pszUnCompressedFile -" << endl;
cout <<"nbytesPassed = "<< nbytesPassed << endl;
cout << pszUnCompressedFile << endl;
//write uncompressed pszDest.dat to pszUnCompressedFile.txt
ofstream outUncompressedSource("pszUnCompressedFile.txt");
outUncompressedSource << pszUnCompressedFile;
// On my system 32bit ArchLinux 3.7.10-1 - gcc 4.7.2-4
// file contains random Garbage
delete[] buffer;
buffer = 0;
delete[] pszUnCompressedFile;
pszUnCompressedFile = 0;
return 0;
}
CONSOLE OUTPUT :
- pchSource -
2013-01-07 00:00:04,0.98644 .....
nbytes = 108
- pszDest Compressed-
nbytesPassed = 63
�2013-01-07 00:
- pszDestUnCompressed -
nbytesPassed = 63
2013-01-07 00:00:04,0.98644 .....
- pszUnCompressedFile -
nbytesPassed = -17
�W��W�-07 q
Process returned 0 (0x0) execution time : 0.010 s
Press ENTER to continue.
I'm obviously missing something, apart form the samples included in the source are there any-other usage examples ?
All working now thanks, here is the code for anyone that is interested
#include <fstream>
#include <iostream>
#include "lz4.h"
using namespace std;
int main()
{
char szSource[] = "2013-01-07 00:00:04,0.98644,0.98676 2013-01-07 00:01:19,0.98654,0.98676 2013-01-07 00:01:38,0.98644,0.98696";
int nInputSize = sizeof(szSource);
// compress szSource into pchCompressed
char* pchCompressed = new char[nInputSize];
int nCompressedSize = LZ4_compress((const char *)(&szSource), pchCompressed, nInputSize);
// write pachCompressed to binary lz4.dat
ofstream outBinaryFile("lz4.dat",ofstream::binary);
outBinaryFile.write(pchCompressed, nCompressedSize);
outBinaryFile.close();
delete[] pchCompressed;
pchCompressed = 0;
//read compressed binary file (assume we pass/encode nInputSize but don't know nCompressedSize)
ifstream infCompressedBinaryFile( "lz4.dat", ifstream::binary );
//Get compressed file size for buffer
infCompressedBinaryFile.seekg (0,infCompressedBinaryFile.end);
int nCompressedInputSize = infCompressedBinaryFile.tellg();
infCompressedBinaryFile.clear();
infCompressedBinaryFile.seekg(0,ios::beg);
//Read file into buffer
char* pchCompressedInput = new char[nCompressedInputSize];
infCompressedBinaryFile.read(pchCompressedInput,nCompressedSize);
infCompressedBinaryFile.close();
// Decompress buffer
char* pchDeCompressed = new char[nInputSize]; //(nCompressedInputSize *2) +8
LZ4_uncompress(pchCompressedInput, pchDeCompressed, nInputSize);
delete[] pchCompressedInput;
pchCompressedInput = 0;
// write decompressed pachUnCompressed to
ofstream outFile("lz4.txt");
outFile.write(pchDeCompressed, nInputSize);
outFile.close();
delete[] pchDeCompressed;
pchDeCompressed = 0;
return 0;
}
I am also working on a a simple CLI csv to binary I/O example here