Split a binary file into chunks c++ - c++

I've been bashing my head against trying to first divide up a file into chunks, for the purpose of sending over sockets. I can read / write a file easily without splitting it into chunks. The code below runs, works, kinda. It will write a textfile and has a garbage character. Which if this was just for txt, no problem. Jpegs aren't working with said garbage.
Been at it for a few days, so I've done my research, and it's time to get some help. I do want to stick strictly to binary readers, as this need to handle any file.
I've seen a lot of slick examples out there. (none of them worked for me with jpgs) Mostly something along the lines of while(file)... I subscribe to the, if you know the size, use a for-loop, not a while-loop camp.
Thank you for the help!!
vector<char*> readFile(const char* fn){
vector<char*> v;
ifstream::pos_type size;
char * memblock;
ifstream file;
file.open(fn,ios::in|ios::binary|ios::ate);
if (file.is_open()) {
size = fileS(fn);
file.seekg (0, ios::beg);
int bs = size/3; // arbitrary. Actual program will use the socket send size
int ws = 0;
int i = 0;
for(i = 0; i < size; i+=bs){
if(i+bs > size)
ws = size%bs;
else
ws = bs;
memblock = new char [ws];
file.read (memblock, ws);
v.push_back(memblock);
}
}
else{
exit(-4);
}
return v;
}
int main(int argc, char **argv) {
vector<char*> v = readFile("foo.txt");
ofstream myFile ("bar.txt", ios::out | ios::binary);
for(vector<char*>::iterator it = v.begin(); it!=v.end(); ++it ){
myFile.write(*it,strlen(*it));
}
}

The problem is that you are using a strlen to calculate the size of array to be written. A 0 to be a part of binary there you would not be writing the right size. Instead, use a pair of char*,int where int specifies the size that is to be written and you will be golden.
Like:
#include <iostream>
#include <vector>
#include <fstream>
#include <stdlib.h>
#include <string.h>
using namespace std;
ifstream::pos_type fileS(const char* fn)
{
ifstream file;
file.open(fn,ios::in|ios::binary);
file.seekg(0, ios::end);
ifstream::pos_type ret= file.tellg();
file.seekg(0,ios::beg);
ret=ret-file.tellg();
file.close();
return ret;
}
vector< pair<char*,int> > readFile(const char* fn){
vector< pair<char*,int> > v;
ifstream::pos_type size;
char * memblock;
ifstream file;
file.open(fn,ios::in|ios::binary|ios::ate);
if (file.is_open()) {
size = fileS(fn);
file.seekg (0, ios::beg);
int bs = size/3; // arbitrary. Actual program will use the socket send size
int ws = 0;
int i = 0;
cout<<"size:"<<size<<" bs:"<<bs<<endl;
for(i = 0; i < size; i+=bs){
if(i+bs > size)
ws = size%bs;
else
ws = bs;
cout<<"read:"<<ws<<endl;
memblock = new char [ws];
file.read (memblock, ws);
v.push_back(make_pair(memblock,ws));
}
}
else{
exit(-4);
}
return v;
}
int main(int argc, char **argv) {
vector< pair<char*,int> > v = readFile("a.png");
ofstream myFile ("out.png", ios::out | ios::binary);
for(vector< pair<char*,int> >::iterator it = v.begin(); it!=v.end(); ++it ){
pair<char*,int> p=*it;
myFile.write(p.first,p.second);
}
}

myFile.write(*it,strlen(*it));
Is using string length on binary data. I suspect that is your culprit. If not, it's certainly a code-smell.

You should never do this:
myFile.write(*it,strlen(*it));
on binary data. strlen counts bytes until it hits a byte which contains a 0 (NUL as we like to say, but it's an honest 0). If you read enough binary data, you will hit a NUL, and you'll get a short count. But actually the situation could be a lot worse, because nowhere do you store the NUL for strlen to find. You're just counting on there being one beyond the end of the datablock you acquire to read the file into.
So don't do that. Remember the number of bytes in each block (you could use a vector> but there are a lot of more C++-like possibilities) and use that to write the data.

Related

Why do I see the contents of binary files?

Why I can open and view binary files . odd appearance that is impossible ?
http://codepad.org/OwX99H0p
Enter a string str -> char arr1[] -> FILEOUT.DAT
FILEOUT.DAT -> char arr2[] -> Printed screens
The code in question:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
void NhapMang(char *&arr, string str , int &n)
{
n = str.length();
arr = new char[n];
for (int i = 0; i < n;i++)
{
arr[i] = str[i];
}
}
void XuatMang(char *arr, int n)
{
for (int i = 0; i < n;i++)
{
cout << arr[i];
}
}
void GhiFile(ofstream &FileOut, char *arr, int n)
{
FileOut.open("OUTPUT.DAT", ios::out | ios::binary);
FileOut.write(arr, n*sizeof(char));
FileOut.close();
}
void DocFile(ifstream &FileInt, char *&arr, int n)
{
FileInt.open("OUTPUT.DAT", ios::in | ios::binary);
arr = new char[n];
FileInt.read(arr, n*sizeof(char));
FileInt.close();
}
int main()
{
char *arr1;
int n1;
fflush(stdin);
string str;
getline(cin, str);
NhapMang(arr1, str,n1);
ofstream FileOut;
GhiFile(FileOut, arr1, n1);
char *arr2;
int n2 = n1;
ifstream FileInt;
DocFile(FileInt, arr2, n2);
XuatMang(arr2, n2);
delete[] arr1;
delete[] arr2;
system("pause");
return 0;
}
You're ultimately storing data in a file. What this data represents is up to you, keep in mind, it's all '1's and '0's in the end. When you open the file you've created with a text editor, it will try to interpret this data as text which doesn't give a readable result.
Imagine storing a liquid in a bottle. If you don't label it, no one knows what it is. If you then pour this liquid in your car, it will try to use this as gasoline and potentially wreck your engine. Computers, fortunately, are much more forgiving.
Most files store information about how the data can be interpreted in their headers so programs can check if the file type is supported or not. So trying to open this file in a media player for example is most likely telling you that this format is not supported instead of trying to interpret the data as a media.

Reading Binary Files into an array of ints c++

I have a method which writes a binary file from an int array. (it could be wrong too)
void bcdEncoder::writeBinaryFile(unsigned int packedBcdArray[], int size)
{
fstream binaryIo;
binaryIo.open("PridePrejudice.bin", ios::out| ios::binary | ios::trunc);
binaryIo.seekp(0);
binaryIo.write((char*)packedBcdArray, size * sizeof(packedBcdArray[0]));
binaryIo.seekp(0);
binaryIo.close();
}
I need to now read that binary file back. And preferably have it read it back into another array of unsigned ints without any information loss.
I have something like the following code, but I have no idea on how reading binary files really works, and no idea how to read it into an array of ints.
void bcdEncoder::readBinaryFile(string fileName)
{
// myArray = my dnynamic int array
fstream binaryIo;
binaryIo.open(fileName, ios::in | ios::binary | ios::trunc);
binaryIo.seekp(0);
binaryIo.seekg(0);
binaryIo.read((int*)myArray, size * sizeof(myFile));
binaryIo.close();
}
Question:
How to complete the implementation of the function that reads binary files?
If you're using C++, use the nice std library.
vector<unsigned int> bcdEncoder::readBinaryFile(string fileName)
{
vector<unsigned int> ret; //std::list may be preferable for large files
ifstream in{ fileName };
unsigned int current;
while (in.good()) {
in >> current;
ret.emplace_back(current);
}
return ret;
}
Writing is just as simple (for this we'll accept an int[] but an std library would be preferable):
void bcdEncoder::writeBinaryFile(string fileName, unsigned int arr[], size_t len)
{
ofstream f { fileName };
for (size_t i = 0; i < len; i++)
f << arr[i];
}
Here's the same thing but with an std::vector
void bcdEncoder::writeBinaryFile(string fileName, vector<unsigned int> arr)
{
ofstream f { fileName };
for (auto&& i : arr)
f << i;
}
To simplify read operation consider storing size (i.e the number of elements in the array) before the data:
void bcdEncoder::writeBinaryFile(unsigned int packedBcdArray[], int size)
{
fstream binaryIo;
binaryIo.open("PridePrejudice.bin", ios::out| ios::binary | ios::trunc);
binaryIo.seekp(0);
binaryIo.write(&size, sizeof(size));
binaryIo.write((char*)packedBcdArray, size * sizeof(packedBcdArray[0]));
binaryIo.close();
}
The read would look something like:
void bcdEncoder::readBinaryFile(string fileName)
{
std::vector<unsigned int> myData;
int size;
fstream binaryIo;
binaryIo.open(fileName, ios::in | ios::binary | ios::trunc);
binaryIo.read(&size, sizeof(size)); // read the number of elements
myData.resize(size); // allocate memory for an array
binaryIo.read(myData.data(), size * sizeof(myData.value_type));
binaryIo.close();
// todo: do something with myData
}
Modern alternative using std::array
Here's a code snippet that uses more modern C++ to read a binary file into an std::array.
const int arraySize = 9216; // Hard-coded
std::array<uint8_t, arraySize> fileArray;
std::ifstream binaryFile("<my-binary-file>", std::ios::in | std::ios::binary);
if (binaryFile.is_open()) {
binaryFile.read(reinterpret_cast<char*>(fileArray.data()), arraySize);
}
Because you're using an std::array you'll need to know the exact size of the file during compile-time. If you don't know the size of the file ahead of time (or rather, you'll need to know that the file has at least X bytes available), use a std::vector and look at this example here: https://stackoverflow.com/a/36661779/1576548
Thanks for the tips guys, looks like I worked it out!! A major part of my problem was that half the arguments and syntax I added to the methods were not required, and actually messed things up. Here are my working methods.
void bcdEncoder::writeBinaryFile(unsigned int packedBcdArray[], int size, string fileName)
{
ofstream binaryIo;
binaryIo.open(fileName.substr(0, fileName.length() - 4) + ".bin", ios::binary);
if (binaryIo.is_open()) {
binaryIo.write((char*)packedBcdArray, size * sizeof(packedBcdArray[0]));
binaryIo.close();
// Send binary file to reader
readBinaryFile(fileName.substr(0, fileName.length() - 4) + ".bin", size);
}
else
cout << "Error writing bin file..." << endl;
}
And the read:
void bcdEncoder::readBinaryFile(string fileName, int size)
{
AllocateArray packedData(size);
unsigned int *packedArray = packedData.createIntArray();
ifstream binaryIo;
binaryIo.open(fileName, ios::binary);
if (binaryIo.is_open()) {
binaryIo.read((char*)packedArray, size * sizeof(packedArray[0]));
binaryIo.close();
decodeBCD(packedArray, size * 5, fileName);
}
else
cout << "Error reading bin file..." << endl;
}
With the AllocateArray being my class that creates dynamic arrays without vectors somewhat safely with destructors included.

How do I access .HGT SRTM files in C++?

Here is a similar question on the topic with a good description of the file:
how to read NASA .hgt binary files
I am fairly new to programming in general and my efforts thus far have been very limited. My ultimate goal is to access the elevation data and store it in a 2D array for easy access. I have been trying to read the file 2 bytes at a time, as has been suggested, but I don't know what to do next. Here is what I've got so far:
#include <iostream>
#include <fstream>
using namespace std;
int main ()
{
ifstream::pos_type size;
char * memblock;
ifstream file ("N34W119.hgt", ios::in|ios::binary|ios::ate);
if (file.is_open())
{
size = 2;
memblock = new char [size];
file.seekg (0, ios::beg);
file.read (memblock, size);
//I don't know what to do next
file.close();
}
return 0;
}
Thanks for any help!
// SRTM_version 1201 or 3601
int height[SRTM_version][SRTM_version];
for ( int r = 0; r < SRTM_version ; r++ ) {
for ( int c = 0 ; c < SRTM_verision; c++ ) {
height[r][c] = (memblock[0] << 8) | memblock[1];
}
}

Serialize and deserialize vector in binary

I am having problems trying to serialise a vector (std::vector) into a binary format and then correctly deserialise it and be able to read the data. This is my first time using a binary format (I was using ASCII but that has become too hard to use now) so I am starting simple with just a vector of ints.
Whenever I read the data back the vector always has the right length but the data is either 0, undefined or random.
class Example
{
public:
std::vector<int> val;
};
WRITE:
Example example = Example();
example.val.push_back(10);
size_t size = sizeof BinaryExample + (sizeof(int) * example.val.size());
std::fstream file ("Levels/example.sld", std::ios::out | std::ios::binary);
if (file.is_open())
{
file.seekg(0);
file.write((char*)&example, size);
file.close();
}
READ:
BinaryExample example = BinaryExample();
std::ifstream::pos_type size;
std::ifstream file ("Levels/example.sld", std::ios::in | std::ios::binary | std::ios::ate);
if (file.is_open())
{
size = file.tellg();
file.seekg(0, std::ios::beg);
file.read((char*)&example, size);
file.close();
}
Does anyone know what I am doing wrong or what to do or be able to point me in the direction that I need to do?
You can't unserialise a non-POD class by overwriting an existing instance as you seem to be trying to do - you need to give the class a constructor that reads the data from the stream and constructs a new instance of the class with it.
In outline, given something like this:
class A {
A();
A( istream & is );
void serialise( ostream & os );
vector <int> v;
};
then serialise() would write the length of the vector followed by the vector contents. The constructor would read the vector length, resize the vector using the length, then read the vector contents:
void A :: serialise( ostream & os ) {
size_t vsize = v.size();
os.write((char*)&vsize, sizeof(vsize));
os.write((char*)&v[0], vsize * sizeof(int) );
}
A :: A( istream & is ) {
size_t vsize;
is.read((char*)&vsize, sizeof(vsize));
v.resize( vsize );
is.read((char*)&v[0], vsize * sizeof(int));
}
You're using the address of the vector. What you need/want is the address of the data being held by the vector. Writing, for example, would be something like:
size = example.size();
file.write((char *)&size, sizeof(size));
file.write((char *)&example[0], sizeof(example[0] * size));
I would write in network byte order to ensure file can be written&read on any platform. So:
#include <fstream>
#include <iostream>
#include <iomanip>
#include <vector>
#include <arpa/inet.h>
int main(void) {
std::vector<int32_t> v = std::vector<int32_t>();
v.push_back(111);
v.push_back(222);
v.push_back(333);
{
std::ofstream ofs;
ofs.open("vecdmp.bin", std::ios::out | std::ios::binary);
uint32_t sz = htonl(v.size());
ofs.write((const char*)&sz, sizeof(uint32_t));
for (uint32_t i = 0, end_i = v.size(); i < end_i; ++i) {
int32_t val = htonl(v[i]);
ofs.write((const char*)&val, sizeof(int32_t));
}
ofs.close();
}
{
std::ifstream ifs;
ifs.open("vecdmp.bin", std::ios::in | std::ios::binary);
uint32_t sz = 0;
ifs.read((char*)&sz, sizeof(uint32_t));
sz = ntohl(sz);
for (uint32_t i = 0; i < sz; ++i) {
int32_t val = 0;
ifs.read((char*)&val, sizeof(int32_t));
val = ntohl(val);
std::cout << i << '=' << val << '\n';
}
}
return 0;
}
Read the other's answer to see how you should read/write a binary structure.
I add this one because I believe your motivations for using a binary format are mistaken. A binary format won't be easier that an ASCII one, usually it's the other way around.
You have many options to save/read data for long term use (ORM, databases, structured formats, configuration files, etc). The flat binary file is usually the worst and the harder to maintain except for very simple structures.

How to output array of doubles to hard drive?

I would like to know how to output an array of doubles to the hard drive.
edit:
for further clarification. I would like to output it to a file on the hard drive (I/O functions). Preferably in a file format that can be quickly translated back into an array of doubles in another program. It would also be nice if it was stored in a standard 4 byte configuration so that i can look at it through a hex viewer and see the actual values.
Hey... so you want to do it in a single write/read, well its not too hard, the following code should work fine, maybe need some extra error checking but the trial case was successful:
#include <string>
#include <fstream>
#include <iostream>
bool saveArray( const double* pdata, size_t length, const std::string& file_path )
{
std::ofstream os(file_path.c_str(), std::ios::binary | std::ios::out);
if ( !os.is_open() )
return false;
os.write(reinterpret_cast<const char*>(pdata), std::streamsize(length*sizeof(double)));
os.close();
return true;
}
bool loadArray( double* pdata, size_t length, const std::string& file_path)
{
std::ifstream is(file_path.c_str(), std::ios::binary | std::ios::in);
if ( !is.is_open() )
return false;
is.read(reinterpret_cast<char*>(pdata), std::streamsize(length*sizeof(double)));
is.close();
return true;
}
int main()
{
double* pDbl = new double[1000];
int i;
for (i=0 ; i<1000 ; i++)
pDbl[i] = double(rand());
saveArray(pDbl,1000,"test.txt");
double* pDblFromFile = new double[1000];
loadArray(pDblFromFile, 1000, "test.txt");
for (i=0 ; i<1000 ; i++)
{
if ( pDbl[i] != pDblFromFile[i] )
{
std::cout << "error, loaded data not the same!\n";
break;
}
}
if ( i==1000 )
std::cout << "success!\n";
delete [] pDbl;
delete [] pDblFromFile;
return 0;
}
Just make sure you allocate appropriate buffers! But thats a whole nother topic.
Use std::copy() with the stream iterators. This way if you change 'data' into another type the alterations to code would be trivial.
#include <algorithm>
#include <iterator>
#include <fstream>
int main()
{
double data[1000] = {/*Init Array */};
{
// Write data too a file.
std::ofstream outfile("data");
std::copy(data,
data+1000,
std::ostream_iterator<double>(outfile," ")
);
}
{
// Read data from a file
std::ifstream infile("data");
std::copy(std::istream_iterator<double>(infile),
std::istream_iterator<double>(),
data // Assuming data is large enough.
);
}
}
You can use iostream .read() and .write().
It works (very roughly!) like this:
double d[2048];
fill(d, d+2048, 0);
ofstream outfile ("save.bin", ios::binary);
outfile.write(reinterpret_cast<char*>(&d), sizeof(d));
ifstream infile ("save.bin", ios::binary);
infile.read(reinterpret_cast<char*>(&d), sizeof(d));
Note that this is not portable between CPU architectures. Some may have different sizes of double. Some may store the bytes in a different order. It shouldn't be used for data files that move between machines or data that is sent over the network.
#include <fstream.h>
void saveArray(double* array, int length);
int main()
{
double array[] = { 15.25, 15.2516, 84.168, 84356};
saveArray(array, 4);
return 0;
}
void saveArray(double* array, int length)
{
ofstream output("output.txt");
for(int i=0;i<length;i++)
{
output<<array[i]<<endl;
}
}
here is a way to output an array of doubles to text file one per line. hope this helps
EDIT
Change top one line to this two, and it will compile in VS. You can use multithreading to not blocking system wile saving data
#include <fstream>
using namespace std;
Now I feel old. I asked this question a long time ago (except about ints).
comp.lang.c++ link
#include <iostream>
#include <fstream>
using namespace std;
int main () {
double [] theArray=...;
int arrayLength=...;
ofstream myfile;
myfile.open ("example.txt");
for(int i=0; i<arrayLength; i++) {
myfile << theArray[i]<<"\n";
}
myfile.close();
return 0;
}
adapted from http://www.cplusplus.com/doc/tutorial/files/
Just set theArray and arrayLength to whatever your code requires.