Issue Read/Write array of structure in binary file in c++ - c++

I want to write my array structure in to a binary file.
My structure
typedef struct student{
char name[15];
vector<int> grade;
}arr_stu;
I can write and read back my data if I write and read in the same program; but if I create another program for read data only and put the binary file, it does not work because the vector grade is null.
size = 0;
unable to read from memory
Program to write array structure to file
int main()
{
arr_stu stu[100];
for (size_t i = 0; i < 100; i++)
{
strcpy(stu[i].name, randomName());
for (size_t j = 0; j < 10; j++)
{
stu[i].grade.push_back(randomGrade());
}
}
ofstream outbal("class", ios::out | ios::binary);
if (!outbal) {
cout << "Cannot open file.\n";
return 1;
}
outbal.write((char *)&stu, sizeof(stu));
outbal.close();
}
Program to read array structure to file
int main(){
feature_struc stu[100];
ifstream inbal("class", ios::in | ios::binary);
if (!inbal) {
cout << "Cannot open file.\n";
return 1;
}
inbal.read((char *)&stu, sizeof(stu));
for (size_t idx = 0; idx < 100; idx++)
{
cout << "Name : " << stu[idx].name << endl;
for (size_t index = 0; index < 10; index++)
{
cout << endl << "test: " << stu[idx].grade[index] << endl;
}
}
inbal.close();
return 0;
}
For me it seems like the use of vector pose the problem,
The reason that if we combine the two in one program it work well I think because vector is saved in the memory so it can still accessible.
Any suggestions?

You cannot serialize a vector like that. The write and read functions access the memory at the given address directly. Since vector is a complex class type only parts of its data content are stored sequentially at its base address. Other parts (heap allocated memory etc) are located elsewhere. The simplest solution would be to write the length of the vector to the file followed by each of the values. You have to loop over the vector elements to accomplish that.

outbal.write((char *)&stu, sizeof(stu));
The sizeof is a compile-time constant. In other words, it never changes. If the vector contained 1, 10, 1000, or 1,000,000 items, you're writing the same number of bytes to the file. So this way of writing to the file is totally wrong.
The struct that you're writing is non-POD due to the vector being a non-POD type. This means you can't just treat it as a set of bytes that you can copy from or to. If you want further proof, open the file you created in any editor. Can you see the data from the vector in that file? What you will see is more than likely, gibberish.
To write the data to the file, you have to properly serialize the data, meaning you have to write the data to a file, not the struct itself. You write the data in a way so that when you read the data back, you can recreate the struct. Ultimately, this means you have to
Write the name to the file, and possibly the number of bytes the name consists of.
Write the number of items in the vector
Write each vector item to the file.
If not this, then some way where you can distinctly figure out the name and the vector's data from the file so that your code to read the data parses the file correctly and recreates the struct.

What is the format of the binary file? Basically, you have to
define the format, and then convert each element to and from
that format. You can never just dump the bits of an internal
representation to disk and expect to be able to read it back.
(The fact that you need a reinterpret_cast to call
ostream::write on your object should tell you something.)

Related

Allocating memory for an array

I am trying to read numbers from a file and store then in an array using dynamic memory. When I try to print a member of the array, it is showing the address instead of the actual contents.
// CLASS METHOD IMPLEMENTATIONS
#include "DataHousing.h"
// CONSTRUCTORS
DataHousing::DataHousing() {
}
void DataHousing::FillArray() {
int tempIn = 0;
int count = 0;
// attempt to open the file with read permission
ifstream inputHandle("NumFile500.txt", ios::in);
// count how many numbers are in each file
if (inputHandle.is_open() == true) {
while (!inputHandle.eof()) {
inputHandle >> tempIn;
count++;
}
// allocate memory for array
int* pFileContents = new int[count];
// fill array
while (!inputHandle.eof()) {
for (int i = 0; i < count; i++) {
inputHandle >> pFileContents[i];
}
}
cout << &pFileContents[2];
}
else {
cout << "error";
}
}
This is my first time attempting anything like this and I am pretty stuck. What am i doing wrong here?
The unary & operator is to retrieve an address, so it is quite natural that it is showing the address.
To display the contents, remove the & in cout << &pFileContents[2]; and have it display the contents.
Also the counting part of your code
while (!inputHandle.eof()) {
inputHandle >> tempIn;
count++;
}
has two mistakes.
Firstly, you are incrementing count without the last reading was successful.
Secondly, you are trying to read from ifstream that is already reached to EOF.
You have to clear the EOF flag and seek to the beginning of the file like this:
In conclusion, the counting part should be:
while (inputHandle >> tempIn) {
count++;
}
inputHandle.clear();
inputHandle.seekg(0, ios_base::beg);
I see that you're trying to print the required value using:
cout << &pFileContents[2];
Since pFileContents is an array, pFileContents[2] will access the second element (value) of the same.
But since you've prepended & before the element, it is going to print the address of the second element of the array.
In order to print the value of the second element of the array, just use:
cout << pFileContents[2];
Notice the difference in the later code, we haven't used & just after cout <<

Can't write string on file C++

I'm trying to emulate a shift register of 32 bits. But I can't even write, neither the input or output on the file. It simply runs the file and closes. There is nothing written in there. Tried some solutions here, but nothing worked so far. The input is a string of 32 bits like this
"00000000000000000000000000000011".
The output should look like this
"00000000000000000000000000001100".
Shifting two bits to the left. I haven't finished the shift yet, I'm trying to understand why it doesn't show anything.
The weird thing about this is, I've created 32 bits multiplexes the same way and it works fine.
void deslocador::shift(std::vector<std::string> entrada)
{
std::ofstream shifter("shift_left2.tv", std::ofstream::out);
std::vector<std::string> saida;
if(shifter.is_open())
{
for(unsigned int i = 0; i < 3; i++)
{
saida[i] = entrada[i];
saida[i].erase(saida[i].begin());
saida[i].erase(saida[i].begin());
saida[i] += "00";
shifter << entrada[i] << "_" << saida[i] << std::endl;
}
}else
{
std::cout << "Couldn't open file!";
}
shifter.close();
}
std::vector<std::string> saida;
This instantiates a new vector. Like all new vectors, it is completely empty, and contains no values.
for(unsigned int i = 0; i < 3; i++)
{
saida[i] = entrada[i];
This assigns four values to saida[0] through saida[3]. Unfortunately, as we've just discovered, the saida vector is completely empty and contains nothing. This attempts to assign new values to nonexistent values in the vector, so this is undefined behavior, and pretty much a guaranteed crash.
Additionally, no attempt is made to verify whether the entrada vector contains at least four values, this would be yet another reason for undefined behavior, and a crash.
It is unclear what the intent of the shown code is, so it's not possible to offer any suggestion of possible ways to fix it. The description of the code does not match its contents. It is unclear what relationship exists between "32 bits" and a vector of strings, that may or may not have four values in it.
The only thing that can be determined is that you get no output because your program crashes because of undefined behavior. Before you can assign a value to the ith element in the vector, the ith element must already exist. It doesn't, in the shown code. This results in undefined behavior and a crash.
There are various ways of placing new values in a vector. A vector can be resize()d, or new values can be push_back()ed into a vector. It is unclear what should be done in this case, so for additional information and examples, see your C++ book so you can learn more about how either approach (and other approaches) work, so you can decide how you want to do what you are trying to do.
Here is what I did to work!
void deslocador::shift(std::vector<std::string> entrada)
{
std::ofstream shifter("shift_left2.tv", std::ofstream::out);
std::vector<std::string> saida;
saida.resize(entrada.size());
if(shifter.is_open())
{
for(unsigned int i = 0; i < entrada.size(); i++)
{
saida[i] = entrada[i];
saida[i].erase(saida[i].begin());
saida[i].erase(saida[i].begin());
saida[i] += "00";
shifter << entrada[i] << "_" << saida[i] << std::endl;
}
}else
{
std::cout << "Couldn't open file!";
}
shifter.close();
}

Writing/reading large vectors of data to binary file in c++

I have a c++ program that computes populations within a given radius by reading gridded population data from an ascii file into a large 8640x3432-element vector of doubles. Reading the ascii data into the vector takes ~30 seconds (looping over each column and each row), while the rest of the program only takes a few seconds. I was asked to speed up this process by writing the population data to a binary file, which would supposedly read in faster.
The ascii data file has a few header rows that give some data specs like the number of columns and rows, followed by population data for each grid cell, which is formatted as 3432 rows of 8640 numbers, separated by spaces. The population data numbers are mixed formats and can be just 0, a decimal value (0.000685648), or a value in scientific notation (2.687768e-05).
I found a few examples of reading/writing structs containing vectors to binary, and tried to implement something similar, but am running into problems. When I both write and read the vector to/from the binary file in the same program, it seems to work and gives me all the correct values, but then it ends with either a "segment fault: 11" or a memory allocation error that a "pointer being freed was not allocated". And if I try to just read the data in from the previously written binary file (without re-writing it in the same program run), then it gives me the header variables just fine but gives me a segfault before giving me the vector data.
Any advice on what I might have done wrong, or on a better way to do this would be greatly appreciated! I am compiling and running on a mac, and I don't have boost or other non-standard libraries at present. (Note: I am extremely new at coding and am having to learn by jumping in the deep end, so I may be missing a lot of basic concepts and terminology -- sorry!).
Here is the code I came up with:
# include <stdio.h>
# include <stdlib.h>
# include <string.h>
# include <fstream>
# include <iostream>
# include <vector>
# include <string.h>
using namespace std;
//Define struct for population file data and initialize one struct variable for reading in ascii (A) and one for reading in binary (B)
struct popFileData
{
int nRows, nCol;
vector< vector<double> > popCount; //this will end up having 3432x8640 elements
} popDataA, popDataB;
int main() {
string gridFname = "sample";
double dum;
vector<double> tempVector;
//open ascii population grid file to stream
ifstream gridFile;
gridFile.open(gridFname + ".asc");
int i = 0, j = 0;
if (gridFile.is_open())
{
//read in header data from file
string fileLine;
gridFile >> fileLine >> popDataA.nCol;
gridFile >> fileLine >> popDataA.nRows;
popDataA.popCount.clear();
//read in vector data, point-by-point
for (i = 0; i < popDataA.nRows; i++)
{
tempVector.clear();
for (j = 0; j<popDataA.nCol; j++)
{
gridFile >> dum;
tempVector.push_back(dum);
}
popDataA.popCount.push_back(tempVector);
}
//close ascii grid file
gridFile.close();
}
else
{
cout << "Population file read failed!" << endl;
}
//create/open binary file
ofstream ofs(gridFname + ".bin", ios::trunc | ios::binary);
if (ofs.is_open())
{
//write struct to binary file then close binary file
ofs.write((char *)&popDataA, sizeof(popDataA));
ofs.close();
}
else cout << "error writing to binary file" << endl;
//read data from binary file into popDataB struct
ifstream ifs(gridFname + ".bin", ios::binary);
if (ifs.is_open())
{
ifs.read((char *)&popDataB, sizeof(popDataB));
ifs.close();
}
else cout << "error reading from binary file" << endl;
//compare results of reading in from the ascii file and reading in from the binary file
cout << "File Header Values:\n";
cout << "Columns (ascii vs binary): " << popDataA.nCol << " vs. " << popDataB.nCol << endl;
cout << "Rows (ascii vs binary):" << popDataA.nRows << " vs." << popDataB.nRows << endl;
cout << "Spot Check Vector Values: " << endl;
cout << "Index 0,0: " << popDataA.popCount[0][0] << " vs. " << popDataB.popCount[0][0] << endl;
cout << "Index 3431,8639: " << popDataA.popCount[3431][8639] << " vs. " << popDataB.popCount[3431][8639] << endl;
cout << "Index 1600,4320: " << popDataA.popCount[1600][4320] << " vs. " << popDataB.popCount[1600][4320] << endl;
return 0;
}
Here is the output when I both write and read the binary file in the same run:
File Header Values:
Columns (ascii vs binary): 8640 vs. 8640
Rows (ascii vs binary):3432 vs.3432
Spot Check Vector Values:
Index 0,0: 0 vs. 0
Index 3431,8639: 0 vs. 0
Index 1600,4320: 25.2184 vs. 25.2184
a.out(11402,0x7fff77c25310) malloc: *** error for object 0x7fde9821c000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6
And here is the output I get if I just try to read from the pre-existing binary file:
File Header Values:
Columns (binary): 8640
Rows (binary):3432
Spot Check Vector Values:
Segmentation fault: 11
Thanks in advance for any help!
When you write popDataA to the file, you are writing the binary representation of the vector of vectors. However this really is quite a small object, consisting of a pointer to the actual data (itself a series of vectors, in this case) and some size information.
When it's read back in to popDataB, it kinda works! But only because the raw pointer that was in popDataA is now in popDataB, and it points to the same stuff in memory. Things go crazy at the end, because when the memory for the vectors is freed, the code tries to free the data referenced by popDataA twice (once for popDataA, and once again for popDataB.)
The short version is, it's not a reasonable thing to write a vector to a file in this fashion.
So what to do? The best approach is to first decide on your data representation. It will, like the ASCII format, specify what value gets written where, and will include information about the matrix size, so that you know how large a vector you will need to allocate when reading them in.
In semi-pseudo code, writing will look something like:
int nrow=...;
int ncol=...;
ofs.write((char *)&nrow,sizeof(nrow));
ofs.write((char *)&ncol,sizeof(ncol));
for (int i=0;i<nrow;++i) {
for (int j=0;j<ncol;++j) {
double val=data[i][j];
ofs.write((char *)&val,sizeof(val));
}
}
And reading will be the reverse:
ifs.read((char *)&nrow,sizeof(nrow));
ifs.read((char *)&ncol,sizeof(ncol));
// allocate data-structure of size nrow x ncol
// ...
for (int i=0;i<nrow;++i) {
for (int j=0;j<ncol;++j) {
double val;
ifs.read((char *)&val,sizeof(val));
data[i][j]=val;
}
}
All that said though, you should consider not writing things into a binary file like this. These sorts of ad hoc binary formats tend to live on, long past their anticipated utility, and tend to suffer from:
Lack of documentation
Lack of extensibility
Format changes without versioning information
Issues when using saved data across different machines, including endianness problems, different default sizes for integers, etc.
Instead, I would strongly recommend using a third-party library. For scientific data, HDF5 and netcdf4 are good choices which address all of the above issues for you, and come with tools that can inspect the data without knowing anything about your particular program.
Lighter-weight options include the Boost serialization library and Google's protocol buffers, but these address only some of the issues listed above.

Array of Strings to Output File

I am writing a C++ code in which I have dynamically created an array of strings. I have written functions to output both the number of items in the string array as well as the array itself. The next thing I wanted to do is stored the elements of the array in a text file, but when I open the file I have written to, only the last element of the array shows up. Here is a sample of what I am doing:
int num_elem = ReadNumElem(); // my function that gets the number of elements in the array of strings
string *MyStringArray = ReadNames(num_elem); // my function that reads a file and outputs the necessary strings into the array
for(int i = 0; i < num_elem < ++i) {
ofstream ofs("C:\\Test\\MyStrings.txt")
ofs << MyStringArray[i] << endl; // I also tried replacing endl with a "\n"
}
I am new to C++, so I apologize if this is too simple, but I have been searching for some time now, and I can't seem to find a solution. The first two functions are not relevant, I only need to know how to output the data into a text file so that all the data shows up, not just the final element in the array. Thanks!
You are opening the file every time in the array and overwriting its contents.
Try:
ofstream ofs("C:\\Test\\MyStrings.txt");
for(int i = 0; i < num_elem ; ++i) {
ofs << MyStringArray[i] << endl; // I also tried replacing endl with a "\n"
}
ofs.close();
You need to declare the file outside of the loop
edit
Sorry I didn't mean to answer in one line but it has been done now anyway.

Several ifstreams access-violation

I try to realise an external merge sort (wiki) and I want to open 2048 ifstreams and read data to personal buffers.
ifstream *file;
file = (ifstream *)malloc(2048 * sizeof(ifstream));
for (short i = 0; i < 2048; i++) {
itoa(i, fileName + 5, 10);
file[i].open(fileName, ios::in | ios::binary); // Access violation Error
if (!file[i]) {
cout << i << ".Bad open file" << endl;
}
if (!file[i].read((char*)perfile[i], 128*4)) {
cout << i << ". Bad read source file" << endl;
}
}
But, it crashes with
Unhandled exception at 0x58f3a5fd (msvcp100d.dll) in sorting.exe: 0xC0000005: Access violation reading location 0xcdcdcdfd.
Is it possible to use so much opened ifstreams?
Or maybe it is very bad idea to have 2048 opened ifstreams and there is a better way to realize this algorithm?
Arrays of non-POD objects are allocated with new, not with malloc, otherwise the constructors aren't run.
Your code is getting uninitialized memory and "interpreting" it as ifstreams, which obviously results in a crash (because the constructor of the class hasn't been run not even the virtual table pointers are in place).
You can either allocate all your objects on the stack:
ifstream file[2048];
or allocate them on the heap if stack occupation is a concern;
ifstream *file=new ifstream[2048];
// ...
delete[] file; // frees the array
(although you should use a smart pointer here to avoid memory leaks in case of exceptions)
or, better, use a vector of ifstream (requires header <vector>):
vector<ifstream> file(2048);
which do not require explicit deallocation of its elements.
(in theory, you could use malloc and then use placement new, but I wouldn't recommend it at all)
... besides, opening 2048 files at the same time doesn't feel like a great idea...
This is C++.ifstream is non-POD, so you can't just malloc it: the instances need to get constructed
ifstream file[2048];
for (short i = 0; i < 2048; i++) {
itoa(i, fileName + 5, 10);
file[i].open(fileName, ios::in | ios::binary); // Access violation Error
if (!file[i]) {
cout << i << ".Bad open file" << endl;
}
if (!file[i].read((char*)perfile[i], 128*4)) {
cout << i << ". Bad read source file" << endl;
}
}
Besides that, opening 2048 files doesn't sound like a good plan, but you can figure that out later
The value 0xcdcdcdcd is used by VS in debug mode to represent uninitialized memory (also keep an eye out for 0xbaadf00d).
You are using malloc which is of C heritage and does not call constructors, it simply gives you a pointer to a chunk of data. An ifstream is not a POD (Plain Old Data) type; it needs you to call its constructor in order to initialize properly. This is C++; use new and delete.
Better yet, don't use either; just construct the thing on the stack and let it handle dynamic memory allocation as it was meant to be used.
Of course, this doesn't even touch on the horrible idea to open 2048 files, but you should probably learn that one the hard way...
You cannot open 2048 files, there is an operating system limit for open files
As far as I can see, you don't really need an array of 2048 separate ifstreams here at all. You only need one ifstream at any given time, so each iteration you close one file and open another. Destroying an ifstream closes the file automatically, so you can do something like this:
for (short i = 0; i < 2048; i++) {
itoa(i, fileName + 5, 10);
ifstream file(fileName, ios::in | ios::binary);
if (!file) {
cout << i << ".Bad open file" << endl;
}
if (!file.read((char*)perfile[i], 128*4)) {
cout << i << ". Bad read source file" << endl;
}
}