Several ifstreams access-violation - c++

I try to realise an external merge sort (wiki) and I want to open 2048 ifstreams and read data to personal buffers.
ifstream *file;
file = (ifstream *)malloc(2048 * sizeof(ifstream));
for (short i = 0; i < 2048; i++) {
itoa(i, fileName + 5, 10);
file[i].open(fileName, ios::in | ios::binary); // Access violation Error
if (!file[i]) {
cout << i << ".Bad open file" << endl;
}
if (!file[i].read((char*)perfile[i], 128*4)) {
cout << i << ". Bad read source file" << endl;
}
}
But, it crashes with
Unhandled exception at 0x58f3a5fd (msvcp100d.dll) in sorting.exe: 0xC0000005: Access violation reading location 0xcdcdcdfd.
Is it possible to use so much opened ifstreams?
Or maybe it is very bad idea to have 2048 opened ifstreams and there is a better way to realize this algorithm?

Arrays of non-POD objects are allocated with new, not with malloc, otherwise the constructors aren't run.
Your code is getting uninitialized memory and "interpreting" it as ifstreams, which obviously results in a crash (because the constructor of the class hasn't been run not even the virtual table pointers are in place).
You can either allocate all your objects on the stack:
ifstream file[2048];
or allocate them on the heap if stack occupation is a concern;
ifstream *file=new ifstream[2048];
// ...
delete[] file; // frees the array
(although you should use a smart pointer here to avoid memory leaks in case of exceptions)
or, better, use a vector of ifstream (requires header <vector>):
vector<ifstream> file(2048);
which do not require explicit deallocation of its elements.
(in theory, you could use malloc and then use placement new, but I wouldn't recommend it at all)
... besides, opening 2048 files at the same time doesn't feel like a great idea...

This is C++.ifstream is non-POD, so you can't just malloc it: the instances need to get constructed
ifstream file[2048];
for (short i = 0; i < 2048; i++) {
itoa(i, fileName + 5, 10);
file[i].open(fileName, ios::in | ios::binary); // Access violation Error
if (!file[i]) {
cout << i << ".Bad open file" << endl;
}
if (!file[i].read((char*)perfile[i], 128*4)) {
cout << i << ". Bad read source file" << endl;
}
}
Besides that, opening 2048 files doesn't sound like a good plan, but you can figure that out later

The value 0xcdcdcdcd is used by VS in debug mode to represent uninitialized memory (also keep an eye out for 0xbaadf00d).
You are using malloc which is of C heritage and does not call constructors, it simply gives you a pointer to a chunk of data. An ifstream is not a POD (Plain Old Data) type; it needs you to call its constructor in order to initialize properly. This is C++; use new and delete.
Better yet, don't use either; just construct the thing on the stack and let it handle dynamic memory allocation as it was meant to be used.
Of course, this doesn't even touch on the horrible idea to open 2048 files, but you should probably learn that one the hard way...

You cannot open 2048 files, there is an operating system limit for open files

As far as I can see, you don't really need an array of 2048 separate ifstreams here at all. You only need one ifstream at any given time, so each iteration you close one file and open another. Destroying an ifstream closes the file automatically, so you can do something like this:
for (short i = 0; i < 2048; i++) {
itoa(i, fileName + 5, 10);
ifstream file(fileName, ios::in | ios::binary);
if (!file) {
cout << i << ".Bad open file" << endl;
}
if (!file.read((char*)perfile[i], 128*4)) {
cout << i << ". Bad read source file" << endl;
}
}

Related

Potential memory leaks in reading binary files

I would like to ask if this part of the code might suffer from memory leaks (I'm quite sure it does, but how severely?).
The "input" variable is a pointer to double, i.e. double* input. The reason I didn't use float (more compatible in this case) is because I wanted to maintain compatibility with other parts of the code.
else if (filetype == "BinaryFile")
{
char* memblock;
std::ifstream file(filename1, std::ios::binary | std::ios::in);
file.seekg(0, std::ios::end);
int size = file.tellg();
file.seekg(0, std::ios::beg);
std::cout << "Size=" << size << " [in bytes]"
<< "\n";
std::cout << "There are overall " << grid_points << "^3 = " << std::setprecision(10) << pow(grid_points, 3) << " values of the field1, written as float type.\n";
memblock = new char[size];
file.seekg(0, std::ios::beg);
file.read(memblock, size);
file.close();
float* values = (float*)memblock; //reinterpret as float, because the file was saved as float
for (int i = 0; i < grid_points * grid_points * grid_points; i++) {
input1[i] = (double)values[i]; //cast to double, since input1 is an array of doubles
}
file.close();
delete[] memblock;
}
The files that I need to work on are quite big, coming from cosmological simulations; for example one file is 4GB and the other could be 20 GB. I'm using the supercomputer infrastructure for that reason.
This kind of reading works for files that have 512^3 float values (e.x. density evaluated on points in a cube of side 512) but memory leaks happen for a file with 1024^3 entries.
I had thought I should delete[] the "values" array, but when I do that, I get even worse memory leaks, crashing my program even in the case where previously all was calculated correctly (512^3).
How could I improve on this code? I would have used the std::vector container but I had to use the FFTW library.
EDIT:
Following the suggestions in the comments, I have rewritten the reading part of the code as:
std::ifstream file(filename1,std::ios::binary);
std::vector<float> buf(pow(grid_points,3));
file.read(reinterpret_cast<char*>(buf.data()), buf.size()*sizeof(float));
std::copy_n(buf.begin(),pow(grid_points,3),input1);
Where I explicitly make use of the knowledge of how many elements there will be in the input1 array. No memory leaks occur now.

Writing/reading large vectors of data to binary file in c++

I have a c++ program that computes populations within a given radius by reading gridded population data from an ascii file into a large 8640x3432-element vector of doubles. Reading the ascii data into the vector takes ~30 seconds (looping over each column and each row), while the rest of the program only takes a few seconds. I was asked to speed up this process by writing the population data to a binary file, which would supposedly read in faster.
The ascii data file has a few header rows that give some data specs like the number of columns and rows, followed by population data for each grid cell, which is formatted as 3432 rows of 8640 numbers, separated by spaces. The population data numbers are mixed formats and can be just 0, a decimal value (0.000685648), or a value in scientific notation (2.687768e-05).
I found a few examples of reading/writing structs containing vectors to binary, and tried to implement something similar, but am running into problems. When I both write and read the vector to/from the binary file in the same program, it seems to work and gives me all the correct values, but then it ends with either a "segment fault: 11" or a memory allocation error that a "pointer being freed was not allocated". And if I try to just read the data in from the previously written binary file (without re-writing it in the same program run), then it gives me the header variables just fine but gives me a segfault before giving me the vector data.
Any advice on what I might have done wrong, or on a better way to do this would be greatly appreciated! I am compiling and running on a mac, and I don't have boost or other non-standard libraries at present. (Note: I am extremely new at coding and am having to learn by jumping in the deep end, so I may be missing a lot of basic concepts and terminology -- sorry!).
Here is the code I came up with:
# include <stdio.h>
# include <stdlib.h>
# include <string.h>
# include <fstream>
# include <iostream>
# include <vector>
# include <string.h>
using namespace std;
//Define struct for population file data and initialize one struct variable for reading in ascii (A) and one for reading in binary (B)
struct popFileData
{
int nRows, nCol;
vector< vector<double> > popCount; //this will end up having 3432x8640 elements
} popDataA, popDataB;
int main() {
string gridFname = "sample";
double dum;
vector<double> tempVector;
//open ascii population grid file to stream
ifstream gridFile;
gridFile.open(gridFname + ".asc");
int i = 0, j = 0;
if (gridFile.is_open())
{
//read in header data from file
string fileLine;
gridFile >> fileLine >> popDataA.nCol;
gridFile >> fileLine >> popDataA.nRows;
popDataA.popCount.clear();
//read in vector data, point-by-point
for (i = 0; i < popDataA.nRows; i++)
{
tempVector.clear();
for (j = 0; j<popDataA.nCol; j++)
{
gridFile >> dum;
tempVector.push_back(dum);
}
popDataA.popCount.push_back(tempVector);
}
//close ascii grid file
gridFile.close();
}
else
{
cout << "Population file read failed!" << endl;
}
//create/open binary file
ofstream ofs(gridFname + ".bin", ios::trunc | ios::binary);
if (ofs.is_open())
{
//write struct to binary file then close binary file
ofs.write((char *)&popDataA, sizeof(popDataA));
ofs.close();
}
else cout << "error writing to binary file" << endl;
//read data from binary file into popDataB struct
ifstream ifs(gridFname + ".bin", ios::binary);
if (ifs.is_open())
{
ifs.read((char *)&popDataB, sizeof(popDataB));
ifs.close();
}
else cout << "error reading from binary file" << endl;
//compare results of reading in from the ascii file and reading in from the binary file
cout << "File Header Values:\n";
cout << "Columns (ascii vs binary): " << popDataA.nCol << " vs. " << popDataB.nCol << endl;
cout << "Rows (ascii vs binary):" << popDataA.nRows << " vs." << popDataB.nRows << endl;
cout << "Spot Check Vector Values: " << endl;
cout << "Index 0,0: " << popDataA.popCount[0][0] << " vs. " << popDataB.popCount[0][0] << endl;
cout << "Index 3431,8639: " << popDataA.popCount[3431][8639] << " vs. " << popDataB.popCount[3431][8639] << endl;
cout << "Index 1600,4320: " << popDataA.popCount[1600][4320] << " vs. " << popDataB.popCount[1600][4320] << endl;
return 0;
}
Here is the output when I both write and read the binary file in the same run:
File Header Values:
Columns (ascii vs binary): 8640 vs. 8640
Rows (ascii vs binary):3432 vs.3432
Spot Check Vector Values:
Index 0,0: 0 vs. 0
Index 3431,8639: 0 vs. 0
Index 1600,4320: 25.2184 vs. 25.2184
a.out(11402,0x7fff77c25310) malloc: *** error for object 0x7fde9821c000: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6
And here is the output I get if I just try to read from the pre-existing binary file:
File Header Values:
Columns (binary): 8640
Rows (binary):3432
Spot Check Vector Values:
Segmentation fault: 11
Thanks in advance for any help!
When you write popDataA to the file, you are writing the binary representation of the vector of vectors. However this really is quite a small object, consisting of a pointer to the actual data (itself a series of vectors, in this case) and some size information.
When it's read back in to popDataB, it kinda works! But only because the raw pointer that was in popDataA is now in popDataB, and it points to the same stuff in memory. Things go crazy at the end, because when the memory for the vectors is freed, the code tries to free the data referenced by popDataA twice (once for popDataA, and once again for popDataB.)
The short version is, it's not a reasonable thing to write a vector to a file in this fashion.
So what to do? The best approach is to first decide on your data representation. It will, like the ASCII format, specify what value gets written where, and will include information about the matrix size, so that you know how large a vector you will need to allocate when reading them in.
In semi-pseudo code, writing will look something like:
int nrow=...;
int ncol=...;
ofs.write((char *)&nrow,sizeof(nrow));
ofs.write((char *)&ncol,sizeof(ncol));
for (int i=0;i<nrow;++i) {
for (int j=0;j<ncol;++j) {
double val=data[i][j];
ofs.write((char *)&val,sizeof(val));
}
}
And reading will be the reverse:
ifs.read((char *)&nrow,sizeof(nrow));
ifs.read((char *)&ncol,sizeof(ncol));
// allocate data-structure of size nrow x ncol
// ...
for (int i=0;i<nrow;++i) {
for (int j=0;j<ncol;++j) {
double val;
ifs.read((char *)&val,sizeof(val));
data[i][j]=val;
}
}
All that said though, you should consider not writing things into a binary file like this. These sorts of ad hoc binary formats tend to live on, long past their anticipated utility, and tend to suffer from:
Lack of documentation
Lack of extensibility
Format changes without versioning information
Issues when using saved data across different machines, including endianness problems, different default sizes for integers, etc.
Instead, I would strongly recommend using a third-party library. For scientific data, HDF5 and netcdf4 are good choices which address all of the above issues for you, and come with tools that can inspect the data without knowing anything about your particular program.
Lighter-weight options include the Boost serialization library and Google's protocol buffers, but these address only some of the issues listed above.

Issue Read/Write array of structure in binary file in c++

I want to write my array structure in to a binary file.
My structure
typedef struct student{
char name[15];
vector<int> grade;
}arr_stu;
I can write and read back my data if I write and read in the same program; but if I create another program for read data only and put the binary file, it does not work because the vector grade is null.
size = 0;
unable to read from memory
Program to write array structure to file
int main()
{
arr_stu stu[100];
for (size_t i = 0; i < 100; i++)
{
strcpy(stu[i].name, randomName());
for (size_t j = 0; j < 10; j++)
{
stu[i].grade.push_back(randomGrade());
}
}
ofstream outbal("class", ios::out | ios::binary);
if (!outbal) {
cout << "Cannot open file.\n";
return 1;
}
outbal.write((char *)&stu, sizeof(stu));
outbal.close();
}
Program to read array structure to file
int main(){
feature_struc stu[100];
ifstream inbal("class", ios::in | ios::binary);
if (!inbal) {
cout << "Cannot open file.\n";
return 1;
}
inbal.read((char *)&stu, sizeof(stu));
for (size_t idx = 0; idx < 100; idx++)
{
cout << "Name : " << stu[idx].name << endl;
for (size_t index = 0; index < 10; index++)
{
cout << endl << "test: " << stu[idx].grade[index] << endl;
}
}
inbal.close();
return 0;
}
For me it seems like the use of vector pose the problem,
The reason that if we combine the two in one program it work well I think because vector is saved in the memory so it can still accessible.
Any suggestions?
You cannot serialize a vector like that. The write and read functions access the memory at the given address directly. Since vector is a complex class type only parts of its data content are stored sequentially at its base address. Other parts (heap allocated memory etc) are located elsewhere. The simplest solution would be to write the length of the vector to the file followed by each of the values. You have to loop over the vector elements to accomplish that.
outbal.write((char *)&stu, sizeof(stu));
The sizeof is a compile-time constant. In other words, it never changes. If the vector contained 1, 10, 1000, or 1,000,000 items, you're writing the same number of bytes to the file. So this way of writing to the file is totally wrong.
The struct that you're writing is non-POD due to the vector being a non-POD type. This means you can't just treat it as a set of bytes that you can copy from or to. If you want further proof, open the file you created in any editor. Can you see the data from the vector in that file? What you will see is more than likely, gibberish.
To write the data to the file, you have to properly serialize the data, meaning you have to write the data to a file, not the struct itself. You write the data in a way so that when you read the data back, you can recreate the struct. Ultimately, this means you have to
Write the name to the file, and possibly the number of bytes the name consists of.
Write the number of items in the vector
Write each vector item to the file.
If not this, then some way where you can distinctly figure out the name and the vector's data from the file so that your code to read the data parses the file correctly and recreates the struct.
What is the format of the binary file? Basically, you have to
define the format, and then convert each element to and from
that format. You can never just dump the bits of an internal
representation to disk and expect to be able to read it back.
(The fact that you need a reinterpret_cast to call
ostream::write on your object should tell you something.)

saving/writing time to txt file in C++

I wanted to save the time into the existing txt file so that I can know when this particular record is added..I have this code to auto detect the time
time_t Now1;
struct tm * timeinfo;
char time1[20];
time(&Now1);
timeinfo = localtime(&Now1);
strftime(time1, 20, "%d/%m/%Y %H:%M", timeinfo);
Passports.Record_Added_On = time1;
cout << "\n\nRecord Added On: " << Passports.Record_Added_On;
code of reading the txt file:
fs = new fstream(Passports_FILE_NAME, ios::in | ios::out | ios::binary);
if (!fs)
{
cout << "\n Can't open or create '" << Passports_FILE_NAME << "' file" << "\n";
system("pause");
break;
}
recs_num = -1;
while (fs->read((char *)&Passports, sizeof(Passports)))
{
recs_num++;
if (Passports.ID == id && !Passports.Deleted)
break;
}
if (fs->eof()) //if (the record is not in the file || it's there but it's Deleted)
{
cout << "\nThe specific passports record does not exists in the file.";
closeFile(fs);
cout << "\n\nExit\n";
return 0;
}
it worked fine even when I display..however when I closed the program and open it again..
it shows weird characters like this or sometimes crashes..can anyone help me on this and explain what is the reason behind it?
You cannot initialize non POD objects ( like std::string class ) from memory. This leads to crashes because class does not know it is initialized. There is no guarantee that all class members where properly initialized from this code. Suppose that there was some class member which was pointer to some memory, allocated for you data. You saved it to file, then run your application again to load it. Then the pointer gets the same numeric value, but there is no more memory which he was pointing at. The memory must be allocated by OS, after OS function call, which was not called in our case. So next time you try to use this badly initialized class object - it crashed because it tries to work with memory which does not belongs to your application.
To fix this crash you should parse the file contents properly and then fill your Passports record accordingly.
For example:
int int_value;
std::string string_value;
fs >> int_value >> string_value;
after that you can initialize your object:
Passports.ID = int_value;
Passports.Name = string_value;
etc. The same way you should use to save data to the file ( this is called serialization, i recommend you search more on the topic ).

Delete/Modify an element in a struct array

I have a program that stores a "friends" info into a struct array and writes it to a file. No problem there. But how would I be able to modify and/or delete a specific element in that struct array? Did some reading and it says I can't, unless I was to shift it all over by one after deleting in.
So I'm assuming I need to read it in, then remove it, and shift all other elements over by one and write it again...but how would I do this? Tried to include only the necessary code I have so far.
For modifying it, I would guess I'd read it in, then ask for the specific element # I want to change, then set all the values in that element to null, then allow the user to input new info? How would this code look?
struct FriendList
{
char screenname[32];
char country[32];
char city[32];
char interests[32];
short age;
};
int main()
{
FriendList friends[num_friends];
const int num_friends = 2;
// Gets user input and puts it into struct array and writes to file
case 3:
{ // Getting info and putting in struct elements
for (index = 0; index < num_friends; index++)
{
// Create Friend Records
cout << "Enter Screename " << endl;
cin.ignore();
cin.getline(friends[index].screenname, 32);
cout << "Country: " << endl;
cin >> friends[index].country;
cout << "City: " << endl;
cin >> friends[index].city;
cout << "Age: " << endl;
cin >> friends[index].age;
}
counting += index;
fstream infile;
infile.open("friends.dat", ios::out | ios::binary |ios::app);
if(infile.fail())
{ cout << "File not found!\n\t";
// exit
}
// Writing struct to file
infile.write((char*)&friends, sizeof(friends));
infile.close();
break;
}
// Delete a friend ???
case 5:
{ // Reading in file contents into struct friends
// Then????
fstream outfile;
outfile.open("friends.dat", ios::in | ios::binary);
outfile.read((char*)&friends, sizeof(friends));
break;
}
Yes, It can modify member of the struct. But you don't clear memory at the first, you will see garages in friends.dat.
in upper of main, you have better to add memset().
memset(&friends, 0, sizeof(friends));
And you use ios::app . I guess that friends is full-set datas. Then, you should remove ios::app ?
BTW, in late about C++, most of c++er don't use binary file for like this case. :)
Change is relatively easy - just read the right entry, update it in memory and write back. In order to delete I suggest the following:
Read all the entries after the one u need to delete
Write those entries in the offset of the deleted entries
Truncate the fuile to the new length
This is the trivial approach
It sounds like you want either a std::deque or a std::vector depending on usage.
If you are deleting items infrequently, then use the std::vector instead of a fixed array:
std::vector<FriendList> friends;
to add new friends:
friends.push_back(newFriend);
accessing a friend by index is the same as accessing an array:
friends[index]
to delete an entry in the vector, use erase() (not remove()!):
friends.erase(friends.begin() + index)
You could make a method delete, which pulls the friend after the one you want to delete, moves the info to the current struct, and continues until there are no more friends.