Retrieving data from binary file, non-sense characters - c++

I am trying to retrieve some data from a binary file to put them in a linked list, here's my code to write to the file:
void Pila::memorizzafile()
{
int contatore = 0;
puntarec temp = puntatesta;
ofstream miofile;
miofile.open("data.dat" , ios::binary | ios::out);
if(!miofile) cerr << "errore";
else
{
while(temp)
{
temp->elem.writes(miofile);
contatore++;
temp = temp->next;
}
//I go back at the beginning of the file to write how many elements I have
miofile.seekp(0, ios::beg);
miofile.write((const char *)&contatore , sizeof(int));
miofile.close();
}
}
And the function writes:
void Fiche::writes(ofstream &miofile)
{
//Valore.
miofile.write((const char *)&Valore,sizeof(int));
//Materiale, I write the dimension of the string.
int buff = strlen(Materiale);
miofile.write((const char *)&buff,sizeof(int));
//Writing the string
miofile.write(Materiale,buff*sizeof(char));
//Dimension of Forma
buff = strlen(Forma);
miofile.write((const char*)&buff,sizeof(int));
//The string itself
miofile.write(Forma,buff*sizeof(char));
//Dimension of Colore.
buff = strlen(Colore);
miofile.write((const char*)&buff,sizeof(int));
//The string
miofile.write(Colore,buff*sizeof(char));
}
Now for the reading part, I am trying to make a constructor which should be able to read directly from the file, here it is:
Pila::Pila(char * nomefile)
{
puntatesta = 0;
int contatore = 0;
ifstream miofile;
miofile.open(nomefile , ios::binary | ios::in);
if(!miofile) cerr << "errore";
else
{
//I read how many records are stored in the file
miofile.read((char*)&contatore,sizeof(int));
Fish temp;
for(int i = 0; i < contatore; i++)
{
temp.reads(miofile);
push(temp);
}
miofile.close();
}
}
And the reading function:
void Fiche::reads(ifstream &miofile)
{
//I read the Valore
miofile.read((char*)&Valore,sizeof(int));
//I create a temporary char *
char * buffer;
int dim = 0;
//I read how long will be the string
miofile.read((char*)&dim,sizeof(int));
buffer = new char[dim];
miofile.read(buffer,dim);
//I use the set function I created to copy the buffer to the actual member char*
setMateriale(buffer);
delete [] buffer;
//Now it pretty much repeats itself for the other stuff
miofile.read((char*)&dim,sizeof(int));
buffer = new char[dim];
miofile.read(buffer,dim);
setForma(buffer);
delete [] buffer;
//And again.
miofile.read((char*)&dim,sizeof(int));
buffer = new char[dim];
miofile.read(buffer,dim);
setColore(buffer);
delete [] buffer;
}
The code doesn't give me any error, but on the screen I read random characters and not even remotely close to what I wrote on my file. Anyone could help me out, please?
EDIT:
As requested here's an example of input&output:
Fiche A("boh" , 4 , "no" , "gaia");
Fiche B("Marasco" , 3 , "boh" , "nonnt");
Fiche C("Valori" , 6 , "asd" , "hey");
Fiche D("TipO" , 7 , "lol" , "nonloso");
Pila pila;
pila.push(A);
pila.push(B);
pila.push(C);
pila.push(D);
pila.stampa();
pila.memorizzafile();
And:
Pila pila("data.dat");
pila.stampa();

This is probably your error:
//I go back at the beginning of the file to write how many elements I have
miofile.seekp(0, ios::beg);
miofile.write((const char *)&contatore , sizeof(int));
miofile.close();
By seeking to the beginning and then writing. You are overwriting part of the first object.
I think your best bet is to run through the list and count the elements first. Write this then proceed to write all the elements. It will probably be faster anyway (but you can time it to make sure).
I think you are using way to many C structures to hold things.
Also I would advice against a binary format unless you are saving huge amounts of information. A text format (for your data) is probably going to be just as good and will be human readable so you can look at the file and see what is wrong.

Related

How to check whether ifstream is end of file in C++

I need to read all blocks of one large file(about 10GB) sequentially, the file contains many floats with a few strings, like this(each item splited by '\n'):
6.292611
-1.078219E-266
-2.305673E+065
sod;eiwo
4.899747e-237
1.673940e+089
-4.515213
I read MAX_NUM_PER_FILE items each time and process them and write to another file, but i don't know when the ifstream is ended.
Here is my code:
ifstream file_input(path_input); //my file is a text file, but i tried both text and binary mode, both failed.
if(file_input)
{
file_input.seekg(0,file_input.end);
unsigned long long length = file_input.tellg(); //get file size
file_input.seekg(0,file_input.beg);
char * buffer = new char [MAX_NUM_PER_FILE+MAX_NUM_PER_LINE];
int i=1,j;
char c,tmp[3];
while(file_input.tellg()<length)
{
file_input.read(buffer,MAX_NUM_PER_FILE);
j=MAX_NUM_PER_FILE;
while(file_input.get(c)&&c!='\n')
buffer[j++]=c; //get a complete item
//process with buffer...
itoa(i++,tmp,10); //int2char
string out_name="out"+string(tmp)+".txt";
ofstream file_output(out_name);
file_output.write(buffer,j);
file_output.close();
}
file_input.close();
delete[] buffer;
}
My code goes wrong, length is bigger than real file size. I have tried file_input.good() or !file_input.eof(), they didn't work, getline(file_input,s) is good, but it is much slower than read, i want read, but i don't know how to check whether ifstream is end-of-file.
I do my work in WINDOWS 7 with VS2010.
I have searched, but there are not any answer about it, How to open a file using ifstream and keep reading it until the end this link can't answer my question.
Update, Problem solved
Hi everyone, I have figured it out that it's my fault. Both while(file_input.tellg()<length) and while(file_input.peek()!=EOF) work fine! while(file_input.peek()!=EOF) is recommended.
The extra items written after the end-of-file is the left items in buffer written in the last time.
Here is the correct code:
ifstream file_input(path_input);
if(file_input)
{
//file_input.seekg(0,file_input.end);
//unsigned long long length = file_input.tellg(); //get file size
//file_input.seekg(0,file_input.beg);
char * buffer = new char [MAX_NUM_PER_FILE+MAX_NUM_PER_LINE];
int i=1,j;
char c,tmp[3];
while(file_input.peek()!=EOF)
{
memset(buffer,0,sizeof(char)*(MAX_NUM_PER_FILE+MAX_NUM_PER_LINE)); //clear first!
file_input.read(buffer,MAX_NUM_PER_FILE);
j=MAX_NUM_PER_FILE;
while(file_input.get(c)&&c!='\n')
buffer[j++]=c;
itoa(i++,tmp,10);//int2char
string out_name="out"+string(tmp)+".txt";
ofstream file_output(out_name);
file_output.write(buffer,strlen(buffer)); //use the correct buffer size instead of j
file_output.close();
}
file_input.close();
delete[] buffer;
}
while( file_input.peek() != EOF )
{
// code
}
Basically peek() will read the next char without extracting it.
So you can simply compare it to EOF.

Reading from a large text file into a structure array in Qt?

I have to read a text file into a array of structures.I have already written a program but it is taking too much time as there are about 13 lac structures in the file.
Please suggest me the best possible and fastest way to do this in C++.
here is my code:
std::ifstream input_counter("D:\\cont.txt");
/**********************************************************/
int counter = 0;
while( getline(input_counter,line) )
{
ReadCont( line,&contract[counter]); // function to read data to structure
counter++;
line.clear();
}
input_counter.close();
keep your 'parsing' as simple as possible: where you know the field' format apply the knowledge, for instance
ReadCont("|PE|1|0|0|0|0|1|1||2|0||2|0||3|0|....", ...)
should apply fast char to integer conversion, something like
ReadCont(const char *line, Contract &c) {
if (line[1] == 'P' && line[2] == 'E' && line[3] == '|') {
line += 4;
for (int field = 0; field < K_FIELDS_PE; ++field) {
c.int_field[field] = *line++ - '0';
assert(*line == '|');
++line;
}
}
well, beware to details, but you got the idea...
I would use Qt entirely in this case.
struct MyStruct {
int Col1;
int Col2;
int Col3;
int Col4;
// blabla ...
};
QByteArray Data;
QFile f("D:\\cont.txt");
if (f.open(QIODevice::ReadOnly)) {
Data = f.readAll();
f.close();
}
MyStruct* DataPointer = reinterpret_cast<MyStruct*>(Data.data());
// Accessing data
DataPointer[0] = ...
DataPointer[1] = ...
Now you have your data and you can access it as array.
In case your data is not binary and you have to parse it first you will need a conversion routine. For example if you read csv file with 4 columns:
QVector<MyStruct> MyArray;
QString StringData(Data);
QStringList Lines = StringData.split("\n"); // or whatever new line character is
for (int i = 0; i < Lines.count(); i++) {
String Line = Lines.at(i);
QStringList Parts = Line.split("\t"); // or whatever separator character is
if (Parts.count() >= 4) {
MyStruct t;
t.Col1 = Parts.at(0).toInt();
t.Col2 = Parts.at(1).toInt();
t.Col3 = Parts.at(2).toInt();
t.Col4 = Parts.at(3).toInt();
MyArray.append(t);
} else {
// Malformed input, do something
}
}
Now your data is parsed and in MyArray vector.
As user2617519 says, this can be made faster by multithreading. I see that you are reading each line and parsing it. Put these lines in a queue. Then let different threads pop them off the queue and parse the data into structures.
An easier way to do this (without the complication of multithreading) is to split the input data file into multiple files and run an equal number of processes to parse them. The data can then be merged later.
QFile::readAll() may cause a memory problem and std::getline() is slow (as is ::fgets()).
I faced a similar problem where I needed to parse very large delimited text files in a QTableView. Using a custom model, I parsed the file to find the offsets to the start of a each line. Then when data is needed to display in the table I read the line and parse it on demand. This results in a lot of parsing, but that is actually fast enough to not notice any lag in scrolling or update speed.
It also has the added benefit of low memory usage as I do not read the file contents into memory. With this strategy nearly any size file is possible.
Parsing code:
m_fp = ::fopen(path.c_str(), "rb"); // open in binary mode for faster parsing
if (m_fp != NULL)
{
// read the file to get the row pointers
char buf[BUF_SIZE+1];
long pos = 0;
m_data.push_back(RowData(pos));
int nr = 0;
while ((nr = ::fread(buf, 1, BUF_SIZE, m_fp)))
{
buf[nr] = 0; // null-terminate the last line of data
// find new lines in the buffer
char *c = buf;
while ((c = ::strchr(c, '\n')) != NULL)
{
m_data.push_back(RowData(pos + c-buf+1));
c++;
}
pos += nr;
}
// squeeze any extra memory not needed in the collection
m_data.squeeze();
}
RowData and m_data are specific to my implementation, but they are simply used to cache information about a row in the file (such as the file position and number of columns).
The other performance strategy I employed was to use QByteArray to parse each line, instead of QString. Unless you need unicode data, this will save time and memory:
// optimized line reading procedure
QByteArray str;
char buf[BUF_SIZE+1];
::fseek(m_fp, rd.offset, SEEK_SET);
int nr = 0;
while ((nr = ::fread(buf, 1, BUF_SIZE, m_fp)))
{
buf[nr] = 0; // null-terminate the string
// find new lines in the buffer
char *c = ::strchr(buf, '\n');
if (c != NULL)
{
*c = 0;
str += buf;
break;
}
str += buf;
}
return str.split(',');
If you need to split each line with a string, rather than a single character, use ::strtok().

Split a File and put it back together in c++

I want to copy a file by reading blocks of data, sending it and than put it back together again. Sending is not part of the problem, so I left it out in the code. It should work with any type of file and arbitrary piece_lengths.
This is just a pre-stage. In the end data block should not be chosen sequentially but at random. There could be some time between receiving another block of data.
I know the example just makes sense if size % piece_length != 0.
I'm getting crashed files of the same size as the original file at the other end.
Does anyone see the problem?
int main ()
{
string file = "path/test.txt"
string file2 = "path2/test.txt";
std::ifstream infile (file.c_str() ,std::ifstream::binary);
//get size of file
infile.seekg (0,infile.end);
long size = infile.tellg();
infile.seekg (0);
size_t piece_length = 5;
for (int i = 0; i < ((size / piece_length) + 1); i++)
{
if ( i != (size / piece_length))
{
std::ifstream infile (file.c_str() ,std::ifstream::binary);
infile.seekg((i * piece_length) , infile.beg);
char* buffer = new char[piece_length];
infile.read(buffer, piece_length);
infile.close();
std::ofstream outfile (file2.c_str() ,std::ofstream::binary);
outfile.seekp((i * piece_length), outfile.beg);
outfile.write(buffer, piece_length);
outfile.close();
}
else
{
std::ifstream infile (file.c_str() ,std::ifstream::binary);
infile.seekg((i * piece_length) , infile.beg);
char* buffer = new char[size % piece_length];
infile.read(buffer, size % piece_length);
infile.close();
std::ofstream outfile (file2.c_str() ,std::ofstream::binary);
outfile.seekp((i * piece_length), outfile.beg);
outfile.write(buffer, size % piece_length);
outfile.close();
}
}
return 0;
}
To answer your specific question, you need to open outfile with ios::in | ios::out in the flags, otherwise it defaults to write-only mode and destroys what was already in the file. See this answer for more details: Write to the middle of an existing binary file c++
You may want to consider the following though:
If you are just writing parts to the end of the file, just use ios::app (append). Don't even need to seek.
You don't need to keep reopening infile or even outfile, just reuse them.
You can also reuse buffer. Please remember to delete them, or better yet use a std::vector.

Can't write an integer into a binary file C++

This is basically the part of the code that i used to store the entire file, and works well ... but when i tryed to store a integer bigger than 120 or something like that the program writes seems like a bunch of trash and not the integer that i want. Any tips ? I am an college student and dont have a clue whats happening.
int* temp
temp = (int*) malloc (sizeof(int));
*temp = atoi( it->valor[i].c_str() );
//Writes the integer in 4 bytes
fwrite(temp, sizeof (int), 1, arq);
if( ferror(arq) ){
printf("\n\n Error \n\n");
exit(1);
}
free(temp);
I've already checked the atoi part and it really returns the number that I want to write.
I changed and added some code and it works fine:
#include <iostream>
using namespace std;
int main()
{
int* temp;
FILE *file;
file = fopen("file.bin" , "rb+"); // Opening the file using rb+ for writing
// and reading binary data
temp = (int*) malloc (sizeof(int));
*temp = atoi( "1013" ); // replace "1013" with your string
//Writes the integer in 4 bytes
fwrite(temp, sizeof (int), 1, file);
if( ferror(file) ){
printf("\n\n Error \n\n");
exit(1);
}
free(temp);
}
Make sure you are opening the file with the correct parameters, and that the string you give to atoi(str) is correct.
I checked the binary file using hex editor, after inputting the number 1013.
int i = atoi("123");
std::ofstream file("filename", std::ios::bin);
file.write(reinterpret_cast<char*>(&i), sizeof(i));
Do not use pointers here.
Never use malloc / free in C++.
Use C++ file streams, not C streams.

Access violation error with the new command

I am working on an assignment for my GUI programming class, in which we are to make a windows program that displays the contents of a file in hexadecimal. I have a class that holds the text and creates the hex in string format.
I'm attempting to create an array of character arrays to store each line for output. However, when I use new to create the array of character pointers, I get an access violation error.
I've done some searching, but haven't had any luck finding the answer.
The class has these member variables:
char* fileText;
char** Lines;
int numChars;
int numLines;
bool fileCopied;
My constructor:
Text::Text(char* fileName){ //load and copy file.
fileText = NULL;
Lines = NULL;
fileCopied = ExtractText(fileName);
if ( fileCopied ) {
CreateHex();
}//endif
}//end constructor
ExtractText loads the file given to the constructor, and copies it into a large string.
bool Text::ExtractText(char fileName[]){
char buffer = '/0'; //buffer for text transfer
numChars = 0; //initialize numLines
ifstream fin( fileName, ios::in|ios::out ); //load file stream
if ( !fin ) { //return false if the file fails to load
return false;
}//endif
while ( !fin.eof() ) { //count the lines in the file
fin.get(buffer);
numChars++;
}//endwh
fileText = new char[numLines]; //create an array of strings, one for each line in the file.
fin.clear(); //clear the eof flag
fin.seekg(0, ios::beg); //move the get pointer back to the start of the file.
for ( int i = 0; i < numChars; i++ ) { //copy the text from the file into the string array.
fin.get(fileText[i]);
}//endfr
fileText[numChars-1] = '\0';
fin.close();
numLines = (numChars % 16 == 0) ? (numChars/16) : (numChars/16 + 1);
return true;
}//end fun ExtractText
Then comes the problem code. In the CreateHex function, the first line is where try to create the array of character pointers.
void Text::CreateHex(){
Lines = new char*[numLines];
As soon as the program runs that line of code, that's when I get the access violation. I'm not really sure what the problem is, because I've used that exact same method before in a previous program. The only difference was the name of pointer. I'm using Borland C++ 5.02 if that makes any difference. It's not my first choice in compilers, but its what our teacher wants us to use.
When you execute the line
fileText = new char[numLines]
The variable numLines has not yet been initialized. As a member variable, it's initialized to 0, so you are allocating an empty array for fileText.