I have a file with 16,900,000 lines, each line contains 10 numbers (mix of int and floats). I'm trying to read this file line by line, modify each line slightly, and write to a series of new files. The code below works on a laptop running Windows Visa, but when I run it on a desktop running Windows 7, the output file does not contain all the data from the input file. The number of lines in the output file varies from 2500 to 40,000.
I've commented out all of the processing, and writing to files, and just write every 100th line to cout, the last line to print isn't the last line of the file.
// skipping code prior to the loop
// only including minimal code that reproduces the problem
ifstream infile((srcdir+filename).c_str());
string line;
int lcount=0;
while(getline(infile,line)){
if(line.find("#")==string::npos){
lcount++;
if(lcount%100==0){
printf("Generating tiles for %s: %d lines processed\n",filename.c_str(),lcount);
}
}
}
Questions:
Is there a maximum buffer size that I might be overflowing?
Can anyone see a problem with my code?
Is there any reason this would work fine on Windows Vista, but not on Windows 7?
In your question there is very little information. But I will put here my guesses:
I don't think lcount is in being incremented at the correct place in code. It get incremented only if '#' is not in the line (I'm guessing this is some sort of comment). But a line starting (o containing) '#' has to be counted too. So your code should be:
while(getline(infile,line)){
lcount++; // Increment every time.
if(line.find("#")==string::npos){
if(lcount%100==0){
printf("Generating tiles for %s: %d lines processed\n",filename.c_str(),lcount);
}
}
}
On the other hand, you should activate isftream exceptions in order to see whats going on, see here how to doit.
While trying to create a minimal full program that would reproduce the problem, I found the culprit.
I'm trying to secure the executable with the keylock dongle that my company is using. When I removed the check for the dongle, the entire file was read.
Related
I am reading from a file into a 2-D array in c++. I used a do while loop together with the eof(). The problem I was facing was, the program continues to read empty spaces when the characters in the file are exhausted. I realized that it was because of the getline function, so I used the get function instead. The program compiles and runs successfully but upon reaching the point where the characters in the file is to be read,(i.e in my console), the programs stops suddenly and returns with this error code -1073741795 (0xc00000ID). I would like to know why though, my major concern is how I will be able to read from the file and after the characters in the file is exhausted, the program stops reading from the file (so that no empty spaces are read).
Here is the sample code:
int k=0;
do{
name_input.getline(sample[k],num_char);
k++;
} while (!name_input.eof());
num=k-1;
PS: The input file is in the form;
Mr. Clark Smith
John B. Doe
Joshua Clement Johnson
each on one line
in a .txt file (notepad), after the 3rd name is empty space which my program still reads.
Using eof() is almost always wrong (see link in comments above). This is how to write the loop
int k = 0;
while (name_input.getline(sample[k],num_char)) {
k++;
}
I am currently stuck in this problem that I do not have any idea to fix. It is regarding a previous question that I have asked here before. But I will reiterate again as I found out the problem but have no idea to fix it.
My program accesses a text file that is updated constantly every millisecond 24/7. It grabs the data line by line and does comparison on each of the line. If any thing is "amiss"(defined by me), then I log that data into a .csv file. This program can be run at timed intervals(user defined).
My problem is that this program works perfectly fine on my computer but yet it doesnt on my clients computer. I have debug the program and these are my findings. Below is my code that I have reduced as much possible to ease the explanation process
int result;
char ReadLogLine[100000] = "";
FILE *readLOG_fp;
CString LogPathName;
LogPathName = Source_Folder + "\\serco.log"; //Source_Folder is found in a .ini file. Value is C:\\phython25\\xyratex\\serco_logs
readLOG_fp = fopen(LogPathName, "r+t");
while ((result = fscanf(readLOG_fp, "%[^\n]\n", ReadLogLine)) != EOF) // Loops through the file till it reaches the end of file
{
Sort_Array(); // Here is a function to sort the different lines that I grabbed from the text file
Comp_State(); // I compare the lines here and store them into an array to be printed out
}
fclose(readLOG_fp);
GenerateCSV(); // This is my function to generate the csv and print everything out
In Sort_Array(), I sort the lines that I grab from the text file as they could be of different nature. For example,
CStringArray LineType_A, LineType_B, LineTypeC, LineTypeD;
if (ReadLogLine == "Example line a")
{
LineType_A.add(ReadLogLine);
}
else if (ReadLogLine == "Example line b")
{
LineType_B.add(ReadLogLine);
}
and so on.
In CompState(), I compare the different values within each LineType array to see if there are any difference. If it is different, then I store them into a seperate array to print. A simple example would be.
CStringArray PrintCSV_Array;
for (int i = 0; i <= LineType_A.GetUpperBound(); i++)
{
if (LineType_A.GetAt(0) == LineType_A.GetAt(1))
{
LineType_A.RemoveAt(0);
}
else
{
LineType_A.RemoveAt(0);
PrintCSV_Array.Add(LineType_A.GetAt(0);
}
}
This way I dont have an infinite amount of data in the array.
Now to the GenerateCSV function, it is just a normal function where I create a .csv file and print whatever I have in the PrintCSV_Array.
Now to the problem. In my client's computer, it seems to not print anything out to the CSV. I debugged the program and found out that it keeps failing here.
while ((result = fscanf(readLOG_fp, "%[^\n]\n", ReadLogLine)) != EOF) // Loops through the file till it reaches the end of file
{
Sort_Array(); // Here is a function to sort the different lines that I grabbed from the text file
Comp_State(); // I compare the lines here and store them into an array to be printed out
}
fclose(readLOG_fp);
It goes into the while loop fine as I did some error checking there in the actual program. The moment it goes into the while loop it breaks out of it suggesting to me it reach EOF for some reason. When that happens, the program has no chance to go into both the Sort_Array and Comp_State functions thus giving me a blank PrintCSV_Array and nothing to print out.
Things that I have checked is that
I definitely have access to the text file.
My thoughts were because the text file is updated every
millisecond, it may have been opened by the other program to write
into it and thus not giving me access OR the text file is always in
an fopen state therefore not saving any data in for me to read. I
tested this out and the program has value added in as I see the KB's
adding up in front of my eyes.
I tried to copy the text file and paste it somewhere else for my
program to read, this way I definitely have full access to it and
once I am done with it, Ill delete it. This gave me nothing to print
aswell.
Am I right to deduce that it is always giving me EOF thus this is
having problems.
while ((result = fscanf(readLOG_fp, "%[^\n]\n", ReadLogLine)) != EOF) // Loops through the file till it reaches the end of file
If yes, How do I fix this?? What other ways can I make it read every line. I have seriously exhausted all my ideas on this problem and need some help in this.
Thanks everyone for your help.
Error is very obvious ... you might have over looked it..
You forgot to open the file.
FILE *readLOG_fp;
CString LogPathName;
LogPathName = Source_Folder + "\\serco.log";
readLOG_fp = fopen(LogPathName.GetBuffer());
if(readLOG_fp==NULL)
cout<<"Error: opening file\n";
I need to store each lines of a text document into a vector. However any file text I try, the output is always 2 lines. First one is empty and second one always output: "DONE". I'm on Windows7 X64, using VC++2013.
I have been trying to solve this for many hours. I tried many different approach but the result stay the same. I suspect that "DONE" is the return value from getline() however I don't understand with my code is not working like it should.
int main() {
ifstream hFile("test.txt");
vector<string> lines;
string line;
while (std::getline(hFile, line))
lines.push_back(line);
cout << lines[1];
hFile.close();
getchar();
return 0;
}
EDIT: It works fine when executing the program from the compilation folder but not in the debug console of VC++...
The program looks mostly correct. The only problem is that your code assumes that there are, at least, two lines in your file: if there are few lines, e.g., just one or the files couldn't be opened, the statement
cout << lines[1];
result in undefined behavior. Did you mean to print each line of the file rather than just the second line?
From the description of the behavior I would suspect that you file either contains the string DONE or you are actually executing a different program!
Be careful, it proves nothing about the count of lines:
cout << lines[1];
Use line.size() to count the read lines. In fact for a file with one line, it's undefined behavior to access second item.
I try to read a file into a map but the program stops in the middle of the file.
The file consists millionss of lines, each line is a STRING composed of numbers and an INT.
e.g. 1230981237120313 123.
#include<map>
#include<iostream>
#include<fstream>
void main ()
{
ifstream mapfile("filename.txt",ifstream::in);
int itemp;
string stemp;
map<string,int> mapping;
while(mapfile>>stemp>>itemp)
{
mapping[stemp]=itemp;
}
}
When it deals with small files with hundreds of lines, it is ok. But when it reaches more then 90 million lines, it stops without reporting any error and just stops with a "Press any key to continue...".
I've done some analysis and I can make sure the program stops after reading the line in the file and when it needs to do mapping[stemp]=itemp . And every time it stops, it happens at different lines but always around 90 million.
Could anyone tell me why this could happen?
Any help will be highly appreciated.
It is always advisable ***not to read entire file*** at once in memory as file size could vary from small kbs to bigger Mbs.
It's better to read in chunks say few fixed thousand bytes(say 4092) every time you read from file do your processing and close it.
This is a sub-problem of a bigger problem I have posted before. Following is a code snippet from a C++ package I am trying to work with. The intent is to write the values in the float array prob_estimates to the output file. For some seemingly random lines, only some of the values of the array are written. When can that happen? How should I debug it?
int j;
predict_label = predict_probability(model_,x,prob_estimates);
fprintf(output,"%g",predict_label);
for(j=0;j<model_->nr_class;j++) {
fprintf(output," %g",prob_estimates[j]);
fflush(output);
}
fprintf(output,"\n");
I also want to point out that this seems to happen only when the input size is fairly huge. This is a part of a bigger loop which runs per line of an input file (with about 200,000 lines). The prob_estimates array has 500 values per line. The output file writes less than 500 values for some 20-odd lines in the output file.
I ran this a couple of times on a subset (with 20,000 lines) and everything seemed fine.
Update: I tried checking the return value of fprintf after each attempted write and turns out it returns -1 for a lot of lines, when trying to write to the output.
fprintf encountered error at 19th value of line 2109359. Returned -1
fprintf encountered error at 373th value of line 2109359. Returned -1
fprintf encountered error at 229th value of line 2109360. Returned -1
fprintf encountered error at 87th value of line 2109361. Returned -1
fprintf encountered error at 439th value of line 2109361. Returned -1
This is when I modified the above code as follows:
for(j=0;j<model_->nr_class;j++) {
int e = fprintf(output," %g",prob_estimates[j]);
if (e < 0) {
printf("fprintf encountered error at %dth value of line %d. Returned %d",j ,count ,e); }
}
Here count is a variable that counts the number of line. It is incremented at the top of the outer loop (not shown here).
What can I do to figure out why fprintf returns -1?
A few things you could do:
print everything also to console, to see if the problem is in file output or in another place
print model_->nr_class to make sure the number of values is what you expect
Check the output file only after it is closed. Although you fflush output, it might be that other places update the file and don't fflush it. fclose would. I suggest that instead of flushing the file each line, open it in append mode in the beginning of the function, and close it in the end.
Hope this helps
Now you've found that fprintf is returning an error, you need to check the value of errno after the failing call to find out what the actual error cause is.