I try to read a file into a map but the program stops in the middle of the file.
The file consists millionss of lines, each line is a STRING composed of numbers and an INT.
e.g. 1230981237120313 123.
#include<map>
#include<iostream>
#include<fstream>
void main ()
{
ifstream mapfile("filename.txt",ifstream::in);
int itemp;
string stemp;
map<string,int> mapping;
while(mapfile>>stemp>>itemp)
{
mapping[stemp]=itemp;
}
}
When it deals with small files with hundreds of lines, it is ok. But when it reaches more then 90 million lines, it stops without reporting any error and just stops with a "Press any key to continue...".
I've done some analysis and I can make sure the program stops after reading the line in the file and when it needs to do mapping[stemp]=itemp . And every time it stops, it happens at different lines but always around 90 million.
Could anyone tell me why this could happen?
Any help will be highly appreciated.
It is always advisable ***not to read entire file*** at once in memory as file size could vary from small kbs to bigger Mbs.
It's better to read in chunks say few fixed thousand bytes(say 4092) every time you read from file do your processing and close it.
Related
I am reading from a file into a 2-D array in c++. I used a do while loop together with the eof(). The problem I was facing was, the program continues to read empty spaces when the characters in the file are exhausted. I realized that it was because of the getline function, so I used the get function instead. The program compiles and runs successfully but upon reaching the point where the characters in the file is to be read,(i.e in my console), the programs stops suddenly and returns with this error code -1073741795 (0xc00000ID). I would like to know why though, my major concern is how I will be able to read from the file and after the characters in the file is exhausted, the program stops reading from the file (so that no empty spaces are read).
Here is the sample code:
int k=0;
do{
name_input.getline(sample[k],num_char);
k++;
} while (!name_input.eof());
num=k-1;
PS: The input file is in the form;
Mr. Clark Smith
John B. Doe
Joshua Clement Johnson
each on one line
in a .txt file (notepad), after the 3rd name is empty space which my program still reads.
Using eof() is almost always wrong (see link in comments above). This is how to write the loop
int k = 0;
while (name_input.getline(sample[k],num_char)) {
k++;
}
I have a file with 16,900,000 lines, each line contains 10 numbers (mix of int and floats). I'm trying to read this file line by line, modify each line slightly, and write to a series of new files. The code below works on a laptop running Windows Visa, but when I run it on a desktop running Windows 7, the output file does not contain all the data from the input file. The number of lines in the output file varies from 2500 to 40,000.
I've commented out all of the processing, and writing to files, and just write every 100th line to cout, the last line to print isn't the last line of the file.
// skipping code prior to the loop
// only including minimal code that reproduces the problem
ifstream infile((srcdir+filename).c_str());
string line;
int lcount=0;
while(getline(infile,line)){
if(line.find("#")==string::npos){
lcount++;
if(lcount%100==0){
printf("Generating tiles for %s: %d lines processed\n",filename.c_str(),lcount);
}
}
}
Questions:
Is there a maximum buffer size that I might be overflowing?
Can anyone see a problem with my code?
Is there any reason this would work fine on Windows Vista, but not on Windows 7?
In your question there is very little information. But I will put here my guesses:
I don't think lcount is in being incremented at the correct place in code. It get incremented only if '#' is not in the line (I'm guessing this is some sort of comment). But a line starting (o containing) '#' has to be counted too. So your code should be:
while(getline(infile,line)){
lcount++; // Increment every time.
if(line.find("#")==string::npos){
if(lcount%100==0){
printf("Generating tiles for %s: %d lines processed\n",filename.c_str(),lcount);
}
}
}
On the other hand, you should activate isftream exceptions in order to see whats going on, see here how to doit.
While trying to create a minimal full program that would reproduce the problem, I found the culprit.
I'm trying to secure the executable with the keylock dongle that my company is using. When I removed the check for the dongle, the entire file was read.
I am currently stuck in this problem that I do not have any idea to fix. It is regarding a previous question that I have asked here before. But I will reiterate again as I found out the problem but have no idea to fix it.
My program accesses a text file that is updated constantly every millisecond 24/7. It grabs the data line by line and does comparison on each of the line. If any thing is "amiss"(defined by me), then I log that data into a .csv file. This program can be run at timed intervals(user defined).
My problem is that this program works perfectly fine on my computer but yet it doesnt on my clients computer. I have debug the program and these are my findings. Below is my code that I have reduced as much possible to ease the explanation process
int result;
char ReadLogLine[100000] = "";
FILE *readLOG_fp;
CString LogPathName;
LogPathName = Source_Folder + "\\serco.log"; //Source_Folder is found in a .ini file. Value is C:\\phython25\\xyratex\\serco_logs
readLOG_fp = fopen(LogPathName, "r+t");
while ((result = fscanf(readLOG_fp, "%[^\n]\n", ReadLogLine)) != EOF) // Loops through the file till it reaches the end of file
{
Sort_Array(); // Here is a function to sort the different lines that I grabbed from the text file
Comp_State(); // I compare the lines here and store them into an array to be printed out
}
fclose(readLOG_fp);
GenerateCSV(); // This is my function to generate the csv and print everything out
In Sort_Array(), I sort the lines that I grab from the text file as they could be of different nature. For example,
CStringArray LineType_A, LineType_B, LineTypeC, LineTypeD;
if (ReadLogLine == "Example line a")
{
LineType_A.add(ReadLogLine);
}
else if (ReadLogLine == "Example line b")
{
LineType_B.add(ReadLogLine);
}
and so on.
In CompState(), I compare the different values within each LineType array to see if there are any difference. If it is different, then I store them into a seperate array to print. A simple example would be.
CStringArray PrintCSV_Array;
for (int i = 0; i <= LineType_A.GetUpperBound(); i++)
{
if (LineType_A.GetAt(0) == LineType_A.GetAt(1))
{
LineType_A.RemoveAt(0);
}
else
{
LineType_A.RemoveAt(0);
PrintCSV_Array.Add(LineType_A.GetAt(0);
}
}
This way I dont have an infinite amount of data in the array.
Now to the GenerateCSV function, it is just a normal function where I create a .csv file and print whatever I have in the PrintCSV_Array.
Now to the problem. In my client's computer, it seems to not print anything out to the CSV. I debugged the program and found out that it keeps failing here.
while ((result = fscanf(readLOG_fp, "%[^\n]\n", ReadLogLine)) != EOF) // Loops through the file till it reaches the end of file
{
Sort_Array(); // Here is a function to sort the different lines that I grabbed from the text file
Comp_State(); // I compare the lines here and store them into an array to be printed out
}
fclose(readLOG_fp);
It goes into the while loop fine as I did some error checking there in the actual program. The moment it goes into the while loop it breaks out of it suggesting to me it reach EOF for some reason. When that happens, the program has no chance to go into both the Sort_Array and Comp_State functions thus giving me a blank PrintCSV_Array and nothing to print out.
Things that I have checked is that
I definitely have access to the text file.
My thoughts were because the text file is updated every
millisecond, it may have been opened by the other program to write
into it and thus not giving me access OR the text file is always in
an fopen state therefore not saving any data in for me to read. I
tested this out and the program has value added in as I see the KB's
adding up in front of my eyes.
I tried to copy the text file and paste it somewhere else for my
program to read, this way I definitely have full access to it and
once I am done with it, Ill delete it. This gave me nothing to print
aswell.
Am I right to deduce that it is always giving me EOF thus this is
having problems.
while ((result = fscanf(readLOG_fp, "%[^\n]\n", ReadLogLine)) != EOF) // Loops through the file till it reaches the end of file
If yes, How do I fix this?? What other ways can I make it read every line. I have seriously exhausted all my ideas on this problem and need some help in this.
Thanks everyone for your help.
Error is very obvious ... you might have over looked it..
You forgot to open the file.
FILE *readLOG_fp;
CString LogPathName;
LogPathName = Source_Folder + "\\serco.log";
readLOG_fp = fopen(LogPathName.GetBuffer());
if(readLOG_fp==NULL)
cout<<"Error: opening file\n";
I need to store each lines of a text document into a vector. However any file text I try, the output is always 2 lines. First one is empty and second one always output: "DONE". I'm on Windows7 X64, using VC++2013.
I have been trying to solve this for many hours. I tried many different approach but the result stay the same. I suspect that "DONE" is the return value from getline() however I don't understand with my code is not working like it should.
int main() {
ifstream hFile("test.txt");
vector<string> lines;
string line;
while (std::getline(hFile, line))
lines.push_back(line);
cout << lines[1];
hFile.close();
getchar();
return 0;
}
EDIT: It works fine when executing the program from the compilation folder but not in the debug console of VC++...
The program looks mostly correct. The only problem is that your code assumes that there are, at least, two lines in your file: if there are few lines, e.g., just one or the files couldn't be opened, the statement
cout << lines[1];
result in undefined behavior. Did you mean to print each line of the file rather than just the second line?
From the description of the behavior I would suspect that you file either contains the string DONE or you are actually executing a different program!
Be careful, it proves nothing about the count of lines:
cout << lines[1];
Use line.size() to count the read lines. In fact for a file with one line, it's undefined behavior to access second item.
I am trying to write a function in my program that loads a huge text-file of 216,555 words and put them as strings into a set. This works properly, but as expected, it will hang for a few micro seconds while looping through the file. But there is something funky going on with my loop, and it's not behaving properly. Please take the time to read, I am sure there's a valid reason for this, but I have no idea what to search for.
The code, which is working by the way, is this:
ifstream dictionary;
dictionary.open("Dictionary.txt");
if(dictionary.fail())
{
cout<<"Could not find Dictionary.txt. Exiting."<<endl;
exit(0);
}
int i = 0;
int progress = 216555/50;
cout<<"Loading dictionary..."<<endl;
cout<<"< >"<<endl;
cout<<"<";
for(string line; getline(dictionary, line);)
{
usleep(1); //Explanation below (not the hangtime)
i++;
if(i%progress == 0)
cout<<"=";
words.insert(line);
}
The for-loop gets every string from the file, and inserts them in the map.
This is a console-application, and I want the user to see the progress. It's not much of a delay, but I wanted to do it anyway. If you don't understand the code, I'll try to explain.
When the program starts, it first prints out "Loading Dictionary...", and then a "<" and a ">" separated by 50 spaces. Then on the next line: "<" followed by a "=" for every 433 word it loops through (216555/50). The purpose of this is so the user can see the progress. The wanted output half way through the loop would be like this:
Loading dictionary...
< >
<=========================
My problem is:
The correct output is shown, but not at the expected time. It prints out the full progress bar, but that only after it has hanged and are done with the loop. How is that possible? The correct number of '=' are shown, but they all pop out at the same time, AFTER it hangs for some microseconds. I added the usleep(1) to make the hangtime a bit longer, but the same thing happened. The if-statement clearly works, or the '=' would've never showed at all, but it feels like my program is stacking the cout-calls for after the entire loop.
The weirdest thing of all, is that the last cout<<"<"; before the for-loop starts, is also shown at the same time as the rest of its line; after the loop is finished. Why and how?
You never flush the stream, so the output just goes into a buffer.
Change cout<<"="; to cout<<"="<<std::flush;
You need to flush the output stream.
cout << "=" << std::flush;
The program is "stacking the cout-calls". It's called output buffering, and every major operating system does it.
When in interactive mode (as your program is intended to be used), output is buffered by line; that is, it will be forced to the terminal when a newline is seen. You can also have block-buffered (fixed number of bytes between forced outputs; used when piping output) and unbuffered.
C++ provides the std::flush stream modifier to force an output at any point. It can be used like this:
cout << "=" << std::flush;
This will slow down your program a bit; the point of buffering is for efficiency. As you'll only be doing it about 51 times, though, the slowdown should be negligible.