I have written a program that processes text files one at a time and extract relevant information. My program works well with some of the text files and not others. There is no obvious difference between the files that run seamlessly through my program and those that don't.
As far as the problematic files are concerned:
the program opens the file
it reads in and processes a good chunk of the lines one at a time as it should
But then it reaches a problem line and gives the error message:
"Debug Assertion Failed File:
f:/dd/vctools/crt_bld/self_x86/src/isctype.c
Line: 56
Expression: (unsigned)(c+1) <= 256"
When I enter the debugger mode the problem seems to arise from the "while(tokenScanner)" loop in my code below. I pulled up the content of the problem line being processed and compared that across a couple of problem files and I found that the Assertion Failure message pops up at </li> where the last token being processed is ">". It's not clear to me why this is a problem. This particular token in the original text file is contiguous with <li in the form </li><li. Therefore the scanner is having trouble half way throught this string.
Any thoughts on why this is and how I can fix this? Any advice would be much appreciated!
Here is the relevant portion of my code:
#include <string>
#include <iostream>
#include <fstream> //to get data from files
#include "filelib.h"
#include "console.h"
#include "tokenScanner.h"
#include "vector.h"
#include "ctype.h"
#include "math.h"
using namespace std;
/*Prototype Function*/
void evaluate(string expression);
Vector<string> myVectorOfTokens; //will store the tokens
Vector<string> myFileNames;
/*Main Program*/
int main() {
/*STEP1 : Creating a vector of the list of file names
to iterate over for processing*/
ifstream infile; //declaring variable to refer to file list
string catchFile = promptUserForFile(infile, "Input file:");
string line; //corresponds to the lines in the master file containing the list files
while(getline(infile, line)){
myFileNames.add(line);
}
/* STEP 2: Iterating over the file names contained in the vector*/
int countFileOpened=0; //keeps track of number of opened files
for (int i=1; i< myFileNames.size(); i++){
myVectorOfTokens.clear(); //resetting the vector of tokens for each new file
string fileName;
string line2;
ifstream inFile;
fileName= myFileNames[i];
inFile.open(fileName.c_str()); //open file convert c_str
if (inFile){
while(getline(inFile, line2)){
evaluate(line2);
}
}
inFile.close();
countFileOpened++;
}
return 0;
}
/*Function for Extracting the Biographer Name*/
void evaluate(string line){
/*Creating a Vector of Tokens From the Text*/
TokenScanner scanner(line); //the constructor
while (scanner.hasMoreTokens()){
string token=scanner.nextToken();
myVectorOfTokens.add(token);
}
}
while(!inFile.eof()
is just wrong (in almost any case)
while(getline(inFile, line2))
evaluate(line2);
is better
Related
i have a question, how to separate one file .txt into 3 files based on the keywords using c++. so each keyword has it's own sentence. so each new sub file contains keywords with their respective sentences. i have tried to show it on console, and it works, but i can't separate the text by it's keywords.
so i have a file. every sentence in this file
so I have a file. There are many sentences here. So, every sentence starts with the words error, warning, and information. how to separate each sentence starting with each of these words, and make them 3 separate files
can you help me please?
i've tried this code, and its failed.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main(){
ifstream myFile;
string data,output,buffer, line;
bool isData = false;
myFile.open("try.txt");
while(getline(myFile, buffer)){
if (buffer == "Error"){
getline(myFile,buffer);
cout<< buffer <<endl;
}
}
cin.get();
return 0;
}
I have a text file full of names:
smartgem
marshbraid
seamore
stagstriker
meadowbreath
hydrabrow
startrack
wheatrage
caskreaver
seaash
I want to code a random name generator that will copy a specific line from the.txt file and return it.
While reading in from a file you must start from the beginning and continue on. My best advice would be to read in all of the names, store them in a set, and randomly access them that way if you don't have stringent concerns over efficiency.
You cannot pick a random string from the end of the file without first reading up that name in the file.
You may also want to look at fseek() which will allow you to "jump" to a location within the input stream. You could randomly generate an offset and then provide that as an argument to fseek().
http://www.cplusplus.com/reference/cstdio/fseek/
You cannot do that unless you do one of two things:
Generate an index for that file, containing the address of each line, then you can go straight to that address and read it. This index can be stored in many different ways, the easiest one being on a separate file, this way the original file can still be considered a text file, or;
Structure the file so that each line starts at a fixed distance in bytes of each other, so you can just go to the line you want by multiplying (desired index * size). This does not mean the texts on each line need to have the same length, you can pad the end of the line with null-terminators (character '\0'). In this case it is not recommended to work this file as a text file anymore, but a binary file instead.
You can write a separate program that will generate this index or generate the structured file for your main program to use.
All this of course, considering you want the program to run and read the line without having to load the entire file in memory first. If your program will constantly read lines from the file, you should probably just load the entire file into a std::vector<std::string> and then read the lines at will from there.
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <cstdlib>
#include <ctime>
using namespace std;
int main()
{
string filePath = "test.txt";
vector<std::string> qNames;
ifstream openFile(filePath.data());
if (openFile.is_open())
{
string line;
while (getline(openFile, line))
{
qNames.push_back(line.c_str());
}
openFile.close();
}
if (!qNames.empty())
{
srand((unsigned int)time(NULL));
for (int i = 0; i < 10; i++)
{
int num = rand();
int linePos = num % qNames.size();
cout << qNames.at(linePos).c_str() << endl;
}
}
return 0;
}
I have a data set with headers and data below those headers. How do I get c++ to read the first line of actual data (which starts on the 3rd row) and keep reading until the file ends?
I know you have to use a while loop and '++' on some declared variable, but I'm not sure how to.
Here is a screenshot of the data file: enter image description here
Just read the first line into a dummy variable first before your while loop
How to read line by line or a whole text file at once?
#include <fstream>
#include <string>
int main()
{
std::ifstream file("Read.txt");
std::string str;
std::getline(file, str); // read a line, as dummy read
while (std::getline(file, str)) // keep reading till end of file
{
// Process str
}
}
I want write code to find words in a file and replace words.
I open file, next I find word. I have a problem with replace words.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
string contain_of_file,a="car";
string::size_type position;
ifstream NewFile;
NewFile.open("plik1.txt");
while(NewFile.good())
{
getline(NewFile, contain_of_file);
position=contain_of_file.find("Zuzia");
if(position!=string::npos)
{
NewFile<<contain_of_file.replace(position,5, a );
}
}
NewFile.close();
cin.get();
return 0;
}
How can I improve my code?
lose the using namespace std;
don't declare the variables before needed;
I think the English word you were looking for was content -- but I am not an English-native speaker;
getline already returns NewFile.good() in boolean context;
No need to close NewFile explicitly;
I would change the casing on the NewFile variable;
I don't think you can write to an ifstream, and you ought to manage how you are going to replace the contents of the file...
My version would be like:
#include <iostream>
#include <fstream>
#include <string>
#include <cstdio>
int main() {
std::rename("plik1.txt", "plik1.txt~");
std::ifstream old_file("plik1.txt~");
std::ofstream new_file("plik1.txt");
for( std::string contents_of_file; std::getline(old_file, contents_of_file); ) {
std::string::size_type position = contents_of_file.find("Zuzia");
if( position != std::string::npos )
contents_of_file = contents_of_file.replace(position, 5, "car");
new_file << contents_of_file << '\n';
}
return 0;
}
There are at least two issues with your code:
1. Overwriting text in a file.
2. Writing to an ifstream (the i is for input, not output).
The File object
Imagine a file as many little boxes that contain characters. The boxes are glued front to back in an endless line.
You can take letters out of boxes and put into other boxes, but since they are glued, you can't put new boxes between existing boxes.
Replacing Text
You can replace text in a file as long as the replacement text is the same length as the original text. If the text is too long, you overwrite existing text. If the replacement text is shorter, you have residual text in the file. Not good in either method.
To replace (overwrite) the text, open the file as fstream and use the ios::in and ios::out modes.
Input versus Output
The common technique for replacing text is to open the original file for *i*nput and a new file as *o*utput.
Copy the existing data, up to your target text, to the new file.
Copy the replacement text to the new file.
Copy any remaining text to the new file.
Close all files.
I've never used dirent.h before. I was using istringstream to read through text files (singular), but have needed to try to revise the program to read in multiple text files in a directory. This is where I tried implementing dirent, but it's not working.
Maybe I can't use it with the stringstream? Please advise.
I've taken out the fluffy stuff that I'm doing with the words for readability. This was working perfectly for one file, until I added the dirent.h stuff.
#include <cstdlib>
#include <iostream>
#include <string>
#include <sstream> // for istringstream
#include <fstream>
#include <stdio.h>
#include <dirent.h>
void main(){
string fileName;
istringstream strLine;
const string Punctuation = "-,.;:?\"'!##$%^&*[]{}|";
const char *commonWords[] = {"AND","IS","OR","ARE","THE","A","AN",""};
string line, word;
int currentLine = 0;
int hashValue = 0;
//// these variables were added to new code //////
struct dirent *pent = NULL;
DIR *pdir = NULL; // pointer to the directory
pdir = opendir("documents");
//////////////////////////////////////////////////
while(pent = readdir(pdir)){
// read in values line by line, then word by word
while(getline(cin,line)){
++currentLine;
strLine.clear();
strLine.str(line);
while(strLine >> word){
// insert the words into a table
}
} // end getline
//print the words in the table
closedir(pdir);
}
You should be using int main() and not void main().
You should be error checking the call to opendir().
You will need to open a file instead of using cin to read the contents of the file. And, of course, you will need to ensure that it is closed appropriately (which might be by doing nothing and letting a destructor do its stuff).
Note that the file name will be a combination of the directory name ("documents") and the file name returned by readdir().
Note too that you should probably check for directories (or, at least, for "." and "..", the current and parent directories).
The book "Ruminations on C++" by Andrew Koenig and Barbara Moo has a chapter that discusses how to wrap the opendir() family of functions in C++ to make them behave better for a C++ program.
Heather asks:
What do I put in getline() instead of cin?
The code at the moment reads from standard input, aka cin at the moment. That means that if you launch your program with ./a.out < program.cpp, it will read your program.cpp file, regardless of what it finds in the directory. So, you need to create a new input file stream based on the file you've found with readdir():
while (pent = readdir(pdir))
{
...create name from "documents" and pent->d_name
...check that name is not a directory
...open the file for reading (only) and check that it succeeded
...use a variable such as fin for the file stream
// read in values line by line, then word by word
while (getline(fin, line))
{
...processing of lines as before...
}
}
You probably can get away with just opening the directories since the first read operation (via getline()) will fail (but you should probably arrange to skip the . and .. directory entries based on their name). If fin is a local variable in the loop, then when the outer loop cycles around, fin will be destroyed, which should close the file.