Can't iterate through all the words in thr file.txt - c++

I have a txt file which contains two txt file references ei. main.txt contains eg1.txt and eg2.txt and i have to access the content in them and find the occurences of every word and return a string with the word and the documents it was preasent in(0 being eg1.txt and 1 being eg2.txt). My program compiles but I can't get past the first word I encounter. It gives the right result (word: 0 1) since the word is preasent in both the files and in the first position but it doesn't return the other words. Could someone please help me find the error? Thank you
string func(string filename) {
map<string, set<int> > invInd;
string line, word;
int fileNum = 0;
ifstream list (filename, ifstream::in);
while (!list.eof()) {
string fileName;
getline(list, fileName);
ifstream input_file(fileName, ifstream::in); //function to iterate through file
if (input_file.is_open()) {
while (getline(input_file, line)) {
stringstream ss(line);
while (ss >> word) {
if (invInd.find(word) != invInd.end()) {
set<int>&s_ref = invInd[word];
s_ref.insert(fileNum);
}
else {
set<int> s;
s.insert(fileNum);
invInd.insert(make_pair<string, set<int> >(string(word) , s));
}
}
}
input_file.close();
}
fileNum++;
}

Basically your function works. It is a little bit complicated, but i works.
After removing some syntax errors, the main problem is, that you do return nothing from you function. There is also no output statement.
Let me show you you the corrected function which shows some output.
#include <string>
#include <map>
#include <iostream>
#include <fstream>
#include <set>
#include <sstream>
#include <utility>
using namespace std;
void func(string filename) {
map<string, set<int> > invInd;
string line, word;
int fileNum = 0;
ifstream list(filename, ifstream::in);
while (!list.eof()) {
string fileName;
getline(list, fileName);
ifstream input_file(fileName, ifstream::in); //function to iterate through file
if (input_file.is_open()) {
while (getline(input_file, line)) {
stringstream ss(line);
while (ss >> word) {
if (invInd.find(word) != invInd.end()) {
set<int>& s_ref = invInd[word];
s_ref.insert(fileNum);
}
else {
set<int> s;
s.insert(fileNum);
invInd.insert(make_pair(string(word), s));
}
}
}
input_file.close();
}
fileNum++;
}
// Show the output
for (const auto& [word, fileNumbers] : invInd) {
std::cout << word << " : ";
for (const int fileNumber : fileNumbers) std::cout << fileNumber << ' ';
std::cout << '\n';
}
return;
}
int main() {
func("files.txt");
}
This works, I tested it. But maybe you want to return the findings to your main function. Then you should write:
#include <string>
#include <map>
#include <iostream>
#include <fstream>
#include <set>
#include <sstream>
#include <utility>
using namespace std;
map<string, set<int> > func(string filename) {
map<string, set<int> > invInd;
string line, word;
int fileNum = 0;
ifstream list(filename, ifstream::in);
while (!list.eof()) {
string fileName;
getline(list, fileName);
ifstream input_file(fileName, ifstream::in); //function to iterate through file
if (input_file.is_open()) {
while (getline(input_file, line)) {
stringstream ss(line);
while (ss >> word) {
if (invInd.find(word) != invInd.end()) {
set<int>& s_ref = invInd[word];
s_ref.insert(fileNum);
}
else {
set<int> s;
s.insert(fileNum);
invInd.insert(make_pair(string(word), s));
}
}
}
input_file.close();
}
fileNum++;
}
return invInd;
}
int main() {
map<string, set<int>> data = func("files.txt");
// Show the output
for (const auto& [word, fileNumbers] : data) {
std::cout << word << " : ";
for (const int fileNumber : fileNumbers) std::cout << fileNumber << ' ';
std::cout << '\n';
}
}
Please enable C++17 in your compiler.
And please see below a brushed up solution. A little bit cleaner and compacter, with comments and better variable names.
#include <string>
#include <map>
#include <iostream>
#include <fstream>
#include <set>
#include <sstream>
#include <utility>
using WordFileIndicator = std::map<std::string, std::set<int>>;
WordFileIndicator getWordsWithFiles(const std::string& fileNameForFileLists) {
// Here will stor the resulting output
WordFileIndicator wordFileIndicator{};
// Open the file and check, if it could be opened
if (std::ifstream istreamForFileList{ fileNameForFileLists }; istreamForFileList) {
// File number Reference
int fileNumber{};
// Read all filenames from the list of filenames
for (std::string fileName{}; std::getline(istreamForFileList, fileName) and not fileName.empty();) {
// Open the files to read their content. Check, if the file could be opened
if (std::ifstream ifs{ fileName }; ifs) {
// Add word and associated file number to set
for (std::string word{}; ifs >> word; )
wordFileIndicator[word].insert(fileNumber);
}
else std::cerr << "\n*** Error: Could not open '" << fileName << "'\n\n";
// Continue with next file
++fileNumber;
}
}
else std::cerr << "\n*** Error: Could not open '" << fileNameForFileLists << "'\n\n";
return wordFileIndicator;
}
// Some test code
int main() {
// Get result. All words and in which file they exists
WordFileIndicator data = getWordsWithFiles("files.txt");
// Show the output
for (const auto& [word, fileNumbers] : data) {
std::cout << word << " : ";
for (const int fileNumber : fileNumbers) std::cout << fileNumber << ' ';
std::cout << '\n';
}
}
There would be a much faster solution by using std::unordered_map and std::unordered_set

Please make sure your code is composed from many small functions. This improves readability, it easier to reason what code does, in such form parts of code can be reused in alternative context.
Here is demo how it can looks like and why it is better to have small functions:
#include <algorithm>
#include <filesystem>
#include <fstream>
#include <iomanip>
#include <iostream>
#include <iterator>
#include <string>
#include <unordered_map>
#include <vector>
struct FileData
{
std::filesystem::path path;
int index;
};
bool operator==(const FileData& a, const FileData& b)
{
return a.index == b.index && a.path == b.path;
}
bool operator!=(const FileData& a, const FileData& b)
{
return !(a == b);
}
using WordLocations = std::unordered_map<std::string, std::vector<FileData>>;
template<typename T>
void mergeWordsFrom(WordLocations& loc, const FileData& fileData, T b, T e)
{
for (; b != e; ++b)
{
auto& v = loc[*b];
if (v.empty() || v.back() != fileData)
v.push_back(fileData);
}
}
void mergeWordsFrom(WordLocations& loc, const FileData& fileData, std::istream& in)
{
return mergeWordsFrom(loc, fileData, std::istream_iterator<std::string>{in}, {});
}
void mergeWordsFrom(WordLocations& loc, const FileData& fileData)
{
std::ifstream f{fileData.path};
return mergeWordsFrom(loc, fileData, f);
}
template<typename T>
WordLocations wordLocationsFromFileList(T b, T e)
{
WordLocations loc;
FileData fileData{{}, 0};
for (; b != e; ++b)
{
++fileData.index;
fileData.path = *b;
mergeWordsFrom(loc, fileData);
}
return loc;
}
WordLocations wordLocationsFromFileList(std::istream& in)
{
return wordLocationsFromFileList(std::istream_iterator<std::filesystem::path>{in}, {});
}
WordLocations wordLocationsFromFileList(const std::filesystem::path& p)
{
std::ifstream f{p};
f.exceptions(std::ifstream::badbit);
return wordLocationsFromFileList(f);
}
void printLocations(std::ostream& out, const WordLocations& locations)
{
for (auto& [word, filesData] : locations)
{
out << std::setw(10) << word << ": ";
for (auto& file : filesData)
{
out << std::setw(3) << file.index << ':' << file.path << ", ";
}
out << '\n';
}
}
int main()
{
auto locations = wordLocationsFromFileList("files.txt");
printLocations(std::cout, locations);
}
https://wandbox.org/permlink/nBbqYV986EsqvN3t

Related

Find random word in c++ string

How would I pick a random word in a C++ string? Also is there a better way to handle importing the words? Just started learning C++ this week.
Current Code:
#include <iostream>
#include <fstream>
#include <string>
class WordList {
private:
std::ifstream m_file;
std::string m_wl;
public:
WordList(const char* f) {
m_file.open(f);
std::string str;
while (std::getline(m_file, str)) {
m_wl += str;
m_wl.push_back('\n');
}
}
bool isWord(std::string s) {
if (m_wl.find(s) != std::string::npos) {
return true;
} else {
return false;
}
}
};
int main() {
std::cout << "Generating" << std::endl;
WordList list("sgb-words.txt");
std::cout << list.isWord("hello") << std::endl;
}

Skip integers while reading text from file into array

I am trying to create a program where I could read string data from a file and store it into an array, but I want to skip any integers or non-letters and not read them into the array. Any ideas on how to do that?
This is my code:
#include <iostream>
#include <stream>
#include <iomanip>
#include <cstdlib>
#include <algorithm>
#include <cctype>
#include <string>
using namespace std;
void loadData();
int main()
{
loadData();
return 0;
}
void loadData()
{
const int SIZE = 100;
string fileName;
std::string wordArray[SIZE];
cout << "Please enter the name of the text file you want to process followed by '.txt': " << endl;
cin >> fileName;
ifstream dataFile;
dataFile.open(fileName, ios::in);
if (dataFile.fail())
{
cerr << fileName << " could not be opened." << endl; //error message if file opening fails
exit(-1);
}
while (!dataFile.eof())
{
for (int i = 0; i < SIZE; i++)
{
dataFile >> wordArray[i];
for (std::string& s : wordArray) //this for loop transforms all the words in the text file into lowercase
std::transform(s.begin(), s.end(), s.begin(),
[](unsigned char c) { return std::tolower(c); });
cout << wordArray[i] << endl;
}
}
}
Use copy_if:
for (std::string& s : wordArray)
std::copy_if(s.begin(), s.end(), s.begin(),
[](char& c) { c = std::tolower(c); return std::isalpha(c); });
Note that this may not be the most efficient code.
This is a scenario where regexes can come in handy.
They do require forward iterators though, so you need to read in the whole file at once before extracting words.
#include <iostream>
#include <iterator>
#include <fstream>
#include <regex>
std::string read_whole_file(const std::string& file_name) {
std::ifstream file(file_name);
return {std::istreambuf_iterator<char>(file),
std::istreambuf_iterator<char>()};
}
int main()
{
// ...
auto file_data = read_whole_file(filename);
std::regex word_regex("(\\w+)");
auto words_begin =
std::sregex_iterator(file_data.begin(), file_data.end(), word_regex);
auto words_end = std::sregex_iterator();
for (auto i = words_begin; i != words_end; ++i) {
std::cout << "found word" << i->str() << '\n';
}
}

strings in if statement

I am trying to write a code which lists all words used in a text file without repeating. I succeeded to list all the words but I always get repeating ,the if statement line 17 always gives the value of 0.I have no idea why , the words are listed properly in the vector. Any suggestion ?
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
class reading {
public:
string word;
vector<string> words;
};
int checkifexist(string word) {
reading readingobject;
bool exist = false;
for (int i = 0; i < readingobject.words.size(); i++) {
if (word == readingobject.words[i]) {
exist = true;
break;
}
}
return exist;
}
int main() {
reading readingobject;
ifstream inFile;
inFile.open("Book.txt");
if (inFile.fail()) {
cout << "file didn't open" << endl;
exit(1);
}
readingobject.word.resize(1);
while (!inFile.eof()) {
inFile >> readingobject.word;
if (checkifexist(readingobject.word) == 1)
continue;
cout << readingobject.word << endl;
readingobject.words.push_back(readingobject.word);
}
return 0;
}
Inside of checkifexist(), you are creating a new reading object, whose words vector is empty, so there is nothing for the loop to do, and the function returns 0.
You need to instead pass in the reading object from main() as an input parameter, eg:
#include <iostream>
#include <fstream>
#include <vector>
using namespace std;
class reading {
public:
vector<string> words;
};
bool checkifexist(const reading &readingobject, const string &word)
{
for (size_t i = 0; i < readingobject.words.size(); ++i) {
if (word == readingobject.words[i]) {
return true;
}
}
return false;
/* alternatively:
return (std::find(readingobject.words.begin(), readingobject.words.end(), word) != readingobject.words.end());
*/
}
int main()
{
reading readingobject;
string word;
ifstream inFile;
inFile.open("Book.txt");
if (!inFile) {
cout << "file didn't open" << endl;
return 1;
}
while (inFile >> word) {
if (checkifexist(readingobject, word))
continue;
cout << word << endl;
readingobject.words.push_back(word);
}
return 0;
}
Alternatively, when it comes to tracking unique elements, you can use a std::set instead of a std::vector, eg:
#include <iostream>
#include <fstream>
#include <set>
using namespace std;
class reading {
public:
set<string> words;
};
int main()
{
reading readingobject;
string word;
ifstream inFile;
inFile.open("Book.txt");
if (!inFile) {
cout << "file didn't open" << endl;
return 1;
}
while (inFile >> word) {
if (readingobject.words.insert(word).second)
cout << word << endl;
}
return 0;
}

How to skip blank spaces when reading in a file c++

Here is the codeshare link of the exact input file: https://codeshare.io/5DBkgY
Ok, as you can see, ​there are 2 blank lines, (or tabs) between 8 and ROD. How would I skip that and continue with the program? I am trying to put each line into 3 vectors (so keys, lamp, and rod into one vector etc). Here is my code (but it does not skip the blank line).:
#include <string>
#include <iostream>
#include <sstream>
#include <vector>
#include <fstream>
using namespace std;
int main() {
ifstream objFile;
string inputName;
string outputName;
string header;
cout << "Enter image file name: ";
cin >> inputName;
objFile.open(inputName);
string name;
vector<string> name2;
string description;
vector<string> description2;
string initialLocation;
vector<string> initialLocation2;
string line;
if(objFile) {
while(!objFile.eof()){
getline(objFile, line);
name = line;
name2.push_back(name);
getline(objFile, line);
description = line;
description2.push_back(description);
getline(objFile, line);
initialLocation = line;
initialLocation2.push_back(initialLocation);
} else {
cout << "not working" << endl;
}
for (std::vector<string>::const_iterator i = name2.begin(); i != name2.end(); ++i)
std::cout << *i << ' ';
for (std::vector<string>::const_iterator i = description2.begin(); i != description2.end(); ++i)
std::cout << *i << ' ';
for (std::vector<string>::const_iterator i = initialLocation2.begin(); i != initialLocation2.end(); ++i)
std::cout << *i << ' ';
#include <cstddef> // std::size_t
#include <cctype> // std::isspace()
#include <string>
#include <vector>
#include <fstream>
#include <iostream>
bool is_empty(std::string const &str)
{
for (auto const &ch : str)
if (!std::isspace(static_cast<char unsigned>(ch)))
return false;
return true;
}
int main()
{
std::cout << "Enter image file name: ";
std::string filename;
std::getline(std::cin, filename); // at least on Windows paths containing whitespace
// are valid.
std::ifstream obj_file{ filename }; // define variables as close to where they're used
// as possible and use the ctors for initialization.
if (!obj_file.is_open()) { // *)
std::cerr << "Couldn't open \"" << filename << "\" for reading :(\n\n";
return EXIT_FAILURE;
}
std::vector<std::string> name;
std::vector<std::string> description;
std::vector<std::string> initial_location;
std::string line;
std::vector<std::string> *destinations[] = { &name, &description, &initial_location };
for (std::size_t i{}; std::getline(obj_file, line); ++i) {
if (is_empty(line)) { // if line only consists of whitespace
--i;
continue; // skip it.
}
destinations[i % std::size(destinations)]->push_back(line);
}
for (auto const &s : name)
std::cout << s << '\n';
for (auto const &s : description)
std::cout << s << '\n';
for (auto const &s : initial_location)
std::cout << s << '\n';
}
... initial_locations look like integers, though.
*) Better early exit if something bad happens. Instead of
if (obj_file) {
// do stuff
}
else {
// exit
}
-->
if(!obj_file)
// exit
// do stuff
makes your code easier to read and takes away one level of indentation for the most parts.

trying to add words from text file to a vector but keep getting thrown 'std::out_of_range'

trying to add words from this text file but keep getting thrown an out of range error. I think the error lies somehwere in the loops but havent been able to figure out why it isnt working. Help would be greatly appreciated
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
struct WordCount{
string word;
int count;
};
int main () {
vector<WordCount> eggsHam;
ifstream readFile ("NewTextDocument.txt");
int counter = 0;
int holder;
string lineRead;
WordCount word;
if(readFile.is_open()){
//add all the words into a vector
while (getline(readFile, lineRead)){
holder = counter;
for(int i = 0; i < lineRead.length(); ++i) {
if (lineRead.at(i) != ' ') {
++counter;
}
if (lineRead.at(i) != ' ') {
for (int k = 0; k < (counter - holder); ++k) {
word.word.at(k) = lineRead.at(holder + k);
}
eggsHam.push_back(word);
++counter;
}
}
}
readFile.close();
}
else cout << "Unable to open file";
return 0;
}
Your code is way to complicated. To read all words (=space-seperated thingies) into a std::vector<std::string> simply do:
#include <cstdlib>
#include <vector>
#include <string>
#include <iterator>
#include <fstream>
#include <iostream>
int main()
{
char const *filename = "test.txt";
std::ifstream is{ filename };
if (!is.is_open()) {
std::cerr << "Couldn't open \"" << filename << "\" for reading :(\n\n";
return EXIT_FAILURE;
}
std::vector<std::string> words{ std::istream_iterator<std::string>{ is },
std::istream_iterator<std::string>{} };
for (auto const &w : words)
std::cout << w << '\n';
}