Skip integers while reading text from file into array - c++

I am trying to create a program where I could read string data from a file and store it into an array, but I want to skip any integers or non-letters and not read them into the array. Any ideas on how to do that?
This is my code:
#include <iostream>
#include <stream>
#include <iomanip>
#include <cstdlib>
#include <algorithm>
#include <cctype>
#include <string>
using namespace std;
void loadData();
int main()
{
loadData();
return 0;
}
void loadData()
{
const int SIZE = 100;
string fileName;
std::string wordArray[SIZE];
cout << "Please enter the name of the text file you want to process followed by '.txt': " << endl;
cin >> fileName;
ifstream dataFile;
dataFile.open(fileName, ios::in);
if (dataFile.fail())
{
cerr << fileName << " could not be opened." << endl; //error message if file opening fails
exit(-1);
}
while (!dataFile.eof())
{
for (int i = 0; i < SIZE; i++)
{
dataFile >> wordArray[i];
for (std::string& s : wordArray) //this for loop transforms all the words in the text file into lowercase
std::transform(s.begin(), s.end(), s.begin(),
[](unsigned char c) { return std::tolower(c); });
cout << wordArray[i] << endl;
}
}
}

Use copy_if:
for (std::string& s : wordArray)
std::copy_if(s.begin(), s.end(), s.begin(),
[](char& c) { c = std::tolower(c); return std::isalpha(c); });
Note that this may not be the most efficient code.

This is a scenario where regexes can come in handy.
They do require forward iterators though, so you need to read in the whole file at once before extracting words.
#include <iostream>
#include <iterator>
#include <fstream>
#include <regex>
std::string read_whole_file(const std::string& file_name) {
std::ifstream file(file_name);
return {std::istreambuf_iterator<char>(file),
std::istreambuf_iterator<char>()};
}
int main()
{
// ...
auto file_data = read_whole_file(filename);
std::regex word_regex("(\\w+)");
auto words_begin =
std::sregex_iterator(file_data.begin(), file_data.end(), word_regex);
auto words_end = std::sregex_iterator();
for (auto i = words_begin; i != words_end; ++i) {
std::cout << "found word" << i->str() << '\n';
}
}

Related

Can't iterate through all the words in thr file.txt

I have a txt file which contains two txt file references ei. main.txt contains eg1.txt and eg2.txt and i have to access the content in them and find the occurences of every word and return a string with the word and the documents it was preasent in(0 being eg1.txt and 1 being eg2.txt). My program compiles but I can't get past the first word I encounter. It gives the right result (word: 0 1) since the word is preasent in both the files and in the first position but it doesn't return the other words. Could someone please help me find the error? Thank you
string func(string filename) {
map<string, set<int> > invInd;
string line, word;
int fileNum = 0;
ifstream list (filename, ifstream::in);
while (!list.eof()) {
string fileName;
getline(list, fileName);
ifstream input_file(fileName, ifstream::in); //function to iterate through file
if (input_file.is_open()) {
while (getline(input_file, line)) {
stringstream ss(line);
while (ss >> word) {
if (invInd.find(word) != invInd.end()) {
set<int>&s_ref = invInd[word];
s_ref.insert(fileNum);
}
else {
set<int> s;
s.insert(fileNum);
invInd.insert(make_pair<string, set<int> >(string(word) , s));
}
}
}
input_file.close();
}
fileNum++;
}
Basically your function works. It is a little bit complicated, but i works.
After removing some syntax errors, the main problem is, that you do return nothing from you function. There is also no output statement.
Let me show you you the corrected function which shows some output.
#include <string>
#include <map>
#include <iostream>
#include <fstream>
#include <set>
#include <sstream>
#include <utility>
using namespace std;
void func(string filename) {
map<string, set<int> > invInd;
string line, word;
int fileNum = 0;
ifstream list(filename, ifstream::in);
while (!list.eof()) {
string fileName;
getline(list, fileName);
ifstream input_file(fileName, ifstream::in); //function to iterate through file
if (input_file.is_open()) {
while (getline(input_file, line)) {
stringstream ss(line);
while (ss >> word) {
if (invInd.find(word) != invInd.end()) {
set<int>& s_ref = invInd[word];
s_ref.insert(fileNum);
}
else {
set<int> s;
s.insert(fileNum);
invInd.insert(make_pair(string(word), s));
}
}
}
input_file.close();
}
fileNum++;
}
// Show the output
for (const auto& [word, fileNumbers] : invInd) {
std::cout << word << " : ";
for (const int fileNumber : fileNumbers) std::cout << fileNumber << ' ';
std::cout << '\n';
}
return;
}
int main() {
func("files.txt");
}
This works, I tested it. But maybe you want to return the findings to your main function. Then you should write:
#include <string>
#include <map>
#include <iostream>
#include <fstream>
#include <set>
#include <sstream>
#include <utility>
using namespace std;
map<string, set<int> > func(string filename) {
map<string, set<int> > invInd;
string line, word;
int fileNum = 0;
ifstream list(filename, ifstream::in);
while (!list.eof()) {
string fileName;
getline(list, fileName);
ifstream input_file(fileName, ifstream::in); //function to iterate through file
if (input_file.is_open()) {
while (getline(input_file, line)) {
stringstream ss(line);
while (ss >> word) {
if (invInd.find(word) != invInd.end()) {
set<int>& s_ref = invInd[word];
s_ref.insert(fileNum);
}
else {
set<int> s;
s.insert(fileNum);
invInd.insert(make_pair(string(word), s));
}
}
}
input_file.close();
}
fileNum++;
}
return invInd;
}
int main() {
map<string, set<int>> data = func("files.txt");
// Show the output
for (const auto& [word, fileNumbers] : data) {
std::cout << word << " : ";
for (const int fileNumber : fileNumbers) std::cout << fileNumber << ' ';
std::cout << '\n';
}
}
Please enable C++17 in your compiler.
And please see below a brushed up solution. A little bit cleaner and compacter, with comments and better variable names.
#include <string>
#include <map>
#include <iostream>
#include <fstream>
#include <set>
#include <sstream>
#include <utility>
using WordFileIndicator = std::map<std::string, std::set<int>>;
WordFileIndicator getWordsWithFiles(const std::string& fileNameForFileLists) {
// Here will stor the resulting output
WordFileIndicator wordFileIndicator{};
// Open the file and check, if it could be opened
if (std::ifstream istreamForFileList{ fileNameForFileLists }; istreamForFileList) {
// File number Reference
int fileNumber{};
// Read all filenames from the list of filenames
for (std::string fileName{}; std::getline(istreamForFileList, fileName) and not fileName.empty();) {
// Open the files to read their content. Check, if the file could be opened
if (std::ifstream ifs{ fileName }; ifs) {
// Add word and associated file number to set
for (std::string word{}; ifs >> word; )
wordFileIndicator[word].insert(fileNumber);
}
else std::cerr << "\n*** Error: Could not open '" << fileName << "'\n\n";
// Continue with next file
++fileNumber;
}
}
else std::cerr << "\n*** Error: Could not open '" << fileNameForFileLists << "'\n\n";
return wordFileIndicator;
}
// Some test code
int main() {
// Get result. All words and in which file they exists
WordFileIndicator data = getWordsWithFiles("files.txt");
// Show the output
for (const auto& [word, fileNumbers] : data) {
std::cout << word << " : ";
for (const int fileNumber : fileNumbers) std::cout << fileNumber << ' ';
std::cout << '\n';
}
}
There would be a much faster solution by using std::unordered_map and std::unordered_set
Please make sure your code is composed from many small functions. This improves readability, it easier to reason what code does, in such form parts of code can be reused in alternative context.
Here is demo how it can looks like and why it is better to have small functions:
#include <algorithm>
#include <filesystem>
#include <fstream>
#include <iomanip>
#include <iostream>
#include <iterator>
#include <string>
#include <unordered_map>
#include <vector>
struct FileData
{
std::filesystem::path path;
int index;
};
bool operator==(const FileData& a, const FileData& b)
{
return a.index == b.index && a.path == b.path;
}
bool operator!=(const FileData& a, const FileData& b)
{
return !(a == b);
}
using WordLocations = std::unordered_map<std::string, std::vector<FileData>>;
template<typename T>
void mergeWordsFrom(WordLocations& loc, const FileData& fileData, T b, T e)
{
for (; b != e; ++b)
{
auto& v = loc[*b];
if (v.empty() || v.back() != fileData)
v.push_back(fileData);
}
}
void mergeWordsFrom(WordLocations& loc, const FileData& fileData, std::istream& in)
{
return mergeWordsFrom(loc, fileData, std::istream_iterator<std::string>{in}, {});
}
void mergeWordsFrom(WordLocations& loc, const FileData& fileData)
{
std::ifstream f{fileData.path};
return mergeWordsFrom(loc, fileData, f);
}
template<typename T>
WordLocations wordLocationsFromFileList(T b, T e)
{
WordLocations loc;
FileData fileData{{}, 0};
for (; b != e; ++b)
{
++fileData.index;
fileData.path = *b;
mergeWordsFrom(loc, fileData);
}
return loc;
}
WordLocations wordLocationsFromFileList(std::istream& in)
{
return wordLocationsFromFileList(std::istream_iterator<std::filesystem::path>{in}, {});
}
WordLocations wordLocationsFromFileList(const std::filesystem::path& p)
{
std::ifstream f{p};
f.exceptions(std::ifstream::badbit);
return wordLocationsFromFileList(f);
}
void printLocations(std::ostream& out, const WordLocations& locations)
{
for (auto& [word, filesData] : locations)
{
out << std::setw(10) << word << ": ";
for (auto& file : filesData)
{
out << std::setw(3) << file.index << ':' << file.path << ", ";
}
out << '\n';
}
}
int main()
{
auto locations = wordLocationsFromFileList("files.txt");
printLocations(std::cout, locations);
}
https://wandbox.org/permlink/nBbqYV986EsqvN3t

How can I use Russian language in std::wstring?

I am making a input stream from file to a vector of wstrings. There might be russian charcters.
If they were there, than after outputting the wstring to the console I am getting the empty line. If the charcters are from English alphabet or from punctuation marks, than every thing is alright. How to fix that?(I use linux)
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
void read_text(std::vector<std::wstring> &words)
{
std::wifstream input("input.txt");
if( !(input.is_open()) )
{
std::cout << "File was not open!" << std::endl;
throw -1;
}
std::wstring input_string;
input >> input_string;
while(input)
{
words.push_back(input_string);
input >> input_string;
}
input.close();
}
int main()
{
setlocale(LC_ALL, "ru_RU.UTF-8");
std::vector<std::wstring> words;
try { read_text(words); }
catch(int i) { return i; }
for (auto i : words)
{
std::wcout << i << std::endl;
}
return 0;
}

How to skip blank spaces when reading in a file c++

Here is the codeshare link of the exact input file: https://codeshare.io/5DBkgY
Ok, as you can see, ​there are 2 blank lines, (or tabs) between 8 and ROD. How would I skip that and continue with the program? I am trying to put each line into 3 vectors (so keys, lamp, and rod into one vector etc). Here is my code (but it does not skip the blank line).:
#include <string>
#include <iostream>
#include <sstream>
#include <vector>
#include <fstream>
using namespace std;
int main() {
ifstream objFile;
string inputName;
string outputName;
string header;
cout << "Enter image file name: ";
cin >> inputName;
objFile.open(inputName);
string name;
vector<string> name2;
string description;
vector<string> description2;
string initialLocation;
vector<string> initialLocation2;
string line;
if(objFile) {
while(!objFile.eof()){
getline(objFile, line);
name = line;
name2.push_back(name);
getline(objFile, line);
description = line;
description2.push_back(description);
getline(objFile, line);
initialLocation = line;
initialLocation2.push_back(initialLocation);
} else {
cout << "not working" << endl;
}
for (std::vector<string>::const_iterator i = name2.begin(); i != name2.end(); ++i)
std::cout << *i << ' ';
for (std::vector<string>::const_iterator i = description2.begin(); i != description2.end(); ++i)
std::cout << *i << ' ';
for (std::vector<string>::const_iterator i = initialLocation2.begin(); i != initialLocation2.end(); ++i)
std::cout << *i << ' ';
#include <cstddef> // std::size_t
#include <cctype> // std::isspace()
#include <string>
#include <vector>
#include <fstream>
#include <iostream>
bool is_empty(std::string const &str)
{
for (auto const &ch : str)
if (!std::isspace(static_cast<char unsigned>(ch)))
return false;
return true;
}
int main()
{
std::cout << "Enter image file name: ";
std::string filename;
std::getline(std::cin, filename); // at least on Windows paths containing whitespace
// are valid.
std::ifstream obj_file{ filename }; // define variables as close to where they're used
// as possible and use the ctors for initialization.
if (!obj_file.is_open()) { // *)
std::cerr << "Couldn't open \"" << filename << "\" for reading :(\n\n";
return EXIT_FAILURE;
}
std::vector<std::string> name;
std::vector<std::string> description;
std::vector<std::string> initial_location;
std::string line;
std::vector<std::string> *destinations[] = { &name, &description, &initial_location };
for (std::size_t i{}; std::getline(obj_file, line); ++i) {
if (is_empty(line)) { // if line only consists of whitespace
--i;
continue; // skip it.
}
destinations[i % std::size(destinations)]->push_back(line);
}
for (auto const &s : name)
std::cout << s << '\n';
for (auto const &s : description)
std::cout << s << '\n';
for (auto const &s : initial_location)
std::cout << s << '\n';
}
... initial_locations look like integers, though.
*) Better early exit if something bad happens. Instead of
if (obj_file) {
// do stuff
}
else {
// exit
}
-->
if(!obj_file)
// exit
// do stuff
makes your code easier to read and takes away one level of indentation for the most parts.

trying to add words from text file to a vector but keep getting thrown 'std::out_of_range'

trying to add words from this text file but keep getting thrown an out of range error. I think the error lies somehwere in the loops but havent been able to figure out why it isnt working. Help would be greatly appreciated
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std;
struct WordCount{
string word;
int count;
};
int main () {
vector<WordCount> eggsHam;
ifstream readFile ("NewTextDocument.txt");
int counter = 0;
int holder;
string lineRead;
WordCount word;
if(readFile.is_open()){
//add all the words into a vector
while (getline(readFile, lineRead)){
holder = counter;
for(int i = 0; i < lineRead.length(); ++i) {
if (lineRead.at(i) != ' ') {
++counter;
}
if (lineRead.at(i) != ' ') {
for (int k = 0; k < (counter - holder); ++k) {
word.word.at(k) = lineRead.at(holder + k);
}
eggsHam.push_back(word);
++counter;
}
}
}
readFile.close();
}
else cout << "Unable to open file";
return 0;
}
Your code is way to complicated. To read all words (=space-seperated thingies) into a std::vector<std::string> simply do:
#include <cstdlib>
#include <vector>
#include <string>
#include <iterator>
#include <fstream>
#include <iostream>
int main()
{
char const *filename = "test.txt";
std::ifstream is{ filename };
if (!is.is_open()) {
std::cerr << "Couldn't open \"" << filename << "\" for reading :(\n\n";
return EXIT_FAILURE;
}
std::vector<std::string> words{ std::istream_iterator<std::string>{ is },
std::istream_iterator<std::string>{} };
for (auto const &w : words)
std::cout << w << '\n';
}

Read and print a csv file with more than 2 column in c++ using multimap

I'm a beginner in c++ and required to write a c++ program to read and print a csv file like this.
DateTime,value1,value2
12/07/16 13:00,3.60,50000
14/07/16 20:00,4.55,3000
May I know how can I proceed with the programming?
I manage to get the date only via a simple multimap code.
I spent some time to make almost (read notice at the end) exact solution for you.
I assume that your program is a console application that receives the original csv-file name as a command line argument.
So see the following code and make required changes if you like:
#include <iostream>
#include <fstream>
#include <sstream>
#include <vector>
#include <map>
#include <string>
std::vector<std::string> getLineFromCSV(std::istream& str, std::map<int, int>& widthMap)
{
std::vector<std::string> result;
std::string line;
std::getline(str, line);
std::stringstream lineStream(line);
std::string cell;
int cellCnt = 0;
while (std::getline(lineStream, cell, ','))
{
result.push_back(cell);
int width = cell.length();
if (width > widthMap[cellCnt])
widthMap[cellCnt] = width;
cellCnt++;
}
return result;
}
int main(int argc, char * argv[])
{
std::vector<std::vector<std::string>> result; // table with data
std::map<int, int> columnWidths; // map to store maximum length (value) of a string in the column (key)
std::ifstream inpfile;
// check file name in the argv[1]
if (argc > 1)
{
inpfile.open(argv[1]);
if (!inpfile.is_open())
{
std::cout << "File " << argv[1] << " cannot be read!" << std::endl;
return 1;
}
}
else
{
std::cout << "Run progran as: " << argv[0] << " input_file.csv" << std::endl;
return 2;
}
// read from file stream line by line
while (inpfile.good())
{
result.push_back(getLineFromCSV(inpfile, columnWidths));
}
// close the file
inpfile.close();
// output the results
std::cout << "Content of the file:" << std::endl;
for (std::vector<std::vector<std::string>>::iterator i = result.begin(); i != result.end(); i++)
{
int rawLen = i->size();
for (int j = 0; j < rawLen; j++)
{
std::cout.width(columnWidths[j]);
std::cout << (*i)[j] << " | ";
}
std::cout << std::endl;
}
return 0;
}
NOTE: Your task is just to replace a vector of vectors (type std::vector<std::vector<std::string>> that are used for result) to a multimap (I hope you understand what should be a key in your solution)
Of course, there are lots of possible solutions for that task (if you open this question and look through the answers you will understand this).
First of all, I propose to consider the following example and to try make your task in the simplest way:
#include <iostream>
#include <sstream>
#include <vector>
#include <string>
using namespace std;
int main()
{
string str = "12/07/16 13:00,3.60,50000";
stringstream ss(str);
vector<string> singleRow;
char ch;
string s = "";
while (ss >> ch)
{
s += ch;
if (ss.peek() == ',' || ss.peek() == EOF )
{
ss.ignore();
singleRow.push_back(s);
s.clear();
}
}
for (vector<string>::iterator i = singleRow.begin(); i != singleRow.end(); i++)
cout << *i << endl;
return 0;
}
I think it can be useful for you.