I would like to store a dictionary in a vector of lists. Each lists contains all words that have the same starting letter in the alphabet. (e. g. ananas, apple)
My problem is that I cannot read any words starting with "z" in my const char* array into the list.
Could someone explain to me why and how to fix this/ Is there a way to realize it with const char*? Thank you!
#include <iostream>
#include <list>
#include <vector>
#include <iterator>
#include <algorithm>
#include <string>
#include <fstream>
std::pair<bool, std::vector<std::list<std::string>> > loadwithList()
{
const char* prefix = "abcdefghijklmnopqrstuvwxyz";
std::vector<std::list<std::string>> dictionary2;
std::ifstream infile("/Users/User/Desktop/Speller/Dictionaries/large", std::ios::in);
if (infile.is_open())
{
std::list<std::string> data;
std::string line;
while (std::getline(infile, line))
{
if (line.starts_with(*prefix) && *prefix != '\0')
{
data.push_front(line);
}
else
{
dictionary2.push_back(data);
data.clear();
prefix++;
}
}
infile.close();
return std::make_pair(true, dictionary2);
}
std::cout << "Cant find file\n";
return std::make_pair(false, dictionary2);
}
int main()
{
auto [loaded, dictionary2] = loadwithList();
if (!loaded) return 1;
}
Answer is already given and problems are explained.
Basically you would need a double nested loop. Outer loop would read word by word, inner loop would check a mtach for each of the characters in "prefix". This will be a lot of looping . . .
And somehow not efficient. It would be better to take a std::mapfor storing the data in the first place. And if you really need a std::vectorof std::lists, then we can copy the data. We will take care to store only lowercase alpha characters as the key of the std::map.
For test purposes I loaded a list with words from here. There are roundabout 450'000 words in this list.
I used this for my demo program.
Please see below one potential solution proposal:
#include <iostream>
#include <fstream>
#include <map>
#include <list>
#include <vector>
#include <utility>
#include <string>
#include <cctype>
std::pair<bool, std::vector<std::list<std::string>> > loadwithList() {
std::vector<std::list<std::string>> data{};
bool resultOK{};
// Open File and check, if it could be opened
if (std::ifstream ifs{ "r:\\words.txt" }; ifs) {
// Here we will store the dictionary
std::map<char, std::list<std::string>> dictionary{};
// Fill dictionary. Read complete file and sort according to firstc character
for (std::string line{}; std::getline(ifs, line); )
// Store only alpha letters and words
if (not line.empty() and std::isalpha(line.front()))
// Use lower case start character for map. All all words starting with that character
dictionary[std::tolower(line.front())].push_back(line);
// Reserve space for resulting vector
data.reserve(dictionary.size());
// Move result to vector
for (auto& [letter, words] : dictionary)
data.push_back(std::move(words));
// All good
resultOK = true;
}
else
std::cerr << "\n\n*** Error: Could not open source file\n\n";
// And give back the result
return { resultOK , data };
}
int main() {
auto [result, data] = loadwithList();
if ( result)
for (const std::list<std::string>&wordList : data)
std::cout << (char)std::tolower(wordList.front().front()) << " has " << wordList.size() << "\twords\n";
}
You loose the first word of each letter after 'a'. This is because when you reach a word of the next letter, the if(line.starts_with(*prefix) && *prefix != '\0') fails and only then you go to the next letter but also go to the next word.
You loose the whole letter 'z' because after the last line in your file - the if(line.starts_with(*prefix) && *prefix != '\0') has succeeded at this point - the while (std::getline(infile, line)) terminates and you miss the dictionary2.push_back(data);.
Related
I'm trying to read every odd line into pair.first and every even line to pair.second and put it into a vector of pairs. I'd like to have a player and his overall in one pair in a vector.
Here's my code :
while(getline(players, player) && players >> overall)
{
pick_players.push_back(make_pair(player, overall));
}
Unfortunately it reads only the first two lines and the rest of vector output are just zeros. player is a string and overall is a integer, players is my fstream file name.
It's not a good idea to interleave std::getline and operator>> for reading an input stream. Instead, you'd better stick to using std::getline twice here.
Something like this would do:
Loop the input stream until you can't read two lines (use std::getline for this).
Push back each reading into a std::vector<std::pair<std::string, int>>.
If there's a parsing error on the overall line, just continue looping.
[Demo]
#include <charconv> // from_chars
#include <fmt/core.h>
#include <fmt/ranges.h>
#include <sstream> // istringstream
#include <system_error> // errc
#include <utility> // pair
#include <vector>
int main() {
std::istringstream iss{
"player1\n"
"1\n"
"player2\n"
"2ddd\n"
"player3\n"
"3 \n"
};
std::vector<std::pair<std::string, int>> players{};
for (std::string player{}, overall_str{};
getline(iss, player) and getline(iss, overall_str);) {
int overall{};
auto [ptr, ec] = std::from_chars(
overall_str.data(), overall_str.data() + overall_str.size(), overall);
if (ec == std::errc{}) {
players.push_back({player, overall});
} else {
fmt::print("Error parsing overall line: {}\n", overall_str);
continue;
}
}
fmt::print("{}\n", players);
}
// Outputs:
//
// [("player1", 1), ("player2", 2), ("player3", 3)]
You can strengthen the parsing of the overall line by:
trimming the input string, and
checking std::from_chars used all of the input string to convert to a number.
[Demo]
#include <boost/algorithm/string.hpp>
...
boost::algorithm::trim(overall_str);
...
if (ec == std::errc{} and ptr == overall_str.data() + overall_str.size()) {
// Outputs:
//
// Error parsing overall line: 2ddd
// [("player1", 1), ("player3", 3)]
I am trying to erase a string from a text file. To do this, I want to read the file into a vector, then I want to search for the position of this string, so I can use vector::erase to remove it. After the string has been erased from the vector, I can write the vector into a new file.
So far, I have made all of that, but finding the position of the string. I've found all sorts of solutions using < algorithm > 's std::find, but those answers were trying to check if this string exists, not its position.
Here is an example of how the text file is set up. With a string, followed by an integer, followed by .txt without spaces. Each string is on a newline.
file123.txt
Bob56.txt'
Foo8854.txt
In this case, the vector would be "file123.txt", "bob56.txt", "Foo8854.txt".
This is the code I have made already:
std::vector<std::string> FileInVector;
std::string str;
int StringPosition;
std::fstream FileNames;
FileNames.open("FileName Permanent Storage.txt");
while (std::getline(FileNames, str)) {
if(str.size() > 0) {
FileInVector.push_back(str); // Reads file, and this puts values into the vector
}
}
//This is where it would find the position of the string: "bob56.txt" as an example
FileInVector.erase(StringPosition); // Removes the string from the vector
remove("FileName Permanent Storage.txt"); // Deletes old file
std::ofstream outFile("FileName Permanent Storage.txt"); // Creates new file
for (const auto &e : FileInVector) outFile << e << "\n"; // Writes vector without string into the new file
Below is the working example. There is no need to store the string into a vector or search for the position of the string inside the vector because we can directly check if the read line is equal to the string to be searched for, as shown.
main.cpp
#include <iostream>
#include <fstream>
int main()
{
std::string line, stringtobeSearched = "Foo8854.txt";
std::ifstream inFile("input.txt");
std::ofstream outFile("output.txt");
if(inFile)
{
while(getline(inFile, line, '\n'))
{
std::cout<<line<<std::endl;
//if the line read is not same as string searched for then write it into the output.txt file
if(line != stringtobeSearched)
{
outFile << line << "\n";
}
//if the line read is same as string searched for then don't write it into the output.txt file
else
{
std::cout<<"string found "<<std::endl;//just printing it on screen
}
}
}
else
{
std::cout<<"file could not be read"<<std::endl;
}
inFile.close();
outFile.close();
return 0;
}
input.txt
file123.txt
Bob56.txt'
Foo8854.txt
file113.txt
Bob56.txt'
Foo8854.txt
file223.txt
Bob96.txt'
Foo8814.txt
output.txt
file123.txt
Bob56.txt'
file113.txt
Bob56.txt'
file223.txt
Bob96.txt'
Foo8814.txt
std::find returns an iterator to the found element and std::vector::erase accepts an iterator too. std::distance can be used to compute the index if needed.
Small example:
#include <vector>
#include <string>
#include <algorithm>
#include <iostream>
void print(const auto& vec){
for(const auto& e:vec){
std::cout<<e<<' ';
}
std::cout<<'\n';
}
int main(){
std::vector<std::string> vec{"a","b","c","d"};
auto it = std::find(vec.begin(),vec.end(),"c");
if(it!=vec.end())//If found
{
std::cout<<"Index "<<std::distance(vec.begin(),it)<<'\n';
vec.erase(it,it+1);
print(vec);
}
}
Output:
Index 2
a b d
That said, there is simple O(1) memory ( in terms of loaded lines) solution: read the lines and immediately write back only those that do not match the string.
#include <filesystem>
#include <iostream>
#include <fstream>
#include <map>
#include <cmath>
#include <chrono>
#include <algorithm>
#include <vector>
#include <execution>
#include <thread>
#include <condition_variable>
#include <mutex>
#include <string>
#include <atomic>
int main(int argc, char *argv[])
{
std::vector<std::string> b{"uyv","uky","u6t"};
std::vector<std::string> cb{"uyv"};
auto heil = std::search(b.begin(), b.end(), cb.begin(), cb.end());
b.erase(heil);
for (auto c : b)
std::cout << c << std::endl;
}
I'm doing a research in linguistics and I need some help.
I have a list of names in a text file (names.txt)
and I need to find out how many times all the words that are in this file occur in another text file (data.txt).
So far I found a manual way by writing each word from the names.txt file in a string by hand. Is there a shorter way to solve this?
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
ifstream file("names.txt");
ifstream file("data.txt")
int wcount = 0;
string token;
string word("Jhon"); //here I write names which are supposed to be taken
string word1("James"); //from names.txt automatically
string word2("Rick");
string word3("Morty");
string word4("Alice");
string word5("Tina");
string word6("Timmy");
// ...
while (file>>token) //here I check if those words exist in data.txt
if ((word == token) || (word1== token)|| (word2 == token) || (word3== token)|| (word4 == token) || (word5== token) || (word6==token))
wcount++;
cout << wcount << endl;
return 0;
Use a std::vector<std::string> to hold the dictionary and std::find to look up words. Some people might argue that the find algorithm of std::set is faster than that of std::vector but you need a really huge number of elements before this algorithm outperforms the gain you get from std::vector being in consecutive memory.
#include <algorithm>
#include <fstream>
#include <iostream>
#include <vector>
int main()
{
std::ifstream names("names.txt");
std::ifstream data("data.txt");
std::vector<std::string> words = { "Jhon", "James", "Rick", "Morty", "Alice", "Tina", "Timmy" };
int wcount = 0;
std::string token;
while (data >> token) //here I check if those words exist in data.txt
if (std::find(std::begin(words), std::end(words), token) != std::end(words))
++wcount;
std::cout << wcount << '\n';
}
I am working on a project for school and I am stuck on what I believe is just a small part but I cant figure it out.
Here is what I have so far:
#include <iostream>
#include <fstream>
#include <string>
#include <locale>
#include <vector>
#include <algorithm>
#include <set>
using namespace std;
int main(int argc, char* argv[])
{
set<string> setwords;
ifstream infile;
infile.open("words.txt"); //reads file "words.txt"
string word = argv[1]; // input from command line
transform(word.begin(), word.end(), word.begin(), tolower); // transforms word to lower case.
sort(word.begin(), word.end()); // sorts the word
vector<string> str; // vector to hold all variations of the word
do {
str.push_back(word);
}
while (next_permutation(word.begin(), word.end())); // pushes all permutations of "word" to vector str
if (!infile.eof())
{
string items;
infile >> items;
setwords.insert(items); //stores set of words from file
}
system("PAUSE");
return 0;
}
Now I need to compare the words from the file and the permutations stored in vector str
and print out the ones that are real words.
I know I need to use the find method of the set class. I am just not sure how to go about that. I was trying something like this with no luck, but my thought process is probably wrong.
for (unsigned int i = 0; i < str.size(); i++)
if (setwords.find(word) == str[i])
cout << str[i] << endl;
If you guys could help or point me in the right direction I would greatly appreciate it.
First, I'd like to say that this is a well-asked question. I appreciate new users that take the time to articulate their problem in detail.
The problem is that the find() method of a std::set<> returns an iterator object pointing to the value that it finds, or the end() of the container if it can't. When you compare it with str[i] (a string) it can't find a suitable overload of operator==() that takes both the iterator and a string.
Instead of making a full-on comparison with the string, you can instead compare the return value with end() to determine if it found the string:
if (setwords.find(str[i]) != setwords.end())
// ^^^^^^ ^^^^^^^^^^^^^^
If the expression returns true, then it sucessfully found the string inside the set.
There's also another potential problem I'd like to address in your code. Using if (!file.eof()) is the wrong way to condition your input. You should instead make the extract part of the condition, like this:
for (std::string item; infile >> item; )
{
setwords.insert(item);
}
Here's another way, using std::istream_iterator<>:
setwords.insert(std::istream_iterator<std::string>(infile),
std::istream_iterator<std::string>());
You actually are really close to having it right.
The set::find method doesn't return the value if it is found in the set, but rather an iterator object that points to the value. So your if statement is comparing the current string to the returned iterator object instead of the value that the iterator points to.
To get the value than an iterator points to, you just have to dereference it like you would a pointer, by prefixing it with an asterisk. Which means that you probably intended your if statement look like this:
if (*(setwords.find(word)) == str[i])
This would work for cases where the value was found in the set, but would be problematic for cases where the value was not found. If the value is not found, an iterator that points to the position after the last item in the set is returned - and you shouldn't try to dereference such an iterator (because it doesn't point to a valid object).
The way these checks are usually conducted is by comparing the returned iterator with the iterator that points to the end of the set (e.g., set::end, in this case). If the iterators do not match, that means the item was found.
if (setwords.find(word) != setwords.end())
cout << word << endl;
I think you need to write something like this:
for (unsigned int i = 0; i < str.size(); i++)
if (setwords.find(str[i]) != setwords.end())
cout << str[i] << endl;
But I think you don't need to store all permutations. You can store set of words with sorted letters. And compare it with sorted word.....
here is simpler solution
#include <iostream>
#include <fstream>
#include <string>
#include <locale>
#include <vector>
#include <algorithm>
#include <map>
using namespace std;
int main(int argc, char* argv[])
{
map<string, string> mapwords;
ifstream infile;
infile.open("words.txt"); //reads file "words.txt"
string word = argv[1]; // input from command line
transform(word.begin(), word.end(), word.begin(), tolower); // transforms word to lower case.
sort(word.begin(), word.end()); // sorts the word
if (!infile.eof())
{
string item;
infile >> item;
string sorted_item = item;
sort(sorted_item.begin(), sorted_item.end()); // sorts the word
mapwords.insert(make_pair(sorted_item, item)); //stores set of words from file
}
map<string, string>::iterator i = mapwords.find(word);
if(i != mapwords.end())
cout << i->second << endl;
system("PAUSE");
return 0;
}
As per request of the fantastic fellas over at the C++ chat lounge, what is a good way to break down a file (which in my case contains a string with roughly 100 lines, and about 10 words in each line) and insert all these words into a std::set?
The easiest way to construct any container from a source that holds a series of that element, is to use the constructor that takes a pair of iterators. Use istream_iterator to iterate over a stream.
#include <set>
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
using namespace std;
int main()
{
//I create an iterator that retrieves `string` objects from `cin`
auto begin = istream_iterator<string>(cin);
//I create an iterator that represents the end of a stream
auto end = istream_iterator<string>();
//and iterate over the file, and copy those elements into my `set`
set<string> myset(begin, end);
//this line copies the elements in the set to `cout`
//I have this to verify that I did it all right
copy(myset.begin(), myset.end(), ostream_iterator<string>(cout, "\n"));
return 0;
}
http://ideone.com/iz1q0
Assuming you've read your file into a string, boost::split will do the trick:
#include <set>
#include <boost/foreach.hpp>
#include <boost/algorithm/string.hpp>
std::string astring = "abc 123 abc 123\ndef 456 def 456"; // your string
std::set<std::string> tokens; // this will receive the words
boost::split(tokens, astring, boost::is_any_of("\n ")); // split on space & newline
// Print the individual words
BOOST_FOREACH(std::string token, tokens){
std::cout << "\n" << token << std::endl;
}
Lists or Vectors can be used instead of a Set if necessary.
Also note this is almost a dupe of:
Split a string in C++?
#include <set>
#include <iostream>
#include <string>
int main()
{
std::string temp, mystring;
std::set<std::string> myset;
while(std::getline(std::cin, temp))
mystring += temp + ' ';
temp = "";
for (size_t i = 0; i < mystring.length(); i++)
{
if (mystring.at(i) == ' ' || mystring.at(i) == '\n' || mystring.at(i) == '\t')
{
myset.insert(temp);
temp = "";
}
else
{
temp.push_back(mystring.at(i));
}
}
if (temp != " " || temp != "\n" || temp != "\t")
myset.insert(temp);
for (std::set<std::string>::iterator i = myset.begin(); i != myset.end(); i++)
{
std::cout << *i << std::endl;
}
return 0;
}
Let's start at the top. First off, you need a few variables to work with. temp is just a placeholder for the string while you build it from each character in the string you want to parse. mystring is the string you are looking to split up and myset is where you will be sticking the split strings.
So then we read the file (input through < piping) and insert the contents into mystring.
Now we want to iterate down the length of the string, searching for spaces, newlines, or tabs to split the string up with. If we find one of those characters, then we need to insert the string into the set, and empty our placeholder string, otherwise, we add the character to the placeholder, which will build up the string. Once we finish, we need to add the last string to the set.
Finally, we iterate down the set, and print each string, which is simply for verification, but could be useful otherwise.
Edit: A significant improvement on my code provided by Loki Astari in a comment which I thought should be integrated into the answer:
#include <set>
#include <iostream>
#include <string>
int main()
{
std::set<std::string> myset;
std::string word;
while(std::cin >> word)
{
myset.insert(std::move(word));
}
for(std::set<std::string>::const_iterator it=myset.begin(); it!=myset.end(); ++it)
std::cout << *it << '\n';
}