Program stops parsing when last position of string is punctuation

Program stops parsing when last position of string is punctuation - c++

So I'm trying to read all words from a file, and get rid of the punctuation as I do that. Here is the logic that is stripping the punctuation:
Edit: The program actually stops running altogether, just want to make that clear
ifstream file("text.txt");
string str;
string::iterator cur;
for(file>>str; !file.eof(); file>>str){
for(cur = str.begin(); cur != str.end(); cur++){
if (!(isalnum(*cur))){
cur = str.erase(cur);
}
}
cout << str << endl;
...
}
Say I have a text file that reads:
This is a program. It has trouble with (non alphanumeric chars)
But it's my own and I love it...
When I cout and endl; my string right after this bit of logic, I'll get
This
is
a
program
It
has
trouble
with
non
alphanumeric
and that's all folks.
Is there something wrong with my iterator logic?
How could I fix this?
Thank you.

The main logical problem with iterators I see is that for non-alphanumeric characters the iterator gets increased twice: during erase it moves to the next symbol and then cur++ from the for loop increases it, so it skips every symbol after a non-alphanumeric one.
So probably something along the lines of:
string next;
string::iterator cur;
cur = next.begin()
while(cur != next.end()){
if (!(isalnum(*cur))){
cur = next.erase(cur);
} else {
cur++;
}
}
This just removes the non-alphanumeric characters. If you need to tokenize your input, you will have to implement a bit more, i.e. remember, whether you're inside a word (have read at least one alphanumeric character) or not and act accordingly.

How about just not copying in the punctuation when building the transformed list. OK. probably overkill.
#include <iostream>
#include <fstream>
#include <iterator>
#include <vector>
#include <algorithm>
#include <cctype>
using namespace std;
// takes the file being processed as only command line param
int main(int argc, char *argv[])
{
if (argc != 2)
return EXIT_FAILURE;
ifstream inf(argv[1]);
vector<string> res;
std::transform(istream_iterator<string>(inf),
istream_iterator<string>(),
back_inserter(res),
[](const string& s) {
string tmp; copy_if(s.begin(), s.end(), back_inserter(tmp),
[](char c) { return std::isalnum(c); });
return tmp;
});
// optional dump to output
copy(res.begin(), res.end(), ostream_iterator<string>(cout, "\n"));
return EXIT_SUCCESS;
}
Input
All the world's a stage,
And all the men and women merely players:
They have their exits and their entrances;
And one man in his time plays many parts,
His acts being seven ages. At first, the infant,
Mewling and puking in the nurse's arms.
Output
All
the
worlds
a
stage
And
all
the
men
and
women
merely
players
They
have
their
exits
and
their
entrances
And
one
man
in
his
time
plays
many
parts
His
acts
being
seven
ages
At
first
the
infant
Mewling
and
puking
in
the
nurses
arms

You should be using ispunct to test for a punctuation character. If you also want to filter out control characters you should use iscntrl.
Once you've filtered out the punctutation you can split a spaces and newlines to get the words.

Related

My program returning the set_intersection value of two text files containing 479k words each is really slow. Is it my code?

I wrote a program to compare two text files containing all of the words in the dictionary (one forwards and one backwards). The idea is that when the text file containing all of the backwards words is compared with the forwards words, any matches will indicate that those words can be spelled both forwards and backwards and will return all palindromes as well as any words that spell both a word backwards and forwards.
The program works and I've tested it on three different file sizes. The first set contain only two words, just for testing purposes. The second contains 10,000 English words (in each text file), and the third contains all English words (~479k words). When I run the program calling on the first set of text files, the result is almost instantaneous. When I run the program calling on the set of text files containing 10k words, it takes a few hours. However, when I run the program containing the largest files (479k words), it ran for a day and returned only about 30 words, when it should have returned thousands. It didn't even finish and was nowhere near finishing (and this was on a fairly decent gaming PC).
I have a feeling it has to do with my code. It must be inefficient.
There are two things that I've noticed:
When I run: cout << "token: " << *it << std::endl; it runs endlessly on a loop forever and never stops. Could this be eating up processing power?
I commented out sorting because all my data is already sorted. I noticed that the second I did this, the program running 10,000 word text files sped up.
However, even after doing these things there seemed to be no real change in speed in the program calling on the largest text files. Any advice? I'm kinda new at this. Thanks~
*Please let me know if you'd like a copy of the text files and I'd happily upload them. Thanks
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <iterator>
#include <algorithm>
#include <boost/tokenizer.hpp>
typedef boost::char_separator<char> separator_type;
using namespace std;
using namespace boost;
int main()
{
fstream file1; //fstream variable files
fstream file2; //fstream variable files
string dictionary1;
string dictionary2;
string words1;
string words2;
dictionary1 = "Dictionary.txt";
// dictionary1 = "Dictionarytenthousand.txt";
// dictionary1 = "Twoworddictionary.txt"; //this dictionary contains only two words separated by a comma as a test
dictionary2 = "Backwardsdictionary.txt";
// dictionary2 = "Backwardsdictionarytenthousand.txt";
// dictionary2 = "Backwardstwoworddictionary.txt"; //this dictionary contains only two words separated by a comma as a test
file1.open(dictionary1.c_str()); //opening Dictionary.txt
file2.open(dictionary2.c_str()); //opening Backwardsdictionary.txt
if (!file1)
{
cout << "Unable to open file1"; //terminate with error
exit(1);
}
if (!file2)
{
cout << "Unable to open file2"; //terminate with error
exit(1);
}
while (getline(file1, words1))
{
while (getline(file2, words2))
{
boost::tokenizer<separator_type> tokenizer1(words1, separator_type(",")); //separates string in Twoworddictionary.txt into individual words for compiler (comma as delimiter)
auto it = tokenizer1.begin();
while (it != tokenizer1.end())
{
std::cout << "token: " << *it << std::endl; //test to see if tokenizer works before program continues
vector<string> words1Vec; // vector to store Twoworddictionary.txt strings in
words1Vec.push_back(*it++); // adds elements dynamically onto the end of the vector
boost::tokenizer<separator_type> tokenizer2(words2, separator_type(",")); //separates string in Backwardstwoworddictionary.txt into individual words for compiler (comma as delimiter)
auto it2 = tokenizer2.begin();
while (it2 != tokenizer2.end())
{
std::cout << "token: " << *it2 << std::endl; //test to see if tokenizer works before program continues
vector<string> words2Vec; //vector to store Backwardstwoworddictionary.txt strings in
words2Vec.push_back(*it2++); //adds elements dynamically onto the end of the vector
vector<string> matchingwords(words1Vec.size() + words2Vec.size()); //vector to store elements from both dictionary text files (and ultimately to store the intersection of both, i.e. the matching words)
//sort(words1Vec.begin(), words1Vec.end()); //set intersection requires its inputs to be sorted
//sort(words2Vec.begin(), words2Vec.end()); //set intersection requires its inputs to be sorted
vector<string>::iterator it3 = set_intersection(words1Vec.begin(), words1Vec.end(), words2Vec.begin(), words2Vec.end(), matchingwords.begin()); //finds the matching words from both dictionaries
matchingwords.erase(it3, matchingwords.end());
for (vector<string>::iterator it4 = matchingwords.begin(); it4 < matchingwords.end(); ++it4) cout << *it4 << endl; // returns matching words
}
}
}
}
file1.close();
file2.close();
return 0;
}

Stop using namespace. Type the extra stuff.
Have code do one thing. Your code isn't doing what you claim it does, probably becuase you are doing 4 things at once and getting confused.
Then glue the code together.
Getline supports arbitrary delimiters. Use it with ','.
Write code that converts a file into a vector of strings.
std::vector<std::string> getWords(std::string filename);
then test it works. You are doing this wrong in your code posted above, in that you are making length 1 vectors and tossing them.
That will remove about half of your code.
Next, for set_intersection, use std::back_inserter and an empty vector as your output. Like (blah begin, blah end, foo begin, foo end, std::back_inserter(vec3)). It will call push_back with each result.
In pseudo code:
std::vec<std::string> loadWords(std::string filename)
auto file=open(filename)
std::vec<std::string> retval
while(std::readline(file, str, ','))
retval.push_back(str)
return retval
std::vec<string> intersect(std::string file1, std::string file2)
auto v1=loadWords(file1)
auto v2=loadWords(file2)
std::vec<string> v3;
std::set_intersect(begin(v1),end(v1),begin(v2),end(v2),std::back_inserter(v3))
return v3
and done.
Also stop it with the C++03 loops.
for(auto& elem:vec)
std::cout<<elem<<'\n';
is far clearer and less error prone than manually futzing with iterators.

I have made a program in C++ to separate words from a line by spacebar and display those words as an array. What's wrong in my code?

Please help me to find a bug in this program.It separates a line into words by spacebar. And display as a list.
If the first char of a word is in lower case, it is converted to uppercase.
#include <iostream>
#include <string>
using namespace std;
int main()
{
char line[30]="Hi there buddy",List[10][20];
unsigned int i=0,List_pos=0,no;
int first=0,last;
while(i!=sizeof(line)+1)
{
if(line[i]==' ' or i==sizeof(line))
{
last=i;
no=0;
for(int j=first;j<last;++j)
{
if(no==0)
List[List_pos][no]=toupper(line[j]);
else
List[List_pos][no]=line[j];
++no;
}
++List_pos;
first=last+1;
}
++i;
}
for(unsigned int a=0;a<List_pos;++a)
cout<<"\nList["<<a+1<<"]="<<List[a];
return 0;
}
Expected Output:
List[1]=Hi
List[2]=There
List[3]=Buddy
Actual Output:
List[1]=Hi
List[2]=ThereiXŚm
List[3]=Buddy

I suggest you use a string, as you already included it. And 'List is not really necessary in this situation. Try making a single for loop where you separate your line into words, in my opinion when you work with arrays you should use for loops. In your for loop, as you go through the line, you could just add a if statement which determines whether you're at the end of a word or not. I think the problem in your code is the multiple loops but I am not sure of it.
I provide you a code which works. Just adapt it to your display requirements and you will be fine
#include <iostream>
#include <string>
using namespace std;
int main()
{
string line = "Hi there buddy";
for (int i = 0; i < line.size(); i++) {
if (line[i] == ' ') {
line[i + 1] = toupper(line[i+1]);
cout<<'\n';
} else {
cout<<line[i];
}
}
return 0;
} ```

Challenged by the comment from PaulMcKenzie, I implemented a C++ solution with 3 statements:
Define a std::string, with the words to work on
Define a std::regex that finds words only. Whitespaces and other delimiters are ignored
Use the std::transform to transform the input string into output lines
std::transform has 4 parameters.
With what the transformation should begin. In this case, we use the std::sregex_token_iterator. This will look for the regex (so, for the word) and return the first word. That's the begin.
With what the transformation should end. We use the empty std::sregex_token_iterator. That means: Do until all matches (all words) have been read.
The destination. For this we will use the std::ostream_iterator. This will send all transformed results (what the lambda returns) to the given output stream (in our case std::cout). And it will add a delimiter, here a newline ("\n").
The transormation function. Implemented as lambda. Here we get the word from the std::sregex_token_iterator and transform it into a new word according to what we want. So, a word with a capitalized first letter. We add a little bit text for the output line as wished by the OP.
Please check:
#include <string>
#include <iostream>
#include <regex>
#include <iterator>
int main()
{
// 1. This is the string to convert
std::string line("Hi there buddy");
// 2. We want to search for complete words
std::regex word("(\\w+)");
// 3. Transform the input string to output lines
std::transform(
std::sregex_token_iterator(line.begin(), line.end(), word, 1),
std::sregex_token_iterator(),
std::ostream_iterator<std::string>(std::cout, "\n"),
[i = 1](std::string w) mutable {
return std::string("List[") + std::to_string(i++) + "]=" + static_cast<char>(::toupper(w[0])) + &w[1];
}
);
return 0;
}
This will give us the following output:
List[1]=Hi
List[2]=There
List[3]=Buddy
Please get a feeling for the capabilities of C++

Found a solution for your next problem (when the user inputs a sentence only the first word it displayed). When you input a "space", the cin just thinks you are done. You need to use the getLine() to get the whole sentence.
getline(cin, line);
Instead of
cin>>line;

How do I make an alphabetized list of all distinct words in a file with the number of times each word was used?

I am writing a program using Microsoft Visual C++. In the program I must read in a text file and print out an alphabetized list of all distinct words in that file with the number of times each word was used.
I have looked up different ways to alphabetize a string but they do not work with the way I have my string initialized.
// What is inside my text file
Any experienced programmer engaged in writing programs for use by others knows
that, once his program is working correctly, good output is a must. Few people
really care how much time and trouble a programmer has spent in designing and
debugging a program. Most people see only the results. Often, by the time a
programmer has finished tackling a difficult problem, any output may look
great. The programmer knows what it means and how to interpret it. However,
the same cannot be said for others, or even for the programmer himself six
months hence.
string lines;
getline(input, lines); // Stores what is in file into the string
I expect an alphabetized list of words with the number of times each word was used. So far, I do not know how to begin this process.

It's rather simple, std::map automatically sorts based on key in the key/value pair you get. The key/value pair represents word/count which is what you need. You need to do some filtering for special characters and such.
EDIT: std::stringstream is a nice way of splitting std::string using whitespace delimiter as it's the default delimiter. Therefore, using stream >> word you will get whitespace-separated words. However, this might not be enough due to punctuation. For example: Often, has comma which we need to filter out. Therefore, I used std::replaceif which replaces puncts and digits with whitespaces.
Now a new problem arises. In your example, you have: "must.Few" which will be returned as one word. After replacing . with we have "must Few". So I'm using another stringstream on the filtered "word" to make sure I have only words in the final result.
In the second loop you will notice if(word == "") continue;, this can happen if the string is not trimmed. If you look at the code you will find out that we aren't trimming after replacing puncts and digits. That is, "Often," will be "Often " with trailing whitespace. The trailing whitespace causes the second loop to extract an empty word. This is why I added the condition to ignore it. You can trim the filtered result and then you wouldn't need this check.
Finally, I have added ignorecase boolean to check if you wish to ignore the case of the word or not. If you wish to do so, the program will simply convert the word to lowercase and then add it to the map. Otherwise, it will add the word the same way it found it. By default, ignorecase = true, if you wish to consider case, just call the function differently: count_words(input, false);.
Edit 2: In case you're wondering, the statement counts[word] will automatically create key/value pair in the std::map IF there isn't any key matching word. So when we call ++: if the word isn't in the map, it will create the pair, and increment value by 1 so you will have newly added word. If it exists already in the map, this will increment the existing value by 1 and hence it acts as a counter.
The program:
#include <iostream>
#include <map>
#include <sstream>
#include <cstring>
#include <cctype>
#include <string>
#include <iomanip>
#include <algorithm>
std::string to_lower(const std::string& str) {
std::string ret;
for (char c : str)
ret.push_back(tolower(c));
return ret;
}
std::map<std::string, size_t> count_words(const std::string& str, bool ignorecase = true) {
std::map<std::string, size_t> counts;
std::stringstream stream(str);
while (stream.good()) {
// wordW may have multiple words connected by special chars/digits
std::string wordW;
stream >> wordW;
// filter special chars and digits
std::replace_if(wordW.begin(), wordW.end(),
[](const char& c) { return std::ispunct(c) || std::isdigit(c); }, ' ');
// now wordW may have multiple words seperated by whitespaces, extract them
std::stringstream word_stream(wordW);
while (word_stream.good()) {
std::string word;
word_stream >> word;
// ignore empty words
if (word == "") continue;
// add to count.
ignorecase ? counts[to_lower(word)]++ : counts[word]++;
}
}
return counts;
}
void print_counts(const std::map<std::string, size_t>& counts) {
for (auto pair : counts)
std::cout << std::setw(15) << pair.first << " : " << pair.second << std::endl;
}
int main() {
std::string input = "Any experienced programmer engaged in writing programs for use by others knows \
that, once his program is working correctly, good output is a must.Few people \
really care how much time and trouble a programmer has spent in designing and \
debugging a program.Most people see only the results.Often, by the time a \
programmer has finished tackling a difficult problem, any output may look \
great.The programmer knows what it means and how to interpret it.However, \
the same cannot be said for others, or even for the programmer himself six \
months hence.";
auto counts = count_words(input);
print_counts(counts);
return 0;
}
I have tested this with Visual Studio 2017 and here is the part of the output:
a : 5
and : 3
any : 2
be : 1
by : 2
cannot : 1
care : 1
correctly : 1
debugging : 1
designing : 1

As others have already noted, an std::map handles the counting you care about quite easily.
Iostreams already have a tokenize to break an input stream up into words. In this case, we want to to only "think" of letters as characters that can make up words though. A stream uses a locale to make that sort of decision, so to change how it's done, we need to define a locale that classifies characters as we see fit.
struct alpha_only: std::ctype<char> {
alpha_only(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
// everything is white space
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::space);
// except lower- and upper-case letters, which are classified accordingly:
std::fill(&rc['a'], &rc['z'], std::ctype_base::lower);
std::fill(&rc['A'], &rc['Z'], std::ctype_base::upper);
return &rc[0];
}
};
With that in place, we tell the stream to use our ctype facet, then simply read words from the file and count them in the map:
std::cin.imbue(std::locale(std::locale(), new alpha_only));
std::map<std::string, std::size_t> counts;
std::string word;
while (std::cin >> word)
++counts[to_lower(word)];
...and when we're done with that, we can print out the results:
for (auto w : counts)
std::cout << w.first << ": " << w.second << "\n";

Id probably start by inserting all of those words into an array of strings, then start with the first index of the array and compare that with all of the other indexes if you find matches, add 1 to a counter and after you went through the array you could display the word you were searching for and how many matches there were and then go onto the next element and compare that with all of the other elements in the array and display etc. Or maybe if you wanna make a parallel array of integers that holds the number of matches you could do all the comparisons at one time and the displays at one time.

EDIT:
Everyone's answer seems more elegant because of the map's inherent sorting. My answer functions more as a parser, that later sorts the tokens. Therefore my answer is only useful to the extent of a tokenizer or lexer, whereas Everyone's answer is only good for sorted data.
You first probably want to read in the text file. You want to use a streambuf iterator to read in the file(found here).
You will now have a string called content, which is the content of you file. Next you will want to iterate, or loop, over the contents of this string. To do that you'll want to use an iterator. There should be a string outside of the loop that stores the current word. You will iterate over the content string, and each time you hit a letter character, you will add that character to your current word string. Then, once you hit a space character, you will take that current word string, and push it back into the wordString vector. (Note: that means that this will ignore non-letter characters, and that only spaces denote word separation.)
Now that we have a vector of all of our words in strings, we can use std::sort, to sort the vector in alphabetical order.(Note: capitalized words take precedence over lowercase words, and therefore will be sorted first.) Then we will iterate over our vector of stringWords and convert them into Word objects (this is a little heavy-weight), that will store their appearances and the word string. We will push these Word objects into a Word vector, but if we discover a repeat word string, instead of adding it into the Word vector, we'll grab the previous entry and increment its appearance count.
Finally, once this is all done, we can iterate over our Word object vector and output the word followed by its appearances.
Full Code:
#include <vector>
#include <fstream>
#include <iostream>
#include <streambuf>
#include <algorithm>
#include <string>
class Word //define word object
{
public:
Word(){appearances = 1;}
~Word(){}
int appearances;
std::string mWord;
};
bool isLetter(const char x)
{
return((x >= 'a' && x <= 'z') || (x >= 'A' && x <= 'Z'));
}
int main()
{
std::string srcFile = "myTextFile.txt"; //what file are we reading
std::ifstream ifs(srcFile);
std::string content( (std::istreambuf_iterator<char>(ifs) ),
( std::istreambuf_iterator<char>() )); //read in the file
std::vector<std::string> wordStringV; //create a vector of word strings
std::string current = ""; //define our current word
for(auto it = content.begin(); it != content.end(); ++it) //iterate over our input
{
const char currentChar = *it; //make life easier
if(currentChar == ' ')
{
wordStringV.push_back(current);
current = "";
continue;
}
else if(isLetter(currentChar))
{
current += *it;
}
}
std::sort(wordStringV.begin(), wordStringV.end(), std::less<std::string>());
std::vector<Word> wordVector;
for(auto it = wordStringV.begin(); it != wordStringV.end(); ++it) //iterate over wordString vector
{
std::vector<Word>::iterator wordIt;
//see if the current word string has appeared before...
for(wordIt = wordVector.begin(); wordIt != wordVector.end(); ++wordIt)
{
if((*wordIt).mWord == *it)
break;
}
if(wordIt == wordVector.end()) //...if not create a new Word obj
{
Word theWord;
theWord.mWord = *it;
wordVector.push_back(theWord);
}
else //...otherwise increment the appearances.
{
++((*wordIt).appearances);
}
}
//print the words out
for(auto it = wordVector.begin(); it != wordVector.end(); ++it)
{
Word theWord = *it;
std::cout << theWord.mWord << " " << theWord.appearances << "\n";
}
return 0;
}
Side Notes
Compiled with g++ version 4.2.1 with target x86_64-apple-darwin, using the compiler flag -std=c++11.
If you don't like iterators you can instead do
for(int i = 0; i < v.size(); ++i)
{
char currentChar = vector[i];
}
It's important to note that if you are capitalization agnostic simply use std::tolower on the current += *it; statement (ie: current += std::tolower(*it);).
Also, you seem like a beginner and this answer might have been too heavyweight, but you're asking for a basic parser and that is no easy task. I recommend starting by parsing simpler strings like math equations. Maybe make a calculator app.

How to make sure the words being read in from the file are how I want them to be C++

If I had to read in a word from a document (one word at a time), and then pass that word into a function until I reach the end of the file, how would I do this?
What also must be kept in mind is that a word is any consecutive string of letters and the apostrophe ( so can't or rojas' is one word). Something like bad-day should be two separate words, and something like to-be-husband should be 3 separate words. I also need to ignore periods ., semi-colons ;, and pretty much anything that isn't part of a word. I have been reading it in using file >> s; and then removing stuff from the string but it has gotten very complicated. Is there a way to store into s only alphabet characters+apostrophes and stop at the end of a word (when a space occurs)?
while (!file.eof()) {
string s;
file >> s; //this is how I am currently reading it it
passToFunction(s);
}

Yes, there is a way: simply write the code to do it. Read one character at a time, and collect the characters in the string, until you gets a non-alphabetic, non-apostrophe character. You've now read one word. Wait until you read the next character that's a letter or an apostrophe, and then you take it from the top.
One other thing:
while (!file.eof())
This is always a bug, and a wrong thing to do. Just thought I'd mention this. I suppose that fixing this is going to be your first order of business, before writing the rest of your code.

OnlyLetterNumAndApp facet for a stream
#include <locale>
#include <string>
#include <fstream>
#include <iostream>
// This facet treats letters/numbers and apostrophe as alpha
// Everything else is treated like a space.
//
// This makes reading words with operator>> very easy to sue
// when you want to ignore all the other characters.
class OnlyLetterNumAndApp: public std::ctype<char>
{
public:
typedef std::ctype<char> base;
typedef base::char_type char_type;
OnlyLetterNumAndApp(std::locale const& l)
: base(table)
{
std::ctype<char> const& defaultCType = std::use_facet<std::ctype<char> >(l);
for(int loop = 0;loop < 256;++loop) {
table[loop] = (defaultCType.is(base::alnum, loop) || loop == '\'')
? base::alpha
: base::space;
}
}
private:
base::mask table[256];
};
Usage
int main()
{
std::ifstream file;
file.imbue(std::locale(std::locale(), new OnlyLetterNumAndApp(std::locale())));
file.open("test.txt");
std::string word;
while(file >> word) {
std::cout << word << "\n";
}
}
Test File
> cat test.txt
This is %%% a test djkhfdkjfd
try another $gh line's
bad-people.Do bad things
Result
> ./a.out
This
is
a
test
djkhfdkjfd
try
another
gh
line's
bad
people
Do
bad
things

Cannot get second while to loop properly

I'm making a function that removes elements from a string. However, I cant seem to get both of my loops to work together. The first while loop works flawlessly. I looked into it and I believe it might be because when "find_last_of" isn't found, it still returns a value (which is throwing off my loop). I haven't been able to figure out how I can fix it. Thank you.
#include <iostream>
#include <string>
using namespace std;
string foo(string word) {
string compare = "!##$";
string alphabet = "abcdefghijklmnopqrstuvxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
while(word.find_first_of(compare) < word.find_first_of(alphabet)) {
int position = word.find_first_of(compare);
word = word.substr(++position);
}
while(word.find_last_of(compare) > word.find_last_of(alphabet)){
int size = word.length();
word = word.substr(0, --size);
}
return word;
}
int main() {
cout << foo("!!hi!!");
return 0;
}
I wrote it like this so compound words would not be affected. Desired result: "hi"

It's not entirely clear what you're trying to do, but how about replacing the second loop with this:
string::size_type p = word.find_last_not_of(compare);
if(p != string::npos)
word = word.substr(0, ++p);

It's not clear if you just want to trim certain characters from the front and back of word or if you want to remove every one of a certain set of characters from word no matter where they are. Based on the first sentence of your question, I'll assume you want to do the latter: remove all characters in compare from word.
A better strategy would be to more directly examine each character to see if it needs to be removed, and if so, do so, all in one pass through word. Since compare is quite short, something like this is probably good enough:
// Rewrite word by removing all characters in compare (and then erasing the
// leftover space, if any, at the end). See std::remove_if() docs.
word.erase(std::remove_if(word.begin(),
word.end(),
// Returns true if a character is to be removed.
[&](const char ch) {
return compare.find(ch) != compare.npos;
}),
word.end());
BTW, I'm not sure why there is both a compare and alphabet string in your example. It seems you would only need to define one or the other, and not both. A character is either one to keep or one to remove.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Program stops parsing when last position of string is punctuation - c++

You should be using ispunct to test for a punctuation character. If you also want to filter out control characters you should use iscntrl. Once you've filtered out the punctutation you can split a spaces and newlines to get the words.

Related

My program returning the set_intersection value of two text files containing 479k words each is really slow. Is it my code?

I have made a program in C++ to separate words from a line by spacebar and display those words as an array. What's wrong in my code?

How do I make an alphabetized list of all distinct words in a file with the number of times each word was used?

How to make sure the words being read in from the file are how I want them to be C++

Cannot get second while to loop properly

Categories

Resources