Trying to remove specific characters from a string C++ - c++

i need help with removing some non-alphabetical characters from strings. I've been using a method where you look for ascii symbols that dont lay within the a-z and A-Z area. It removes some characters like " ! ", and " ? ", but it dosent remove " " " and " ) " from the end of words.
for(j=0;j<word.size();j++){
if(word[j]<'A' || word[j]>'Z'
word[j]<'a' || word[j]>'z'){
word.erase(j,1);
j--;
wordsave.push_back(word)}}
This code gets data from a textfile, with a random story in it, it saves the word that is modified to a vector called "wordsave". word is just the string, saving the word from the file temporarily.
The word goes through the whole program because at the end of the program i have a cout that proves that it went through the program.
What could be the problem behind this code, that makes it skip out on removing some characters?

Here is a method using a well-established idiom, aptly called the erase/remove idiom. It's more efficient than multiple random erases. It takes advantage of a few Standard Library functions, and doesn't require an unnecessary copy.
#include <algorithm>
#include <cctype>
#include <iostream>
#include <string>
int main() {
std::string str("a678b##$c");
std::cout << str << '\n';
str.erase(std::remove_if(str.begin(), str.end(),
[](char c) { return !std::isalpha(c); }),
str.end());
std::cout << str << '\n';
}
Output:
a678b##$c
abc
You could pull this code out into its own function, and iterate over your vector, calling the function for each element.

Related

C++ C-string has (apparently) no content to display

First of all, I am nothing but new to both programming and Stack Overflow.
I am self-studying with Schaum's outline for Programming with C++ and I have some issues with problem 8.24 (solutions are given to almost every problem in the book, but I want to know why my code in particular isn't working as expected).
You are supposed to be given a c-string and return the given string, but with all its tokens in reverse order (but keeping the natural order of the token itself).
That is, given "Enter a sentence" it would show on screen "sentence a Enter".
My code is the following:
#include <iostream>
#include <cstring>
using namespace std;
int main()
{
char line1[100];
cout << "Enter a sentence (enter \".\" to terminate input):\n";
cin.getline(line1,100,'.');
char line2[strlen(line1) + 1]; // we add 1 for the empty char that ends every c string
int char_count = strlen(line1); //strlen() does not include the empty char
char* p = strtok(line1," ");
while (p)
{
char_count -= strlen(p); // we substract p's len to start adding its chars
for (int i = 0; i <= strlen(p); i++)
line2[char_count + i] = p[i]; // we then add the chars themselves
if ((char_count - 1) > 0)
line2[--char_count] = ' '; // a blanck space is needed between the different tokens
p = strtok(NULL, " ");
}
cout << "\n" << line2 << "\n";
}
Unfortunately, the code is wrong in many ways. The most obvious thing is the obscurity of the word reversal process (and the fact it is mixed with word iteration).
According to the commenters, you are not using C++. In C++ it would be rather straightforward:
#include <algorithm>
#include <iostream>
#include <string>
void reverse_words(std::string& s) {
/* starting position of the word */
size_t last_pos = 0;
do {
/* find the end of current word */
size_t end_pos = std::min( s.find(' ', last_pos + 1), s.size() );
/* reverse one word inplace */
std::reverse(s.begin() + last_pos, s.begin() + end_pos);
/* advance to the begining of the next word */
last_pos = end_pos + 1;
} while (pos != std::string::npos);
std::reverse(s.begin(), s.end());
}
int main()
{
std::string s = "This is a sentence";
reverse_words(s);
std::cout << s << std::endl;
}
Hopefully, you can see the essence of the method: sequentially find start and finish of each word, reverse letter order in this word and then finally reverse the entire string.
Now, getting back to the C-string question. You can replace std::string::find call with strtok and write your version of std::reverse specialized for C strings (the reversal of the entire string or its part is simpler than reversing the word order and this is also the recommended exercise).
Start from a simpler program which prints out pairs of integers (start_pos and end_pos for each word) using strtok. Then write a reverse procedure and test it also. Finally, combine this word iteration with reverse. I personally think this is the only way to be sure your implementation is correct - being sure in each of its parts and being able to test each part individually.
A lot of improvements have been added to C++ since that book was originally written, and we can do it in a lot cleaner and safer way now. We'll break the problem into two parts:
A function to convert a string into a list of tokens
The main function, which reads the string; reverses it; and prints it.
These functions will be tokenize, which returns a vector of string_view, and main. A string_view is just a class that stores a pointer and a size to some other string. It's efficient because it won't make a copy of the string or allocate any memory. In this case, it's the right tool for the job because we're going to be breaking up an existing string.
#include <string_view>
#include <string>
#include <iostream>
#include <vector>
#include <algorithm>
auto tokenize(std::string_view line) {
std::vector<std::string_view> tokens;
for (size_t token_size = line.find(' ');
token_size != line.npos;
token_size = line.find(' '))
{
tokens.push_back(line.substr(0, token_size));
line.remove_prefix(token_size + 1);
}
tokens.push_back(line);
return tokens;
}
int main() {
std::string line;
std::getline(std::cin, line);
auto tokens = tokenize(line);
std::reverse(tokens.begin(), tokens.end());
for(auto token : tokens) {
std::cout << token << ' ';
}
std::cout << std::endl;
}
Explaining tokenize
Tokenize takes a string_view as input, and returns a list of the tokens. line.find(' ') will look for a space. If it finds one, it'll return the position of the space; otherwise, it'll return line.npos (which is basically the biggest possible size).
For every token we find, we
get the token via view.substr(0, token_size)
Add the token to the vector via tokens.push_back
Then, we update the line by removing the first token and the corresponding space. This is line.remove_prefix(token_size + 1);
Once there are no more spaces, we'll add the remainder of the line to the vector using tokenize.push_back(line);, and then we'll return the vector of tokens.
Explaining main
We can get the line via std::getline(std::cin, line);, which will read a line from cin and put it in the variable we give it (line). After that, we can read all the tokens in the line using the tokenize function we wrote. We'll reverse the vector of tokens via std::reverse, and then we'll print out all the tokens.
Thanks to each of you.
Seeing your answers I have learnt quite a lot about good programming (both regarding syntax and original ways to solve the problem itself, as Viktor's).
I apologise if I have not given the proper feedback, but again I am (still) unfamiliar with Stack's customs and ''policies''.

How do I count individual number of characters in a word from a text file?

I have an assignment to read a text from a file and remove all of the words which have an odd number of characters. I don't really know where to start because I'm still learning the basics and could use some help. The biggest question I have is how do I count individual words. Like how would I seperate the counting from word to word in a text file. Thank you for the help
//Sorry for the lack of clarity, English is my second language and it can be difficult some times to specify what I need.
I have an assignment to write a c++ program where it:
(1)Reads a text file and determines which words are even and which are odd
(2)Then it takes even words and reverses the order of characters in each even word.
So for example I have a moderate size text. It picks out even words and reverses their character order.
So far I have written this code and I don't know if it is any good to continue with because I don't know how to reverse the order of characters.
#include <iostream>
#include <string>
#include <string.h>
#include <cstdlib>
#include <fstream>
using namespace std;
int main()
{
string word;
ifstream f("text.txt");
if (f.is_open())
{
while (f >> word)
{
if (word.length() % 2 == 0)
cout << word << endl;
}
f.close();
}
else
cout << "file is not open" << '\n';
}

Find the different characters present in a string

Is there any way to find the all the unique characters present in a string without finding all the occurrences of the string ?
For example, Let it be string a="mississippi" , the output should be {i,m,p,s}. Is there any inbuilt functions to find that in c++?
You can do that using std::sort, std::unique, std::string::erase
Note : original string will be modified [If you don't want that make a copy of it]
std::string str = "mississippi";
std::sort(std::begin(str), std::end(str));
auto last = std::unique(std::begin(str), std::end(str));
str.erase(last, std::end(str));
Make a set of characters and put all items from string to it, then you will have set with "alphabet" of your string.
E.g.:
#include <string>
#include <iostream>
#include <set>
int main(void)
{
std::string a = "mississippi";
std::set<char> alphabet;
alphabet.insert(a.begin(), a.end());
std::cout << "Set of chars has " << alphabet.size() << " items." << std::endl;
for (auto a : alphabet)
{
std::cout << a << std::endl;
}
}
Original string is not modified in that example and there is no need to pre-sort.
Sounds uncommon enough that it's not part of the STL.
I'd simply try iterating through the string and creating / incrementing the count of numbers in a Hashset. Then, grab all the Keys to determine unique values.
Good Luck

Unusual/Wrong Behavior of C++ Program

My code is intended to tell the user whether the string entered is a keyword in c++.
I am reading the keywords from a file into a set and then checking if the user supplied string is in it.
#include <iostream>
#include <string>
#include <set>
#include <algorithm>
#include <fstream>
using namespace std;
int main()
{
set<string> key;
fstream fs;
string b;
fs.open("keywords.txt",fstream::in);
while(getline(fs,b))
key.insert(b);
b.clear();
for(auto x:key)
cout << x << endl;
cout << "Enter String user\nPress exit to terminate\n";
while(getline(cin,b))
{
if(b == "exit")
break;
if(key.find(b) != key.end())
cout << "This is a keyword\n";
else
cout << "This is a not a keyword\n";
b.clear();
}
fs.close();
}
The keywords.txt file is just a list of keywords and can be obtained from here
The problem is that my program reads all keywords correctly but for some of them such as false,public it cannot find them in the set.
i.e. when I enter false as user input
it says, "This is not a keyword."
Considering your input file, I think you have some keyword names with trailing spaces.
"catch "
"false "
You can trim the strings before inserting in the set to remove spaces, using boost::trim or your own trim (see this question for instance.)
(If you want some advice as for your code:
You can use std::ifstream like this for input file streams:
std::ifstream file( "keywords.txt" );
You do not need to call .close() at then of the scope, it will be done automatically thanks to RAII.
You should not reuse the same std::string objects for every purpose, you can declare new string objects close to their use. You should give them better names like "line" instead of "b". Doing this, you don't need to call ".clear()" for your strings.
Every line has just one word, you could use while(fs>>b) the >> will ignore the spaces (from moldbinlo & wangxf comments)
)

How do I split a user-defined sentence into words in C++ using substr and find?

I used this function but it is wrong.
for (int i=0; i<sen.length(); i++) {
if (sen.find (' ') != string::npos) {
string new = sen.substr(0,i);
}
cout << "Substrings:" << new << endl;
}
Thank you! Any kind of help is appreciated!
new is a keyword in C++, so first step is to not use that as a variable name.
After that, you need to put your output statement in the "if" block, so that it can actually be allowed to access the substring. Scoping is critical in C++.
First: this cannot compile because new is a language keyword.
Then you have a loop running through every character in the string so you shouldn't need to use std::string::find. I would use std::string::find, but then the loop condition should be different.
This doesn't use substr and find, so if this is homework and you have to use that then this won't be a good answer... but I do believe it's the better way to do what you're asking in C++. It's untested but should work fine.
//Create stringstream and insert your whole sentence into it.
std::stringstream ss;
ss << sen;
//Read out words one by one into a string - stringstream will tokenize them
//by the ASCII space character for you.
std::string myWord;
while (ss >> myWord)
std::cout << myWord << std::endl; //You can save it however you like here.
If it is homework you should tag it as such so people stick to the assignment and know how much to help and/or not help you so they don't give it away :)
No need to iterate over the string, find already does this. It starts to search from the beginning by default, so once we found a space, we need to start the next search from this found space:
std::vector<std::string> words;
//find first space
size_t start = 0, end = sen.find(' ');
//as long as there are spaces
while(end != std::string::npos)
{
//get word
words.push_back(sen.substr(start, end-start));
//search next space (of course only after already found space)
start = end + 1;
end = sen.find(' ', start);
}
//last word
words.push_back(sen.substr(start));
Of course this doesn't handle duplicate spaces, starting or trailing spaces and other special cases. You would actually be better off using a stringstream:
#include <sstream>
#include <algorithm>
#include <iterator>
std::istringstream stream(sen);
std::vector<std::string> words(std::istream_iterator<std::string>(stream),
std::istream_iterator<std::string>());
You can then just put these out however you like or just do it directly in the loops without using a vector:
for(std::vector<std::string>::const_iterator iter=
words.begin(); iter!=words.end(); ++iter)
std::cout << "found word: " << *iter << '\n';