how do i dividing the sentence into a word [duplicate] - c++

This question already has answers here:
How do I iterate over the words of a string?
(84 answers)
Closed 5 years ago.
how do i divide a sentences in c++ like :
input from cin (He said, "that's not a good idea". )
into
He
Said
That
s
not
a
good
idea
to test whether a character is a letter, use a statement (ch >='a' && ch <='z') || (ch >='A' && ch <='Z').

You can split string by spaces then check each word if it has any characters other than A-z or not. if it has, erase it. Here's a tip :
#include <iostream>
#include <string>
#include <vector>
#include <sstream>
std::vector<std::string> splitBySpace(std::string input);
std::vector<std::string> checker(std::vector<std::string> rawVector);
int main() {
//input
std::string input ("Hi, My nam'e is (something)");
std::vector<std::string> result = checker(splitBySpace(input));
return 0;
}
//functin to split string by space (the words)
std::vector<std::string> splitBySpace(std::string input) {
std::stringstream ss(input);
std::vector<std::string> elems;
while (ss >> input) {
elems.push_back(input);
}
return elems;
}
//function to check each word if it has any char other than A-z characters
std::vector<std::string> checker(std::vector<std::string> rawVector) {
std::vector<std::string> outputVector;
for (auto iter = rawVector.begin(); iter != rawVector.end(); ++iter) {
std::string temp = *iter;
int index = 0;
while (index < temp.size()) {
if ((temp[index] < 'A' || temp[index] > 'Z') && (temp[index] < 'a' || temp[index] > 'z')) {
temp.erase(index, 1);
}
++index;
}
outputVector.push_back(temp);
}
return outputVector;
}
in this example result is a vector that has words of this sentence.
NOTE : use std::vector<std::string>::iterator iter instead of auto iter if you are not using c++1z

Related

How to convert an input string into an array in C++? [duplicate]

This question already has answers here:
C++ function split string into words
(1 answer)
taking input of a string word by word
(3 answers)
Right way to split an std::string into a vector<string>
(12 answers)
Closed last year.
myStr = input("Enter something - ")
// say I enter "Hi there"
arrayStr = myStr.split()
print(arrayStr)
// Output: ['Hi', 'there']
What is the exact C++ equivalent of this code? (My aim is to further iterate over the array and perform comparisons with other arrays).
One way of doing this would be using std::vector and std::istringstream as shown below:
#include <iostream>
#include <string>
#include<sstream>
#include <vector>
int main()
{
std::string input, temp;
//take input from user
std::getline(std::cin, input);
//create a vector that will hold the individual words
std::vector<std::string> vectorOfString;
std::istringstream ss(input);
//go word by word
while(ss >> temp)
{
vectorOfString.emplace_back(temp);
}
//iterate over all elements of the vector and print them out
for(const std::string& element: vectorOfString)
{
std::cout<<element<<std::endl;
}
return 0;
}
You can use string_views to avoid generating copies of the input string (efficient in memory), it literally will give you views on the words in the string, like this :
#include <iostream>
#include <string_view>
#include <vector>
inline bool is_delimiter(const char c)
{
// order by frequency in your input for optimal performance
return (c == ' ') || (c == ',') || (c == '.') || (c == '\n') || (c == '!') || (c == '?');
}
auto split_view(const char* line)
{
const char* word_start_pos = line;
const char* p = line;
std::size_t letter_count{ 0 };
std::vector<std::string_view> words;
// while parsing hasn't seen the terminating 0
while(*p != '\0')
{
// if it is a character from a word then start counting the letters in the word
if (!is_delimiter(*p))
{
letter_count++;
}
else
{
//delimiter reached and word detected
if (letter_count > 0)
{
//add another string view to the characters in the input string
// this will call the constructor of string_view with arguments const char* and size
words.emplace_back(word_start_pos, letter_count);
// skip to the next word
word_start_pos += letter_count;
}
// skip delimiters for as long as you encounter them
word_start_pos++;
letter_count = 0ul;
}
// move on to the next character
++p;
}
return words;
}
int main()
{
auto words = split_view("the quick brown fox is fast. And the lazy dog is asleep!");
for (const auto& word : words)
{
std::cout << word << "\n";
}
return 0;
}
#include <string>
#include <sstream>
#include <vector>
#include <iterator>
template <typename Out>
void split(const std::string &s, char delim, Out result) {
std::istringstream iss(s);
std::string item;
while (std::getline(iss, item, delim)) {
*result++ = item;
}
}
std::vector<std::string> split(const std::string &s, char delim) {
std::vector<std::string> elems;
split(s, delim, std::back_inserter(elems));
return elems;
}
std::vector<std::string> x = split("one:two::three", ':');
Where 'x' is your converted array with 4 elements.
Basically #AnoopRana's solution but using STL algorithms and removing punctuation signs from words:
[Demo]
#include <cctype> // ispunct
#include <algorithm> // copy, transform
#include <iostream> // cout
#include <iterator> // istream_iterator, ostream_iterator
#include <sstream> // istringstream
#include <string>
#include <vector>
int main() {
const std::string s{"In the beginning, there was simply the event and its consequences."};
std::vector<std::string> ws{};
std::istringstream iss{s};
std::transform(std::istream_iterator<std::string>{iss}, {},
std::back_inserter(ws), [](std::string w) {
w.erase(std::remove_if(std::begin(w), std::end(w),
[](unsigned char c) { return std::ispunct(c); }),
std::end(w));
return w;
});
std::copy(std::cbegin(ws), std::cend(ws), std::ostream_iterator<std::string>{std::cout, "\n"});
}
// Outputs:
//
// In
// the
// beginning
// there
// was
// simply
// the
// event
// and
// its
// consequences

Split strings into tokens with delimiter (/ and -) in c++ [duplicate]

This question already has answers here:
Right way to split an std::string into a vector<string>
(12 answers)
Closed 11 months ago.
The community reviewed whether to reopen this question 11 months ago and left it closed:
Original close reason(s) were not resolved
I have some text (meaningful text or arithmetical expression) and I want to split it into words.
If I had a single delimiter, I'd use:
std::stringstream stringStream(inputString);
std::string word;
while(std::getline(stringStream, word, delimiter))
{
wordVector.push_back(word);
}
How can I break the string into tokens with several delimiters?
Assuming one of the delimiters is newline, the following reads the line and further splits it by the delimiters. For this example I've chosen the delimiters space, apostrophe, and semi-colon.
std::stringstream stringStream(inputString);
std::string line;
while(std::getline(stringStream, line))
{
std::size_t prev = 0, pos;
while ((pos = line.find_first_of(" ';", prev)) != std::string::npos)
{
if (pos > prev)
wordVector.push_back(line.substr(prev, pos-prev));
prev = pos+1;
}
if (prev < line.length())
wordVector.push_back(line.substr(prev, std::string::npos));
}
If you have boost, you could use:
#include <boost/algorithm/string.hpp>
std::string inputString("One!Two,Three:Four");
std::string delimiters("|,:");
std::vector<std::string> parts;
boost::split(parts, inputString, boost::is_any_of(delimiters));
Using std::regex
A std::regex can do string splitting in a few lines:
std::regex re("[\\|,:]");
std::sregex_token_iterator first{input.begin(), input.end(), re, -1}, last;//the '-1' is what makes the regex split (-1 := what was not matched)
std::vector<std::string> tokens{first, last};
Try it yourself
I don't know why nobody pointed out the manual way, but here it is:
const std::string delims(";,:. \n\t");
inline bool isDelim(char c) {
for (int i = 0; i < delims.size(); ++i)
if (delims[i] == c)
return true;
return false;
}
and in function:
std::stringstream stringStream(inputString);
std::string word; char c;
while (stringStream) {
word.clear();
// Read word
while (!isDelim((c = stringStream.get())))
word.push_back(c);
if (c != EOF)
stringStream.unget();
wordVector.push_back(word);
// Read delims
while (isDelim((c = stringStream.get())));
if (c != EOF)
stringStream.unget();
}
This way you can do something useful with the delims if you want.
And here, ages later, a solution using C++20:
constexpr std::string_view words{"Hello-_-C++-_-20-_-!"};
constexpr std::string_view delimeters{"-_-"};
for (const std::string_view word : std::views::split(words, delimeters)) {
std::cout << std::quoted(word) << ' ';
}
// outputs: Hello C++ 20!
Required headers:
#include <ranges>
#include <string_view>
Reference: https://en.cppreference.com/w/cpp/ranges/split_view
If you interesting in how to do it yourself and not using boost.
Assuming the delimiter string may be very long - let say M, checking for every char in your string if it is a delimiter, would cost O(M) each, so doing so in a loop for all chars in your original string, let say in length N, is O(M*N).
I would use a dictionary (like a map - "delimiter" to "booleans" - but here I would use a simple boolean array that has true in index = ascii value for each delimiter).
Now iterating on the string and check if the char is a delimiter is O(1), which eventually gives us O(N) overall.
Here is my sample code:
const int dictSize = 256;
vector<string> tokenizeMyString(const string &s, const string &del)
{
static bool dict[dictSize] = { false};
vector<string> res;
for (int i = 0; i < del.size(); ++i) {
dict[del[i]] = true;
}
string token("");
for (auto &i : s) {
if (dict[i]) {
if (!token.empty()) {
res.push_back(token);
token.clear();
}
}
else {
token += i;
}
}
if (!token.empty()) {
res.push_back(token);
}
return res;
}
int main()
{
string delString = "MyDog:Odie, MyCat:Garfield MyNumber:1001001";
//the delimiters are " " (space) and "," (comma)
vector<string> res = tokenizeMyString(delString, " ,");
for (auto &i : res) {
cout << "token: " << i << endl;
}
return 0;
}
Note: tokenizeMyString returns vector by value and create it on the stack first, so we're using here the power of the compiler >>> RVO - return value optimization :)
Using Eric Niebler's range-v3 library:
https://godbolt.org/z/ZnxfSa
#include <string>
#include <iostream>
#include "range/v3/all.hpp"
int main()
{
std::string s = "user1:192.168.0.1|user2:192.168.0.2|user3:192.168.0.3";
auto words = s
| ranges::view::split('|')
| ranges::view::transform([](auto w){
return w | ranges::view::split(':');
});
ranges::for_each(words, [](auto i){ std::cout << i << "\n"; });
}

C++ Extracting the integer in a string with multiple delimiters

I am trying to extract the integers from a string. What could be wrong here?
I only get the first value. How can I get it working even with zero's in the string?
string str="91,43,3,23,0;6,9,0-4,29,24";
std::stringstream ss(str);
int x;
while(ss >> x)
{
cout<<"GOT->"<<x<<endl;
char c;
ss >> c; //Discard a non space char.
if(c != ',' || c != '-' || c != ';')
{
ss.unget();
}
}
Look very closely at this line:
if(c != ',' || c != '-' || c != ';')
Note that this condition is always true, so you are always ungeting the punctuation character. The next read will then always fail as it reads punctuation when a number is expected. Changing the ||'s to &&'s should fix the problem.
Of course, your code assumes that str is formatted in a very particular way and might break when given a differently-formatted str value. Just be aware of that.
u can get this done with boost split.
int main() {
std::stringstream ss;
std::string inputString = "91,43,3,23,0;6,9,0-4,29,24";
std::string delimiters("|,:-;");
std::vector<std::string> parts;
boost::split(parts, inputString, boost::is_any_of(delimiters));
for(int i = 0; i<parts.size();i++ ) {
std::cout <<parts[i] << " ";
}
return 0;
}
Output (Just integers) :- 91 43 3 23 0 6 9 0 4 29 24
This will change the string into char and write off : , ; -
#include <iostream>
#include <string>
using namespace std;
int main(){
string str = "91,43,3,23,0;6,9,0-4,29,24";
str.c_str(); // ex: string a; --> char a[];
char a[99];
int j = 0;
int x;
for(int i = 0; i < str.length(); i++){
if (str[i]!=',' && str[i]!=';' && str[i]!='-'){
a[j] = str[i];
j++;
}
}
return 0;
}
Hope this will help you.
This suits my purpose where in I can extract the integers and also add the delimiters if necessary. Works with different formatted strings as well.
(I dont have boost lib, hence preferring this method. )
int main()
{
string str="2,3,4;0,1,3-4,289,24,21,45;2";
//string str=";2;0,1,3-4,289,24;21,45;2"; //input2
std::stringstream ss(str);
int x=0;
if( str.length() != 0 )
{
while( !ss.eof() )
{
if( ss.peek()!= ',' && ss.peek()!=';' && ss.peek()!='-') /*Delimiters*/
{
ss>>x;
cout<<"val="<<x<<endl;
/* TODO:store integers do processing */
}
ss.get();
}
}
}
You can also try:
vector<int> SplitNumbersFromString(const string& input, const vector<char>& delimiters)
{
string buff{""};
vector<int> output;
for (auto n : input)
{
if (none_of(delimiters.begin(), delimiters.end(), [n](const char& c){ return c == n; }))
{
buff += n;
}
else
{
if (buff != "")
{
output.push_back(stoi(buff));
buff = "";
}
}
}
if (buff != "") output.push_back(stoi(buff));
return output;
}
vector<char> delimiters = { ',', '-', ';' };
vector<int> numbers = SplitNumbersFromString("91,43,3,23,0;6,9,0-4,29,24", delimiters);

C++ - Counting the number of vowels from a file

I'm having trouble implementing a feature that counts and displays the number of vowels from a file.
Here is the code I have so far.
#include <iostream>
#include <fstream>
#include <string>
#include <cassert>
#include <cstdio>
using namespace std;
int main(void)
{int i;
string inputFileName;
string s;
ifstream fileIn;
char ch;
cout<<"Enter name of file of characters :";
cin>>inputFileName;
fileIn.open(inputFileName.data());
assert(fileIn.is_open() );
i=0;
while (!(fileIn.eof()))
{
????????????
}
cout<<s;
cout<<"The number of vowels in the string is "<<s.?()<<endl;
return 0;
}
Note the question marks in the code.
Questions: How should I go about counting the vowels? Do I have to convert the text to lowercase and invoke system controls (if possible)?
Also, as for printing the number of vowels in the end, which string variable should I use, (see s.?)?
Thanks
auto isvowel = [](char c){ return c == 'A' || c == 'a' ||
c == 'E' || c == 'e' ||
c == 'I' || c == 'i' ||
c == 'O' || c == 'o' ||
c == 'U' || c == 'u'; };
std::ifstream f("file.txt");
auto numVowels = std::count_if(std::istreambuf_iterator<char>(f),
std::istreambuf_iterator<char>(),
isvowel);
You can using <algorithm>'s std::count_if to achieve this :
std::string vowels = "AEIOUaeiou";
size_t count = std::count_if
(
std::istreambuf_iterator<char>(in),
std::istreambuf_iterator<char>(),
[=]( char x)
{
return vowels.find(x) != std::string::npos ;
}
);
Or
size_t count = 0;
std::string vowels = "AEIOUaeiou";
char x ;
while ( in >> x )
{
count += vowels.find(x) != std::string::npos ;
}
Also read Why is iostream::eof inside a loop condition considered wrong?

Getting the words from a sentence and storing them in a vector of strings

Alright, guys ...
Here's my set that has all the letters. I'm defining a word as consisting of consecutive letters from the set.
const char LETTERS_ARR[] = {"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"};
const std::set<char> LETTERS_SET(LETTERS_ARR, LETTERS_ARR + sizeof(LETTERS_ARR)/sizeof(char));
I was hoping that this function would take in a string representing a sentence and return a vector of strings that are the individual words in the sentence.
std::vector<std::string> get_sntnc_wrds(std::string S) {
std::vector<std::string> retvec;
std::string::iterator it = S.begin();
while (it != S.end()) {
if (LETTERS_SET.count(*it) == 1) {
std::string str(1,*it);
int k(0);
while (((it+k+1) != S.end()) && (LETTERS_SET.count(*(it+k+1) == 1))) {
str.push_back(*(it + (++k)));
}
retvec.push_back(str);
it += k;
}
else {
++it;
}
}
return retvec;
}
For instance, the following call should return a vector of the strings "Yo", "dawg", etc.
std::string mystring("Yo, dawg, I heard you life functions, so we put a function inside your function so you can derive while you derive.");
std::vector<std::string> mystringvec = get_sntnc_wrds(mystring);
But everything isn't going as planned. I tried running my code and it was putting the entire sentence into the first and only element of the vector. My function is very messy code and perhaps you can help me come up with a simpler version. I don't expect you to be able to trace my thought process in my pitiful attempt at writing that function.
Try this instead:
#include <vector>
#include <cctype>
#include <string>
#include <algorithm>
// true if the argument is whitespace, false otherwise
bool space(char c)
{
return isspace(c);
}
// false if the argument is whitespace, true otherwise
bool not_space(char c)
{
return !isspace(c);
}
vector<string> split(const string& str)
{
typedef string::const_iterator iter;
vector<string> ret;
iter i = str.begin();
while (i != str.end())
{
// ignore leading blanks
i = find_if(i, str.end(), not_space);
// find end of next word
iter j = find_if(i, str.end(), space);
// copy the characters in [i, j)
if (i != str.end())
ret.push_back(string(i, j));
i = j;
}
return ret;
}
The split function will return a vector of strings, each element containing one word.
This code is taken from the Accelerated C++ book, so it's not mine, but it works. There are other superb examples of using containers and algorithms for solving every-day problems in this book. I could even get a one-liner to show the contents of a file at the output console. Highly recommended.
It's just a bracketing issue, my advice is (almost) never put in more brackets than are necessary, it's only confuses things
while (it+k+1 != S.end() && LETTERS_SET.count(*(it+k+1)) == 1) {
Your code compares the character with 1 not the return value of count.
Also although count does return an integer in this context I would simplify further and treat the return as a boolean
while (it+k+1 != S.end() && LETTERS_SET.count(*(it+k+1))) {
You should use the string steam with std::copy like so:
#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <vector>
int main() {
std::string sentence = "And I feel fine...";
std::istringstream iss(sentence);
std::vector<std::string> split;
std::copy(std::istream_iterator<std::string>(iss),
std::istream_iterator<std::string>(),
std::back_inserter(split));
// This is to print the vector
for(auto iter = split.begin();
iter != split.end();
++iter)
{
std::cout << *iter << "\n";
}
}
I would use another more simple approach based on member functions of class std::string. For example
const char LETTERS[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
std::string s( "This12 34is 56a78 test." );
std::vector<std::string> v;
for ( std::string::size_type first = s.find_first_of( LETTERS, 0 );
first != std::string::npos;
first = s.find_first_of( LETTERS, first ) )
{
std::string::size_type last = s.find_first_not_of( LETTERS, first );
v.push_back(
std::string( s, first, last == std::string::npos ? std::string::npos : last - first ) );
first = last;
}
for ( const std::string &s : v ) std::cout << s << ' ';
std::cout << std::endl;
Here you make 2 mistakes, I have correct in the following code.
First, it should be
while (((it+k+1) != S.end()) && (LETTERS_SET.count(*(it+k+1)) == 1))
and, it should move to next by
it += (k+1);
and the code is
std::vector<std::string> get_sntnc_wrds(std::string S) {
std::vector<std::string> retvec;
std::string::iterator it = S.begin();
while (it != S.end()) {
if (LETTERS_SET.count(*it) == 1) {
std::string str(1,*it);
int k(0);
while (((it+k+1) != S.end()) && (LETTERS_SET.count(*(it+k+1)) == 1)) {
str.push_back(*(it + (++k)));
}
retvec.push_back(str);
it += (k+1);
}
else {
++it;
}
}
return retvec;
}
The output have been tested.