Extracting all words separated by non-alphabetical characters

Extracting all words separated by non-alphabetical characters - c++

Given a string such as:
std::string word = "Hello World!it's#.-/sunday";
I want to extract all the words which are separated by non-alphabetical letters stored in a container like a vector, so for this case it would be:
Hello, World, it, s, sunday
My initial try was to use isalpha() along with an index count to substr() accordingly like so:
std::string word = "Hello World!it's#.-/sunday";
int count = 0, index = 0;
std::string temp;
for (; count < word.length();count++){
if (!isalpha(word[count])){
temp = word.substr(index,count);
index = count;
}
}
But this does not work as I thought it would as the index does not update fast enough resulting in words mixed with non-alphabetical characters. Is there perhaps a function or a better way to extract said words?

You can try the following code. It constructs the words char by char.
#include <string>
#include <iostream>
#include <vector>
int main()
{
std::string phrase = "Hello World!it's#.-/sunday";
std::vector<std::string> words;
std::string tmp;
for (const char &c:phrase)
{
if (isalpha(c))
{
tmp.push_back(c);
}
else
{
words.push_back(tmp);
tmp = "";
}
}
if (!tmp.empty())
{
words.push_back(tmp);
}
for (const auto w : words)
{
std::cout << w << std::endl;
}
return 0;
}

one easy way is that replace all the non-alphabetical char to space than
split around space.

Related

C++ reverse a string but printing numbers first

I was given a project in class and almost have it finished, I am required to take a string of numbers and letters and return that string with the numbers printed first followed by the letters in reverse order (ex. abc123 should return 123cba). As of now my code returns a string with the numbers first and the original order of the letters (ex. abc123 returns 123abc). I would be able to do this with two loops however the assignment asks that my code only iterates though the initial string one time. Here is the code I have so far...
#include <iostream>
#include <string>
#include "QueType.h"
#include "StackType.h"
using namespace std;
int main ()
{
QueType<char> myQueue;
StackType<char> myStack;
string myString="hello there123";
char curchar;
string numbers, letters;
for (int i = 0; i < myString.length(); i++) {
if (isdigit(myString.at(i))) {
myQueue.Enqueue(myString.at(i));
myQueue.Dequeue(curchar);
numbers += curchar;
//cout<<numbers<<endl;
}
else if (islower(myString.at(i))) {
myStack.Push(myString.at(i));
curchar = myStack.Peek();
myStack.Pop();
letters += curchar;
//cout<<curchar<<endl;
}
}
cout<<(myString = numbers + letters)<<endl;
}
In my code, I have two .h files that set up a stack and a queue. With the given string, the code loops through the string looking to see if it sees a letter or number. With a number the spot in the string is then saved to a queue, and with a letter it is saved to the stack.
The only other way i can think of reversing the order of the letters is in the if else statement instead of having char = myStack.Peek() every loop, change it to char += myStack.Peek() however I get weird lettering when that happens.

since you already got the string with letters you can basically reverse it and that's it.
//emplace version:
void reverse_str(std::string& in)
{
std::reverse(in.begin(), in.end());
}
//copy version
std::string reverse_str(std::string in)
{
std::reverse(in.begin(), in.end());
return in;
}
in your case the emplace version would be the best match.
in other cases (e.g. when you want to preserve the original string) the copy version is preferred.
adding an example to make it as clean as possible.
int main()
{
std::string inputstr = "123abc";
std::string numbers{};
std::string letters{};
for(auto c : inputstr)
{
if(isdigit(c))
numbers += c;
else
letters += c;
}
reverse_str(letters); //using the emplace version
std::cout << numbers + letters;
}

Here's my take. It only loops through the string once. I don't have your types, so I'm just using the std versions.
std::string output;
output.reserve( myString.size() );
std::stack<char> stack;
for ( char c : myString ) {
if ( std::isdigit( c ) ) // if it's a number, just add it to the output
output.push_back( c );
else // otherwise, add the character to the stack
stack.push( c );
}
// string is done being processed, so use the stack to get the
// other characters in reverse order
while ( !stack.empty() ) {
output.push_back( stack.top() );
stack.pop();
}
std::cout << output;
working example: https://godbolt.org/z/eMazcGsMf
Note: wasn't sure from your description how to handle characters other than letters and numbers, so treated them the same as letters.

One way to do this is as follows:
Version 1
#include <iostream>
#include <string>
int main() {
std::string s = "abc123";
std::string output;
output.resize(s.size());
int i = output.length() - 1;
int j = 0;
for(char &c: s)
{
if(!std::isdigit(c))
{
output.at(i) = c;
--i;
}
else
{
output.at(j) = c;
++j;
}
}
std::cout<<output<<std::endl;
}
You can also use iterators in the above program to obtain the desired result as shown in version 2.
Version 2
#include <iostream>
#include <string>
int main() {
std::string s = "abfsc13423";
std::string output;
output.resize(s.size());
std::string::reverse_iterator iter = output.rbegin();
std::string::iterator begin = output.begin();
for(char &c: s)
{
if(!std::isdigit(c))
{
*iter = c;
++iter;
}
else
{
*begin = c;
++begin;
}
}
std::cout<<output<<std::endl;
}

C++: English to Pig Latin

I am receiving an error from the following code when I try to dynamically allocate the array (seen after my attempt to incrementing through each letter in the users array using the bool function). This is the error:
main.cpp: In function ‘Word* splitSentene(std::string, int&)’:
main.cpp:81:32: error: cannot convert ‘std::string* {aka std::basic_string*}’ to ‘Word*’ in assignment
words = new string[i];
I am trying to count how many words the user inputs and dynamically allocate an array for the string of words. This is my code thus far:
#include <iostream>
#include <cctype>
#include <string>
using namespace std;
struct Word
{
string english; // English sentence
string piglatin; // Pig latin sentence
};
// PT 1. Function prototype
Word * splitSentence(const string words, int &size){};
int main()
{
string userSentence;
int size;
// Get the users sentence to convert to pig latin
cout << "Please enter a string to convert to pig latin:\n";
getline(cin, userSentence);
// Directs to Word * splitSentence function
Word* tempptr = splitSentence(userSentence, size);
delete [] tempptr;
return 0;
}
//PT 1. Analyze the sentence
Word * splitSentene(const string words, int &size)
{
bool flag = true;
int num = 0;
for (int i = 0; i < words.length() + 1; i++)
{
//test for white space, then when you hit the first alphabetical character after a space,
//increment up the size of the array
if (isspace(words[i]))
flag = true;
if (isalpha(words[i]));
{
if (flag == true)
{
flag = false;
cout << words[i++];
}
}
// Dynamically allocate the array for the words
Word *sentence = nullptr;
sentence = new string[i];
}
}
Here are the pt 1 instructions for further clarification:
PT. 1) Write a function that takes in an English sentence as one string. This function should first calculate how many “words” are in the sentence (words being substrings separated by whitespace). It should then allocate a dynamic array of size equal to the number of words. The array contains Word structures (i.e. array of type Word). The function would then store each word of that sentence to the english field of the corresponding structure. The function should then return this array to the calling function using the return statement, along with the array size using a reference parameter.
This function should also remove all capitalization and special characters other than letters. Implement the function with the following prototype:
Word * splitSentence(const string words, int &size);
This is my first post here, so I will appreciate any input on how to dynamically allocate the array and format it (if I have successfully coded how to count the words in the sentence the user inputs). If more information needs to be provided, let me know!

The compiler error is because you are trying to assign a string[] array to a Word* pointer. You need to allocate a Word[] array instead.
You also have other errors in your code:
You have an erroneous {} at the end of the declaration of splitSentence().
You misspelled splitSentene in the defintion of splitSentence().
You have an erroneous ; on if (isalpha(words[i]));
You are not return'ing the array that you allocate.
In fact, you are not even following the instructions properly at all. You are not "calculating how many words are in the sentence" BEFORE allocating the array (you tried, but you are doing the allocation in the wrong place), and you are not filling the array at all, let alone "removing all capitalization and special characters other than letters".
Try something more like this:
#include <iostream>
#include <cctype>
#include <string>
using namespace std;
struct Word
{
string english; // English sentence
string piglatin; // Pig latin sentence
};
// PT 1. Function prototype
Word* splitSentence(const string words, int &size);
int main()
{
string userSentence;
int size;
// Get the users sentence to convert to pig latin
cout << "Please enter a string to convert to pig latin:\n";
getline(cin, userSentence);
// Directs to Word * splitSentence function
Word* tempptr = splitSentence(userSentence, size);
delete [] tempptr;
return 0;
}
//PT 1. Analyze the sentence
Word* splitSentence(const string words, int &size)
{
bool flag = true;
int num = 0;
char ch;
for (int i = 0; i < words.length(); ++i)
{
ch = words[i];
if (isalpha(ch))
{
if (flag)
{
flag = false;
++num;
}
}
else if (isspace(ch))
{
flag = true;
}
}
Word *sentence = new Word[num];
int index = -1;
flag = true;
num = 0;
for (int i = 0; i < words.length(); ++i)
{
ch = words[i];
if (isalpha(ch))
{
if (flag)
{
flag = false;
++num;
++index;
}
if (isupper(ch))
{
ch = tolower(ch);
}
sentence[index].english += ch;
}
else if (isspace(ch))
{
flag = true;
}
}
size = num;
return sentence;
}
Live Demo
That being said, this will be much easier to implement splitSentence() if you could use std::istringstream and std::vector and other C++ idioms, instead of using C idioms, eg:
#include <iostream>
#include <sstream>
#include <cctype>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
struct Word
{
string english; // English sentence
string piglatin; // Pig latin sentence
};
// PT 1. Function prototype
Word* splitSentence(const string words, int &size);
int main()
{
string userSentence;
int size;
// Get the users sentence to convert to pig latin
cout << "Please enter a string to convert to pig latin:\n";
getline(cin, userSentence);
// Directs to Word * splitSentence function
Word* tempptr = splitSentence(userSentence, size);
delete [] tempptr;
return 0;
}
//PT 1. Analyze the sentence
Word* splitSentence(const string words, int &size)
{
istringstream iss(words);
vector<string> vec;
string s;
while (iss >> s)
{
remove_if(s.begin(), s.end(),
[](unsigned char ch){ return !isalpha(ch); });
if (!s.empty())
{
transform(s.begin(), s.end(), s.begin(),
[](unsigned char ch){ return tolower(ch); });
vec.push_back(s);
}
}
Word *sentence = new Word[vec.size()];
transform(vec.begin(), vec.end(), sentence,
[](const string &s){
Word w;
w.english = s;
return w;
}
);
size = vec.size();
return sentence;
}
Live Demo

How to split a sentence of any length into words and store them into variables c++

I need some help on making a function to split sentence into words and this function should work on sentence with different lengths.
Here is the sample code:
void spilt_sentence(string sentence)
{}
int main()
{
std::string sentence1= "Hello everyone";
std::string sentence2= "Hello I am doing stuff";
split_sentence(sentence1);
split_sentence(sentence2);
return 0;
}
I saw someone use std::istringstream to get every words before each space but I don't really know how it works. It gives me error when I put std::istringstream ss(sentence); in the code. Also, I am using c++98 and I compile my program with cygwin. Any leads? Thank you.
Edit: The function will create a number of variables depending on how many words are there in the sentence.
Edit: I am actually working on a LinkedList program and what I am trying to do here is split sentence into words and then generate new nodes containing each word.
Here is the actual code (note: I modified it a little bit so it's not exactly the same as my actual one. Also I am not using struct for Node) and let's say sentence 1 is "Hello everyone" and sentence 2 is "Hello I am doing stuff".
The expected output will be:
linkedlist1:
"hello"<->"everyone"
linkedlist2:
"hello"<->"I"<->"am"<->"doing"<->"stuff"
inside LinkedList.cpp:
void LinkedList::add(std::string sentence)
{
//breaks down the sentence into words
std::istringstream ss(sentence);
do
{
std::string word;
ss >> word;
//store them in nodes in a linkedlist
Node* new_tail = new Node(word);
if (size == 0)
{
head = new_tail;
tail = new_tail;
}
else
{
new_tail->set_previous(tail);
tail->set_next(new_tail);
tail = new_tail;
}
new_tail = NULL;
size++;
}
while(ss);
}
[FIXED]An error message pop up when I compile it, saying std::istringstream ss has default settings but the type is incomplete. What should I do?
error

Here is the function using streams, this function will work only for vectors, you can't use this function for arrays, but if you want to, you can modify it for you.
Here is the code and usage example
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <iostream>
using namespace std;
void split_sentence(const string& str, vector<string>& cont)
{
istringstream iss(str);
copy(istream_iterator<string>(iss),
istream_iterator<string>(),
back_inserter(cont));
//checking for punctuation marks and if found, we remove them from the word
for(int i = 0, sz = cont.size(); i < sz; i++){
string word = cont.at(i);
for(int j = 0, len = word.length(); j < len; j++){
if(ispunct(word[j])){
cont.at(i) = word.substr(0, word.length() - 1);
}
}
}
}
int main(){
string sentence = "this is a test sentence for stackoverflow!";
vector<string> words;
split_sentence(sentence, words);
for(int i = 0, sz = words.size(); i < sz; i++){
cout<<words.at(i) << endl;
}
return 0;
}
And this is the output
this
is
a
test
sentence
for
stackoverflow
if you also want to print punctuation marks then remove double for loop in fucntion.

Using isalpha function with string pointers

Hey I'm quite new to programming and I'm having trouble using the isalpha function in my programme. This a part of the code for a palindrome class. What I'm trying to do is remove all the non alphabetic characters from the input. So if the user inputs "Hi, How are you" I need to first count the size of the array of just the letters then in my removeNonLetters subclass, I need to get rid of the non alphabetical characters. Can someone please help me with this. Thank you so much!
#include <iostream>
#include <string>
#include <stdio.h>
#include <algorithm>
#include <cctype>
#include <cstring>
#include <ctype.h>
using namespace std;
class palindrome
{
private:
int only_letters_size;
string input_phrase;
string* only_letters;
public:
string inputPhrase();
string removeNonLetters();
string* new_Array;
int size_new_Array;
};
string palindrome::inputPhrase()
{
cout << "Input phrase: "; //asks the user for the input
getline(cin,input_phrase);
size_new_Array = input_phrase.length(); //creating a dynamic array to store
the input phrase
new_Array = new string[size_new_Array];
int i;
for (i=0; i<size_new_Array; i++)
{
new_Array[i]=input_phrase[i];
}
only_letters_size = 0;
while(new_Array[i])
{
if (isalpha(new_Array[i])) //PROBLEM OCCURS HERE
{
only_letters_size=only_letters_size+1;
}
}
cout << only_letters_size << endl;
return new_Array;
}
string palindrome::removeNonLetters()
{
int j=0;
int str_length = new_Array.length(); //string length
only_letters = new string[only_letters_size];
for (int i=0;i<size_new_Array;i++) //PROBLEM OCCURS HERE AS WELL
{
if (isalpha(new_Array[i]))//a command that checks for characters
{
only_letters[j] = new_Array[i];//word without non alphabetical c
characters is stored to new variable
j++;
}
}
cout << only_letters << endl;
return only_letters;
}

I've found the best way to determine if a string is a palindrome is to walk toward the center from both sides. In your case I would just opt to skip non-alpha characters like so.
bool is_palindrome(string mystring)
{
int start = 0, end = mystring.length() - 1;
while (start < end)
{
// Skip over non-alpha characters
while (!isalpha(mystring[start]))
{
start++;
}
while (!isalpha(mystring[end]))
{
end--;
}
if (tolower(mystring[start]) != tolower(mystring[end]))
{
return false;
}
else
{
start++;
end--;
}
}
return true;
}
If you must save the input first and remove nonalpha characters, I would do it like this.
string remove_non_alpha(string mystring)
{
string ret_string = "";
for (int i = 0; i < mystring.length(); i++)
{
if (isalpha(mystring[i]))
{
ret_string += tolower(mystring[i]);
}
}
return ret_string;
}
And then feed the result into the above function.

Sorry for being hard, but your trying far too much copying around. You can achieve all this with one single loop after retrieving your data and all on one single string object (unless you want to keep the original input for some other purposes):
getline(cin,input_phrase);
std::string::iterator pos = input_phrase.begin();
for(char c : input_phrase)
{
if(isalpha(c))
{
*pos++ = tolower(c);
}
}
input_phrase.erase(pos, input_phrase.end());
After that, your string is ready to use...
Explanation:
std::string::iterator pos = input_phrase.begin();
An iterator something similar than a pointer to the internal data of the string. We keep the position to move the alpha only characters to, skipping the non-alpha ones.
for(char c : input_phrase)
Simply iterating over all characters...
if(isalpha(c))
The essential check, is the current character an alpha one?
*pos++ = tolower(c);
If so, convert it to lower case immediately. Assign it to the current string position, and advance the "pointer" (iterator!).
input_phrase.erase(pos, input_phrase.end());
And at very last, drop the remaining part of the string occupied with surplus characters. You might note that there might be some characters you wanted to keep within, but you copied these to a position more to the left already...

How to extract words out of a string and store them in different array in c++

How to split a string and store the words in a separate array without using strtok or istringstream and find the greatest word?? I am only a beginner so I should accomplish this using basic functions in string.h like strlen, strcpy etc. only. Is it possible to do so?? I've tried to do this and I am posting what I have done. Please correct my mistakes.
#include<iostream.h>
#include<stdio.h>
#include<string.h>
void count(char n[])
{
char a[50], b[50];
for(int i=0; n[i]!= '\0'; i++)
{
static int j=0;
for(j=0;n[j]!=' ';j++)
{
a[j]=n[j];
}
static int x=0;
if(strlen(a)>x)
{
strcpy(b,a);
x=strlen(a);
}
}
cout<<"Greatest word is:"<<b;
}
int main( int, char** )
{
char n[100];
gets(n);
count(n);
}

The code in your example looks like it's written in C. Functions like strlen and strcpy originates in C (although they are also part of the C++ standard library for compatibility via the header cstring).
You should start learning C++ using the Standard Library and things will get much easier. Things like splitting strings and finding the greatest element can be done using a few lines of code if you use the functions in the standard library, e.g:
// The text
std::string text = "foo bar foobar";
// Wrap text in stream.
std::istringstream iss{text};
// Read tokens from stream into vector (split at whitespace).
std::vector<std::string> words{std::istream_iterator<std::string>{iss}, std::istream_iterator<std::string>{}};
// Get the greatest word.
auto greatestWord = *std::max_element(std::begin(words), std::end(words), [] (const std::string& lhs, const std::string& rhs) { return lhs.size() < rhs.size(); });
Edit:
If you really want to dig down in the nitty-gritty parts using only functions from std::string, here's how you can do to split the text into words (I leave finding the greatest word to you, which shouldn't be too hard):
// Use vector to store words.
std::vector<std::string> words;
std::string text = "foo bar foobar";
std::string::size_type beg = 0, end;
do {
end = text.find(' ', beg);
if (end == std::string::npos) {
end = text.size();
}
words.emplace_back(text.substr(beg, end - beg));
beg = end + 1;
} while (beg < text.size());

I would write two functions. The first one skips blank characters for example
const char * SkipSpaces( const char *p )
{
while ( *p == ' ' || *p == '\t' ) ++p;
return ( p );
}
And the second one copies non blank characters
const char * CopyWord( char *s1, const char *s2 )
{
while ( *s2 != ' ' && *s2 != '\t' && *s2 != '\0' ) *s1++ = *s2++;
*s1 = '\0';
return ( s2 );
}

try to get a word in a small array(obviously no word is >35 characters) you can get the word by checking two successive spaces and then put that array in strlen() function and then check if the previous word was larger then drop that word else keep the new word
after all this do not forget to initialize the word array with '\0' or null character after every word catch or this would happen:-
let's say 1st word in that array was 'happen' and 2nd 'to' if you don't initialize then your array will be after 1st catch :
happen
and 2nd catch :
*to*ppen

Try this. Here ctr will be the number of elements in the array(or vector) of individual words of the sentence. You can split the sentence from whatever letter you want by changing function call in main.
#include<iostream>
#include<string>
#include<vector>
using namespace std;
void split(string s, char ch){
vector <string> vec;
string tempStr;
int ctr{};
int index{s.length()};
for(int i{}; i<=index; i++){
tempStr += s[i];
if(s[i]==ch || s[i]=='\0'){
vec.push_back(tempStr);
ctr++;
tempStr="";
continue;
}
}
for(string S: vec)
cout<<S<<endl;
}
int main(){
string s;
getline(cin, s);
split(s, ' ');
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Extracting all words separated by non-alphabetical characters - c++

one easy way is that replace all the non-alphabetical char to space than split around space.

Related

C++ reverse a string but printing numbers first

C++: English to Pig Latin

How to split a sentence of any length into words and store them into variables c++

Using isalpha function with string pointers

How to extract words out of a string and store them in different array in c++

Categories

Resources