Interview Question : Trim multiple consecutive spaces from a string - c++

This is an interview question
Looking for best optimal solution to trim multiple spaces from a string. This operation should be in-place operation.
input = "I Like StackOverflow a lot"
output = "I Like StackOverflow a lot"
String functions are not allowed, as this is an interview question. Looking for an algorithmic solution of the problem.

Does using <algorithm> qualify as "algorithmic solution"?
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
struct BothAre
{
char c;
BothAre(char r) : c(r) {}
bool operator()(char l, char r) const
{
return r == c && l == c;
}
};
int main()
{
std::string str = "I Like StackOverflow a lot";
std::string::iterator i = unique(str.begin(), str.end(), BothAre(' '));
std::copy(str.begin(), i, std::ostream_iterator<char>(std::cout, ""));
std::cout << '\n';
}
test run: https://ideone.com/ITqxB

A c++0x - solution using a lambda instead of a regular function object. Compare to Cubbi's solution.
#include <string>
#include <algorithm>
int main()
{
std::string str = "I Like StackOverflow a lot";
str.erase(std::unique(str.begin(), str.end(),
[](char a, char b) { return a == ' ' && b == ' '; } ), str.end() );
}

Keep two indices: The next available spot to put a letter in (say, i), and the current index you're examining (say, j).
Just loop over all the characters with j, and whenever you see a letter, copy it to index i, then increment i. If you see a space that was not preceded by a space, also copy the space.
I think this would work in-place...

I'd just go with this:
int main(int argc, char* argv[])
{
char *f, *b, arr[] = " This is a test. ";
f = b = arr;
if (f) do
{
while(*f == ' ' && *(f+1) == ' ') f++;
} while (*b++ = *f++);
printf("%s", arr);
return 0;
}

I'd propose a little state machine (just a simple switch statement). Because if the interviewer is anything like me, the first enhancement they'll ask you to do is to fully trim any leading or trailing spaces, so that:
" leading and trailing "
gets transformed to:
"leading and trailing"
instead of:
" leading and trailing "
This is a really simple modification to a state-machine design, and to me it seems easier to understand the state-machine logic in general over a 'straight-forward' coded loop, even if it takes a few more lines of code than a straight-forward loop.
And if you argue that the modifications to the straight forward loop wouldn't be too bad (which can be reasonably argued), then I (as the interviewer) would throw in that I also want leading zeros from numbers to be be trimmed.
On the other hand, a lot of interviewers might actually dislike a state-machine solution as being 'non-optimal'. I guess it depends on what you're trying to optimize.

Here it is using only stdio:
#include <stdio.h>
int main(void){
char str[] = "I Like StackOverflow a lot";
int i, j = 0, lastSpace = 0;
for(i = 0;str[i]; i++){
if(!lastSpace || str[i] != ' '){
str[j] = str[i];
j++;
}
lastSpace = (str[i] == ' ');
}
str[j] = 0;
puts(str);
return 0;
}

Trimming multiple spaces also means a space should always be followed by a non space character.
int pack = 0;
char str[] = "I Like StackOverflow a lot";
for (int iter = 1; iter < strlen(str); iter++)
{
if (str[pack] == ' ' && str[iter] == ' ')
continue;
str[++pack] = str[iter];
}
str[++pack] = NULL;

int j = 0;
int k=0;
char str[] = "I Like StackOverflow a lot";
int length = strlen(str);
char str2[38];
for (int i = 0; i < length; i++)
{
if (str[i] == ' ' && str[i+1] == ' ')
continue;
str2[j] = str[i];
j++;
}
str2[j] =NULL;
cout<<str2;

void trimspaces(char * str){
int i = 0;
while(str[i]!='\0'){
if(str[i]==' '){
for(int j = i + 1; j<strlen(str);j++){
if(str[j]!=' '){
memmove(str + i + 1, str + j, strlen(str)-j+1);
break;
}
}
}
i++;
}
}

Functional variant in Haskell:
import Data.List (intercalate)
trimSpaces :: String -> String
trimSpaces = intercalate " " . words
The algorithm the next:
breaks a string up into a list of words, which were delimited by white space
concatenate the list inserting one space between each element in list

This is a very simple implementation of removing extra whitespaces.
#include <iostream>
std::string trimExtraWhiteSpaces(std::string &str);
int main(){
std::string str = " Apple is a fruit and I like it . ";
str = trimExtraWhiteSpaces(str);
std::cout<<str;
}
std::string trimExtraWhiteSpaces(std::string &str){
std::string s;
bool first = true;
bool space = false;
std::string::iterator iter;
for(iter = str.begin(); iter != str.end(); ++iter){
if(*iter == ' '){
if(first == false){
space = true;
}
}else{
if(*iter != ',' && *iter != '.'){
if(space){
s.push_back(' ');
}
}
s.push_back(*iter);
space = false;
first = false;
}
}
return s;
}

std::string tripString(std::string str) {
std::string result = "";
unsigned previous = 0;
if (str[0] != ' ')
result += str[0];
for (unsigned i = 1; i < str.length()-1; i++) {
if (str[i] == ' ' && str[previous] != ' ')
result += ' ';
else if (str[i] != ' ')
result += str[i];
previous++;
}
if (str[str.length()-1] != ' ')
result += str[str.length()-1];
return result;
}
This may be an implementation of the accepted idea.

Related

Fastest way to count words of string

How could I make this algorithm faster and shorten this code which counts word of given string?
int number_of_words(std::string &s) {
int count = 0;
for (int i = 0; i < s.length(); i++) {
// skip spaces
while (s[i] == ' ' && i < s.length())
i++;
if (i == s.length())
break;
// word found
count++;
// inside word
while (s[i] != ' ' && i < s.length())
i++;
}
return count;
}
Your code is quite alright, speed-wise. But if you want to make your code shorter, you may use find_first_not_of() and find_first_of standard functions, like I did in following code that solves your task.
I made an assumption that all your words are separated by only spaces. If other separators are needed you may pass something like " \r\n\t" instead of ' ' in both lines of my code.
One small optimization that can be made in your code is when you notice that after first while-loop we're located on non-space character, so we can add ++i; line for free before second loop. Similarly after second while-loop we're located on space character so we may add one more ++i; line after second while loop. This will give a tiny bit of speed gain to avoid extra two checks inside while loop.
Try it online
#include <iostream>
#include <string>
int number_of_words(std::string const & s) {
ptrdiff_t cnt = 0, pos = -1;
while (true) {
if ((pos = s.find_first_not_of(' ', pos + 1)) == s.npos) break;
++cnt;
if ((pos = s.find_first_of(' ', pos + 1)) == s.npos) break;
}
return cnt;
}
int main() {
std::cout << number_of_words(" abc def ghi ") << std::endl;
}
Output:
3

How to fill an array with a sentence?

For instance i have the sentence " I am Piet". I want to fill this sentence into an array in a way that I[0], am[1] Piet[2]. Below is the code i've made. The problem is that the sentence is filled in each element of the array.
#include <iostream>
#include <string>
using namespace std;
// function to populate my array
void populateMyArray(string*myArray, string sentence, int size)
{
for (int i = 0; i < size; i++)
{
*myArray = sentence;
myArray++;
}
}
// function to count works in the sentence
int countWords(string x)
{
int Num = 0;
char prev = ' ';
for (unsigned int i = 0; i < x.size(); i++) {
if (x[i] != ' ' && prev == ' ') Num++;
prev = x[i];
}
return Num;
}
int main()
{
string sentence1;
cout << "Please enter a line of text:\n";
getline(cin, sentence1);
int nWords1 = countWords(sentence1);
string *arr1 = new string[nWords1];
populateMyArray(arr1, sentence1, nWords1); //populate array1
for (int i = 0; i < nWords1; i++)
{
cout << "sentence one: " << arr1[i] << "\n";
}
system("PAUSE");
}
You can use vector in order to save data and each time you should use space between two words and you will store each string type into vector array
#include<bits/stdc++.h>
using namespace std;
main()
{
string s;
getline(cin,s);
vector<string> ss;
string temp = "";
s +=" ";
for(int i = 0 ; i < s.size();i ++){
if(s[i] != ' ')
temp += s[i];
else{
ss.push_back(temp);
temp = "";
}
}
for(int i = 0 ; i < ss.size();i ++)
cout << ss[i] <<" ";
}
Instead of using an array, use a std::vector instead. This way you don't have to worry about variable word sizes or overflowing anything in case a word or sentence is too long. Rather you can just do something like this:
#include <iostream>
#include <string>
#include <vector>
#include <sstream>
int main() {
// Get all words on one line
std::cout << "Enter words: " << std::flush;
std::string sentence;
getline(std::cin, sentence);
// Parse words into a vector
std::vector<std::string> words;
std::string word;
std::istringstream iss(sentence);
while( iss >> word ) {
words.push_back(word);
}
// Test it out.
for(auto const& w : words) {
std::cout << w << std::endl;
}
}
For an example sentence of I like cats and dogs equally you will have: words[0] = I, words[1] = like and so on.
If I understood correctly you are trying to split the input sentence into words.
You could do it like this:
void populateMyArray(string *myArray, string sentence, int size)
{
int firstCharIndex = -1;
char prev = ' ';
for (unsigned int i = 0; i < sentence.size(); i++) {
// Find the first character index of current word
if (sentence[i] != ' ' && prev == ' ') {
firstCharIndex = i;
}
// Check if it's the end of current word
// and get substring from first to last index of current word
else if (sentence[i] == ' ' && prev != ' ') {
*myArray = sentence.substr(firstCharIndex, i - firstCharIndex);
myArray++;
}
prev = sentence[i];
}
// For the last word
if (firstCharIndex != -1 && sentence[sentence.size() - 1] != ' ') {
*myArray = sentence.substr(firstCharIndex, sentence.size() - firstCharIndex);
}
}
How to think like a programmer.
The first thing we need is definitions of the beginning or a word and the end of a word. You might think that the beginning of a word is a non-space preceded by a space and the end of a word is a non-space followed by a space. But those definitions are wrong because they ignore the possibility of words at the start or end of the string. The correct definition of the beginning of a word is a non-space at the start of the string or a non-space preceded by a space. Similarly the end of a word is a non-space at the end of the string or a non-space followed by a space.
Now we have the definitions we capture them in two functions. It's very important to break complex problems down into smallier pieces and the way to do that is by writing functions (or classes).
bool beginning_of_word(string str, int index)
{
return str[index] != ' ' && (index == 0 || str[index - 1] == ' ');
}
bool end_of_word(string str, int index)
{
return str[index] != ' ' && (index == str.size() - 1 || str[index + 1] == ' ');
}
Now we're getting closer, but we still need the idea of finding the next start of word, or the next end of word, so we can loop through the sentence finding each word one at a time. Here are two functions for finding the next start and next end of word. They start from a given index and find the next index that is the start or end of a word. If no such index is found they return -1.
int next_beginning_of_word(string str, int index)
{
++index;
while (index < str.size())
{
if (beginning_of_word(str, index))
return index; // index is a start of word so return it
++index;
}
return -1; // no next word found
}
int next_end_of_word(string str, int index)
{
++index;
while (index < str.size())
{
if (end_of_word(str, index))
return index; // index is an end of word so return it
++index;
}
return -1; // no next word found
}
Now we have a way of looping through the words in a sentence we're ready to write the main loop. We use substr to break the words out of the sentence, substr takes two parameters the index of the start of the word and the length of the word. We can get the length of the word by substracting the start from the end and adding one.
int populateMyArray(string* array, string sentence)
{
// find the first word
int start = next_beginning_of_word(sentence, -1);
int end = next_end_of_word(sentence, -1);
int count = 0;
while (start >= 0) // did we find it?
{
// add to array
array[count] = sentence.substr(start, end - start + 1);
++count;
// find the next word
start = next_beginning_of_word(sentence, start);
end = next_end_of_word(sentence, end);
}
return count;
}
Now for extra credit we can rewrite countWords using next_beginning_of_word
int countWords(string sentence)
{
int start = next_beginning_of_word(sentence, -1);
int count = 0;
while (start >= 0)
{
++count;
start = next_beginning_of_word(sentence, start);
}
return count;
}
Notice the similarity of the countWords and the populateMyArray functions, the loops are very similar. That should give you confidence.
This is the way programmers work, when you have a problem that is too complex for you to handle, break it down into smaller pieces.

Counting words in a string

I am learning C++ on my own. I have written this program to count the number of words in a string. I know it's not the best way to do this, but this was what I could think of.
I am using spaces to count the number of words. Here is the problem.
countWords(""); // ok, 'x.empty()' identifies it as an empty string.
countWords(" "); // 'x.empty()' fails, function returns 1.
p.s I want this program to not count symbols like, "!","?" as words. Here is my code:
#include <iostream>
#include <string>
int countWords(std::string x);
int main() {
std::cout << countWords("Hello world!");
}
int countWords(std::string x) {
if(x.empty()) return 0; // if the string is empty
int Num = 1;
for(unsigned int i = 0; i < x.size(); i++) {
// if there is a space in the start
if(x[0] == ' ') continue;
// second condition makes sure that i don't count 2 spaces as 2 words
else if(x[i] == ' ' && x[i - 1] != ' ') Num++;
}
return Num;
}
Your function can be reduced to this:
int countWords(std::string x) {
int Num = 0;
char prev = ' ';
for(unsigned int i = 0; i < x.size(); i++) {
if(x[i] != ' ' && prev == ' ') Num++;
prev = x[i];
}
return Num;
}
Here is a demo
Edit: To follow up comment:
Here is a simple way to replace other characters with ' ', thought there might be a build method for this:
void replace(std::string &s, char replacer, std::set<char> &replacies)
{
for (int i=0; i < s.size(); i++)
if (replacies.count(s[i])) s[i] = replacer;
}
demo
The problem with your answer is that you are counting the number of words after which there is a ' ' sign. I believe you start with Num = 1 because you won't be counting the last word. Hovewer that only occurs when the string youre analysing does not end with ' '. Otherwise you will have 1 more word counted. The easiest way to fix this is to add
if(x.back() == ' ')
Num--;
right before returning the answer.
Your solution is insufficient. It will fail when applied with:
Leading spaces
Trailing spaces
Only spaces
Other forms of whitespace
You need to rethink how your algorithm should work as you simply need a more sophisticated method to cover all the use cases.
Or you could avoid reinventing the wheel and use what the standard library already provides, e.g.:
int countWords(const std::string& s) {
std::istringstream iss{s};
return std::distance(std::istream_iterator<std::string>{iss},
std::istream_iterator<std::string>{});
}
Here std::istringstream and std::istream_iterator is used to tokenize the string, and std::distance is used to get the number of tokens extracted.
I found the best using string stream:
int Count(const std::string &string)
{
stringstream ss(string);
char cmd[256] = {0};
int Words = 0;
while(true)
{
ss >> cmd;
if(!ss)
break;
Words++;
}
return Words;
Input: " Hello my dear friend "
Output: 4
It will not fail even if appiled with:
Leading spaces
Trailing spaces
Only spaces
Other forms of whitespace
So I tried on my own, after reading some useful comments. Here is my solution. I have checked my program for worst case scenario. If any of you, can find any cases for which this program doesn't work, let me know, so that I can work and improve it.
And just to be clear, we don't want symbols like, "," , "!" , "?", "." , "\n" to be counted as words. But obviously, "I" should be counted as word, as we consider it in the language. I have made sure of all this by replacing them with spaces. Let me know if I missed something.
#include <iostream>
#include <string>
void replace(std::string& str, char x, char y);
int countWords(std::string x);
int main(){
std::cout<<countWords(" \n \t Hello, world ! ");
}
void replace(std::string& str, char x, char y){
for(unsigned int i=0;i<str.size();i++){
if(str[i]==x) str[i]=y;
}
}
int countWords(std::string x){
replace(x,',',' ');
replace(x,'.',' ');
replace(x,'!',' ');
replace(x,'?',' ');
replace(x,'(',' ');
replace(x,')',' ');
replace(x,'\n',' ');
replace(x,'\t',' ');
replace(x,'"',' ');
if(x.empty()) return 0;
int Num=1;
for(unsigned int i=1;i<x.size();i++){
if(x[i]==' ' && x[i-1]!=' ') Num++;
}
if(x.back() == ' ') Num--;
return Num;
}
This is simple and fast on my machine. It iterates over the string, using a bool to track
whether it's inside a word or not, and whitespace characters as word delimiters. I tested with the isspace() library function but this switch statement was slightly faster.
int countwords(const std::string &str)
{
int count = 0;
bool in_word = false;
for (char ch : str) {
switch (ch) {
case '\t': case '\n': case '\v': case '\f': case '\r': case ' ':
in_word = false;
break;
default:
if (!in_word) {
in_word = true;
++count;
}
break;
}
}
return count;
}
This is easy to extend or modify for different word delimiters. Here is a version that considers any non-alphabetical character as a delimiter. Changing the !isalpha() call to isspace() will give the same results as the code above.
int countwords(const std::string &str)
{
int count = 0;
bool in_word = false;
for (char ch : str) {
if (!isalpha(ch)) { // non-alpha chars are word delimiters
in_word = false;
} else if (!in_word) {
in_word = true;
++count;
}
}
return count;
}
int countwords(std::string x)
{
int i, count = 0;
for (i = 0; i < x.size(); i++)
if (x[i] == ' ')
count++; //just count empty spaces
count++; //count++ is same as count+1,so there will be count+1 words in string
if (x.size() == 0)
count = 0;
return count;
}
Add the following lines to the code
int Num;
if(x[0] == ' ') Num = 0;
else Num = 1;
this would eliminate the count of a blank in the start of the string
#include <iostream>
#include <string>
int countWords(std::string x);
int main() {
std::cout << countWords("Hello world!");
}
int countWords(std::string x) {
if(x.empty()) return 0; // if the string is empty
int Num;
if(x[0] == ' ') Num = 0;
else Num = 1;
for(unsigned int i = 0; i < x.size(); i++) {
// if there is a space in the start
if(x[0] == ' ') continue;
// second condition makes sure that i don't count 2 spaces as 2 words
else if(x[i] == ' ' && x[i - 1] != ' ') Num++;
}
return Num;
}

Remove multiple spaces [duplicate]

This is an interview question
Looking for best optimal solution to trim multiple spaces from a string. This operation should be in-place operation.
input = "I Like StackOverflow a lot"
output = "I Like StackOverflow a lot"
String functions are not allowed, as this is an interview question. Looking for an algorithmic solution of the problem.
Does using <algorithm> qualify as "algorithmic solution"?
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
struct BothAre
{
char c;
BothAre(char r) : c(r) {}
bool operator()(char l, char r) const
{
return r == c && l == c;
}
};
int main()
{
std::string str = "I Like StackOverflow a lot";
std::string::iterator i = unique(str.begin(), str.end(), BothAre(' '));
std::copy(str.begin(), i, std::ostream_iterator<char>(std::cout, ""));
std::cout << '\n';
}
test run: https://ideone.com/ITqxB
A c++0x - solution using a lambda instead of a regular function object. Compare to Cubbi's solution.
#include <string>
#include <algorithm>
int main()
{
std::string str = "I Like StackOverflow a lot";
str.erase(std::unique(str.begin(), str.end(),
[](char a, char b) { return a == ' ' && b == ' '; } ), str.end() );
}
Keep two indices: The next available spot to put a letter in (say, i), and the current index you're examining (say, j).
Just loop over all the characters with j, and whenever you see a letter, copy it to index i, then increment i. If you see a space that was not preceded by a space, also copy the space.
I think this would work in-place...
I'd just go with this:
int main(int argc, char* argv[])
{
char *f, *b, arr[] = " This is a test. ";
f = b = arr;
if (f) do
{
while(*f == ' ' && *(f+1) == ' ') f++;
} while (*b++ = *f++);
printf("%s", arr);
return 0;
}
I'd propose a little state machine (just a simple switch statement). Because if the interviewer is anything like me, the first enhancement they'll ask you to do is to fully trim any leading or trailing spaces, so that:
" leading and trailing "
gets transformed to:
"leading and trailing"
instead of:
" leading and trailing "
This is a really simple modification to a state-machine design, and to me it seems easier to understand the state-machine logic in general over a 'straight-forward' coded loop, even if it takes a few more lines of code than a straight-forward loop.
And if you argue that the modifications to the straight forward loop wouldn't be too bad (which can be reasonably argued), then I (as the interviewer) would throw in that I also want leading zeros from numbers to be be trimmed.
On the other hand, a lot of interviewers might actually dislike a state-machine solution as being 'non-optimal'. I guess it depends on what you're trying to optimize.
Here it is using only stdio:
#include <stdio.h>
int main(void){
char str[] = "I Like StackOverflow a lot";
int i, j = 0, lastSpace = 0;
for(i = 0;str[i]; i++){
if(!lastSpace || str[i] != ' '){
str[j] = str[i];
j++;
}
lastSpace = (str[i] == ' ');
}
str[j] = 0;
puts(str);
return 0;
}
Trimming multiple spaces also means a space should always be followed by a non space character.
int pack = 0;
char str[] = "I Like StackOverflow a lot";
for (int iter = 1; iter < strlen(str); iter++)
{
if (str[pack] == ' ' && str[iter] == ' ')
continue;
str[++pack] = str[iter];
}
str[++pack] = NULL;
int j = 0;
int k=0;
char str[] = "I Like StackOverflow a lot";
int length = strlen(str);
char str2[38];
for (int i = 0; i < length; i++)
{
if (str[i] == ' ' && str[i+1] == ' ')
continue;
str2[j] = str[i];
j++;
}
str2[j] =NULL;
cout<<str2;
void trimspaces(char * str){
int i = 0;
while(str[i]!='\0'){
if(str[i]==' '){
for(int j = i + 1; j<strlen(str);j++){
if(str[j]!=' '){
memmove(str + i + 1, str + j, strlen(str)-j+1);
break;
}
}
}
i++;
}
}
Functional variant in Haskell:
import Data.List (intercalate)
trimSpaces :: String -> String
trimSpaces = intercalate " " . words
The algorithm the next:
breaks a string up into a list of words, which were delimited by white space
concatenate the list inserting one space between each element in list
This is a very simple implementation of removing extra whitespaces.
#include <iostream>
std::string trimExtraWhiteSpaces(std::string &str);
int main(){
std::string str = " Apple is a fruit and I like it . ";
str = trimExtraWhiteSpaces(str);
std::cout<<str;
}
std::string trimExtraWhiteSpaces(std::string &str){
std::string s;
bool first = true;
bool space = false;
std::string::iterator iter;
for(iter = str.begin(); iter != str.end(); ++iter){
if(*iter == ' '){
if(first == false){
space = true;
}
}else{
if(*iter != ',' && *iter != '.'){
if(space){
s.push_back(' ');
}
}
s.push_back(*iter);
space = false;
first = false;
}
}
return s;
}
std::string tripString(std::string str) {
std::string result = "";
unsigned previous = 0;
if (str[0] != ' ')
result += str[0];
for (unsigned i = 1; i < str.length()-1; i++) {
if (str[i] == ' ' && str[previous] != ' ')
result += ' ';
else if (str[i] != ' ')
result += str[i];
previous++;
}
if (str[str.length()-1] != ' ')
result += str[str.length()-1];
return result;
}
This may be an implementation of the accepted idea.

C++ function to count all the words in a string

I was asked this during an interview and apparently it's an easy question but it wasn't and still isn't obvious to me.
Given a string, count all the words in it. Doesn't matter if they are repeated. Just the total count like in a text files word count. Words are anything separated by a space and punctuation doesn't matter, as long as it's part of a word.
For example:
A very, very, very, very, very big dog ate my homework!!!! ==> 11 words
My "algorithm" just goes through looking for spaces and incrementing a counter until I hit a null. Since i didn't get the job and was asked to leave after that I guess My solution wasn't good? Anyone have a more clever solution? Am I missing something?
Assuming words are white space separated:
unsigned int countWordsInString(std::string const& str)
{
std::stringstream stream(str);
return std::distance(std::istream_iterator<std::string>(stream), std::istream_iterator<std::string>());
}
Note: There may be more than one space between words. Also this does not catch other white space characters like tab new line or carriage return. So counting spaces is not enough.
The stream input operator >> when used to read a string from a stream. Reads one white space separated word. So they were probably looking for you to use this to identify words.
std::stringstream stream(str);
std::string oneWord;
stream >> oneWord; // Reads one space separated word.
When can use this to count words in a string.
std::stringstream stream(str);
std::string oneWord;
unsigned int count = 0;
while(stream >> oneWord) { ++count;}
// count now has the number of words in the string.
Getting complicated:
Streams can be treated just like any other container and there are iterators to loop through them std::istream_iterator. When you use the ++ operator on an istream_iterator it just read the next value from the stream using the operator >>. In this case we are reading std::string so it reads a space separated word.
std::stringstream stream(str);
std::string oneWord;
unsigned int count = 0;
std::istream_iterator loop = std::istream_iterator<std::string>(stream);
std::istream_iterator end = std::istream_iterator<std::string>();
for(;loop != end; ++count, ++loop) { *loop; }
Using std::distance just wraps all the above in a tidy package as it find the distance between two iterators by doing ++ on the first until we reach the second.
To avoid copying the string we can be sneaky:
unsigned int countWordsInString(std::string const& str)
{
std::stringstream stream;
// sneaky way to use the string as the buffer to avoid copy.
stream.rdbuf()->pubsetbuf (str.c_str(), str.length() );
return std::distance(std::istream_iterator<std::string>(stream), std::istream_iterator<std::string>());
}
Note: we still copy each word out of the original into a temporary. But the cost of that is minimal.
A less clever, more obvious-to-all-of-the-programmers-on-your-team method of doing it.
#include <cctype>
int CountWords(const char* str)
{
if (str == NULL)
return error_condition; // let the requirements define this...
bool inSpaces = true;
int numWords = 0;
while (*str != '\0')
{
if (std::isspace(*str))
{
inSpaces = true;
}
else if (inSpaces)
{
numWords++;
inSpaces = false;
}
++str;
}
return numWords;
}
You can use the std::count or std::count_if to do that. Below a simple example with std::count:
//Count the number of words on string
#include <iostream>
#include <string>
#include <algorithm> //count and count_if is declared here
int main () {
std::string sTEST("Text to verify how many words it has.");
std::cout << std::count(sTEST.cbegin(), sTEST.cend(), ' ')+1;
return 0;
}
UPDATE: Due the observation made by Aydin Özcan (Nov 16) I made a change to this solution. Now the words may have more than one space between them. :)
//Count the number of words on string
#include <string>
#include <iostream>
int main () {
std::string T("Text to verify : How many words does it have?");
size_t NWords = T.empty() || T.back() == ' ' ? 0 : 1;
for (size_t s = T.size(); s > 0; --s)
if (T[s] == ' ' && T[s-1] != ' ') ++NWords;
std::cout << NWords;
return 0;
}
Another boost based solution that may work (untested):
vector<string> result;
split(result, "aaaa bbbb cccc", is_any_of(" \t\n\v\f\r"), token_compress_on);
More information can be found in the Boost String Algorithms Library
This can be done without manually looking at every character or copying the string.
#include <boost/iterator/transform_iterator.hpp>
#include <cctype>
boost::transform_iterator
< int (*)(int), std::string::const_iterator, bool const& >
pen( str.begin(), std::isalnum ), end( str.end(), std::isalnum );
size_t word_cnt = 0;
while ( pen != end ) {
word_cnt += * pen;
pen = std::mismatch( pen+1, end, pen ).first;
}
return word_cnt;
I took the liberty of using isalnum instead of isspace.
This is not something I would do at a job interview. (It's not like it compiled the first time.)
Or, for all the Boost haters ;v)
if ( str.empty() ) return 0;
size_t word_cnt = std::isalnum( * str.begin() );
for ( std::string::const_iterator pen = str.begin(); ++ pen != str.end(); ) {
word_cnt += std::isalnum( pen[ 0 ] ) && ! std::isalnum( pen[ -1 ] );
}
return word_cnt;
An O(N) solution that is also very simple to understand and implement:
(I haven't checked for an empty string input. But I am sure you can do that easily.)
#include <iostream>
#include <string>
using namespace std;
int countNumberOfWords(string sentence){
int numberOfWords = 0;
size_t i;
if (isalpha(sentence[0])) {
numberOfWords++;
}
for (i = 1; i < sentence.length(); i++) {
if ((isalpha(sentence[i])) && (!isalpha(sentence[i-1]))) {
numberOfWords++;
}
}
return numberOfWords;
}
int main()
{
string sentence;
cout<<"Enter the sentence : ";
getline(cin, sentence);
int numberOfWords = countNumberOfWords(sentence);
cout<<"The number of words in the sentence is : "<<numberOfWords<<endl;
return 0;
}
Here is a single pass, branchless (almost), locale-aware algorithm which handles cases with more than one space between words:
If the string is empty return 0
let transitions = number of adjacent char pairs (c1, c2) where c1 == ' ' and c2 != ' '
if the sentence starts with a space, return transitions else return transitions + 1
Here is an example with string = "A very, very, very, very, very big dog ate my homework!!!!"
i | 0123456789
c1 | A very, very, very, very, very big dog ate my homework!!!!
c2 | A very, very, very, very, very big dog ate my homework!!!!
| x x x x x x x x x x
Explanation
Let `i` be the loop counter.
When i=0: c1='A' and c2=' ', the condition `c1 == ' '` and `c2 != ' '` is not met
When i=1: c1=' ' and c2='A', the condition is met
... and so on for the remaining characters
Here are 2 solutions I came up with
Naive solution
size_t count_words_naive(const std::string_view& s)
{
if (s.size() == 0) return 0;
size_t count = 0;
bool isspace1, isspace2 = true;
for (auto c : s) {
isspace1 = std::exchange(isspace2, isspace(c));
count += (isspace1 && !isspace2);
}
return count;
}
If you think carefully, you will be able to reduce this set of operations into an inner product (just for fun, I don't recommend this as this is arguably much less readable).
Inner product solution
size_t count_words_using_inner_prod(const std::string_view& s)
{
if (s.size() == 0) return 0;
auto starts_with_space = isspace(s.front());
auto num_transitions = std::inner_product(
s.begin()+1, s.end(), s.begin(), 0, std::plus<>(),
[](char c2, char c1) { return isspace(c1) && !isspace(c2); });
return num_transitions + !starts_with_space;
}
I think that will help
the complexty O(n)
#include <iostream>
#include <string>
#include <ctype.h>
using namespace std;
int main()
{
int count = 0, size;
string sent;
getline(cin, sent);
size = sent.size();
check if the char is in alpha and the next char not in alpha
for (int i = 0; i < size - 1; ++i) {
if (isalpha(sent[i]) && !isalpha(sent[i+1])) {
++count;
}
}
if the word in the last of sentence didn't count above so it count here
if (isalpha(sent[size - 1]))++count;
cout << count << endl;
return 0;
}
A very concise O(N) approach:
bool is_letter(char c) { return c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z'; }
int count_words(const string& s) {
int i = 0, N = s.size(), count = 0;
while(i < N) {
while(i < N && !is_letter(s[i])) i++;
if(i == N) break;
while(i < N && is_letter(s[i])) i++;
count++;
}
return count;
}
A divide-and-conquer approach, complexity is also O(N):
int DC(const string& A, int low, int high) {
if(low > high) return 0;
int mid = low + (high - low) / 2;
int count_left = DC(A, low, mid-1);
int count_right = DC(A, mid+1, high);
if(!is_letter(A[mid]))
return count_left + count_right;
else {
if(mid == low && mid == high) return 1;
if(mid-1 < low) {
if(is_letter(A[mid+1])) return count_right;
else return count_right+1;
} else if(mid+1 > high) {
if(is_letter(A[mid-1])) return count_left;
else return count_left+1;
}
else {
if(!is_letter(A[mid-1]) && !is_letter(A[mid+1]))
return count_left + count_right + 1;
else if(is_letter(A[mid-1]) && is_letter(A[mid+1]))
return count_left + count_right - 1;
else
return count_left + count_right;
}
}
}
int count_words_divide_n_conquer(const string& s) {
return DC(s, 0, s.size()-1);
}
Efficient version based on map-reduce approach
#include <iostream>
#include <string_view>
#include <numeric>
std::size_t CountWords(std::string_view s) {
if (s.empty())
return 0;
std::size_t wc = (!std::isspace(s.front()) ? 1 : 0);
wc += std::transform_reduce(
s.begin(),
s.end() - 1,
s.begin() + 1,
std::size_t(0),
std::plus<std::size_t>(),
[](char left, char right) {
return std::isspace(left) && !std::isspace(right);
});
return wc;
}
int main() {
std::cout << CountWords(" pretty little octopus "sv) << std::endl;
return 0;
}