reverse words in a sentence using istringstream and istream_iterator

reverse words in a sentence using istringstream and istream_iterator - c++

Trying to solve the problem using C++ constructs. Reference to each word in the sentence is taken and reversed. But the changes are not seen in the original sentence.
class Solution {
public:
string reverseWords(string s) {
istringstream ss(s);
for(auto w = istream_iterator<string>(ss); w != istream_iterator<string>(); w++)
{
/* changes of the below 2 lines are not reflected in the main sentence*/
string &str = const_cast<string&>(*w);
reverse(str.begin(),str.end());
}
reverse(s.begin(),s.end());
return s;
}
};

I don't think it is possible to use streams without copying the word as the stream would always extract the word into a separate string. In your attempt, you are also modifying such a copy, that's why you get the original string returned. I would just use iterators (this is to be taken as pseudo-code, may not compile):
auto last = s.begin();
auto cur = s.begin();
while (cur != s.end()) {
if (!isalpha(*(cur++))) {
reverse(last, cur);
last = cur;
}
}
reverse(last, cur);
return s;

Related

Function to separate each word from a string and put them into a vector, without using auto keyword?

I'm really stuck here. So I can't edit the main function, and inside it there is a function call with the only parameter being the string. How can I make this function put each word from the string into a vector, without using the auto keyword? I realize that this code is probably really wrong but its my best attempt at what it should look like.
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
vector<string> extract_words(const char * sentence[])
{
string word = "";
vector<string> list;
for (int i = 0; i < sentence.size(); ++i)
{
while (sentence[i] != ' ')
{
word = word + sentence[i];
}
list.push_back(word);
}
}
int main()
{
sentence = "Help me please" /*In the actual code a function call is here that gets input sentence.*/
if (sentence.length() > 0)
{
words = extract_words(sentence);
}
}

Do you know how to read "words" from std::cin?
Then you can put that string in a std::istringstream which works like std::cin but for "reading" strings instead.
Use the stream extract operator >> in a loop to get all the words one by one, and add them to the vector.
Perhaps something like:
std::vector<std::string> get_all_words(std::string const& string)
{
std::vector<std::string> words;
std::istringstream in(string);
std::string word;
while (in >> word)
{
words.push_back(word);
}
return words;
}
With a little more knowledge of C++ and its standard classes and functions, you can actually make the function a lot shorter:
std::vector<std::string> get_all_words(std::string const& string)
{
std::istringstream in(string);
return std::vector<std::string>(std::istream_iterator<std::string>(in),
std::istream_iterator<std::string>());
}

I recommend making the argument to the function a const std::string& instead of const char * sentence[]. A std::string has many member functions, like find_first_of, find_first_not_of and substr and more that could help a lot.
Here's an example using those mentioned:
std::vector<std::string> extract_words(const std::string& sentence)
{
/* Control char's, "whitespaces", that we don't want in our words:
\a audible bell
\b backspace
\f form feed
\n line feed
\r carriage return
\t horizontal tab
\v vertical tab
*/
static const char whitespaces[] = " \t\n\r\a\b\f\v";
std::vector<std::string> list;
std::size_t begin = 0;
while(true)
{
// Skip whitespaces by finding the first non-whitespace, starting at
// "begin":
begin = sentence.find_first_not_of(whitespaces, begin);
// If no non-whitespace char was found, break out:
if(begin == std::string::npos) break;
// Search for a whitespace starting at "begin + 1":
std::size_t end = sentence.find_first_of(whitespaces, begin + 1);
// Store the result by creating a substring from "begin" with the
// length "end - begin":
list.push_back(sentence.substr(begin, end - begin));
// If no whitespace was found, break out:
if(end == std::string::npos) break;
// Set "begin" to the char after the found whitespace before the loop
// makes another lap:
begin = end + 1;
}
return list;
}
Demo
With the added restriction "no breaks", this could be a variant. It does exactly the same as the above, but without using break:
std::vector<std::string> extract_words(const std::string& sentence)
{
static const char whitespaces[] = " \t\n\r\a\b\f\v";
std::vector<std::string> list;
std::size_t begin = 0;
bool loop = true;
while(loop)
{
begin = sentence.find_first_not_of(whitespaces, begin);
if(begin == std::string::npos) {
loop = false;
} else {
std::size_t end = sentence.find_first_of(whitespaces, begin + 1);
list.push_back(sentence.substr(begin, end - begin));
if(end == std::string::npos) {
loop = false;
} else {
begin = end + 1;
}
}
}
return list;
}

Starting loop at specific index of a std::string?

I wrote the following function:
std::regex r("");
for (std::sregex_iterator i = words_begin; i != words_end; ++i) {}
It starts looking for regex matches from the beginning of the given string (str) But how may I tell it to exclude everything before specific index?
For example I want it to delete with all of what comes after index number 4 (Not including it).
Note: I am calling this code from another function so I tried something like str + 4 in the string parameter but I got an error that it's not l-value.

If I understand your question correctly you can pass a parameter to the function with the position where you'd like to start the search, and use it to set the iterator:
void print_str(const std::string& str, int pos)
{
std::regex r("\\{[^}]*\\}");
auto words_begin =
std::sregex_iterator(str.begin() + pos, str.end(), r);
//...
}
int main()
{
std::string str = "somestring";
func_str(str, 4);
}
Or pass the iterators themselves, one to the position you'd like to start the search and one to the end of the string:
void func_str(std::string::iterator it_begin, std::string::iterator it_end)
{
std::regex r("\\{[^}]*\\}");
auto words_begin =
std::sregex_iterator(it_begin, it_end, r);
//...
}
int main()
{
std::string str = "somestring";
func_str(str.begin() + 4, str.end());
}
As #bruno correctly stated, you may use str.substr(4) not str + 4, as an argument instead of the original string, the downside of the method is that it will create unnecessary copies of the string to be searched, as #Marek also correctly pointed out, thus the options of passing a position or begin and end iterators is less expensive. The upside is that you would not have to change anything in the function.

I suggest checking the std::smatch#position() to determine if the match is to be taken or discarded:
#include <iostream>
#include<regex>
int main() {
std::regex r("\\{[^}]*\\}");
std::string str("{1}, {2} and {3}");
auto words_begin =
std::sregex_iterator(str.begin(), str.end(), r);
auto words_end = std::sregex_iterator();
for (std::sregex_iterator i = words_begin; i != words_end; ++i) {
std::smatch m = *i;
if (m.position() > 4) {
std::cout << m.str() << std::endl;
}
}
return 0;
}
See the C++ demo online. Adjust the if condition as you need.
Here, the first {1} match is discarded since its position was less or equal than 4.

string::replace not working correctly 100% of the time?

I'm trying to replace every space character with '%20' in a string, and I'm thinking of using the built in replace function for the string class.
Currently, I have:
void replaceSpace(string& s)
{
int len = s.length();
string str = "%20";
for(int i = 0; i < len; i++) {
if(s[i] == ' ') {
s.replace(i, 1, str);
}
}
}
When I pass in the string "_a_b_c_e_f_g__", where the underscores represent space, my output is "%20a%20b%20c%20e_f_g__". Again, underscores represent space.
Why is that the spaces near the beginning of the string are replaced, but the spaces towards the end aren't?

You are making s longer with each replacement, but you are not updating len which is used in the loop condition.

Modifying the string that you are just scanning is like cutting the branch under your feet. It may work if you are careful, but in this case you aren't.
Namely, you take the string len at the beginning but with each replacement your string gets longer and you are pushing the replacement places further away (so you never reach all of them).
The correct way to cut this branch is from its end (tip) towards the trunk - this way you always have a safe footing:
void replaceSpace(string& s)
{
int len = s.length();
string str = "%20";
for(int i = len - 1; i >= 0; i--) {
if(s[i] == ' ') {
s.replace(i, 1, str);
}
}
}

You're growing the string but only looping to its initial size.
Looping over a collection while modifying it is very prone to error.
Here's a solution that doesn't:
void replace(string& s)
{
string s1;
std::for_each(s.begin(),
s.end(),
[&](char c) {
if (c == ' ') s1 += "%20";
else s1 += c;
});
s.swap(s1);
}

As others have already mentioned, the problem is you're using the initial string length in your loop, but the string gets bigger along the way. Your loop never reaches the end of the string.
You have a number of ways to fix this. You can correct your solution and make sure you go to the end of the string as it is now, not as it was before you started looping.
Or you can use #molbdnilo 's way, which creates a copy of the string along the way.
Or you can use something like this:
std::string input = " a b c e f g ";
std::string::size_type pos = 0;
while ((pos = input.find(' ', pos)) != std::string::npos)
{
input.replace(pos, 1, "%20");
}

Here's a function that can make it easier for you:
string replace_char_str(string str, string find_str, string replace_str)
{
size_t pos = 0;
for ( pos = str.find(find_str); pos != std::string::npos; pos = str.find(find_str,pos) )
{
str.replace(pos ,1, replace_str);
}
return str;
}
So if when you want to replace the spaces, try it like this:
string new_str = replace_char_str(yourstring, " ", "%20");
Hope this helps you ! :)

Remove whitespace, convert case, in string except in quotes

I am using C++03 without Boost.
Suppose I have a string such as.. The day is "Mon day"
I want to process this to
THEDAYISMon day
That is, convert to upper case what is not in the quote, and remove whitespace that isn't in the quote.
The string may not contain quotes, but if it does, there will only be 2.
I tried using STL algorithms but I get stuck on how to remember if it's in a quote or not between elements.
Of course I can do it with good old for loops, but I was wondering if there is a fancy C++ way.
Thanks.
This is what I have using a for loop
while (getline(is, str))
{
// remove whitespace and convert case except in quotes
temp.clear();
bool bInQuote = false;
for (string::const_iterator it = str.begin(), end_it = str.end(); it != end_it; ++it)
{
char c = *it;
if (c == '\"')
{
bInQuote = (! bInQuote);
}
else
{
if (! ::isspace(c))
{
temp.push_back(bInQuote ? c : ::toupper(c));
}
}
}
swap(str, temp);

You can do something with STL algorithms like the following:
#include <iostream>
#include <string>
#include <algorithm>
#include <cctype>
using namespace std;
struct convert {
void operator()(char& c) { c = toupper((unsigned char)c); }
};
bool isSpace(char c)
{
return std::isspace(c);
}
int main() {
string input = "The day is \"Mon Day\" You know";
cout << "original string: " << input <<endl;
unsigned int firstQuote = input.find("\"");
unsigned int secondQuote = input.find_last_of("\"");
string firstPart="";
string secondPart="";
string quotePart="";
if (firstQuote != string::npos)
{
firstPart = input.substr(0,firstQuote);
if (secondQuote != string::npos)
{
secondPart = input.substr(secondQuote+1);
quotePart = input.substr(firstQuote+1, secondQuote-firstQuote-1);
//drop those quotes
}
std::for_each(firstPart.begin(), firstPart.end(), convert());
firstPart.erase(remove_if(firstPart.begin(),
firstPart.end(), isSpace),firstPart.end());
std::for_each(secondPart.begin(), secondPart.end(), convert());
secondPart.erase(remove_if(secondPart.begin(),
secondPart.end(), isSpace),secondPart.end());
input = firstPart + quotePart + secondPart;
}
else //does not contains quote
{
std::for_each(input.begin(), input.end(), convert());
input.erase(remove_if(input.begin(),
input.end(), isSpace),input.end());
}
cout << "transformed string: " << input << endl;
return 0;
}
It gave the following output:
original string: The day is "Mon Day" You know
transformed string: THEDAYISMon DayYOUKNOW
With the test case you have shown:
original string: The day is "Mon Day"
transformed string: THEDAYISMon Day

Just for laughs, use a custom iterator, std::copy and a std::back_insert_iterator, and an operator++ that knows to skip whitespace and set a flag on a quote character:
CustomStringIt& CustomStringIt::operator++ ()
{
if(index_<originalString_.size())
++index_;
if(!inQuotes_ && isspace(originalString_[index_]))
return ++(*this);
if('\"'==originalString_[index_])
{
inQuotes_ = !inQuotes_;
return ++(*this);
}
return *this;
}
char CustomStringIt::operator* () const
{
char c = originalString_[index_];
return inQuotes_ ? c : std::toupper(c) ;
}
Full code here.

You can use stringstream and getline with the \" character as the delimiter instead of newline.
Split your string into 3 cases: the part of the string before the first quote, the part in quotes, and the part after the second quote.
You would process the first and third parts before adding to your output, but add the second part without processing.
If your string contains no quotes, the entire string will be contained in the first part. The second and third parts will just be empty.
while (getline (is, str)) {
string processed;
stringstream line(str);
string beforeFirstQuote;
string inQuotes;
getline(line, beforeFirstQuote, '\"');
Process(beforeFirstQuote, processed);
getline(line, inQuotes, '\"');
processed += inQuotes;
getline(line, afterSecondQuote, '\"');
Process(afterFirstQuote, processed);
}
void Process(const string& input, string& output) {
for (string::const_iterator it = input.begin(), end_it = input.end(); it != end_it; ++it)
{
char c = *it;
if (! ::isspace(c))
{
output.push_back(::toupper(c));
}
}
}

Selective iterator

FYI: no boost, yes it has this, I want to reinvent the wheel ;)
Is there some form of a selective iterator (possible) in C++? What I want is to seperate strings like this:
some:word{or other
to a form like this:
some : word { or other
I can do that with two loops and find_first_of(":") and ("{") but this seems (very) inefficient to me. I thought that maybe there would be a way to create/define/write an iterator that would iterate over all these values with for_each. I fear this will have me writing a full-fledged custom way-too-complex iterator class for a std::string.
So I thought maybe this would do:
std::vector<size_t> list;
size_t index = mystring.find(":");
while( index != std::string::npos )
{
list.push_back(index);
index = mystring.find(":", list.back());
}
std::for_each(list.begin(), list.end(), addSpaces(mystring));
This looks messy to me, and I'm quite sure a more elegant way of doing this exists. But I can't think of it. Anyone have a bright idea? Thanks
PS: I did not test the code posted, just a quick write-up of what I would try
UPDATE: after taking all your answers into account, I came up with this, and it works to my liking :). this does assume the last char is a newline or something, otherwise an ending {,}, or : won't get processed.
void tokenize( string &line )
{
char oneBack = ' ';
char twoBack = ' ';
char current = ' ';
size_t length = line.size();
for( size_t index = 0; index<length; ++index )
{
twoBack = oneBack;
oneBack = current;
current = line.at( index );
if( isSpecial(oneBack) )
{
if( !isspace(twoBack) ) // insert before
{
line.insert(index-1, " ");
++index;
++length;
}
if( !isspace(current) ) // insert after
{
line.insert(index, " ");
++index;
++length;
}
}
}
Comments are welcome as always :)

That's relatively easy using the std::istream_iterator.
What you need to do is define your own class (say Term). Then define how to read a single "word" (term) from the stream using the operator >>.
I don't know your exact definition of a word is, so I am using the following definition:
Any consecutive sequence of alpha numeric characters is a term
Any single non white space character that is also not alpha numeric is a word.
Try this:
#include <string>
#include <sstream>
#include <iostream>
#include <iterator>
#include <algorithm>
class Term
{
public:
// This cast operator is not required but makes it easy to use
// a Term anywhere that a string can normally be used.
operator std::string const&() const {return value;}
private:
// A term is just a string
// And we friend the operator >> to make sure we can read it.
friend std::istream& operator>>(std::istream& inStr,Term& dst);
std::string value;
};
Now all we have to do is define an operator >> that reads a word according to the rules:
// This function could be a lot neater using some boost regular expressions.
// I just do it manually to show it can be done without boost (as requested)
std::istream& operator>>(std::istream& inStr,Term& dst)
{
// Note the >> operator drops all proceeding white space.
// So we get the first non white space
char first;
inStr >> first;
// If the stream is in any bad state the stop processing.
if (inStr)
{
if(std::isalnum(first))
{
// Alpha Numeric so read a sequence of characters
dst.value = first;
// This is ugly. And needs re-factoring.
while((first = insStr.get(), inStr) && std::isalnum(first))
{
dst.value += first;
}
// Take into account the special case of EOF.
// And bad stream states.
if (!inStr)
{
if (!inStr.eof())
{
// The last letter read was not EOF and and not part of the word
// So put it back for use by the next call to read from the stream.
inStr.putback(first);
}
// We know that we have a word so clear any errors to make sure it
// is used. Let the next attempt to read a word (term) fail at the outer if.
inStr.clear();
}
}
else
{
// It was not alpha numeric so it is a one character word.
dst.value = first;
}
}
return inStr;
}
So now we can use it in standard algorithms by just employing the istream_iterator
int main()
{
std::string data = "some:word{or other";
std::stringstream dataStream(data);
std::copy( // Read the stream one Term at a time.
std::istream_iterator<Term>(dataStream),
std::istream_iterator<Term>(),
// Note the ostream_iterator is using a std::string
// This works because a Term can be converted into a string.
std::ostream_iterator<std::string>(std::cout, "\n")
);
}
The output:
> ./a.exe
some
:
word
{
or
other

std::string const str = "some:word{or other";
std::string result;
result.reserve(str.size());
for (std::string::const_iterator it = str.begin(), end = str.end();
it != end; ++it)
{
if (isalnum(*it))
{
result.push_back(*it);
}
else
{
result.push_back(' '); result.push_back(*it); result.push_back(' ');
}
}
Insert version for speed-up
std::string str = "some:word{or other";
for (std::string::iterator it = str.begin(), end = str.end(); it != end; ++it)
{
if (!isalnum(*it))
{
it = str.insert(it, ' ') + 2;
it = str.insert(it, ' ');
end = str.end();
}
}
Note that std::string::insert inserts BEFORE the iterator passed and returns an iterator to the newly inserted character. Assigning is important since the buffer may have been reallocated at another memory location (the iterators are invalidated by the insertion). Also note that you can't keep end for the whole loop, each time you insert you need to recompute it.

a more elegant way of doing this exists.
I do not know how BOOST implements that, but traditional way is by feeding input string character by character into a FSM which detects where tokens (words, symbols) start and end.
I can do that with two loops and find_first_of(":") and ("{")
One loop with std::find_first_of() should suffice.
Though I'm still a huge fan of FSMs for such parsing tasks.
P.S. Similar question

How about something like:
std::string::const_iterator it, end = mystring.end();
for(it = mystring.begin(); it != end; ++it) {
if ( !isalnum( *it ))
list.push_back(it);
}
This way, you'll only iterate once through the string, and isalnum from ctype.h seems to do what you want. Of course, the code above is very simplistic and incomplete and only suggests a solution.

Are you looking to tokenize the input string, ala strtok?
If so, here is a tokenizing function that you can use. It takes an input string and a string of delimiters (each char int he string is a possible delimitter), and it returns a vector of tokens. Each token is a tuple with the delimitted string, and the delimiter used in that case:
#include <cstdlib>
#include <vector>
#include <string>
#include <functional>
#include <iostream>
#include <algorithm>
using namespace std;
// FUNCTION : stringtok(char const* Raw, string sToks)
// PARAMATERS : Raw Pointer to NULL-Terminated string containing a string to be tokenized.
// sToks string of individual token characters -- each character in the string is a token
// DESCRIPTION : Tokenizes a string, much in the same was as strtok does. The input string is not modified. The
// function is called once to tokenize a string, and all the tokens are retuned at once.
// RETURNS : Returns a vector of strings. Each element in the vector is one token. The token character is
// not included in the string. The number of elements in the vector is N+1, where N is the number
// of times the Token character is found in the string. If one token is an empty string (as with the
// string "string1##string3", where the token character is '#'), then that element in the vector
// is an empty string.
// NOTES :
//
typedef pair<char,string> token; // first = delimiter, second = data
inline vector<token> tokenize(const string& str, const string& delims, bool bCaseSensitive=false) // tokenizes a string, returns a vector of tokens
{
bCaseSensitive;
// prologue
vector<token> vRet;
// tokenize input string
for( string::const_iterator itA = str.begin(), it=itA; it != str.end(); it = find_first_of(++it,str.end(),delims.begin(),delims.end()) )
{
// prologue
// find end of token
string::const_iterator itEnd = find_first_of(it+1,str.end(),delims.begin(),delims.end());
// add string to output
if( it == itA ) vRet.push_back(make_pair(0,string(it,itEnd)));
else vRet.push_back(make_pair(*it,string(it+1,itEnd)));
// epilogue
}
// epilogue
return vRet;
}
using namespace std;
int main()
{
string input = "some:word{or other";
typedef vector<token> tokens;
tokens toks = tokenize(input.c_str(), " :{");
cout << "Input: '" << input << " # Tokens: " << toks.size() << "'\n";
for( tokens::iterator it = toks.begin(); it != toks.end(); ++it )
{
cout << " Token : '" << it->second << "', Delimiter: '" << it->first << "'\n";
}
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

reverse words in a sentence using istringstream and istream_iterator - c++

Related

Function to separate each word from a string and put them into a vector, without using auto keyword?

Starting loop at specific index of a std::string?

string::replace not working correctly 100% of the time?

Remove whitespace, convert case, in string except in quotes

Selective iterator

Categories

Resources