Extract all numbers from stringstream - c++

I want to read string and extract all numbers.
Input: 5a3 1f a0aaaa f1fg3
Output: 53 1 0 13
I tried this code:
string s;
getline(cin, s);
stringstream str_strm(s);
int found;
string temp;
while (!str_strm.eof()) {
str_strm >> temp;
if (stringstream(temp) >> found)
{
cout << found << endl;
}
}
but when found 5 (from example)after that automatically start to check the other string. How can I extract all numbers?

Here's a possible solution - while loop is used to separate strings with whitespaces, after that digits are extracted from the sub-strings.
int main()
{
stringstream ss("5a3 1f a0aaaa f1fg3");
string str;
while (getline(ss, str, ' ') ){
str.erase(std::remove_if(str.begin(), str.end(), [](unsigned char c) { return !std::isdigit(c); }), str.end());
cout << str << " ";
}
}

You could read each space separated word, and then remove the non-digits, like this
std::string word;
while (std::cin >> word)
{
word.erase(std::remove_if(word.begin(), word.end(),
[](unsigned char c) { return not std::isdigit(c); }),
word.end());
std::cout << word << " ";
}
For the input of 5a3 1f a0aaaa f1fg3, it prints 53 1 0 13.
The admittedly odd way of removing elements of a range, is a common idiom.
You could even avoid the loop entirely, if you have the input on a single line
std::string word;
std::getline(std::cin, word);
word.erase(std::remove_if(word.begin(), word.end(),
[](unsigned char c) { return not std::isdigit(c)
and not std::isspace(c); }),
word.end());
std::cout << word;

Please see here the ultra simple example. (There is an even simpler solution at the bottom of this post)
It is using modern C++ elements and algorithms. And has only a few lines of code.
#include <iostream>
#include <string>
#include <regex>
#include <iterator>
#include <algorithm>
#include <sstream>
int main() {
// Read a string from the console
if (std::string line{}; std::getline(std::cin, line)) {
// Put the complete line into a std::istringstream
std::istringstream iss{line};
// Print result
std::transform(std::istream_iterator<std::string>(iss), {}, std::ostream_iterator<std::string>(std::cout, " "),
[](const std::string& s) { return std::regex_replace(s, std::regex{ R"([^\d])" }, ""); });
}
return 0;
}
So, what's going on here. Let us look at it statement by statement. So, first:
if (std::string line{}; std::getline(std::cin, line)) {
This is a if-statement with initializer. If you look up if in the C++ reference, here, then you can see, that we can now have an additional initialization statement as the first part in the if. And why are we using that? Because it is an additional measure for scoping. The variable "line" is only used within the scope of the if statement. It is not needed outside the if. From the functionality point of view, it is the same as writing:
std::string line{};
if (std::getline(std::cin, line)) {
But then, "line" would be also visible outside of the if statement. And, because we want to prevent the pollution of outer namespace, we select this method.
Next is std::getline. This will read a complete line from the input stream, so, from the console (std::cin)and put it into the string. The std::getline returns a reference to the stream. The stream has an overloaded bool operator, that returns, if there was a failure (or end of file) or not. So, the if statement checks, if the input operation works. By the way. All IO-opereations should be checked, if they work or fail.
Good, now we have the complete line of the user input in our variable "line".
With
std::istringstream iss{line};
we put the string into an std::istringstream. We do this, because we want to make use of the C++ "iostream" library. The std::istringstream behaves as any other stream, for example std::cin and you can extract values from it that are separated by a white space. Like in std::cin >> v1 >> v2. The disadvantage for such an approach is, that you need to know the number of values in advance or use a dynamic growing container and a loop.
And this brings ud to our next construct that I want to explain. You may have heard about "iterators". Iterators are like pointers and can point to a range of elements. If you have a std::vector or any other container, then you can iterate with the begin() and end() iterator over all elements in the std::vector without knowing, how many elements are in the std::vector, without knowing how many elements it contains.
And for input streams, we have something similar: The std::istream_iterator. This iterator will iterate over the elements in the std::sitringstream and returns the type of variable given in its template parameter, by repeatedly calling the extractor operator >>. Here, in our case, a std::string. You may know ask: Until when? Where is the end. If you look in the description of the constructor number 1 of the std::istream_operator then you will see, that the default constructor Constructs the end-of-stream iterator. and the default construct can be generated by using the empty braced {} initializer. So {} is the end iterator.
If we want to read all std::strings from the std::istringstream, then we read between
std::istream_iterator<std::string>(iss) and {}. So every string that is in the std::istringstream.
Good, next, there is a similar thing for output, the std::ostream_iterator. This will call the inserter operator "<<" for all elements in a given range. And, we can can specify, to which stream it should send the data, here std::cout and additionally a separator-string, which will be appended to the outputted value.
OK, next: std::transform. As it names says, it will transform the elements in a range of elements, between a begin() and end() iterator, to a other range. So, it will transform the elements as shown above from the std::istringstream and send them to the std::ostream iterator. So, we read the source value, transform it, then write it.
But, how to transform. For the transformation, we give a simple lambda function, which calls the std::regex_replace function. This is a standard function, to replace parts of a string with other string data. And, the what that will be replaced is specified by a std::regex. This is a special pattern that is defined in some kind of meta language and matches specified parts of a string. in our case we use [^\d] which means, not a digit. You can test regexes here. You can also lean about them here.
And now, all together, explains the above solution.
All this can be further optimized to 2 statements:
#include <iostream>
#include <string>
#include <regex>
int main() {
// Read a string from the console
if (std::string line{}; std::getline(std::cin, line)) {
// Remove unnecessary characters
std::cout << std::regex_replace(line, std::regex{ R"([^\d ])" }, "") << "\n";
}
return 0;
}
I cannot think of a more simpler solution.
In case of questions, please ask.

You can use get from istream to get each character, including whitespace, and then isdigit to check for a digit character...
#include <iostream>
#include <cctype>
int main()
{
char ch;
std::cin.get(ch);
while (!std::cin.eof())
{
if (isdigit(ch) || ch == ' ' || ch == '\n')
{
std::cout << ch;
}
std::cin.get(ch);
}
return 0;
}
However, you can avoid using std::cin.eof() for your expression for your While loop as follows...
#include <iostream>
#include <cctype>
int main()
{
char ch;
while (std::cin.get(ch))
{
if (isdigit(ch) || ch == ' ' || ch == '\n')
{
std::cout << ch;
}
}
return 0;
}

Regular expression pattern matching can be used to find all the digits in the input string.
Here is an example program to find the digits:
// C++ program to find all digits in a string
#include <bits/stdc++.h>
using namespace std;
int main() {
string inputString;
cout << "Enter the input string: ";
getline(cin, inputString);
cout << "Digits found: ";
// Define the regular expression matcher and pattern
smatch matcher;
regex pattern("[[:digit:]]");
while (regex_search(inputString, matcher, pattern)) {
// Show the match
cout << matcher.str(0);
// Continue searching the rest of the string
inputString = matcher.suffix().str();
}
return 0;
}
Output:
Enter the input string: sdfh354 eutyt;ljkn756897490uiotureu 587689jkgf 90
Digits found: 35475689749058768990
Here is another approach of finding the numbers in the string, without using the regular expression pattern matching:
#include <iostream>
#include <cctype>
#include <bits/stdc++.h>
using namespace std;
int main() {
string rawInput;
cout <<"Enter input string: ";
getline(cin, rawInput);
// Get all words from the input string
stringstream allWords(rawInput);
// Find and print digits in each word
string word;
while(allWords >> word) {
for(int i = 0; word[i]; i++) {
// Print only the numbers in the word
if(isdigit(word[i])) {
cout<<word[i];
}
}
cout<<" ";
}
cout<<"\n";
return 0;
}
Output:
Enter input string: ghjg45 jsdfj 897897 343yut45 90
45 897897 34345 90

How can I extract all numbers?
When you KNOW that the input numbers are all hex values ... (and how many)
stringstream ss ("5a3 1f a0aaaa f1fg3");
for (int i=0; i<4; ++i)
{
int k;
ss >> hex >> k;
cout << k << endl;
}
with output
1443
31
10529450
3871

Related

How do I parse this file in cpp?

I want to parse a file with the following content:
2 300
abc12 130
bcd22 456
3 400
abfg12 230
bcpd22 46
abfrg2 13
Here, 2 is the number of lines, 300 is the weight.
Each line has a string and a number(price). Same with 3 and 400.
I need to store 130, 456 in an array.
Currently, I am reading the file and each line is processed as std::string. I need help to progress further.
Code:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
//void processString(string line);
void process2(string line);
int main(int argc, char ** argv) {
cout << "You have entered " << argc <<
" arguments:" << "\n";
for (int i = 1; i < argc; ++i)
cout << argv[i] << "\n";
//2, 4 are the file names
//Reading file - market price file
string line;
ifstream myfile(argv[2]);
if (myfile.is_open()) {
while (getline(myfile, line)) {
// cout << line << '\n';
}
myfile.close();
} else cout << "Unable to open market price file";
//Reading file - price list file
string line_;
ifstream myfile2(argv[4]);
int c = 1;
if (myfile2.is_open()) {
while (getline(myfile2, line_)) {
// processString(line_);
process2(line_);
}
myfile2.close();
} else cout << "Unable to open price lists file";
//processString(line_);
return 0;
}
void process2(string line) {
string word = "";
for (auto x: line) {
if (x == ' ') {
word += " ";
} else {
word = word + x;
}
}
cout << word << endl;
}
Is there a split function like in Java, so I can split and store everything as tokens?
You have 2 questions in your post:
How do I parse this file in cpp?
Is there a split function like in Java, so I can split and store everything as tokens?
I will answer both questions and show a demo example.
Let's start with splitting a string into tokens. There are several possibilities. We start with the easy ones.
Since the tokens in your string are delimited by a whitespace, we can take advantage of the functionality of the extractor operator (>>). This will read data from an input stream, up to a whitespace and then converts this read data into the specified variable. You know that this operation can be chained.
Then for the example string
const std::string line{ "Token1 Token2 Token3 Token4" };
you can simply put that into a std::istringstream and then extract the variables from the stream:
std::istringstream iss1(line);
iss1 >> subString1 >> subString2 >> subString3 >> subString4;
The disadvantage is that you need to write a lot of stuff and you have to know the number of elements in the string.
We can overcome this problem with using a vector as the taget data store and fill it with its range constructor. The vectors range constructor takes a begin and and end interator and copies the data into it.
As iterator we use the std::istream_iterator. This will, in simple terms, call the extractor operator (>>) until all data is consumed. Whatever number of data we will have.
This will then look like the below:
std::istringstream iss2(line);
std::vector token(std::istream_iterator<std::string>(iss2), {});
This may look complicated, but is not. We define a variable "token" of type std::vector. We use its range constructor.
And, we can define the std::vector without template argument. The compiler can deduce the argument from the given function parameters. This feature is called CTAD ("class template argument deduction", C++17 required).
Additionally, you can see that I do not use the "end()"-iterator explicitely.
This iterator will be constructed from the empty brace-enclosed default initializer with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.
There is an additional solution. It is the most powerful solution and hence maybe a little bit to complicated in the beginning.
With that can avoid the usage of std::istringstream and directly convert the string into tokens using std::sregex_token_iterator. Very simple to use. And the result is a one liner for splitting the original string:
std::vector<std::string> token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
So, modern C++ has a build in functionality which is exactly designed for the purpose of tokenizing strings. It is called std::sregex_token_iterator. What is this thing?
As it name says, it is an iterator. It will iterate over a string (hence the 's' in its name) and return the split up tokens. The tokens will be matched again a regular expression. Or, natively, the delimiter will be matched and the rest will be seen as token and returned. This will be controlled via the last flag in its constructor.
Let's have a look at this constructor:
token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
The first parameter is, where it should start in the source string, the 2nd parameter is the end position, up to which the iterator should work. The last parameter is:
1, if you want to have a positive match for the regex
-1, will return everything that not matches the regex
And last but not least the regex itself. Please read in the net abot regex'es. There are tons of pages available.
Please see a demo for all 3 solutions here:
#include <iostream>
#include <string>
#include <vector>
#include <regex>
#include <sstream>
#include <iterator>
#include <algorithm>
/// Split string into tokens
int main() {
// White space separated tokens in a string
const std::string line{ "Token1 Token2 Token3 Token4" };
// Solution 1: Use extractor operator ----------------------------------
// Here, we will store the result
std::string subString1{}, subString2{}, subString3{}, subString4{};
// Put the line into an istringstream for easier extraction
std::istringstream iss1(line);
iss1 >> subString1 >> subString2 >> subString3 >> subString4;
// Show result
std::cout << "\nSolution 1: Use inserter operator\n- Data: -\n" << subString1 << "\n"
<< subString2 << "\n" << subString3 << "\n" << subString4 << "\n";
// Solution 2: Use istream_iterator ----------------------------------
std::istringstream iss2(line);
std::vector token(std::istream_iterator<std::string>(iss2), {});
// Show result
std::cout << "\nSolution 2: Use istream_iterator\n- Data: -\n";
std::copy(token.begin(), token.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
// Solution 3: Use std::sregex_token_iterator ----------------------------------
const std::regex re(" ");
std::vector<std::string> token2(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});
// Show result
std::cout << "\nSolution 3: Use sregex_token_iterator\n- Data: -\n";
std::copy(token2.begin(), token2.end(), std::ostream_iterator<std::string>(std::cout, "\n"));
return 0;
}
So, now the answer on how you could read you text file.
It is essential to create the correct data structures. Then, overwrite the inserter and extractor operator and put the above functionality in it.
Please see the below demo example. Of course there are many other possible solutions:
#include <string>
#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <iterator>
struct ItemAndPrice {
// Data
std::string item{};
unsigned int price{};
// Extractor
friend std::istream& operator >> (std::istream& is, ItemAndPrice& iap) {
// Read a complete line from the stream and check, if that worked
if (std::string line{}; std::getline(is, line)) {
// Read the item and price from that line and check, if that worked
if (std::istringstream iss(line); !(iss >> iap.item >> iap.price))
// There was an error, while reading item and price. Set failbit of input stream
is.setf(std::ios::failbit);
}
return is;
}
// Inserter
friend std::ostream& operator << (std::ostream& os, const ItemAndPrice& iap) {
// Simple output of our internal data
return os << iap.item << " " << iap.price;
}
};
struct MarketPrice {
// Data
std::vector<ItemAndPrice> marketPriceData{};
size_t numberOfElements() const { return marketPriceData.size(); }
unsigned int weight{};
// Extractor
friend std::istream& operator >> (std::istream& is, MarketPrice& mp) {
// Read a complete line from the stream and check, if that worked
if (std::string line{}; std::getline(is, line)) {
size_t numberOfEntries{};
// Read the number of following entries and the weigth from that line and check, if that worked
if (std::istringstream iss(line); (iss >> numberOfEntries >> mp.weight)) {
mp.marketPriceData.clear();
// Now copy the numberOfEntries next lines into our vector
std::copy_n(std::istream_iterator<ItemAndPrice>(is), numberOfEntries, std::back_inserter(mp.marketPriceData));
}
else {
// There was an error, while reading number of following entries and the weigth. Set failbit of input stream
is.setf(std::ios::failbit);
}
}
return is;
};
// Inserter
friend std::ostream& operator << (std::ostream& os, const MarketPrice& mp) {
// Simple output of our internal data
os << "\nNumber of Elements: " << mp.numberOfElements() << " Weight: " << mp.weight << "\n";
// Now copy all marekt price data to output stream
if (os) std::copy(mp.marketPriceData.begin(), mp.marketPriceData.end(), std::ostream_iterator<ItemAndPrice>(os, "\n"));
return os;
}
};
// For this example I do not use argv and argc and file streams.
// This, because on Stackoverflow, I do not have files on Stackoverflow
// So, I put the file data in an istringstream. But for the below example,
// there is no difference between a file stream or a string stream
std::istringstream sourceFile{R"(2 300
abc12 130
bcd22 456
3 400
abfg12 230
bcpd22 46
abfrg2 13)"};
int main() {
// Here we will store all the resulting data
// So, read the complete source file, parse the data and store result in vector
std::vector mp(std::istream_iterator<MarketPrice>(sourceFile), {});
// Now, all data are in mp. You may work with that now
// Show result on display
std::copy(mp.begin(), mp.end(), std::ostream_iterator<MarketPrice>(std::cout, "\n"));
return 0;
}

Splitting sentences and placing in vector

I was given a code from my professor that takes multiple lines of input. I am currently changing the code for our current assignment and I came across an issue. The code is meant to take strings of input and separate them into sentences from periods and put those strings into a vector.
vector<string> words;
string getInput() {
string s = ""; // string to return
bool cont = true; // loop control.. continue is true
while (cont){ // while continue
string l; // string to hold a line
cin >> l; // get line
char lastChar = l.at(l.size()-1);
if(lastChar=='.') {
l = l.substr(0, l.size()-1);
if(l.size()>0){
words.push_back(s);
s = "";
}
}
if (lastChar==';') { // use ';' to stop input
l = l.substr(0, l.size()-1);
if (l.size()>0)
s = s + " " + l;
cont = false; // set loop control to stop
}
else
s = s + " " + l; // add line to string to return
// add a blank space to prevent
// making a new word from last
// word in string and first word
// in line
}
return s;
}
int main()
{
cout << "Input something: ";
string s = getInput();
cout << "Your input: " << s << "\n" << endl;
for(int i=0; i<words.size(); i++){
cout << words[i] << "\n";
}
}
The code puts strings into a vector but takes the last word of the sentence and attaches it to the next string and I cannot seem to understand why.
This line
s = s + " " + l;
will always execute, except for the end of input, even if the last character is '.'. You are most likely missing an else between the two if-s.
You have:
string l; // string to hold a line
cin >> l; // get line
The last line does not read a line unless the entire line has non-white space characters. To read a line of text, use:
std::getline(std::cin, l);
It's hard telling whether that is tripping your code up since you haven't posted any sample input.
I would at least consider doing this job somewhat differently. Right now, you're reading a word at a time, then putting the words back together until you get to a period.
One possible alternative would be to use std::getline to read input until you get to a period, and put the whole string into the vector at once. Code to do the job this way could look something like this:
#include <iostream>
#include <string>
#include <algorithm>
#include <vector>
#include <iterator>
int main() {
std::vector<std::string> s;
std::string temp;
while (std::getline(std::cin, temp, '.'))
s.push_back(temp);
std::transform(s.begin(), s.end(),
std::ostream_iterator<std::string>(std::cout, ".\n"),
[](std::string const &s) { return s.substr(s.find_first_not_of(" \t\n")); });
}
This does behave differently in one circumstance--if you have a period somewhere other than at the end of a word, the original code will ignore that period (won't treat it as the end of a sentence) but this will. The obvious place this would make a difference would be if the input contained a number with a decimal point (e.g., 1.234), which this would break at the decimal point, so it would treat the 1 as the end of one sentence, and the 234 as the beginning of another. If, however, you don't need to deal with that type of input, this can simplify the code considerably.
If the sentences might contain decimal points, then I'd probably write the code more like this:
#include <iostream>
#include <string>
#include <algorithm>
#include <vector>
#include <iterator>
class sentence {
std::string data;
public:
friend std::istream &operator>>(std::istream &is, sentence &s) {
std::string temp, word;
while (is >> word) {
temp += word + ' ';
if (word.back() == '.')
break;
}
s.data = temp;
return is;
}
operator std::string() const { return data; }
};
int main() {
std::copy(std::istream_iterator<sentence>(std::cin),
std::istream_iterator<sentence>(),
std::ostream_iterator<std::string>(std::cout, "\n"));
}
Although somewhat longer and more complex, at least to me it still seems (considerably) simpler than the code in the question. I guess it's different in one way--it detects the end of the input by...detecting the end of the input, rather than depending on the input to contain a special delimiter to mark the end of the input. If you're running it interactively, you'll typically need to use a special key combination to signal the end of input (e.g., Ctrl+D on Linux/Unix, or F6 on Windows).
In any case, it's probably worth considering a fundamental difference between this code and the code in the question: this defines a sentence as a type, where the original code just leaves everything as strings, and manipulates strings. This defines an operator>> for a sentence, that reads a sentence from a stream as we want it read. This gives us a type we can manipulate as an object. Since it's like a string in other ways, we provide a conversion to string so once you're done reading one from a stream, you can just treat it as a string. Having done that, we can (for example) use a standard algorithm to read sentences from standard input, and write them to standard output, with a new-line after each to separate them.

How to change each word in a string vector to upper case

I was inquiring about reading a sequence of words and storing the values in a vector. Then proceed to change each word in the vector to uppercase and print the out put with respect to eight word to a line. I think my code is either slow or running infinitely as i can't seem to achieve an output.
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main() {
string word;
vector<string> text;
while (getline(cin, word)) {
text.push_back(word);
}
for (auto index = text.begin(); index != text.end(); ++index) {
for ( auto it = word.begin(); it != word.end(); ++it)
*it = toupper(*it);
/*cout<< index << " " << endl;*/
}
for (decltype(text.size()) i = 0; i != 8; i++)
cout << text[i] << endl;
return 0;
}
At least as far as I can tell, the idea here is to ignore the existing line structure, and write out 8 words per line, regardless of line breaks in the input data. Assuming that's correct, I'd start by just reading words from the input, paying no attention to the existing line breaks.
From there, it's a matter of capitalizing the words, writing them out, and (if you're at a multiple of 8, a new-line.
I would also use standard algorithms for most of the work, instead of writing my own loops to do the pars such as reading and writing the data. Since the pattern is basically just reading a word, modifying it, then writing out the result, it fits nicely with the std::transform algorithm.
Code to do that could look something like this:
#include <string>
#include <iostream>
#include <algorithm>
std::string to_upper(std::string in) {
for (auto &ch : in)
ch = toupper((unsigned char) ch);
return in;
}
int main() {
int count = 0;
std::transform(
std::istream_iterator<std::string>(std::cin),
std::istream_iterator<std::string>(),
std::ostream_iterator<std::string>(std::cout),
[&](std::string const &in) {
char sep = (++count % 8 == 0) ? '\n' : ' ';
return to_upper(in) + sep;
});
}
We could implement capitalizing each string as a second lambda, nested inside the first, but IMO, that starts to become a bit unreadable. Likewise, we could use std::tranform to implement the upper-case transformation inside of to_upper.
I'll rewrite my answer here:
Your outer for loop defines index to cycle through text, but you never use index inside it. The inner loop uses word, but word is still the last one the user entered. You should change the inner loop so that it uses index instead of word, like this:
for ( auto it = index->begin(); it != index->end(); ++it)
This is effectively an infinite loop:
while (getline(cin, word)) {
text.push_back(word);
}
getline(cin, word) reads a line (ending in '\n') from stdin, and puts it into word. It then returns cin itself (which will evaluate to true if the read was successful). You seem to be using it to get a space-delimited word, rather than a whole line, but that's not what it does. Since you put it in the condition of the while, after you enter a line, it will wait for another line.
This loop only breaks when getline fails. For example, by hitting an End of File character. I expect you're using the console and pressing Enter. In that case, you are never causing getline to fail. (If you're feeding a file into stdin, it should work.)
The typical solution to this is to have some sort of way of indicating a stop (such as an "Enter an empty line to stop" or "Write \"STOP\" to stop", and then checking for that before inserting the line into the vector). For you, the solution is to read in a SINGLE line, and then break it up into words (for example, using the sstream library).
You can detect whether the program is doing actual work (rather than waiting for more input) by viewing your CPU use. In Windows, this is CTRL+SHIFT+ESC -> Performance, and -> Processes to see your program in particular. You will find that the program isn't actually using the CPU (because it's waiting for more input).
You should try inserting print statements into your program to figure out where it gets up to. You will find it never goes past the for-loop.
Short Answer
for (string &str : vec)
{
transform(str.begin(), str.end(), str.begin(), [](char c) { return std::toupper(c); });
}
Complete working code as example:
#include <iostream>
#include <string>
#include <vector>
#include <cctype>
#include <algorithm>
using namespace std;
int main()
{
vector<string> vec;
string str;
while (cin >> str)
{
vec.push_back(str);
}
for (string &str : vec)
{
transform(str.begin(), str.end(), str.begin(), [](char c)
{ return toupper(c); });
}
for (auto str : vec)
{
cout << str << endl;
}
return 0;
}

When parsing a string using a string stream, it extracts a new line character

Description of the program : The program must read in a variable amount of words until a sentinel value is specified ("#" in this case). It stores the words in a vector array.
Problem : I use a getline to read in the string and parse the string with a stringstream. My problem is that the stringstream is not swallowing the new line character at the end of each line and is instead extracting it.
Some solutions I have thought of is to cut off the last character by creating a subset or checking if the next extracted word is a new line character, but I feel there is a better cost efficient solution such as changing the conditions for my loops.
I have included a minimized version of the overall code that reproduces the problem.
#include <iostream>
#include <sstream>
#include <string>
using namespace std;
int main()
{
const int MAX_LIST_SIZE = 1000;
string str;
string list[MAX_LIST_SIZE];
int numWords = 0;
// program starts here
getline(cin, str); // read innput
stringstream parse(str); // use stringstream to parse input
while(str != "#") // read in until sentinel value
{
while(!parse.fail()) // until all words are extracted from the line
{
parse >> list[numWords]; // store words
numWords++;
}
getline(cin,str); // get next line
parse.clear();
parse.str(str);
}
// print number of words
cout << "Number of words : " << numWords << endl;
}
And a set of test input data that will produce the problem
Input:
apples oranges mangos
bananas
pineapples strawberries
Output:
Number of words : 9
Expected Output:
Number of words : 6
I would appreciate any suggestions on how to deal with this problem in an efficient manner.
Your logic for parsing out the stream isn't quite correct. fail() only becomes true after a >> operation fails, so you'll doing an extra increment each time. For example:
while(!parse.fail())
{
parse >> list[numWords]; // fails
numWords++; // increment numWords anyway
} // THEN check !fail(), but we incremented already!
All of these operations have returns that you should check as you go to avoid this problem:
while (getline(cin, str)) { // fails if no more lines in cin
if (str != "#") { // doesn't need to be a while
stringstream parse(str);
while (parse >> list[numWords]) { // fails if no more words
++numWords; // *only* increment if we got one!
}
}
}
Even better would be to not use an array at all for the list of words:
std::vector<std::string> words;
Which can be used in the inner loop:
std::string temp;
while (parse >> temp) {
words.push_back(temp);
}
The increment on numwords happens one more time than you intend at the end of each line. Use a std::vector< std::string > for your list. Then you can use list.size().

Converting characters in strings to uppercase not working

I have this C++ code(Which I will explain below):
#include <iostream>
#include <string>
#include <vector>
#include <cctype>
using namespace std;
int main()
{
// part 1
cout << "Write many words separated by either newlines or spaces:"<< endl;
string word;
vector<string> v;
while(cin >> word){
if(word == "quit"){
break;
}
else{
v.push_back(word);
}
}
//part 2
for(string x:v){
for(char &j:x){
j = toupper(j);
}
}
//part 3
for(string x:v){
cout << x << endl;
}
return 0;
}
What I am trying to do is get a sequence of strings and convert each character in the string to uppercase and output the strings back.
I want to use vectors for this as I am studying it.
In the part 1, I get strings from the standard input and store them in a string vector. I write "quit" to break out of the loop and begin capitalising the letters in each string.
The problem is with part 2,obviously. What I am trying to do there is this:
1- Get a string as we loop.
2 Once we have a string, get a character in that string and transform it into uppercase.Do this for all the characters.
3-Do this for all the strings.
When I compile it, I get all correct except the strings being capitalised.
I am really confused D:
for(string x:v){
for(char &j:x){
j = toupper(j);
}
}
You take every character out of the string by reference, but you take the string by value. Try
for (string& x : v){
// […]
}
Note that with C++1Z we will be able to use terse range-based for loops, making life a lot easier:
for (x : v) { // Captures by reference automatically
// […]
}