Ignore spaces in vector C++ - c++

I'm trying to split a string in individual words using vector in C++. So I would like to know how to ignore spaces in vector, if user put more than one space between words in string.
How would I do that?
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
int main(){
cout<<"Sentence: ";
string sentence;
getline(cin,sentence);
vector<string> my;
int start=0;
unsigned int end=sentence.size();
unsigned int temp=0;
while(temp<end){
int te=sentence.find(" ",start);
temp=te;
my.push_back(sentence.substr(start, temp-start));
start=temp+1;
}
unsigned int i;
for(i=0 ; i<my.size() ; i++){
cout<<my[i]<<endl;
}
return 0;
}

Four things:
When reading input from a stream into astring using the overloaded >> operator, then it automatically separates on white-space. I.e. it reads "words".
There exists an input stream that uses a string as the input, std::istringstream.
You can use iterators with streams, like e.g. std::istream_iterator.
std::vector have a constructor taking a pair of iterators.
That means your code could simply be
std::string line;
std::getline(std::cin, line);
std::istringstream istr(line);
std::vector<std::string> words(std::istream_iterator<std::string>(istr),
std::istream_iterator<std::string>());
After this, the vector words will contain all the "words" from the input line.
You can easily print the "words" using std::ostream_iterator and std::copy:
std::copy(begin(words), end(words),
std::ostream_iterator<std::string>(std::cout, "\n"));

The easiest way is to use a std::istringstream like follows:
std::string sentence;
std::getline(std::cin,sentence);
std::istringstream iss(sentence);
std::vector<std::string> my;
std::string word;
while(iss >> word) {
my.push_back(word);
}
Any whitespaces will be ignored and skipped automatically.

You can create the vector directly using the std::istream_iterator which skips white spaces:
#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <iterator>
int main() {
std::string str = "Hello World Lorem Ipsum The Quick Brown Fox";
std::istringstream iss(str);
std::vector<std::string> vec {std::istream_iterator<std::string>(iss),
std::istream_iterator<std::string>() };
for (const auto& el : vec) {
std::cout << el << '\n';
}
}

Here is a function which divides given sentence into words.
#include <string>
#include <vector>
#include <sstream>
#include <utility>
std::vector<std::string> divideSentence(const std::string& sentence) {
std::stringstream stream(sentence);
std::vector<std::string> words;
std::string word;
while(stream >> word) {
words.push_back(std::move(word));
}
return words;
}

Reducing double, triple etc. spaces in string is a problem you'll encounter again and again. I've always used the following very simple algorithm:
Pseudocode:
while " " in string:
string.replace(" ", " ")
After the while loop, you know your string only has single spaces since multiple consecutive spaces were compressed to singles.
Most languages allow you to search for a substring in a string and most languages have the ability to run string.replace() so it's a useful trick.

Related

Reading all the words in a text file in C++

I have a large .txt file and I want to read all of the words inside it and print them on the screen. The first thing I did was to use std::getline() in this way:
std::vector<std::string> words;
std::string line;
while(std::getline(std::cin,line)){
words.push_back(line);
}
and then I printed out all the words present in the vector words. The .txt file is passed from command line as ./a.out < myTxt.txt.
The problem is that each component of the vector is a whole line, and so I am not reading each word.
The problem, I guess, is the spaces between words: how can I tell the code to ignore them? More specifically, is there any function that I can use in order to read each word from a .txt file?
UPDATE:
I'm trying to avoid all the commas ., but also ? ! (). I used find_first_of(), but my program doesn't work. Also, I don't know how to set what are the characters I don't want to be read, i.e. ., ?, !, and so on
std::vector<std::string> my_vec;
std::string line;
while(std::cin>>line){
std::size_t pos = line.find_first_of("!");
std::string line = line.substr(pos);
my_vec.push_back(line);
}
'>>' operator of type string exactly fills your requirements.
std::vector<std::string> words;
std::string line;
while (std::cin >> line) {
words.push_back(line);
}
If you need remove some noisy characters, e.g. ',','.', you can replace them with space character first.
#include <iostream>
#include <sstream>
#include <vector>
#include <algorithm>
int main() {
std::vector<std::string> words;
std::string line;
while (getline(std::cin, line)) {
std::transform(line.begin(), line.end(), line.begin(),
[](char c) { return std::isalnum(c) ? c : ' '; });
std::stringstream linestream(line);
std::string w;
while (linestream >> w) {
std::cout << w << "\n";
words.push_back(w);
}
}
}
cppreference
The getline function, as it sounds, only returns a whole line. You can split each line on spaces after reading it, or you can read word by word using operator>>:
string word;
while (cin >> word){
cout << word << "\n";
words.push_back(word);
}
Use operator>> instead of std::getline(). The operator will read individual whitespace-separated substrings for you.
#include <iostream>
#include <string>
#include <vector>
std::vector<std::string> my_vec;
std::string s;
while (std::cin >> s){
// use s as needed...
}
However, you may still end up receiving strings that have punctuation in them without any surrounding whitespace, ie hello,world, so you will have to manually split those strings as needed, eg:
#include <iostream>
#include <string>
#include <vector>
#include <cctype>
std::vector<std::string> my_vec;
std::string s;
while (std::cin >> s){
std::string::size_type start = 0, pos;
while ((pos = s.find_first_of(".,?!()", start)) != std::string::npos){
my_vec.push_back(s.substr(start, pos-start));
start = s.find_first_not_of(".,?!() \t\f\r\n\v", pos+1);
}
if (start == 0)
my_vec.push_back(s);
else if (start != std::string::npos)
my_vec.push_back(s.substr(start));
}

Reading csv file to vector of doubles

I am trying to read this csv file, let's call it "file.csv", and I'm trying to put it into a vector of double.
This csv contains numbers like:
755673.8431514322,
684085.6737614165,
76023.8121728658,
...
I tried using stringstream, and it successfully input these number to the vector but the input numbers is not like I wanted. Instead, the inputted numbers are
7556373, 684085, 76023.8
How can I read the whole digits without throwing any of it away?
This is my code
vector<long double> mainVector;
int main()
{
ifstream data;
data.open("file.csv");
while (data.good())
{
string line;
stringstream s;
long double db;
getline(data, line, ',');
s << line;
s >> db;
mainVector.push_back(db);
}
}
How to read the whole digits without throwing any of it.
As #user4581301 mentioned in the comments, I guess you are missing std::setprecision() while outputting.
However, you do not need std::stringstream to do the job. Convert line(which is a string directly to double using std::stold and place into the vector directly as follows.
That being said, use of std::stold will make sure not to have wrong input to the vector, by throwing std::invalid_argument exception, if the conversion from string to double was unsuccessful. (Credits to #user4581301)
#include <iostream>
#include <fstream>
#include <vector> // std::vector
#include <string> // std:: stold
#include <iomanip> // std::setprecision
int main()
{
std::vector<long double> mainVector;
std::ifstream data("file.csv");
if(data.is_open())
{
std::string line;
while(std::getline(data, line, ','))
mainVector.emplace_back(std::stold(line));
}
for(const auto ele: mainVector)
std::cout << std::setprecision(16) << ele << std::endl;
// ^^^^^^^^^^^^^^^^^^^^
return 0;
}

Split string according to character-combination/ at `\n`

What is the right way to split a string like below at a specific character-combination into a string vector?
string myString = "This is \n a test. Let's go on. \n Yeah.";
split at "\n" to get this result:
vector<string> myVector = {
"This is ",
" a test. Let's go on. ",
" Yeah."
}
I was using boost algorithm library but now I'd like to achieve this all without using an external library like boost.
#include <boost/algorithm/string/classification.hpp>
#include <boost/algorithm/string/split.hpp>
std::vector<std::string> result;
boost::split(result, "This is \n a test. Let's go on. \n Yeah.",
boost::is_any_of("\n"), boost::token_compress_on);
How about something like this:
#include <iostream>
#include <sstream>
#include <vector>
#include <string>
#include <iterator>
class line : public std::string {};
std::istream &operator>>(std::istream &iss, line &line)
{
std::getline(iss, line, '\n');
return iss;
}
int main()
{
std::istringstream iss("This is \n a test. Let's go on. \n Yeah.");
std::vector<std::string> v(std::istream_iterator<line>{iss}, std::istream_iterator<line>{});
// test
for (auto const &s : v)
std::cout << s << std::endl;
return 0;
}
Basically make a new type of string which is line and use stream iterator to read whole lines straight to vector range constructor
Working demo: https://ideone.com/4qdfY2
Solution 1: Just to remove "\n" from the string.
Just to remove "\n", you can use erase-remove idiom . SEE LIVE HERE
#include <iostream>
#include <string>
#include <algorithm>
int main()
{
std::string myString = "This is \n a test. Let's go on. \n Yeah.";
myString.erase(std::remove(myString.begin(), myString.end(), '\n'),
myString.end());
std::cout << myString<< std::endl;
}
Output:
This is a test. Let's go on. Yeah
Solution 2: To remove "\n" from the string and save each split at \n to a vector. (un-efficient)
Replace all \n occurance with some other charectors, which doesn't exist in the string (here I have chosen ;). Then parse with the help of std::stringstream and std::getline as follows. SEE LIVE HERE
#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
#include <sstream>
int main()
{
std::string myString = "This is \n a test. Let's go on. \n Yeah.";
std::replace(myString.begin(), myString.end(), '\n', ';');
std::stringstream ssMyString(myString);
std::string each_split;
std::vector<std::string> vec;
while(std::getline(ssMyString, each_split, ';')) vec.emplace_back(each_split);
for(const auto& it: vec) std::cout << it << "\n";
}
Output:
This is
a test. Let's go on.
Yeah.
Solution 3: To remove "\n" from the string and save each split at \n to a vector.
Loop through the string and find positions(using std::string::find) where \n(end position) finds. Push back the substrings (std::string::substr) using the information of starting position and the number of charectors between start and end position. Each time update the start and end positions, so that look up will not start again from the beging of the input string. SEE LIVE HERE
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <cstddef>
int main()
{
std::string myString = "This is \n a test. Let's go on. \n Yeah.";
std::vector<std::string> vec;
std::size_t start_pos = 0;
std::size_t end_pos = 0;
while ((end_pos = myString.find("\n", end_pos)) != std::string::npos)
{
vec.emplace_back(myString.substr(start_pos, end_pos - start_pos));
start_pos = end_pos + 1;
end_pos += 2;
}
vec.emplace_back(myString.substr(start_pos, myString.size() - start_pos)); // last substring
for(const auto& it: vec) std::cout << it << "\n";
}
Output:
This is
a test. Let's go on.
Yeah.

c++ found an example on splitting strings trying to figure out why change to it changes result

I'm learning about splitting strings for a program in class, and i came across this example.
#include <string>
#include <sstream>
#include <iostream>
int main()
{
std::string str = "23454323 ABCD EFGH";
std::istringstream iss(str);
std::string word;
while(iss >> word)
{
std::cout << word << '\n';
}
}
I modified so that the user instead inputs the string,but if I input the string stored in str i get 23454323 and not the other material in the string.
#include <string>
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
string str;
cout<<"Enter a postfix with a space between each object:";
cin>>str;
istringstream iss(str);
string word;
while(iss >> word)
{
cout << word << '\n';
}
}
Ok, thanks for the help everyone got it!
You need to modify your input code a little for this to work. Use:
getline(cin, str);
instead of:
cin >> str;
The latter will stop reading a string on whitespace characters.
Because you use the same input operator as for istringstream when you input from cin and it always breaks on whitespace.
That means you only read a single word from the user. You want to use std::getline.
Just as iss >> word reads a single space-separated word from iss, so cin >> str just reads the first word from cin.
To read a whole line, use getline(cin, str).
(Also, get out of the habit of dumping namespace std into the global namespace. It will cause problems as your programs grow.)

String into vector

I have a string which i then want to store in a vector
string a = "N\nT\n";
after each new line to be in a different cell.
std::string ss (".V/\n.F/\n.R/\n");
for(int i = 0; i< ss.size(); i++)
{
test1.push_back(ss);
}
I want to store the string in vector test1
is this the best way?
Your code won't work; it'll store the string ss.size() times in the vector.
You might want to use a string stream to split the string:
std::stringstream stream(ss);
std::string line;
while (std::getline(stream, line)) {
test1.push_back(line);
}
Note that the newline character will be discarded. If you want to keep it, push_back(line + "\n");.
Boost::split will do this for you. See usage details here:
http://www.boost.org/doc/libs/1_49_0/doc/html/string_algo/usage.html#id3184031
If the newline can be discarded then you could use std::copy():
#include <iostream>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <string>
#include <vector>
int main()
{
std::string ss(".V/\n.F/\n.R/\n");
std::istringstream in(ss);
std::vector<std::string> test1;
std::copy(std::istream_iterator<std::string>(in),
std::istream_iterator<std::string>(),
std::back_inserter(test1));
std::for_each(test1.begin(),
test1.end(),
[](const std::string& s)
{
std::cout << s << "\n";
});
return 0;
}
Output:
.V/
.F/
.R/
This certainly isn't the best way, because it doesn't work. This just pushes ss.size() instances of the std::string in the vector.
You can use the find and substr methods to partition the string and push them in the array. (not gonna write the actual code though, might be a good exercise).