I have seen some very popular questions here, on StackOverflow about splitting a string in C++, but every time, they needed to split that string by the SPACE delimiter. Instead, I want to split an std::string by the ; delimiter.
This code is taken from a n answer on StackOverflow, but I don't know how to update it for ;, instead of SPACE.
#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
int main() {
using namespace std;
string sentence = "And I feel fine...";
istringstream iss(sentence);
copy(istream_iterator<string>(iss),
istream_iterator<string>(),
ostream_iterator<string>(cout, "\n"));
}
Can you help me?
Here is one of the answers from Split a string in C++? that uses any delimiter.
I use this to split string by a delim. The first puts the results in a pre-constructed vector, the second returns a new vector.
std::vector<std::string> &split(const std::string &s, char delim, std::vector<std::string> &elems) {
std::stringstream ss(s);
std::string item;
while (std::getline(ss, item, delim)) {
elems.push_back(item);
}
return elems;
}
std::vector<std::string> split(const std::string &s, char delim) {
std::vector<std::string> elems;
split(s, delim, elems);
return elems;
}
Note that this solution does not skip empty tokens, so the following will find 4 items, one of which is empty:
std::vector<std::string> x = split("one:two::three", ':');
Related
I'm trying to split a string in individual words using vector in C++. So I would like to know how to ignore spaces in vector, if user put more than one space between words in string.
How would I do that?
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
int main(){
cout<<"Sentence: ";
string sentence;
getline(cin,sentence);
vector<string> my;
int start=0;
unsigned int end=sentence.size();
unsigned int temp=0;
while(temp<end){
int te=sentence.find(" ",start);
temp=te;
my.push_back(sentence.substr(start, temp-start));
start=temp+1;
}
unsigned int i;
for(i=0 ; i<my.size() ; i++){
cout<<my[i]<<endl;
}
return 0;
}
Four things:
When reading input from a stream into astring using the overloaded >> operator, then it automatically separates on white-space. I.e. it reads "words".
There exists an input stream that uses a string as the input, std::istringstream.
You can use iterators with streams, like e.g. std::istream_iterator.
std::vector have a constructor taking a pair of iterators.
That means your code could simply be
std::string line;
std::getline(std::cin, line);
std::istringstream istr(line);
std::vector<std::string> words(std::istream_iterator<std::string>(istr),
std::istream_iterator<std::string>());
After this, the vector words will contain all the "words" from the input line.
You can easily print the "words" using std::ostream_iterator and std::copy:
std::copy(begin(words), end(words),
std::ostream_iterator<std::string>(std::cout, "\n"));
The easiest way is to use a std::istringstream like follows:
std::string sentence;
std::getline(std::cin,sentence);
std::istringstream iss(sentence);
std::vector<std::string> my;
std::string word;
while(iss >> word) {
my.push_back(word);
}
Any whitespaces will be ignored and skipped automatically.
You can create the vector directly using the std::istream_iterator which skips white spaces:
#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <iterator>
int main() {
std::string str = "Hello World Lorem Ipsum The Quick Brown Fox";
std::istringstream iss(str);
std::vector<std::string> vec {std::istream_iterator<std::string>(iss),
std::istream_iterator<std::string>() };
for (const auto& el : vec) {
std::cout << el << '\n';
}
}
Here is a function which divides given sentence into words.
#include <string>
#include <vector>
#include <sstream>
#include <utility>
std::vector<std::string> divideSentence(const std::string& sentence) {
std::stringstream stream(sentence);
std::vector<std::string> words;
std::string word;
while(stream >> word) {
words.push_back(std::move(word));
}
return words;
}
Reducing double, triple etc. spaces in string is a problem you'll encounter again and again. I've always used the following very simple algorithm:
Pseudocode:
while " " in string:
string.replace(" ", " ")
After the while loop, you know your string only has single spaces since multiple consecutive spaces were compressed to singles.
Most languages allow you to search for a substring in a string and most languages have the ability to run string.replace() so it's a useful trick.
I'm trying to split some data i receive, the data is like this:
0010|chocolate|cookie;458|strawberry|cream;823|peanut|butter;09910|chocolate|icecream
so first i need to separe each section of food (separed with ";") and then get the ID of only the food sections that contains "chocolate" on it, the problem is that the data is not static so i can't predict how many times a food section with "chocolate" will appear.
Here is the code where i split the food sections and get the quantity of sections that are in the data:
#include <string>
#include <sstream>
#include <vector>
#include <iostream>
#include <fstream>
using namespace std;
vector<string> &split(const string &s, char delim, vector<string> &elems)
{
stringstream ss(s);
string item;
while (getline(ss, item, delim))
{
elems.push_back(item);
}
return elems;
}
vector<string> split(const string &s, char delim)
{
vector<string> elems;
split(s, delim, elems);
return elems;
}
char* data = "0010|chocolate|cookie;458|strawberry|cream;823|peanut|butter;09910|chocolate|icecream";
int main()
{
vector<string> food = split(data, ';');
cout << number of food sections is : " << food.size();
return 0;
}
It works, but now i want it to read in ALL the sections and list me which ones contains "chocolate" on it like:
0010|chocolate|cookie
09910|chocolate|icecream
then list me only the ID's of the sections that contains chocolate on it, which probably is possible with the same split vector i use.
0010
09910
It just depends how rich your data is. Ultimately you have to throw a recursive descent parser at it. But this seems simpler.
Can semicolon be escaped? If not, go though, and each time you hit a semicolon, store the index in a growing vector. That gives you record starts. Then step through the records. Create a temporary string which consists of the record up to the semicolon, then search for the string "chocolate". If it matches, the id is the first field in your record, so up to the first | character.
Try using a function to find a word inside a string delimited by delim, like this one:
bool find(string vfood, string s, char delim)
{
std::istringstream to_find(vfood);
for (std::string word; std::getline(to_find, word, delim); ) if (word == s) return true;
return false;
}
And then you can find whatever you want within each string of 'food'
vector<string> food_with_chocolate;
for (string &s : food)
{
if (find(s, "chocolate", '|')) food_with_chocolate.push_back(s);
}
I have a function that prompts the user for input. If they input more than the number of words I want(3), then an error should be printed. How do I approach this? I found out how to check if the input is < 3, but not > 3.
struct Info
{
std::string cmd;
std::string name;
std::string location;
}
Info* get_string()
{
std::string raw_input;
std::getline(std::cin, raw_input);
std::istringstream input(raw_input);
std::string cmd;
std::string name;
std::string location;
input>>cmd;
input>>name;
input>>location;
Info* inputs = new Info{cmd, name, location};
return inputs;
}
The function I have automatically takes 3 strings and stores them in my struct, which I check later to see if any part of the struct is empty (for example: "Run" "Joe" ""), but what if they enter in 4 strings? Thank you
you can split the input string into words with a space delimiter and then check the number of words. you can use the function below to split your input. after this you can check the size of the vector.
#include <vector>
#include <string>
#include <iostream>
#include <sstream>
using namespace std;
vector<std::string> split(const string &s, char delim) {
stringstream ss(s);
string item;
vector<string> res;
while (getline(ss, item, delim)) {
if(item.length()==0)continue;
res.push_back(item);
}
return res;
}
int _tmain(int argc, _TCHAR* argv[])
{
string theString;
cin>>theString;
vector<string> res=split(theString, ' ');
if(res.size()>3)
{
//show error
}
return 0;
}
The problem with this and with Ferdinand's idea is that in order to test if a 4th string exists, you have to "ask" for it. If it exists, you can error, but if it doesn't then it sits there waiting for input and the user wonders what is going wrong.
Thus I'm going to modify your code slightly. It's fairly straight forward. If the user enters a space in the last "word", then you know that there is an issue and can deal with it as you wish.
// Replace input >> location; with the below
// Get until the line break, including spaces
getline(input, location);
// Check if there is a space (I.e. 2+ words)
if(location.find(" ") != string::npos){
// If so, fail
}
Resources for learning:
http://www.cplusplus.com/reference/string/string/find/
http://www.cplusplus.com/reference/string/string/getline/
I have a set of strings separated by #. I want to split them and insert into unordered_set.
For example.
abc#def#ghi
xyz#mno#pqr
I use boost split by passing unordered set. But everytime I get new result set. I want to append the next result into the same set.
std::string str1 = "abc#def#ghi";
std::string str2 = "xyz#mno#pqr";
std::unordered_set<std::string> result
boost::split(result, str1, boost::is_any_of("#"));
boost::split(result, str2, boost::is_any_of("#"));
If i check result set, i only get xyz, mno, pqr. I want it to have been appended with "abc def and ghi". How to achieve it.
Note: I dont want to use a any additional container.
I'd do: (see it Live On Coliru)
#include <sstream>
#include <unordered_set>
#include <iostream>
int main()
{
std::unordered_set<std::string> result;
std::istringstream iss("abc#def#ghi");
std::string tok;
while (std::getline(iss, tok, '#'))
result.insert(tok);
iss.str("xyz#mno#pqr");
iss.clear();
while (std::getline(iss, tok, '#'))
result.insert(tok);
for (auto& s : result)
std::cout << s << "\n";
}
This is because boost::split clean the destination container before writing into it.
I'd use boost::tokenizer for what you want.
#include<boost/tokenizer>
// ....
typedef boost::tokenizer<boost::char_separator<char> > tokenizer;
boost::char_separator<char> sep("#");
std::string str1 = "abc#def#ghi";
std::string str2 = "xyz#mno#pqr";
std::unordered_set<std::string> result;
tokenizer t1(str1, sep), t2(str2, sep);
std::copy(t1.begin(), t1.end(), std::inserter(result, result.end()) );
std::copy(t2.begin(), t2.end(), std::inserter(result, result.end()) );
This question already has answers here:
How do I iterate over the words of a string?
(84 answers)
Closed 4 years ago.
How do you split a string into tokens in C++?
this works nicely for me :), it puts the results in elems. delim can be any char.
std::vector<std::string> &split(const std::string &s, char delim, std::vector<std::string> &elems) {
std::stringstream ss(s);
std::string item;
while(std::getline(ss, item, delim)) {
elems.push_back(item);
}
return elems;
}
With this Mingw distro that includes Boost:
#include <iostream>
#include <string>
#include <vector>
#include <iterator>
#include <ostream>
#include <algorithm>
#include <boost/algorithm/string.hpp>
using namespace std;
using namespace boost;
int main() {
vector<string> v;
split(v, "1=2&3=4&5=6", is_any_of("=&"));
copy(v.begin(), v.end(), ostream_iterator<string>(cout, "\n"));
}
You can use the C function strtok:
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
The Boost Tokenizer will also do the job:
#include<iostream>
#include<boost/tokenizer.hpp>
#include<string>
int main(){
using namespace std;
using namespace boost;
string s = "This is, a test";
tokenizer<> tok(s);
for(tokenizer<>::iterator beg=tok.begin(); beg!=tok.end();++beg){
cout << *beg << "\n";
}
}
Try using stringstream:
std::string line("A line of tokens");
std::stringstream lineStream(line);
std::string token;
while(lineStream >> token)
{
}
Check out my answer to your last question:
C++ Reading file Tokens
See also boost::split from String Algo library
string str1("hello abc-*-ABC-*-aBc goodbye");
vector<string> tokens;
boost::split(tokens, str1, boost::is_any_of("-*"));
// tokens == { "hello abc","ABC","aBc goodbye" }
It depends on how complex the token delimiter is and if there are more than one. For easy problems, just use std::istringstream and std::getline. For more complex tasks or if you want to iterate the tokens in an STL-compliant way, use Boost's Tokenizer. Another possibility (although messier than either of these two) is to set up a while loop that calls std::string::find and updates the position of the last found token to be the start point for searching for the next. But this is probably the most bug-prone of the 3 options.