How to split string read from text file into array using c++ - c++

I want to split the strings on each line of my text file into an array, similar to the split() function in python. my desired syntax is a loop that enters every split-string into the next index of an array,
so for example if my string:
"ab,cd,ef,gh,ij"
, every time I encounter a comma then I would:
datafile >> arr1[i]
and my array would end up:
arr1 = [ab,cd,ef,gh,ij]
a mock code without reading a text file is provided below
#include <iostream>
#include <fstream>
#include <stdio.h>
#include <string.h>
#include <string>
using namespace std;
int main(){
char str[] = "ab,cd,ef,gh,ij"; //" ex str in place of file contents/fstream sFile;"
const int NUM = 5;
string sArr[NUM];//empty array
char *token = strtok(str, ",");
for (int i=0; i < NUM; i++)
while((token!=NULL)){
("%s\n", token) >> sArr[i];
token = strtok(NULL, ",");
}
cout >> sArr;
return 0;
}

In C++ you can read a file line by line and directly get a std::string.
You will found below an example I made with a split() proposal as you requested, and a main() example of reading a file:
Example
data file:
ab,cd,ef,gh
ij,kl,mn
c++ code:
#include <fstream>
#include <iostream>
#include <vector>
std::vector<std::string> split(const std::string & s, char c);
int main()
{
std::string file_path("data.txt"); // I assumed you have that kind of file
std::ifstream in_s(file_path);
std::vector <std::vector<std::string>> content;
if(in_s)
{
std::string line;
std::vector <std::string> vec;
while(getline(in_s, line))
{
for(const std::string & str : split(line, ','))
vec.push_back(str);
content.push_back(vec);
vec.clear();
}
in_s.close();
}
else
std::cout << "Could not open: " + file_path << std::endl;
for(const std::vector<std::string> & str_vec : content)
{
for(unsigned int i = 0; i < str_vec.size(); ++i)
std::cout << str_vec[i] << ((i == str_vec.size()-1) ? ("") : (" : "));
std::cout << std::endl;
}
return 0;
}
std::vector<std::string> split(const std::string & s, char c)
{
std::vector<std::string> splitted;
std::string word;
for(char ch : s)
{
if((ch == c) && (!word.empty()))
{
splitted.push_back(word);
word.clear();
}
else
word += ch;
}
if(!word.empty())
splitted.push_back(word);
return splitted;
}
output:
ab : cd : ef : gh
ij : kl : mn
I hope it will help.

So, a few things to fix. Firstly, arrays and NUM are kind of limiting - you have to fix up NUM whenever you change the input string, so C++ provides std::vector which can resize itself to however many strings it finds. Secondly, you want to call strtok until it returns nullptr once, and you can do that with one loop. With both your for and NUM you call strtok too many times - even after it has returned nullptr. Next, to put the token into a std::string, you would assign using my_string = token; rather than ("%s\n", token) >> my_string - which is a broken mix of printf() formatting and C++ streaming notation. Lastly, to print the elements you've extracted, you can use another loop. All these changes are illustrated below.
char str[] = "ab,cd,ef,gh,ij";
std::vector<std::string> strings;
char* token = strtok(str, ",");
while ((token != nullptr))
{
strings.push_back(token);
token = strtok(NULL, ",");
}
for (const auto& s : strings)
cout >> s >> '\n';

Your code is overly complicated and wrong.
You probably want this:
#include <iostream>
#include <string>
#include <string.h>
using namespace std;
int main() {
char str[] = "ab,cd,ef,gh,ij"; //" ex str in place of file contents/fstream sFile;"
const int NUM = 5;
string sArr[NUM];//empty array
char *token = strtok(str, ",");
int max = 0;
while ((token != NULL)) {
sArr[max++] = token;
token = strtok(NULL, ",");
}
for (int i = 0; i < max; i++)
cout << sArr[i] << "\n";
return 0;
}
This code is still poor and no bound checking is done.
But anyway, you should rather do it the C++ way as suggested in the other answers.

Use boost::split
#include <boost/algorithm/string.hpp>
[...]
std::vector<std::string> strings;
std::string val("ab,cd,ef,gh,ij");
boost::split(strings, val, boost::is_any_of(","));

You could do something like this
std::string str = "ab,cd,ef,gh,ij";
std::vector<std::string> TokenList;
std::string::size_type lastPos = 0;
std::string::size_type pos = str.find_first_of(',', lastPos);
while(pos != std::string::npos)
{
std::string temp(str, lastPos, pos - lastPos);
TokenList.push_back(temp);
lastPos = pos + 1;
pos = str.find_first_of(',', lastPos);
}
if(lastPos != str.size())
{
std::string temp(str, lastPos, str.size());
TokenList.push_back(temp);
}
for(int i = 0; i < TokenList.size(); i++)
std::cout << TokenList.at(i) << std::endl;

Related

istream_iterator deal with string with pipe [duplicate]

This question already has answers here:
How do I iterate over the words of a string?
(84 answers)
Closed 4 years ago.
If I have a std::string containing a comma-separated list of numbers, what's the simplest way to parse out the numbers and put them in an integer array?
I don't want to generalise this out into parsing anything else. Just a simple string of comma separated integer numbers such as "1,1,1,1,2,1,1,1,0".
Input one number at a time, and check whether the following character is ,. If so, discard it.
#include <vector>
#include <string>
#include <sstream>
#include <iostream>
int main()
{
std::string str = "1,2,3,4,5,6";
std::vector<int> vect;
std::stringstream ss(str);
for (int i; ss >> i;) {
vect.push_back(i);
if (ss.peek() == ',')
ss.ignore();
}
for (std::size_t i = 0; i < vect.size(); i++)
std::cout << vect[i] << std::endl;
}
Something less verbose, std and takes anything separated by a comma.
stringstream ss( "1,1,1,1, or something else ,1,1,1,0" );
vector<string> result;
while( ss.good() )
{
string substr;
getline( ss, substr, ',' );
result.push_back( substr );
}
Yet another, rather different, approach: use a special locale that treats commas as white space:
#include <locale>
#include <vector>
struct csv_reader: std::ctype<char> {
csv_reader(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table() {
static std::vector<std::ctype_base::mask> rc(table_size, std::ctype_base::mask());
rc[','] = std::ctype_base::space;
rc['\n'] = std::ctype_base::space;
rc[' '] = std::ctype_base::space;
return &rc[0];
}
};
To use this, you imbue() a stream with a locale that includes this facet. Once you've done that, you can read numbers as if the commas weren't there at all. Just for example, we'll read comma-delimited numbers from input, and write then out one-per line on standard output:
#include <algorithm>
#include <iterator>
#include <iostream>
int main() {
std::cin.imbue(std::locale(std::locale(), new csv_reader()));
std::copy(std::istream_iterator<int>(std::cin),
std::istream_iterator<int>(),
std::ostream_iterator<int>(std::cout, "\n"));
return 0;
}
The C++ String Toolkit Library (Strtk) has the following solution to your problem:
#include <string>
#include <deque>
#include <vector>
#include "strtk.hpp"
int main()
{
std::string int_string = "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15";
std::vector<int> int_list;
strtk::parse(int_string,",",int_list);
std::string double_string = "123.456|789.012|345.678|901.234|567.890";
std::deque<double> double_list;
strtk::parse(double_string,"|",double_list);
return 0;
}
More examples can be found Here
Alternative solution using generic algorithms and Boost.Tokenizer:
struct ToInt
{
int operator()(string const &str) { return atoi(str.c_str()); }
};
string values = "1,2,3,4,5,9,8,7,6";
vector<int> ints;
tokenizer<> tok(values);
transform(tok.begin(), tok.end(), back_inserter(ints), ToInt());
Lots of pretty terrible answers here so I'll add mine (including test program):
#include <string>
#include <iostream>
#include <cstddef>
template<typename StringFunction>
void splitString(const std::string &str, char delimiter, StringFunction f) {
std::size_t from = 0;
for (std::size_t i = 0; i < str.size(); ++i) {
if (str[i] == delimiter) {
f(str, from, i);
from = i + 1;
}
}
if (from <= str.size())
f(str, from, str.size());
}
int main(int argc, char* argv[]) {
if (argc != 2)
return 1;
splitString(argv[1], ',', [](const std::string &s, std::size_t from, std::size_t to) {
std::cout << "`" << s.substr(from, to - from) << "`\n";
});
return 0;
}
Nice properties:
No dependencies (e.g. boost)
Not an insane one-liner
Easy to understand (I hope)
Handles spaces perfectly fine
Doesn't allocate splits if you don't want to, e.g. you can process them with a lambda as shown.
Doesn't add characters one at a time - should be fast.
If using C++17 you could change it to use a std::stringview and then it won't do any allocations and should be extremely fast.
Some design choices you may wish to change:
Empty entries are not ignored.
An empty string will call f() once.
Example inputs and outputs:
"" -> {""}
"," -> {"", ""}
"1," -> {"1", ""}
"1" -> {"1"}
" " -> {" "}
"1, 2," -> {"1", " 2", ""}
" ,, " -> {" ", "", " "}
You could also use the following function.
void tokenize(const string& str, vector<string>& tokens, const string& delimiters = ",")
{
// Skip delimiters at beginning.
string::size_type lastPos = str.find_first_not_of(delimiters, 0);
// Find first non-delimiter.
string::size_type pos = str.find_first_of(delimiters, lastPos);
while (string::npos != pos || string::npos != lastPos) {
// Found a token, add it to the vector.
tokens.push_back(str.substr(lastPos, pos - lastPos));
// Skip delimiters.
lastPos = str.find_first_not_of(delimiters, pos);
// Find next non-delimiter.
pos = str.find_first_of(delimiters, lastPos);
}
}
std::string input="1,1,1,1,2,1,1,1,0";
std::vector<long> output;
for(std::string::size_type p0=0,p1=input.find(',');
p1!=std::string::npos || p0!=std::string::npos;
(p0=(p1==std::string::npos)?p1:++p1),p1=input.find(',',p0) )
output.push_back( strtol(input.c_str()+p0,NULL,0) );
It would be a good idea to check for conversion errors in strtol(), of course. Maybe the code may benefit from some other error checks as well.
I'm surprised no one has proposed a solution using std::regex yet:
#include <string>
#include <algorithm>
#include <vector>
#include <regex>
void parse_csint( const std::string& str, std::vector<int>& result ) {
typedef std::regex_iterator<std::string::const_iterator> re_iterator;
typedef re_iterator::value_type re_iterated;
std::regex re("(\\d+)");
re_iterator rit( str.begin(), str.end(), re );
re_iterator rend;
std::transform( rit, rend, std::back_inserter(result),
[]( const re_iterated& it ){ return std::stoi(it[1]); } );
}
This function inserts all integers at the back of the input vector. You can tweak the regular expression to include negative integers, or floating point numbers, etc.
#include <sstream>
#include <vector>
const char *input = "1,1,1,1,2,1,1,1,0";
int main() {
std::stringstream ss(input);
std::vector<int> output;
int i;
while (ss >> i) {
output.push_back(i);
ss.ignore(1);
}
}
Bad input (for instance consecutive separators) will mess this up, but you did say simple.
string exp = "token1 token2 token3";
char delimiter = ' ';
vector<string> str;
string acc = "";
for(int i = 0; i < exp.size(); i++)
{
if(exp[i] == delimiter)
{
str.push_back(acc);
acc = "";
}
else
acc += exp[i];
}
bool GetList (const std::string& src, std::vector<int>& res)
{
using boost::lexical_cast;
using boost::bad_lexical_cast;
bool success = true;
typedef boost::tokenizer<boost::char_separator<char> > tokenizer;
boost::char_separator<char> sepa(",");
tokenizer tokens(src, sepa);
for (tokenizer::iterator tok_iter = tokens.begin();
tok_iter != tokens.end(); ++tok_iter) {
try {
res.push_back(lexical_cast<int>(*tok_iter));
}
catch (bad_lexical_cast &) {
success = false;
}
}
return success;
}
I cannot yet comment (getting started on the site) but added a more generic version of Jerry Coffin's fantastic ctype's derived class to his post.
Thanks Jerry for the super idea.
(Because it must be peer-reviewed, adding it here too temporarily)
struct SeparatorReader: std::ctype<char>
{
template<typename T>
SeparatorReader(const T &seps): std::ctype<char>(get_table(seps), true) {}
template<typename T>
std::ctype_base::mask const *get_table(const T &seps) {
auto &&rc = new std::ctype_base::mask[std::ctype<char>::table_size]();
for(auto &&sep: seps)
rc[static_cast<unsigned char>(sep)] = std::ctype_base::space;
return &rc[0];
}
};
This is the simplest way, which I used a lot. It works for any one-character delimiter.
#include<bits/stdc++.h>
using namespace std;
int main() {
string str;
cin >> str;
int temp;
vector<int> result;
char ch;
stringstream ss(str);
do
{
ss>>temp;
result.push_back(temp);
}while(ss>>ch);
for(int i=0 ; i < result.size() ; i++)
cout<<result[i]<<endl;
return 0;
}
simple structure, easily adaptable, easy maintenance.
std::string stringIn = "my,csv,,is 10233478,separated,by commas";
std::vector<std::string> commaSeparated(1);
int commaCounter = 0;
for (int i=0; i<stringIn.size(); i++) {
if (stringIn[i] == ",") {
commaSeparated.push_back("");
commaCounter++;
} else {
commaSeparated.at(commaCounter) += stringIn[i];
}
}
in the end you will have a vector of strings with every element in the sentence separated by spaces. empty strings are saved as separate items.
Simple Copy/Paste function, based on the boost tokenizer.
void strToIntArray(std::string string, int* array, int array_len) {
boost::tokenizer<> tok(string);
int i = 0;
for(boost::tokenizer<>::iterator beg=tok.begin(); beg!=tok.end();++beg){
if(i < array_len)
array[i] = atoi(beg->c_str());
i++;
}
void ExplodeString( const std::string& string, const char separator, std::list<int>& result ) {
if( string.size() ) {
std::string::const_iterator last = string.begin();
for( std::string::const_iterator i=string.begin(); i!=string.end(); ++i ) {
if( *i == separator ) {
const std::string str(last,i);
int id = atoi(str.c_str());
result.push_back(id);
last = i;
++ last;
}
}
if( last != string.end() ) result.push_back( atoi(&*last) );
}
}
#include <sstream>
#include <vector>
#include <algorithm>
#include <iterator>
const char *input = ",,29870,1,abc,2,1,1,1,0";
int main()
{
std::stringstream ss(input);
std::vector<int> output;
int i;
while ( !ss.eof() )
{
int c = ss.peek() ;
if ( c < '0' || c > '9' )
{
ss.ignore(1);
continue;
}
if (ss >> i)
{
output.push_back(i);
}
}
std::copy(output.begin(), output.end(), std::ostream_iterator<int> (std::cout, " ") );
return 0;
}

Is there any inbuilt function available two get string between two delimiter string in C/C++?

Is there any inbuilt function available to get strings between two delimiter string in C++?
Input string
(23567)=(58765)+(67888)+(65678)
Expected Output
23567
58765
67888
65678
include <iostream>
#include <stdexcept>
#include <string>
#include <sstream>
#include <vector>
std::vector<std::string> tokenize(const std::string& input)
{
std::vector<std::string> result;
std::istringstream stream(input);
std::string thingie; // please choose a better name, my inspiration is absent today
while(std::getline(stream, thingie, '('))
{
if(std::getline(stream, thingie, ')'))
result.push_back(thingie);
else
throw std::runtime_error("expected \')\' to match \'(\'.");
}
return result;
}
void rtc()
{
ifstream myfile(test.txt);
if(myfile.is_open())
while (!myfile.eof())
{
getline(myfile,line);
auto tokens = tokenize(line);
for(auto&& item : tokens)
std::cout << item << '\n';
}
Error C4430 missing type specifier int assumed note:c++ does not support default int
ErrorC2440initializing cannot convertfrom std::vector<_ty>to int
Error C2059syntac error empty declaration
Error C2143syntax error missing;before&&
Error C2059syntax error:')'
Use std::getline:
#include <iostream>
#include <stdexcept>
#include <string>
#include <sstream>
#include <vector>
std::vector<std::string> tokenize(const std::string& input)
{
std::vector<std::string> result;
std::istringstream stream(input);
std::string thingie; // please choose a better name, my inspiration is absent today
while(std::getline(stream, thingie, '('))
{
if(std::getline(stream, thingie, ')'))
result.push_back(thingie);
else
throw std::runtime_error("expected \')\' to match \'(\'.");
}
return result;
}
int main()
{
std::string test = "(23567)=(58765)+(67888)+(65678)";
auto tokens = tokenize(test);
for(auto&& item : tokens)
std::cout << item << '\n';
}
Live example here.
For those not entirely convinced by the awesome robustness of this solution, I specialized this for double inputs between the parentheses, and used boost::lexical_cast to verify the input:
#include <iostream>
#include <stdexcept>
#include <string>
#include <sstream>
#include <vector>
#include <boost/lexical_cast.hpp>
std::vector<double> tokenize(const std::string& input)
{
std::vector<double> result;
std::istringstream stream(input);
std::string thingie; // please choose a better name, my inspiration is absent today
while(std::getline(stream, thingie, '('))
{
if(std::getline(stream, thingie, ')'))
{
try
{
result.push_back(boost::lexical_cast<double>(thingie));
}
catch(...)
{
throw std::runtime_error("This wasn't just a number, was it?");
}
}
else
throw std::runtime_error("expected \')\' to match \'(\'.");
}
return result;
}
int main()
{
std::string test = "(23567)=(58765)+(67888)+(65678)";
auto tokens = tokenize(test);
for(auto&& item : tokens)
std::cout << item << '\n';
test = "(2h567)=(58765)+(67888)+(65678)";
tokens = tokenize(test);
}
Live example here. Now go cry about how bad strtok really is, or how bad/unportable the general <regex> implementations are currently. Also, for those who doubt boost::lexical_cast performance-wise, please see the results for yourself.
strpbrk can be used to find the start of each token
or strcspn can be used to count the characters until the next token
then strspn can be used to find the length of each token.
const char tokenChars[] = "0123456789";
char token = input; // say input is "(23567)=(58765)+(67888)+(65678)"
while( 0 != (token = strpbrk( token, tokenChars )) ) // find token
{
size_t tokenLen = strspn( token, token_chars ); // find length of token
// print out tokenLen characters of token here!
token+= tokenLen; // go to end of token
}
http://www.cplusplus.com/reference/cstring/strspn/
http://www.cplusplus.com/reference/cstring/strcspn/
http://www.cplusplus.com/reference/cstring/strpbrk/
Here's the answer if you wanna use pointers:
char test[32] = "(23567)=(58765)+(67888)+(65678)";
char *output = NULL;
char *pos = (char *)test;
int length = 0;
while (*pos != '\0') {
if(*pos == '(' || *pos == ')' || *pos == '+' || *pos == '=') {
*pos = '\0';
if (length > 0) {
output = new char[length + 1];
strncpy_s(output, length + 1, pos - length, length + 1);
length = 0;
cout << output << endl;
delete [] output;
output = NULL;
}
} else {
length++;
}
pos++;
}
While some of the commentators may hate it I like this:
for (p = std::strtok(input, "+"); p != NULL; p = std::strtok(NULL, "+"))
{
// do more stuff
}
This won't work off the bat - the delimiters need expanding - it demonstrates the ease of use.
const char input[] = "(2av67q)=(ble ble)+(67888)+(qpa)";
int s = 0;
for(int i = 0; input[i]; i++)
{
if ( input[i] == ')' )
{
cout << endl;
s = 0;
}
else if ( input[i] == '(' )
{
s = 1;
continue;
}
else
{
if ( s == 1 )
{
cout << input[i];
}
}
}
result:
2av67q
ble ble
67888
qpa
Here is a solution using a regular expression:
std::vector<std::string> get_numbers(std::string const& s)
{
static std::regex regex(R"(^\((\d+)\)=\((\d+)\)(?:\+\((\d+)\))+$)",
std::regex_constants::ECMAScript
| std::regex_constants::optimize);
std::vector<std::string> results;
std::sregex_iterator matches(s.cbegin(), s.cend(), regex);
for (auto first = matches->cbegin(), last = matches->cend();
last != first;
++first)
{
results.push_back(first->str());
}
return results;
}

Printing input string words in reverse order

Using if and while/do-while, my job is to print following user's inputs (string value) in reverse order.
For example:
input string value : "You are American"
output in reverse order : "American are You"
Is there any way to do this?
I have tried
string a;
cout << "enter a string: ";
getline(cin, a);
a = string ( a.rbegin(), a.rend() );
cout << a << endl;
return 0;
...but this would reverse the order of the words and spelling while spelling is not what I'm going for.
I also should be adding in if and while statements but do not have a clue how.
The algorithm is:
Reverse the whole string
Reverse the individual words
#include<iostream>
#include<algorithm>
using namespace std;
string reverseWords(string a)
{
reverse(a.begin(), a.end());
int s = 0;
int i = 0;
while(i < a.length())
{
if(a[i] == ' ')
{
reverse(a.begin() + s, a.begin() + i);
s = i + 1;
}
i++;
}
if(a[a.length() - 1] != ' ')
{
reverse(a.begin() + s, a.end());
}
return a;
}
Here is a C-based approach that will compile with a C++ compiler, which uses the stack to minimize creation of char * strings. With minimal work, this can be adapted to use C++ classes, as well as trivially replacing the various for loops with a do-while or while block.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_LINE_LENGTH 1000
#define MAX_WORD_LENGTH 80
void rev(char *str)
{
size_t str_length = strlen(str);
int str_idx;
char word_buffer[MAX_WORD_LENGTH] = {0};
int word_buffer_idx = 0;
for (str_idx = str_length - 1; str_idx >= 0; str_idx--)
word_buffer[word_buffer_idx++] = str[str_idx];
memcpy(str, word_buffer, word_buffer_idx);
str[word_buffer_idx] = '\0';
}
int main(int argc, char **argv)
{
char *line = NULL;
size_t line_length;
int line_idx;
char word_buffer[MAX_WORD_LENGTH] = {0};
int word_buffer_idx;
/* set up line buffer - we cast the result of malloc() because we're using C++ */
line = (char *) malloc (MAX_LINE_LENGTH + 1);
if (!line) {
fprintf(stderr, "ERROR: Could not allocate space for line buffer!\n");
return EXIT_FAILURE;
}
/* read in a line of characters from standard input */
getline(&line, &line_length, stdin);
/* replace newline with NUL character to correctly terminate 'line' */
for (line_idx = 0; line_idx < (int) line_length; line_idx++) {
if (line[line_idx] == '\n') {
line[line_idx] = '\0';
line_length = line_idx;
break;
}
}
/* put the reverse of a word into a buffer, else print the reverse of the word buffer if we encounter a space */
for (line_idx = line_length - 1, word_buffer_idx = 0; line_idx >= -1; line_idx--) {
if (line_idx == -1)
word_buffer[word_buffer_idx] = '\0', rev(word_buffer), fprintf(stdout, "%s\n", word_buffer);
else if (line[line_idx] == ' ')
word_buffer[word_buffer_idx] = '\0', rev(word_buffer), fprintf(stdout, "%s ", word_buffer), word_buffer_idx = 0;
else
word_buffer[word_buffer_idx++] = line[line_idx];
}
/* cleanup memory, to avoid leaks */
free(line);
return EXIT_SUCCESS;
}
To compile with a C++ compiler, and then use:
$ g++ -Wall test.c -o test
$ ./test
foo bar baz
baz bar foo
This example unpacks the input string one word at a time,
and builds an output string by concatenating in reverse order.
`
#include <iostream>
#include <sstream>
using namespace std;
int main()
{
string inp_str("I am British");
string out_str("");
string word_str;
istringstream iss( inp_str );
while (iss >> word_str) {
out_str = word_str + " " + out_str;
} // while (my_iss >> my_word)
cout << out_str << endl;
return 0;
} // main
`
This uses exactly one each of if and while.
#include <string>
#include <iostream>
#include <sstream>
void backwards(std::istream& in, std::ostream& out)
{
std::string word;
if (in >> word) // Read the frontmost word
{
backwards(in, out); // Output the rest of the input backwards...
out << word << " "; // ... and output the frontmost word at the back
}
}
int main()
{
std::string line;
while (getline(std::cin, line))
{
std::istringstream input(line);
backwards(input, std::cout);
std::cout << std::endl;
}
}
You might try this solution in getting a vector of string's using the ' ' (single space) character as a delimiter.
The next step would be to iterate over this vector backwards to generate the reverse string.
Here's what it might look like (split is the string splitting function from that post):
Edit 2: If you don't like vectors for whatever reason, you can use arrays (note that pointers can act as arrays). This example allocates a fixed size array on the heap, you may want to change this to say, double the size when the current word amount has reached a certain value.
Solution using an array instead of a vector:
#include <iostream>
#include <string>
using namespace std;
int getWords(string input, string ** output)
{
*output = new string[256]; // Assumes there will be a max of 256 words (can make this more dynamic if you want)
string currentWord;
int currentWordIndex = 0;
for(int i = 0; i <= input.length(); i++)
{
if(i == input.length() || input[i] == ' ') // We've found a space, so we've reached a new word
{
if(currentWord.length() > 0)
{
(*output)[currentWordIndex] = currentWord;
currentWordIndex++;
}
currentWord.clear();
}
else
{
currentWord.push_back(input[i]); // Add this character to the current word
}
}
return currentWordIndex; // returns the number of words
}
int main ()
{
std::string original, reverse;
std::getline(std::cin, original); // Get the input string
string * arrWords;
int size = getWords(original, &arrWords); // pass in the address of the arrWords array
int index = size - 1;
while(index >= 0)
{
reverse.append(arrWords[index]);
reverse.append(" ");
index--;
}
std::cout << reverse << std::endl;
return 0;
}
Edit: Added includes, main function, while loop format
#include <vector>
#include <string>
#include <iostream>
#include <sstream>
// From the post
std::vector<std::string> &split(const std::string &s, char delim, std::vector<std::string> &elems)
{
std::stringstream ss(s);
std::string item;
while(std::getline(ss, item, delim)) {
elems.push_back(item);
}
return elems;
}
std::vector<std::string> split(const std::string &s, char delim) {
std::vector<std::string> elems;
return split(s, delim, elems);
}
int main ()
{
std::string original, reverse;
std::cout << "Input a string: " << std::endl;
std::getline(std::cin, original); // Get the input string
std::vector<std::string> words = split(original, ' ');
std::vector<std::string>::reverse_iterator rit = words.rbegin();
while(rit != words.rend())
{
reverse.append(*rit);
reverse.append(" "); // add a space
rit++;
}
std::cout << reverse << std::endl;
return 0;
}
This code here uses string libraries to detect the blanks in the input stream and rewrite the output sentence accordingly
The algorithm is
1. Get the input stream using getline function to capture the spacecs. Initialize pos1 to zero.
2. Look for the first space in the input stream
3. If no space is found, the input stream is the output
4. Else, get the position of the first blank after pos1, i.e. pos2.
5. Save the sub-string bewteen pos1 and pos2 at the beginning of the output sentence; newSentence.
6. Pos1 is now at the first char after the blank.
7. Repeat 4, 5 and 6 untill no spaces left.
8. Add the last sub-string to at the beginning of the newSentence. –
#include <iostream>
#include <string>
using namespace std;
int main ()
{
string sentence;
string newSentence;
string::size_type pos1;
string::size_type pos2;
string::size_type len;
cout << "This sentence rewrites a sentence backward word by word\n"
"Hello world => world Hello"<<endl;
getline(cin, sentence);
pos1 = 0;
len = sentence.length();
pos2 = sentence.find(' ',pos1);
while (pos2 != string::npos)
{
newSentence = sentence.substr(pos1, pos2-pos1+1) + newSentence;
pos1 = pos2 + 1;
pos2 = sentence.find(' ',pos1);
}
newSentence = sentence.substr(pos1, len-pos1+1) + " " + newSentence;
cout << endl << newSentence <<endl;
return 0;
}

C++ split string by line

I need to split string by line.
I used to do in the following way:
int doSegment(char *sentence, int segNum)
{
assert(pSegmenter != NULL);
Logger &log = Logger::getLogger();
char delims[] = "\n";
char *line = NULL;
if (sentence != NULL)
{
line = strtok(sentence, delims);
while(line != NULL)
{
cout << line << endl;
line = strtok(NULL, delims);
}
}
else
{
log.error("....");
}
return 0;
}
I input "we are one.\nyes we are." and invoke the doSegment method. But when i debugging, i found the sentence parameter is "we are one.\\nyes we are", and the split failed. Can somebody tell me why this happened and what should i do. Is there anyway else i can use to split string in C++. thanks !
I'd like to use std::getline or std::string::find to go through the string.
below code demonstrates getline function
int doSegment(char *sentence)
{
std::stringstream ss(sentence);
std::string to;
if (sentence != NULL)
{
while(std::getline(ss,to,'\n')){
cout << to <<endl;
}
}
return 0;
}
You can call std::string::find in a loop and the use std::string::substr.
std::vector<std::string> split_string(const std::string& str,
const std::string& delimiter)
{
std::vector<std::string> strings;
std::string::size_type pos = 0;
std::string::size_type prev = 0;
while ((pos = str.find(delimiter, prev)) != std::string::npos)
{
strings.push_back(str.substr(prev, pos - prev));
prev = pos + delimiter.size();
}
// To get the last substring (or only, if delimiter is not found)
strings.push_back(str.substr(prev));
return strings;
}
See example here.
#include <sstream>
#include <string>
#include <vector>
std::vector<std::string> split_string_by_newline(const std::string& str)
{
auto result = std::vector<std::string>{};
auto ss = std::stringstream{str};
for (std::string line; std::getline(ss, line, '\n');)
result.push_back(line);
return result;
}
#include <iostream>
#include <string>
#include <regex>
#include <algorithm>
#include <iterator>
using namespace std;
vector<string> splitter(string in_pattern, string& content){
vector<string> split_content;
regex pattern(in_pattern);
copy( sregex_token_iterator(content.begin(), content.end(), pattern, -1),
sregex_token_iterator(),back_inserter(split_content));
return split_content;
}
int main()
{
string sentence = "This is the first line\n";
sentence += "This is the second line\n";
sentence += "This is the third line\n";
vector<string> lines = splitter(R"(\n)", sentence);
for (string line: lines){cout << line << endl;}
}
We have a string with multiple lines
we split those into an array (vector)
We print out those elements in a for loop
Using the library range-v3:
#include <range/v3/all.hpp>
#include <string>
#include <string_view>
#include <vector>
std::vector<std::string> split_string_by_newline(const std::string_view str) {
return str | ranges::views::split('\n')
| ranges::to<std::vector<std::string>>();
}
Using C++23 ranges:
#include <ranges>
#include <string>
#include <string_view>
#include <vector>
std::vector<std::string> split_string_by_newline(const std::string_view str) {
return str | std::ranges::views::split('\n')
| std::ranges::to<std::vector<std::string>>();
}
This fairly inefficient way just loops through the string until it encounters an \n newline escape character. It then creates a substring and adds it to a vector.
std::vector<std::string> Loader::StringToLines(std::string string)
{
std::vector<std::string> result;
std::string temp;
int markbegin = 0;
int markend = 0;
for (int i = 0; i < string.length(); ++i) {
if (string[i] == '\n') {
markend = i;
result.push_back(string.substr(markbegin, markend - markbegin));
markbegin = (i + 1);
}
}
return result;
}

Tokenize a String in C++

I have a string currentLine="12 23 45"
I need to extract 12, 23, 45 from this string without using Boost libraries. Since i am using string, strtok fails for me. I have tried a number of things still no success.
Here is my last attempt
while(!inputFile.eof())
while(getline(inputFile,currentLine))
{
int countVar=0;
int inputArray[10];
char* tokStr;
tokStr=(char*)strtok(currentLine.c_str()," ");
while(tokstr!=NULL)
{
inputArray[countVar]=(int)tokstr;
countVar++;
tokstr=strtok(NULL," ");
}
}
}
the one without strtok
string currentLine;
while(!inputFile.eof())
while(getline(inputFile,currentLine))
{
cout<<atoi(currentLine.c_str())<<" "<<endl;
int b=0,c=0;
for(int i=1;i<currentLine.length();i++)
{
bool lockOpen=false;
if((currentLine[i]==' ') && (lockOpen==false))
{
b=i;
lockOpen=true;
continue;
}
if((currentLine[i]==' ') && (lockOpen==true))
{
c=i;
break;
}
}
cout<<b<<"b is"<<" "<<c;
}
Try this:
#include <sstream>
std::string str = "12 34 56";
int a,b,c;
std::istringstream stream(str);
stream >> a >> b >> c;
Read a lot about c++ streams here: http://www.cplusplus.com/reference/iostream/
std::istringstream istr(your_string);
std::vector<int> numbers;
int number;
while (istr >> number)
numbers.push_back(number);
Or, simpler (though not really shorter):
std::vector<int> numbers;
std::copy(
std::istream_iterator<int>(istr),
std::istream_iterator<int>(),
std::back_inserter(numbers));
(Requires the standard headers <sstream>, <algorithm> and <iterator>.)
You can also opt for Boost tokenizer ......
#include <iostream>
#include <string>
#include <boost/foreach.hpp>
#include <boost/tokenizer.hpp>
using namespace std;
using namespace boost;
int main(int argc, char** argv)
{
string str= "India, gold was dear";
char_separator<char> sep(", ");
tokenizer< char_separator<char> > tokens(str, sep);
BOOST_FOREACH(string t, tokens)
{
cout << t << "." << endl;
}
}
stringstream and boost::tokenizer are two possibilities. Here is a more explicit solution using string::find and string::substr.
std::list<std::string>
tokenize(
std::string const& str,
char const token[])
{
std::list<std::string> results;
std::string::size_type j = 0;
while (j < str.length())
{
std::string::size_type k = str.find(token, j);
if (k == std::string::npos)
k = str.length();
results.push_back(str.substr(j, k-j));
j = k + 1;
}
return results;
}
Hope this helps. You can easily turn this into an algorithm that writes the tokens to arbitrary containers or takes a function handle that processes the tokens.