...may be so simple question,but am going to write a simple c++ code to parse a string using a delimiter,i want the delimiter to contain multiple spaces(actually one or more space). My question is,is it possible to do that way? my sample code is :
#include <stdio.h>
#include <iostream>
#include <vector>
#include <string>
#include <fstream>
#include <stdlib.h>
#include <cstring>
#include <sstream>
using namespace std;
int main()
{
string str="HELLO THIS IS 888and777";
char buf[1000];
getline(buf, 1000);
string str(buf);
stringstream stream(buf);
string toStr;
getline(stream, toStr,' ');//here the delimiter is six spaces
string str1=tostr;
getline(stream, toStr,' ');//here the delimiter is two spaces
string str2=tostr;
getline(stream, toStr,' ');//here the delimiter is three spaces
string str3=tostr;
cout<<str1<<"\t"<<str2<<"\t"<<str3<<endl;
return 0;
}
but,i cant use a delimiter of multiple chars. any idea please.
i get the following error:
error: invalid conversion from ‘void*’ to ‘char**’
error: cannot convert ‘std::string’ to ‘size_t*’ for argument ‘2’ to ‘__ssize_t getline(char**, size_t*, FILE*)’
The delimiter used by std::getline() is purely an individual character. To accept a string would require a non-trivial algorithm to guarantee suitable performance. In addition, the entities defined using 'x' normally need to result in an individual char.
For the example I think the easiest approach is to simply tokenize the string directly:
#include <tuple>
#include <utility>
#include <string>
#include <iostream>
std::pair<std::string, std::string::size_type>
get_token(std::string const& value, std::string::size_type pos, std::string const& delimiter)
{
if (pos == value.npos) {
return std::make_pair(std::string(), pos);
}
std::string::size_type end(value.find(delimiter, pos));
return end == value.npos
? std::make_pair(value.substr(pos), end)
: std::make_pair(value.substr(pos, end - pos), end + delimiter.size());
}
int main()
{
std::string str("HELLO THIS IS 888and777");
std::string str1, str2, str3;
std::string::size_type pos(0);
std::tie(str1, pos) = get_token(str, pos, " ");
std::tie(str2, pos) = get_token(str, pos, " ");
std::tie(str3, pos) = get_token(str, pos, " ");
std::cout << "str1='" << str1 << "' str2='" << str2 << "' str3='" << str3 << "'\n";
}
Related
I am creating a function that splits a sentence into words, and believe the way to do this is to use str.substr, starting at str[0] and then using str.find to find the index of the first " " character. Then update the starting position parameter of str.find to start at the index of that " " character, until the end of str.length().
I am using two variables to mark the beginning position and end position of the word, and update the beginning position variable with the ending position of the last. But it is not updating as desired in the loop as I currently have it, and cannot figure out why.
#include <iostream>
#include <string>
using namespace std;
void splitInWords(string str);
int main() {
string testString("This is a test string");
splitInWords(testString);
return 0;
}
void splitInWords(string str) {
int i;
int beginWord, endWord, tempWord;
string wordDelim = " ";
string testWord;
beginWord = 0;
for (i = 0; i < str.length(); i += 1) {
endWord = str.find(wordDelim, beginWord);
testWord = str.substr(beginWord, endWord);
beginWord = endWord;
cout << testWord << " ";
}
}
It is easier to use a string stream.
#include <vector>
#include <string>
#include <sstream>
using namespace std;
vector<string> split(const string& s, char delimiter)
{
vector<string> tokens;
string token;
istringstream tokenStream(s);
while (getline(tokenStream, token, delimiter))
{
tokens.push_back(token);
}
return tokens;
}
int main() {
string testString("This is a test string");
vector<string> result=split(testString,' ');
return 0;
}
You can write it using the existing C++ libraries:
#include <string>
#include <vector>
#include <iterator>
#include <sstream>
int main()
{
std::string testString("This is a test string");
std::istringstream wordStream(testString);
std::vector<std::string> result(std::istream_iterator<std::string>{wordStream},
std::istream_iterator<std::string>{});
}
Couple of issues:
The substr() method second parameter is a length (not a position).
// Here you are using `endWord` which is a poisition in the string.
// This only works when beginWord is 0
// for all other values you are providing an incorrect len.
testWord = str.substr(beginWord, endWord);
The find() method searches from the second paramer.
// If str[beginWord] contains one of the delimiter characters
// Then it will return beginWord
// i.e. you are not moving forward.
endWord = str.find(wordDelim, beginWord);
// So you end up stuck on the first space.
Assuming you got the above fixed. You would be adding space at the front of each word.
// You need to actively search and remove the spaces
// before reading the words.
nice things you could do:
Here:
void splitInWords(string str) {
You are passing the parameter by value. This means you are making a copy. A better technique would be to pass by const reference (you are not modifying the original or the copy).
void splitInWords(string const& str) {
An Alternative
You can use the stream functionality.
void split(std::istream& stream)
{
std::string word;
stream >> word; // This drops leading space.
// Then reads characters into `word`
// until a "white space" character is
// found.
// Note: it emptys words before adding any
}
What is the right way to split a string like below at a specific character-combination into a string vector?
string myString = "This is \n a test. Let's go on. \n Yeah.";
split at "\n" to get this result:
vector<string> myVector = {
"This is ",
" a test. Let's go on. ",
" Yeah."
}
I was using boost algorithm library but now I'd like to achieve this all without using an external library like boost.
#include <boost/algorithm/string/classification.hpp>
#include <boost/algorithm/string/split.hpp>
std::vector<std::string> result;
boost::split(result, "This is \n a test. Let's go on. \n Yeah.",
boost::is_any_of("\n"), boost::token_compress_on);
How about something like this:
#include <iostream>
#include <sstream>
#include <vector>
#include <string>
#include <iterator>
class line : public std::string {};
std::istream &operator>>(std::istream &iss, line &line)
{
std::getline(iss, line, '\n');
return iss;
}
int main()
{
std::istringstream iss("This is \n a test. Let's go on. \n Yeah.");
std::vector<std::string> v(std::istream_iterator<line>{iss}, std::istream_iterator<line>{});
// test
for (auto const &s : v)
std::cout << s << std::endl;
return 0;
}
Basically make a new type of string which is line and use stream iterator to read whole lines straight to vector range constructor
Working demo: https://ideone.com/4qdfY2
Solution 1: Just to remove "\n" from the string.
Just to remove "\n", you can use erase-remove idiom . SEE LIVE HERE
#include <iostream>
#include <string>
#include <algorithm>
int main()
{
std::string myString = "This is \n a test. Let's go on. \n Yeah.";
myString.erase(std::remove(myString.begin(), myString.end(), '\n'),
myString.end());
std::cout << myString<< std::endl;
}
Output:
This is a test. Let's go on. Yeah
Solution 2: To remove "\n" from the string and save each split at \n to a vector. (un-efficient)
Replace all \n occurance with some other charectors, which doesn't exist in the string (here I have chosen ;). Then parse with the help of std::stringstream and std::getline as follows. SEE LIVE HERE
#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
#include <sstream>
int main()
{
std::string myString = "This is \n a test. Let's go on. \n Yeah.";
std::replace(myString.begin(), myString.end(), '\n', ';');
std::stringstream ssMyString(myString);
std::string each_split;
std::vector<std::string> vec;
while(std::getline(ssMyString, each_split, ';')) vec.emplace_back(each_split);
for(const auto& it: vec) std::cout << it << "\n";
}
Output:
This is
a test. Let's go on.
Yeah.
Solution 3: To remove "\n" from the string and save each split at \n to a vector.
Loop through the string and find positions(using std::string::find) where \n(end position) finds. Push back the substrings (std::string::substr) using the information of starting position and the number of charectors between start and end position. Each time update the start and end positions, so that look up will not start again from the beging of the input string. SEE LIVE HERE
#include <iostream>
#include <vector>
#include <string>
#include <algorithm>
#include <cstddef>
int main()
{
std::string myString = "This is \n a test. Let's go on. \n Yeah.";
std::vector<std::string> vec;
std::size_t start_pos = 0;
std::size_t end_pos = 0;
while ((end_pos = myString.find("\n", end_pos)) != std::string::npos)
{
vec.emplace_back(myString.substr(start_pos, end_pos - start_pos));
start_pos = end_pos + 1;
end_pos += 2;
}
vec.emplace_back(myString.substr(start_pos, myString.size() - start_pos)); // last substring
for(const auto& it: vec) std::cout << it << "\n";
}
Output:
This is
a test. Let's go on.
Yeah.
How can I find the position of a character in a string? Ex. If I input "abc*ab" I would like to create a new string with just "abc". Can you help me with my problem?
C++ standard string provides a find method:
s.find(c)
returns the position of first instance of character c into string s or std::string::npos in case the character is not present at all. You can also pass the starting index for the search; i.e.
s.find(c, x0)
will return the first index of character c but starting the search from position x0.
std::find returns an iterator to the first element it finds that compares equal to what you're looking for (or the second argument if it doesn't find anything, in this case the end iterator.) You can construct a std::string using iterators.
#include <iostream>
#include <string>
#include <algorithm>
int main()
{
std::string s = "abc*ab";
std::string s2(s.begin(), std::find(s.begin(), s.end(), '*'));
std::cout << s2;
return 0;
}
If you are working with std::string type, then it is very easy to find the position of a character, by using std::find algorithm like so:
#include <string>
#include <algorithm>
#include <iostream>
using namespace std;
int main()
{
string first_string = "abc*ab";
string truncated_string = string( first_string.cbegin(), find( first_string.cbegin(), first_string.cend(), '*' ) );
cout << truncated_string << endl;
}
Note: if your character is found multiple times in your std::string, then the find algorithm will return the position of the occurrence.
Elaborating on existing answers, you can use string.find() and string.substr():
#include <iostream>
#include <string>
int main() {
std::string s = "abc*ab";
size_t index = s.find("*");
if (index != std::string::npos) {
std::string prefix = s.substr(0, index);
std::cout << prefix << "\n"; // => abc
}
}
I've been trying to make a program that parses a text file and feeds 6 pieces of information into an array of objects. The problem for me is that I'm having issues figuring out how to process the text file. I was told that the first step I needed to do was to write some code that counted how many letters long each entry was. The txt file is in this format:
"thing1","thing2","thing3","thing4","thing5","thing6"
This is the current version of my code:
#include<iostream>
#include<string>
#include<fstream>
#include<cstring>
using namespace std;
int main()
{
ifstream myFile("Book List.txt");
while(myFile.good())
{
string line;
getline(myFile, line);
char *sArr = new char[line.length() + 1];
strcpy(sArr, line.c_str());
char *sPtr;
sPtr = strtok(sArr, " ");
while(sPtr != NULL)
{
cout << strlen(sPtr) << " ";
sPtr = strtok(NULL, " ");
}
cout << endl;
}
myFile.close();
return 0;
}
So there are two things making it hard for me right now.
1) How do I deal with the delimiters?
2) How do I deal with "skipping" the first quotation mark in each line?
Read in a string instead of a c-style string. This means that you can use the handy std methods.
The std::string::find() method should help you out with finding each thing that you want to parse.
http://www.cplusplus.com/reference/string/string/find/
You can use this to find all the commas, which will give you the starts of all the things.
Then you can use std::string::substr() to cut up the string into each piece.
http://www.cplusplus.com/reference/string/string/substr/
You can manage to get rid of the quotation marks by passing in 1 more than the start and 1 less than the length of the thing, you can also use
If you have to use strtok then this code snippet should give enough to modify your program to parse your data:
#include <cstdio>
#include <cstring>
int main ()
{
char str[] ="\"thing1\",\"thing2\",\"thing3\",\"thing4\",\"thing5\"";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str,"\",");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, ",\"");
}
return 0;
}
If you do not have to use strtok then you should use std::string as others have advised. Using std::string and std::istringstream:
#include <string>
#include <sstream>
#include <vector>
#include <iostream>
int main ()
{
std::string str2( "\"thing1\",\"thing2\",\"thing3\",\"thing4\",\"thing5\"" ) ;
std::istringstream is(str2);
std::string part;
while (getline(is, part, ','))
std::cout << part.substr(1,part.length()-2) << std::endl;
return 0;
}
For starters, don't use strtok if you can avoid it (and you easily can here - and you can even avoid using the find series of functions as well).
If you want to read in the whole line and then parse it:
#include <algorithm>
#include <iostream>
#include <iterator>
#include <sstream>
#include <string>
#include <vector>
// defines a new ctype that treats commas as whitespace
struct csv_reader : std::ctype<char>
{
csv_reader() : std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table()
{
static std::vector<std::ctype_base::mask> rc(table_size, std::ctype_base::mask());
rc['\n'] = std::ctype_base::space;
rc[','] = std::ctype_base::space;
return &rc[0];
}
};
int main()
{
std::ifstream fin("yourFile.txt");
std::string line;
csv_reader csv;
std::vector<std::vector<std::string>> values;
while (std::getline(fin, line))
{
istringstream iss(line);
iss.imbue(std::locale(std::locale(), csv));
std::vector<std::string> vec;
std::copy(std::istream_iterator<std::string>(iss), std::istream_iterator<std::string>(), std::back_inserter(vec));
values.push_back(vec);
}
// values now contains a vector for each line that has the strings split by their commas
fin.close();
return 0;
}
That answers your first question. For your second, you can skip all the quotation marks by adding them to the rc mask (also treating them as whitespace) or you can strip them out afterwards (either directly or by using a transform):
std::transform(vec.begin(), vec.end(), vec.begin(), [](std::string& s)
{
std::string::iterator pend = std::remove_if(s.begin(), s.end(), [](char c)
{
return c == '"';
});
s.erase(pend, s.end());
});
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to split a string in C++?
Best way to split a string in C++? The string can be assumed to be composed of words separated by ;
From our guide lines point of view C string functions are not allowed and also Boost is also not allowed to use because of security conecerns open source is not allowed.
The best solution I have right now is:
string str("denmark;sweden;india;us");
Above str should be stored in vector as strings. how can we achieve this?
Thanks for inputs.
I find std::getline() is often the simplest. The optional delimiter parameter means it's not just for reading "lines":
#include <sstream>
#include <iostream>
#include <vector>
using namespace std;
int main() {
vector<string> strings;
istringstream f("denmark;sweden;india;us");
string s;
while (getline(f, s, ';')) {
cout << s << endl;
strings.push_back(s);
}
}
You could use a string stream and read the elements into the vector.
Here are many different examples...
A copy of one of the examples:
std::vector<std::string> split(const std::string& s, char seperator)
{
std::vector<std::string> output;
std::string::size_type prev_pos = 0, pos = 0;
while((pos = s.find(seperator, pos)) != std::string::npos)
{
std::string substring( s.substr(prev_pos, pos-prev_pos) );
output.push_back(substring);
prev_pos = ++pos;
}
output.push_back(s.substr(prev_pos, pos-prev_pos)); // Last word
return output;
}
There are several libraries available solving this problem, but the simplest is probably to use Boost Tokenizer:
#include <iostream>
#include <string>
#include <boost/tokenizer.hpp>
#include <boost/foreach.hpp>
typedef boost::tokenizer<boost::char_separator<char> > tokenizer;
std::string str("denmark;sweden;india;us");
boost::char_separator<char> sep(";");
tokenizer tokens(str, sep);
BOOST_FOREACH(std::string const& token, tokens)
{
std::cout << "<" << *tok_iter << "> " << "\n";
}