Splitting a string using a single delimeter [duplicate] - c++

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Splitting a string in C++
I'm trying to split a single string object with a delimeter into separate strings and then output individual strings.
e.g The input string is firstname,lastname-age-occupation-telephone
The '-' character is the delimeter and I need to output them separately using the string class functions only.
What would be the best way to do this? I'm having a hard time understanding .find . substr and similar functions.
Thanks!

I think string streams and getline make for easy-to-read code:
#include <string>
#include <sstream>
#include <iostream>
std::string s = "firstname,lastname-age-occupation-telephone";
std::istringstream iss(s);
for (std::string item; std::getline(iss, item, '-'); )
{
std::cout << "Found token: " << item << std::endl;
}
Here's using only string member functions:
for (std::string::size_type pos, cur = 0;
(pos = s.find('-', cur)) != s.npos || cur != s.npos; cur = pos)
{
std::cout << "Found token: " << s.substr(cur, pos - cur) << std::endl;
if (pos != s.npos) ++pos; // gobble up the delimiter
}

I'd do something like this
do
{
std::string::size_type posEnd = myString.find(delim);
//your first token is [0, posEnd). Do whatever you want with it.
//e.g. if you want to get it as a string, use
//myString.substr(0, posEnd - pos);
myString = substr(posEnd);
}while(posEnd != std::string::npos);

Related

How to delete plain "\n" from a string in C++? [duplicate]

This question already has answers here:
How to remove all substrings from a string
(5 answers)
Closed 4 years ago.
As above, I'm trying to delete two character sub-string (not new-line character, just plain text) from a line.
What I am doing right now is line.replace(line.find("\\n"), 3, ""); because I wanted to escape it, but I receive debug error saying that abort() has been called. Moreover, I am not sure about size 3, because first slash shouldn't be treated as a literal character.
I guess this is exactly what you're looking for:
std::string str = "This is \\n a string \\n containing \\n a lot of \\n stuff.";
const std::string to_erase = "\\n";
// Search for the substring in string
std::size_t pos = str.find(to_erase);
while (pos != std::string::npos) {
// If found then erase it from string
str.erase(pos, to_erase.length());
pos = str.find(to_erase);
}
Note that you're likely getting std::abort because you are passing std::string::npos or the length 3 (not 2) to std::string::replace.
#include <iostream>
#include <string>
int main()
{
std::string head = "Hi //nEveryone. //nLets //nsee //nif //nthis //nworks";
std::string sub = "//n";
std::string::iterator itr;
for (itr = head.begin (); itr != head.end (); itr++)
{
std::size_t found = head.find (sub);
if (found != std::string::npos)
head.replace (found, sub.length (), "");
}
std::cout << "after everything = " << head << std::endl;
}
i got the output as :
after everything = Hi Everyone. Lets see if this works

Best and efficient way to extract a First word from string without space when only a "GOOD" string is available in string?

What is the best and efficient way to extract a First word from string without space when only a "GOOD" string is available in std::string
Example
std::string temp = " THIS IS MY STACK OVER FLOW SECOND QUESTION IS IT GOOD"
Output:THIS
My below logic fails when only "GOOD GOOD GOOD GOOD" is available
#define SerachGOOD "GOOD"
#define SerachBAD "BAD "
#define firstpos 0
using namespace std;
void removeSpaces(std::string & input)
{
input.erase(std::remove(input.begin(),input.end(),' '),input.end());
cout<<input<<endl;
}
void GetFirstiWord_IF_GOOD(std::string temp)
{
if (temp.find(SerachGOOD) != std::string::npos)
{
std::string FirstWord = temp.substr(firstpos, temp.find(SerachGOOD));
removeSpaces(FirstWord);
cout<<FirstWord<<endl;
}
}
Your code is not trying to extract the first word at all. It is simply removing all spaces from the input and then returning the remaining input as-is.
Thus, this input:
" THIS IS MY STACK OVER FLOW SECOND QUESTION IS IT GOOD"
Would output this:
"THISISMYSTACKOVERFLOWSECONDQUESTIONISITGOOD"
Which is not what you want.
Try something more like this instead:
#include <iostream>
#include <string>
#define SearchGOOD "GOOD"
void GetFirstWord_IF_GOOD(const std::string &temp)
{
if (temp.find(SearchGOOD) != std::string::npos)
{
std::string::size_type start_pos = temp.find_first_not_of(" \t");
std::string::size_type end_pos = temp.find_first_of(" \t", start_pos + 1);
if (end_pos == std::string::npos)
end_pos = temp.size();
std::string FirstWord = temp.substr(start_pos, end_pos - start_pos);
std::cout << FirstWord << std::endl;
}
}
Or simpler, just let the STL do the parsing for you:
#include <iostream>
#include <string>
#include <sstream>
#define SearchGOOD "GOOD"
void GetFirstWord_IF_GOOD(const std::string &temp)
{
if (temp.find(SearchGOOD) != std::string::npos)
{
std::string FirstWord;
std::istringstream iss(temp);
iss >> FirstWord;
std::cout << FirstWord << std::endl;
}
}
Use a std::stringstream to read the first word from the string.
std::string GetFirstiWord_IF_GOOD(std::string temp)
{
std::string FirstWord;
if (temp.find(SerachGOOD) != std::string::npos)
{
std::stringstream ss(temp);
ss >> FirstWord;
}
return FirstWord;
}
DEMO
I noticed a slight anomaly with your first string in that you have a leading space, that will throw off a simple parse (and in the case of your second example would appear to work correctly, but on a different string would have a completely different outcome).
e.g.: "THIS IS MY STACK OVER FLOW SECOND QUESTION IS IT GOOD"
vs: " THIS IS MY STACK OVER FLOW SECOND QUESTION IS IT GOOD"
To get around this issue, adding in an initial offset to the first word, then simply take everything up to the first space. So, find, position, and split.
void GetFirstiWord_IF_GOOD(std::string temp)
{
while (isspace(temp[0])) {
temp.erase(0,1);
}
if (temp.find(SearchGOOD) != std::string::npos)
{
temp = temp.substr(0, temp.find_first_of(' '));
cout << temp << endl;
}
}
I think your remove string function is causing your false success in the first string, but I am having troubles running your code on my system. If you drop the code above into your existing code it should work as you are expecting.

Getting a word or sub string from main string when char '\' from RHS is found and then erase rest

Suppose i have a string as below
input = " \\PATH\MYFILES This is my sting "
output = MYFILES
from RHS when first char '\' is found get the word (ie MYFILES) and erase the rest.
Below is my approach i tired but its bad because there is a Runtime error as ABORTED TERMINATED WITH A CORE.
Please suggest cleanest and/or shortest way to get only a single word (ie MYFILES ) from the above string?
I have searching and try it from last two days but no luck .please help
Note: The input string in above example is not hardcoded as it ought to be .The string contain changes dynamically but char '\' available for sure.
std::regex const r{R"~(.*[^\\]\\([^\\])+).*)~"} ;
std::string s(R"(" //PATH//MYFILES This is my sting "));
std::smatch m;
int main()
{
if(std::regex_match(s,m,r))
{
std::cout<<m[1]<<endl;
}
}
}
To erase the part of a string, you have to find where is that part begins and ends. Finding somethig inside an std::string is very easy because the class have six buit-in methods for this (std::string::find_first_of, std::string::find_last_of, etc.). Here is a small example of how your problem can be solved:
#include <iostream>
#include <string>
int main() {
std::string input { " \\PATH\\MYFILES This is my sting " };
auto pos = input.find_last_of('\\');
if(pos != std::string::npos) {
input.erase(0, pos + 1);
pos = input.find_first_of(' ');
if(pos != std::string::npos)
input.erase(pos);
}
std::cout << input << std::endl;
}
Note: watch out for escape sequences, a single backslash is written as "\\" inside a string literal.

Strtok and Char* [duplicate]

This question already has answers here:
C's strtok() and read only string literals
(5 answers)
Closed 8 years ago.
I have a simple code where Iam trying to go through a char* and spit it into separate words. Here is the simple code I have.
#include <iostream>
#include <stdio.h>
int main ()
{
char * string1 = "- This is a test string";
char * character_pointer;
std::cout << "Splitting stringinto tokens:" << string1 << std::endl;
character_pointer = strtok (string1," ");
while (character_pointer != NULL)
{
printf ("%s\n", character_pointer);
character_pointer = strtok (NULL, " ");
}
return 0;
}
I am getting an error that will not allow me to do this.
So my question is, how do I go through and find each word in a char*. For my actual program I am working on, one of my libraries returns a paragraph of words as a const char* and I need to stem each word using a stemming algorithm (I know how to do this, I just do not know how to send each individual word to the stemmer). If someone could just solve how to get the example code to work, I will be able to figure it out. All of the examples online use a char[] for string1 instead of a char* and I cannot do that.
This is the simplest (codewise) way I know to split a string in c++:
std::string string1 = "- This is a test string";
std::string word;
std::istringstream iss(string1);
// by default this splits on any whitespace
while(iss >> word) {
std::cout << word << '\n';
}
or like this if you want to specify a delimiter.
while(std::getline(iss, word, ' ')) {
std::cout << word << '\n';
}
Here's a corrected version, try it out:
#include <iostream>
#include <stdio.h>
#include <cstring>
int main ()
{
char string1[] = "- This is a test string";
char * character_pointer;
std::cout << "Splitting stringinto tokens:" << string1 << std::endl;
character_pointer = strtok (string1," ");
while (character_pointer != NULL)
{
printf ("%s\n", character_pointer);
character_pointer = strtok (NULL, " ");
}
return 0;
}
There are different ways you could do this in C++.
If space is your delimited then you can get the tokens this way:
std::string text = "- This is a test string";
std::istringstream ss(text);
std::vector<std::string> tokens;
std::copy(std::istream_iterator<std::string>(ss),
std::istream_iterator<std::string>(),
std::back_inserter<std::vector<std::string>>(tokens));
You can also tokenize the string in C++ using regular expressions.
std::string text = "- This is a test string";
std::regex pattern("\\s+");
std::sregex_token_iterator it(std::begin(text), std::end(text), pattern, -1);
std::sregex_token_iterator end;
for(; it != end; ++it)
{
std::cout << it->str() << std::endl;
}
Forget about strtok. To get exactly what you seem to be
aiming for:
std::string const source = "- This is a test string";
std::vector<std::string> tokens;
std::string::const_iterator start = source.begin();
std::string::const_iterator end = source.end();
std::string::const_iterator next = std::find( start, end, ' ' );
while ( next != end ) {
tokens.push_back( std::string( start, next ) );
start = next + 1;
next = std::find( start, end, ' ' );
}
tokens.push_back( std::string( start, next ) );
Of course, this can be modified as much as you want: you can use
std::find_first_of is you want more than one separator, or
std::search if you want a multi-character separator, or even
std::find_if for an arbitrary test (with a lambda, if you have
C++11). And in most of the cases where you're parsing, you can
just pass around two iterators, rather than having to construct
a substring; you only need to construct a substring when you
want to save the extracted token somewhere.
Once you get used to using iterators and the standard
algorithms, you'll find it a lot more flexible than strtok,
and it doesn't have all of the drawbacks which the internal
state implies.

CString Parsing Carriage Returns

Let's say I have a string that has multiple carriage returns in it, i.e:
394968686
100630382
395950966
335666021
I'm still pretty amateur hour with C++, would anyone be willing to show me how you go about: parsing through each "line" in the string ? So I can do something with it later (add the desired line to a list). I'm guessing using Find("\n") in a loop?
Thanks guys.
while (!str.IsEmpty())
{
CString one_line = str.SpanExcluding(_T("\r\n"));
// do something with one_line
str = str.Right(str.GetLength() - one_line.GetLength()).TrimLeft(_T("\r\n"));
}
Blank lines will be eliminated with this code, but that's easily corrected if necessary.
You could try it using stringstream. Notice that you can overload the getline method to use any delimeter you want.
string line;
stringstream ss;
ss << yourstring;
while ( getline(ss, line, '\n') )
{
cout << line << endl;
}
Alternatively you could use the boost library's tokenizer class.
You can use stringstream class in C++.
#include <iostream>
#include <sstream>
#include <vector>
using namespace std;
int main()
{
string str = "\
394968686\
100630382\
395950966\
335666021";
stringstream ss(str);
vector<string> v;
string token;
// get line by line
while (ss >> token)
{
// insert current line into a std::vector
v.push_back(token);
// print out current line
cout << token << endl;
}
}
Output of the program above:
394968686
100630382
395950966
335666021
Note that no whitespace will be included in the parsed token, with the use of operator>>. Please refer to comments below.
If your string is stored in a c-style char* or std::string then you can simply search for \n.
std::string s;
size_t pos = s.find('\n');
You can use string::substr() to get the substring and store it in a list. Pseudo code,
std::string s = " .... ";
for(size_t pos, begin = 0;
string::npos != (pos = s.find('\n'));
begin = ++ pos)
{
list.push_back(s.substr(begin, pos));
}