C++ Get String between two delimiter String - c++

Is there any inbuilt function available two get string between two delimiter string in C/C++?
My input look like
_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_
And my output should be
_0_192.168.1.18_
Thanks in advance...

You can do as:
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
unsigned first = str.find(STARTDELIMITER);
unsigned last = str.find(STOPDELIMITER);
string strNew = str.substr (first,last-first);
Considering your STOPDELIMITER delimiter will occur only once at the end.
EDIT:
As delimiter can occur multiple times, change your statement for finding STOPDELIMITER to:
unsigned last = str.find_last_of(STOPDELIMITER);
This will get you text between the first STARTDELIMITER and LAST STOPDELIMITER despite of them being repeated multiple times.

I have no idea how the top answer received so many votes that it did when the question clearly asks how to get a string between two delimiter strings, and not a pair of characters.
If you would like to do so you need to account for the length of the string delimiter, since it will not be just a single character.
Case 1: Both delimiters are unique:
Given a string _STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_ that you want to extract _0_192.168.1.18_ from, you could modify the top answer like so to get the desired effect. This is the simplest solution without introducing extra dependencies (e.g Boost):
#include <iostream>
#include <string>
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find(stop_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
int main() {
// Want to extract _0_192.168.1.18_
std::string s = "_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_";
std::string s2 = "ABC123_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_XYZ345";
std::string start_delim = "_STARTDELIMITER";
std::string stop_delim = "STOPDELIMITER_";
std::cout << get_str_between_two_str(s, start_delim, stop_delim) << std::endl;
std::cout << get_str_between_two_str(s2, start_delim, stop_delim) << std::endl;
return 0;
}
Will print _0_192.168.1.18_ twice.
It is necessary to add the position of the first delimiter in the second argument to std::string::substr as last - (first + start_delim.length()) to ensure that the it would still extract the desired inner string correctly in the event that the start delimiter is not located at the very beginning of the string, as demonstrated in the second case above.
See the demo.
Case 2: Unique first delimiter, non-unique second delimiter:
Say you want to get a string between a unique delimiter and the first non unique delimiter encountered after the first delimiter. You could modify the above function get_str_between_two_str to use find_first_of instead to get the desired effect:
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find_first_of(stop_delim, end_pos_of_first_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
If instead you want to capture any characters in between the first unique delimiter and the last encountered second delimiter, like what the asker commented above, use find_last_of instead.
Case 3: Non-unique first delimiter, unique second delimiter:
Very similar to case 2, just reverse the logic between the first delimiter and second delimiter.
Case 4: Both delimiters are not unique:
Again, very similar to case 2, make a container to capture all strings between any of the two delimiters. Loop through the string and update the first delimiter's position to be equal to the second delimiter's position when it is encountered and add the string in between to the container. Repeat until std::string:npos is reached.

To get a string between 2 delimiter strings without white spaces.
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string startDEL = "STARTDELIMITER";
// this is really only needed for the first delimiter
string stopDEL = "STOPDELIMITER";
unsigned firstLim = str.find(startDEL);
unsigned lastLim = str.find(stopDEL);
string strNew = str.substr (firstLim,lastLim);
//This won't exclude the first delimiter because there is no whitespace
strNew = strNew.substr(firstLim + startDEL.size())
// this will start your substring after the delimiter
I tried combining the two substring functions but it started printing the STOPDELIMITER
Hope that helps

Hope you won't mind I'm answering by another question :)
I would use boost::split or boost::split_iter.
http://www.boost.org/doc/libs/1_54_0/doc/html/string_algo/usage.html#idp166856528
For example code see this SO question:
How to avoid empty tokens when splitting with boost::iter_split?

Let's say you need to get 5th argument (brand) from output below:
zoneid:zonename:state:zonepath:uuid:brand:ip-type:r/w:file-mac-profile
You cannot use any "str.find" function, because it is in the middle, but you can use 'strtok'. e.g.
char *brand;
brand = strtok( line, ":" );
for (int i=0;i<4;i++) {
brand = strtok( NULL, ":" );
}

This is a late answer, but this might work too:
string strgOrg= "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string strg= strgOrg;
strg.replace(strg.find("STARTDELIMITER"), 14, "");
strg.replace(strg.find("STOPDELIMITER"), 13, "");
Hope it works for others.

void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
int start = oStr.find(sStr1);
if (start >= 0)
{
string tstr = oStr.substr(start + sStr1.length());
int stop = tstr.find(sStr2);
if (stop >1)
rStr = oStr.substr(start + sStr1.length(), stop);
else
rStr ="error";
}
else
rStr = "error"; }
or if you are using Windows and have access to c++14, the following,
void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
using namespace std::literals::string_literals;
auto start = sStr1;
auto end = sStr2;
std::regex base_regex(start + "(.*)" + end);
auto example = oStr;
std::smatch base_match;
std::string matched;
if (std::regex_search(example, base_match, base_regex)) {
if (base_match.size() == 2) {
matched = base_match[1].str();
}
rStr = matched;
}
}
Example:
string strout;
getBtwString("it's_12345bb2","it's","bb2",strout);
getBtwString("it's_12345bb2"s,"it's"s,"bb2"s,strout); // second solution
Headers:
#include <regex> // second solution
#include <string.h>

Related

Suggestions that could improve my string splitting function

I'm new to C++ and I'm trying to write some basic functions to get the hang of some of it, I decided on making a custom function to split up a string into tokens every time a specific delimiter is reached.
I've made it work successfully, but since I'm new, I'd like to hear from more experienced programmers on if there is a better way to go about it. This is my code:
vector<string> split(string const str, string const separator=" ") {
int str_len = str.length();
int sep_len = separator.length();
int current_index {0};
vector<string> strings {};
for(int i {0}; i < str_len; ++i) {
if(str.substr(i, sep_len) == separator) {
strings.push_back(str.substr(current_index, i-current_index));
current_index = i + sep_len;
}
}
strings.push_back(str.substr(current_index, str_len-current_index));
return strings;
}
One thing I will say is, I don't like how I had to put
strings.push_back(str.substr(current_index, str_len-current_index));
this after the entire iteration to get the final part of the string. I just can't think of any different methods.
Use std::string::find() to find separators in the string, which is probably much more efficient than your loop that checks for each possible position if the substring at that position matches the separator. Once you have that, you can make use of the fact that if the separator is not found, find() returns std::string::npos, which is the largest possible value of std::string::size_type, so just pass this to substr() to get everything from the current position to the end of the string. This way you can avoid the second push_back().
vector<string> split(string const &str, string const &separator=" ") {
string::size_type current_index {};
vector<string> strings;
while (true) {
auto separator_index = str.find(separator, current_index);
strings.push_back(str.substr(current_index, separator_index - current_index));
if (separator_index == str.npos)
break;
else
current_index = separator_index + separator.size();
}
return strings;
}
Note: ensure you pass the input parameters by reference to avoid unnecessary copies being made.

How to create a function to extract field in string inside of vector? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
i want to make a function like this.
using namespace std;
vector<string> hj;
vector<string> feldseperator(const string& s, char delimiter) {
size_t pos = 0;
string token;
while ((pos = s.find(delimiter)) != string::npos)
{
token = s.substr(0, pos);
cout << token << endl;
hj.push_back(token);
s.erase(); // I WANT TO DELETE THE FIRST FIELD + CHAR
}
return hj;
}
int main()
{
string s = "dog;cat;fish;fax;fox;fast;";
char f = ';';
feldseperator(s, f);
cin.get();
}
Ah, I see, you're trying to break string on basis of delimiters. Your problem is you want to remove prefix of string uptil first occurrence of delimiter - as stated by comment in your code. So, you could:
Use second param of std::find, which is "Position of the first character in the string to be considered in the search" and update your code like this:
size_t last_pos = 0 , pos = 0;
while ((pos = s.find(delimiter , last_pos)) != string::npos)
{
token = s.substr(last_pos, pos - last_pos);
last_pos = pos + 1; //pos is position of delimiter, you want next search to begin from character which comes after delimiter.
..
}
Since you already have position of delimiter, you can reinitialize string s as s = s.substr(pos+1 , s.size() - pos ) but then you'd have to removes const keyword, making option 1 a better choice.
Try this,
According to std::erase (overload 3), it deletes in the range [first, last) hence the +1.
using namespace std;
vector<string> hj;
vector<string> feldseperator(const string& s, char delimiter) {
auto copy = s;
size_t pos = 0;
string token;
while ((pos = copy.find(delimiter)) != string::npos)
{
token = s.substr(0, pos);
cout << token << endl;
hj.push_back(token);
copy.erase(copy.begin(), copy.begin()+pos+1); // I WANT TO DELETE THE FIRST FIELD + CHAR
}
return hj;
}
int main()
{
string s = "dog;cat;fish;fax;fox;fast;";
char f = ';';
feldseperator(s, f);
cin.get();
}
Note that I'm passing a copy of the string, and not a reference. You may want to change that as required.
So, as I understand, you want to split a string that contains substrings, delimited by a ";". This process is called tokenizing. Becuase you want to split a string into smaller tokens.
Modern C++ has a build in functionaliyt which is exactly designed for that purpose. It is called std::sregex_token_iterator. What is this thing?
As it name says, it is an iterator. It will iterate over a string (hence the 's' in its name) and return the split up tokens. The tokens will be matched again a regular expression. Or, nagtively, the delimiter will be matched and the rest will be seen as token and returned. This will be controlled via the last flag in its constructor.
Let's have a look at this constructor:
hj(std::sregex_token_iterator(s.begin(), s.end(), delimiter, -1), {});
The first parameter is, where it should start in the source string, the 2nd parameter is the end position, up to which the iterator should work. The lase parameter is:
1, if you want to have a positive match for the regex
-1, will return everything that not matches the regex
And last but not least the regex itself. Please read in the net abot regex'es. There are tons of pages available.
So, then we check the one liner for the field extraction.
std::vector<std::string> hj(std::sregex_token_iterator(s.begin(), s.end(), delimiter, -1), {});
What is that. The first is obvious. We define a std::vector<std::string>> with the name "hj". As possible with any variable definition, we will call a constructor, to construct the std::vector<std::string>>.
If you look in std::vector range constructor (no 5), you will see that we can initialize the vector, with a other iterator (begin and end), and copy the values from there. The begin iterator is given and the end iterator is given automatically with {}, because the empty constructor for the std::sregex_token_iterator is equal to its "end()".
That's it, everything done with a one-liner.
Please see:
#include <iostream>
#include <iterator>
#include <regex>
#include <string>
#include <vector>
int main() {
// The string to split
std::string s("dog;cat;fish;fax;fox;fast;");
// The delimiter
std::regex delimiter(";");
// Tokenize and store result in vector
std::vector<std::string> hj(std::sregex_token_iterator(s.begin(), s.end(), delimiter, -1), {});
std::cin.get();
}
Byt the way, if you have an existing vector, then you can copy the result into that vector:
std::copy(std::sregex_token_iterator(s.begin(), s.end(), delimiter, -1), {}, std::back_inserter(hj));
I hope that you can see the simplicity of that approach.
Of course there are many other possible solutions, and everybdoy can select whatever he wants.

C++ read select parts of std::string

I have been trying to extract only "Apple" from the string below ie. between "," and "/". Is there a way to extract the string between the delimters? Currently, all of the string after the "," is extracted.
std::string test = "Hello World, Apple/Banana";
std::size_t found;
found = test.find(",");
std::string str3 = test.substr(found);
std::cout << str3 << std::endl;
One step at a time. First, extract the part after the comma. Then extract the part before the following slash.
Alternatively, substr() also takes an optional 2nd parameter, the maximum number of characters to extract, instead of extracting everything that's left of the string. So, if you compute how many characters you want to extract, you can also do it in a single substr() call.
The first part is finding where the substring "Apple" begins. You can use find() for that. It returns the location of the first character of the substring. You can then use the std::string constructor to pass in the original string with start and stop locations.
References
String find(),
String constructor
std::string extractedString = std::string(test, test.find("Apple"), sizeof("Apple") - 1);
You can use the second argument of substr to find the length of the extraction
#include <string>
using namespace std;
int main(int argc, char * argv[]){
string test = "Hello World, Apple/Banana";
size_t f1 = test.find(',');
size_t f2 = test.find('/');
string extracted = test.substr(f1, f2 - f1);
}
This will output , Apple when using VC2013. If you change f1 to size_t f1 = test.find(',') + 2; it will output Apple.

c++ How to split string into two strings based on the last '.'

I want to split the string into two separate strings based on the last '.'
For example, abc.text.sample.last should become abc.text.sample.
I tried using boost::split but it gives output as follows:
abc
text
sample
last
Construction of string adding '.' again will not be good idea as sequence matters.
What will be the efficient way to do this?
Something as simple as rfind + substr
size_t pos = str.rfind("."); // or better str.rfind('.') as suggested by #DieterLücking
new_str = str.substr(0, pos);
std::string::find_last_of will give you the position of the last dot character in your string, which you can then use to split the string accordingly.
Make use of function std::find_last_of and then string::substr to achieve desired result.
Search for the first '.' beginning from the right. Use substr to extract the substring.
One more possible solution , assuming you can update original string.
Take char pointer, traverse from last.
Stop when first '.' found, replace it with '\0' null character.
Assign char pointer to that location.
now you have two strings.
char *second;
int length = string.length();
for(int i=length-1; i >= 0; i--){
if(string[i]=='.'){
string[i] = '\0';
second = string[i+1];
break;
}
}
I have not included test cases like if '.' is at last, or any other.
If you want to use boost, you could try this:
#include<iostream>
#include<boost/algorithm/string.hpp>
using namespace std;
using namespace boost;
int main(){
string mytext= "abc.text.sample.last";
typedef split_iterator<string::iterator> string_split_iterator;
for(string_split_iterator It=
make_split_iterator(mytext, last_finder(".", is_iequal()));
It!=string_split_iterator();
++It)
{
cout << copy_range<string>(*It) << endl;
}
return 0;
}
Output:
abc.text.sample
last

Parsing a string by a delimeter in C++

Ok, so I need some info parsed and I would like to know what would be the best way to do it.
Ok so here is the string that I need to parse. The delimeter is the "^"
John Doe^Male^20
I need to parse the string into name, gender, and age variables. What would be the best way to do it in C++? I was thinking about looping and set the condition to while(!string.empty()
and then assign all characters up until the '^' to a string, and then erase what I have already assigned. Is there a better way of doing this?
You can use getline in C++ stream.
istream& getline(istream& is,string& str,char delimiter=’\n’)
change delimiter to '^'
You have a few options. One good option you have, if you can use boost, is the split algorithm they provide in their string library. You can check out this so question to see the boost answer in action: How to split a string in c
If you cannot use boost, you can use string::find to get the index of a character:
string str = "John Doe^Male^20";
int last = 0;
int cPos = -1;
while ((cPos = str.find('^', cPos + 1)) != string::npos)
{
string sub = str.substr(last, cPos - last);
// Do something with the string
last = cPos + 1;
}
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] = "This is a sample string";
char * pch;
printf ("Looking for the 's' character in \"%s\"...\n",str);
pch=strchr(str,'s');
while (pch!=NULL)
{
printf ("found at %d\n",pch-str+1);
pch=strchr(pch+1,'s');
}
return 0;
}
Do something like this in an array.
You have a number of choices but I would use strtok(), myself. It would make short work of this.