Parsing a string by a delimeter in C++ - c++

Ok, so I need some info parsed and I would like to know what would be the best way to do it.
Ok so here is the string that I need to parse. The delimeter is the "^"
John Doe^Male^20
I need to parse the string into name, gender, and age variables. What would be the best way to do it in C++? I was thinking about looping and set the condition to while(!string.empty()
and then assign all characters up until the '^' to a string, and then erase what I have already assigned. Is there a better way of doing this?

You can use getline in C++ stream.
istream& getline(istream& is,string& str,char delimiter=’\n’)
change delimiter to '^'

You have a few options. One good option you have, if you can use boost, is the split algorithm they provide in their string library. You can check out this so question to see the boost answer in action: How to split a string in c
If you cannot use boost, you can use string::find to get the index of a character:
string str = "John Doe^Male^20";
int last = 0;
int cPos = -1;
while ((cPos = str.find('^', cPos + 1)) != string::npos)
{
string sub = str.substr(last, cPos - last);
// Do something with the string
last = cPos + 1;
}

#include <stdio.h>
#include <string.h>
int main ()
{
char str[] = "This is a sample string";
char * pch;
printf ("Looking for the 's' character in \"%s\"...\n",str);
pch=strchr(str,'s');
while (pch!=NULL)
{
printf ("found at %d\n",pch-str+1);
pch=strchr(pch+1,'s');
}
return 0;
}
Do something like this in an array.

You have a number of choices but I would use strtok(), myself. It would make short work of this.

Related

C++ Most effective way to grab a substring with a value in the middle of a long string

I want to find the most effective way to do something like this:
A big string containing all kinds of data, for example:
plushieid:5637372&plushieposition:12757&plushieowner:null&totalplushies:5637373
I want to make a function that would have the input to be, let's say "plushieposition", and I would have it find and return the string with plushieposition:12757.
The only way I can think of is find the position of plushieposition and then scan for & and delete the rest. But, is there a cleaner way? If not, what would be the best way to do this in code?
I'm having a little bit of trouble understanding string scan practices.
Use std::string::find() to find the starting and stopping positions, and then use std::string::substr() to extract what is between them, eg:
string extract(const string &s, const string &name)
{
string to_find = name + ":";
string::size_type start = s.find(to_find);
if (start == string::npos) return "";
string::size_type stop = s.find('&', start + to_find.size());
return s.substr(start, stop - start);
}
string s = "plushieid:5637372&plushieposition:12757&plushieowner:null&totalplushies:5637373";
string found = extract(s, "plushieposition");
Online Demo

c++ How to split string into two strings based on the last '.'

I want to split the string into two separate strings based on the last '.'
For example, abc.text.sample.last should become abc.text.sample.
I tried using boost::split but it gives output as follows:
abc
text
sample
last
Construction of string adding '.' again will not be good idea as sequence matters.
What will be the efficient way to do this?
Something as simple as rfind + substr
size_t pos = str.rfind("."); // or better str.rfind('.') as suggested by #DieterLücking
new_str = str.substr(0, pos);
std::string::find_last_of will give you the position of the last dot character in your string, which you can then use to split the string accordingly.
Make use of function std::find_last_of and then string::substr to achieve desired result.
Search for the first '.' beginning from the right. Use substr to extract the substring.
One more possible solution , assuming you can update original string.
Take char pointer, traverse from last.
Stop when first '.' found, replace it with '\0' null character.
Assign char pointer to that location.
now you have two strings.
char *second;
int length = string.length();
for(int i=length-1; i >= 0; i--){
if(string[i]=='.'){
string[i] = '\0';
second = string[i+1];
break;
}
}
I have not included test cases like if '.' is at last, or any other.
If you want to use boost, you could try this:
#include<iostream>
#include<boost/algorithm/string.hpp>
using namespace std;
using namespace boost;
int main(){
string mytext= "abc.text.sample.last";
typedef split_iterator<string::iterator> string_split_iterator;
for(string_split_iterator It=
make_split_iterator(mytext, last_finder(".", is_iequal()));
It!=string_split_iterator();
++It)
{
cout << copy_range<string>(*It) << endl;
}
return 0;
}
Output:
abc.text.sample
last

Extract a specific text pattern from a string

I have a string as follows,
"0/41/9/71.94 PC:0x82cc (add)"
The desired output is the text between the brackets ( )
Ex: output = add,
for the string specified above
How is this done using sscanf?
Is there a better way to do it in C++?
With string operations exclusively:
std::string text = "0/41/9/71.94 PC:0x82cc (add)";
auto pos = text.find('(') + 1;
auto opcode = text.substr(pos, text.find(')', pos) - pos);
Demo.
With sscanf it would look something like this:
std::string opcode(5, '\0'); // Some suitable maximum size
sscanf(text.c_str(), "%*[^(](%[^)]", &opcode[0]);
Demo.
Its very easy, u should try yourself, think how to search in an array, then think if i could compare the content of an array or not, then every thing would be possible, as a programmer u have to create ideas, however if i were asked to write a program like this i would do that as follows:
int i=0, p=0;
char string="0/41/9/71.94 PC:0x82cc (add)", nstr[100];
while(string[i]!='\0')
{
while(string[i]!='(')
i++;
if (string[i]=='(')
{
i++;
goto end;
}
end:
while (string[i]!=')' || string[i]!='\0')
{
nstr[p]=string[i];
p++;
i++;
}
nstr[p]='\0';
cout<<Output = "<<nstr<<"\n";
I know this is very long, but this will give you deeper understanding of parsing or spliting a string, hope i help u, thank u...

C++ Get String between two delimiter String

Is there any inbuilt function available two get string between two delimiter string in C/C++?
My input look like
_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_
And my output should be
_0_192.168.1.18_
Thanks in advance...
You can do as:
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
unsigned first = str.find(STARTDELIMITER);
unsigned last = str.find(STOPDELIMITER);
string strNew = str.substr (first,last-first);
Considering your STOPDELIMITER delimiter will occur only once at the end.
EDIT:
As delimiter can occur multiple times, change your statement for finding STOPDELIMITER to:
unsigned last = str.find_last_of(STOPDELIMITER);
This will get you text between the first STARTDELIMITER and LAST STOPDELIMITER despite of them being repeated multiple times.
I have no idea how the top answer received so many votes that it did when the question clearly asks how to get a string between two delimiter strings, and not a pair of characters.
If you would like to do so you need to account for the length of the string delimiter, since it will not be just a single character.
Case 1: Both delimiters are unique:
Given a string _STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_ that you want to extract _0_192.168.1.18_ from, you could modify the top answer like so to get the desired effect. This is the simplest solution without introducing extra dependencies (e.g Boost):
#include <iostream>
#include <string>
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find(stop_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
int main() {
// Want to extract _0_192.168.1.18_
std::string s = "_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_";
std::string s2 = "ABC123_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_XYZ345";
std::string start_delim = "_STARTDELIMITER";
std::string stop_delim = "STOPDELIMITER_";
std::cout << get_str_between_two_str(s, start_delim, stop_delim) << std::endl;
std::cout << get_str_between_two_str(s2, start_delim, stop_delim) << std::endl;
return 0;
}
Will print _0_192.168.1.18_ twice.
It is necessary to add the position of the first delimiter in the second argument to std::string::substr as last - (first + start_delim.length()) to ensure that the it would still extract the desired inner string correctly in the event that the start delimiter is not located at the very beginning of the string, as demonstrated in the second case above.
See the demo.
Case 2: Unique first delimiter, non-unique second delimiter:
Say you want to get a string between a unique delimiter and the first non unique delimiter encountered after the first delimiter. You could modify the above function get_str_between_two_str to use find_first_of instead to get the desired effect:
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find_first_of(stop_delim, end_pos_of_first_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
If instead you want to capture any characters in between the first unique delimiter and the last encountered second delimiter, like what the asker commented above, use find_last_of instead.
Case 3: Non-unique first delimiter, unique second delimiter:
Very similar to case 2, just reverse the logic between the first delimiter and second delimiter.
Case 4: Both delimiters are not unique:
Again, very similar to case 2, make a container to capture all strings between any of the two delimiters. Loop through the string and update the first delimiter's position to be equal to the second delimiter's position when it is encountered and add the string in between to the container. Repeat until std::string:npos is reached.
To get a string between 2 delimiter strings without white spaces.
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string startDEL = "STARTDELIMITER";
// this is really only needed for the first delimiter
string stopDEL = "STOPDELIMITER";
unsigned firstLim = str.find(startDEL);
unsigned lastLim = str.find(stopDEL);
string strNew = str.substr (firstLim,lastLim);
//This won't exclude the first delimiter because there is no whitespace
strNew = strNew.substr(firstLim + startDEL.size())
// this will start your substring after the delimiter
I tried combining the two substring functions but it started printing the STOPDELIMITER
Hope that helps
Hope you won't mind I'm answering by another question :)
I would use boost::split or boost::split_iter.
http://www.boost.org/doc/libs/1_54_0/doc/html/string_algo/usage.html#idp166856528
For example code see this SO question:
How to avoid empty tokens when splitting with boost::iter_split?
Let's say you need to get 5th argument (brand) from output below:
zoneid:zonename:state:zonepath:uuid:brand:ip-type:r/w:file-mac-profile
You cannot use any "str.find" function, because it is in the middle, but you can use 'strtok'. e.g.
char *brand;
brand = strtok( line, ":" );
for (int i=0;i<4;i++) {
brand = strtok( NULL, ":" );
}
This is a late answer, but this might work too:
string strgOrg= "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string strg= strgOrg;
strg.replace(strg.find("STARTDELIMITER"), 14, "");
strg.replace(strg.find("STOPDELIMITER"), 13, "");
Hope it works for others.
void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
int start = oStr.find(sStr1);
if (start >= 0)
{
string tstr = oStr.substr(start + sStr1.length());
int stop = tstr.find(sStr2);
if (stop >1)
rStr = oStr.substr(start + sStr1.length(), stop);
else
rStr ="error";
}
else
rStr = "error"; }
or if you are using Windows and have access to c++14, the following,
void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
using namespace std::literals::string_literals;
auto start = sStr1;
auto end = sStr2;
std::regex base_regex(start + "(.*)" + end);
auto example = oStr;
std::smatch base_match;
std::string matched;
if (std::regex_search(example, base_match, base_regex)) {
if (base_match.size() == 2) {
matched = base_match[1].str();
}
rStr = matched;
}
}
Example:
string strout;
getBtwString("it's_12345bb2","it's","bb2",strout);
getBtwString("it's_12345bb2"s,"it's"s,"bb2"s,strout); // second solution
Headers:
#include <regex> // second solution
#include <string.h>

C++ split string with space and punctuation chars

I wanna split an string using C++ which contains spaces and punctuations.
e.g. str = "This is a dog; A very good one."
I wanna get "This" "is" "a" "dog" "A" "very" "good" "one" 1 by 1.
It's quite easy with only one delimiter using getline but I don't know all the delimiters. It can be any punctuation chars.
Note: I don't wanna use Boost!
Use std::find_if() with a lambda to find the delimiter.
auto it = std::find_if(str.begin(), str.end(), [] (const char element) -> bool {
return std::isspace(element) || std::ispunct(element);})
So, starting at the first position, you find the first valid token. You can use
index = str.find_first_not_of (yourDelimiters);
Then you have to find the first delimiter after this, so you can do
delimIndex = str.substr (index).find_first_of (yourDelimiters);
your first word will then be
// since delimIndex will essentially be the length of the word
word = str.substr (index, delimIndex);
Then you truncate your string and repeat. You have to, of course, handle all of the cases where find_first_not_of and find_first_of return npos, which means that character was/was not found, but I think that's enough to get started.
Btw, I'm not claiming that this is the best method, but it works...
vmpstr's solution works, but could be a bit tedious.
Some months ago, I wrote a C library that does what you want.
http://wiki.gosub100.com/doku.php?id=librerias:c:cadenas
Documentation has been written in Spanish (sorry).
It doesn't need external dependencies. Try with splitWithChar() function.
Example of use:
#include "string_functions.h"
int main(void){
char yourString[]= "This is a dog; A very good one.";
char* elementsArray[8];
int nElements;
int i;
/*------------------------------------------------------------*/
printf("Character split test:\n");
printf("Base String: %s\n",yourString);
nElements = splitWithChar(yourString, ' ', elementsArray);
printf("Found %d element.\n", nElements);
for (i=0;i<nElements;i++){
printf ("Element %d: %s\n", i, elementsArray[i]);
}
return 0;
}
The original string "yourString" is modified after use spliWithChar(), so be carefull.
Good luck :)
CPP, unlike JAVA doesn't provide an elegant way to split the string by a delimiter. You can use boost library for the same but if you want to avoid it, a manual logic would suffice.
vector<string> split(string s) {
vector<string> words;
string word = "";
for(char x: s) {
if(x == ' ' or x == ',' or x == '?' or x == ';' or x == '!'
or x == '.') {
if(word.length() > 0) {
words.push_back(word);
word = "";
}
}
else
word.push_back(x);
}
if(word.length() > 0) {
words.push_back(word);
}
return words;