C++ How to get string/char in between 2 words - c++

i got a word that is
AD#Andorra
Got a few questions:
How do i check
AD?Andorra exist
? is a wildcard, it could be comma or hex or dollar sign or other value
then after confirm AD?Andorra exist, how do i get the value of ?
Thanks,
Chen

The problem can be solved generally with a regular expression match. However, for the specific problem you presented, this would work:
std::string input = getinput();
char at2 = input[2];
input[2] = '#';
if (input == "AD#Andorra") {
// match, and char of interest is in at2;
} else {
// doesn't match
}
If the ? is supposed to represent a string also, then you can do something like this:
bool find_inbetween (std::string input,
std::string &output,
const std::string front = "AD",
const std::string back = "Andorra") {
if ((input.size() < front.size() + back.size())
|| (input.compare(0, front.size(), front) != 0)
|| (input.compare(input.size()-back.size(), back.size(), back) != 0)) {
return false;
}
output = input.substr(front.size(), input.size()-front.size()-back.size());
return true;
}

If you are on C++11/use Boost (which I strongly recommend!) use regular expressions. Once you gain some level of understanding all text processing becomes easy-peasy!
#include <regex> // or #include <boost/regex>
//! \return A separating character or 0, if str does not match the pattern
char getSeparator(const char* str)
{
using namespace std; // change to "boost" if not on C++11
static const regex re("^AD(.)Andorra$");
cmatch match;
if (regex_match(str, match, re))
{
return *(match[1].first);
}
return 0;
}

assuming your character always starts at position 3!
use the string functions substr:
your_string.substr(your_string,2,1)

If you are using C++11, i recommend you to use regex instead of direct searching in your string.

Related

C++ Get String between two delimiter String

Is there any inbuilt function available two get string between two delimiter string in C/C++?
My input look like
_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_
And my output should be
_0_192.168.1.18_
Thanks in advance...
You can do as:
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
unsigned first = str.find(STARTDELIMITER);
unsigned last = str.find(STOPDELIMITER);
string strNew = str.substr (first,last-first);
Considering your STOPDELIMITER delimiter will occur only once at the end.
EDIT:
As delimiter can occur multiple times, change your statement for finding STOPDELIMITER to:
unsigned last = str.find_last_of(STOPDELIMITER);
This will get you text between the first STARTDELIMITER and LAST STOPDELIMITER despite of them being repeated multiple times.
I have no idea how the top answer received so many votes that it did when the question clearly asks how to get a string between two delimiter strings, and not a pair of characters.
If you would like to do so you need to account for the length of the string delimiter, since it will not be just a single character.
Case 1: Both delimiters are unique:
Given a string _STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_ that you want to extract _0_192.168.1.18_ from, you could modify the top answer like so to get the desired effect. This is the simplest solution without introducing extra dependencies (e.g Boost):
#include <iostream>
#include <string>
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find(stop_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
int main() {
// Want to extract _0_192.168.1.18_
std::string s = "_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_";
std::string s2 = "ABC123_STARTDELIMITER_0_192.168.1.18_STOPDELIMITER_XYZ345";
std::string start_delim = "_STARTDELIMITER";
std::string stop_delim = "STOPDELIMITER_";
std::cout << get_str_between_two_str(s, start_delim, stop_delim) << std::endl;
std::cout << get_str_between_two_str(s2, start_delim, stop_delim) << std::endl;
return 0;
}
Will print _0_192.168.1.18_ twice.
It is necessary to add the position of the first delimiter in the second argument to std::string::substr as last - (first + start_delim.length()) to ensure that the it would still extract the desired inner string correctly in the event that the start delimiter is not located at the very beginning of the string, as demonstrated in the second case above.
See the demo.
Case 2: Unique first delimiter, non-unique second delimiter:
Say you want to get a string between a unique delimiter and the first non unique delimiter encountered after the first delimiter. You could modify the above function get_str_between_two_str to use find_first_of instead to get the desired effect:
std::string get_str_between_two_str(const std::string &s,
const std::string &start_delim,
const std::string &stop_delim)
{
unsigned first_delim_pos = s.find(start_delim);
unsigned end_pos_of_first_delim = first_delim_pos + start_delim.length();
unsigned last_delim_pos = s.find_first_of(stop_delim, end_pos_of_first_delim);
return s.substr(end_pos_of_first_delim,
last_delim_pos - end_pos_of_first_delim);
}
If instead you want to capture any characters in between the first unique delimiter and the last encountered second delimiter, like what the asker commented above, use find_last_of instead.
Case 3: Non-unique first delimiter, unique second delimiter:
Very similar to case 2, just reverse the logic between the first delimiter and second delimiter.
Case 4: Both delimiters are not unique:
Again, very similar to case 2, make a container to capture all strings between any of the two delimiters. Loop through the string and update the first delimiter's position to be equal to the second delimiter's position when it is encountered and add the string in between to the container. Repeat until std::string:npos is reached.
To get a string between 2 delimiter strings without white spaces.
string str = "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string startDEL = "STARTDELIMITER";
// this is really only needed for the first delimiter
string stopDEL = "STOPDELIMITER";
unsigned firstLim = str.find(startDEL);
unsigned lastLim = str.find(stopDEL);
string strNew = str.substr (firstLim,lastLim);
//This won't exclude the first delimiter because there is no whitespace
strNew = strNew.substr(firstLim + startDEL.size())
// this will start your substring after the delimiter
I tried combining the two substring functions but it started printing the STOPDELIMITER
Hope that helps
Hope you won't mind I'm answering by another question :)
I would use boost::split or boost::split_iter.
http://www.boost.org/doc/libs/1_54_0/doc/html/string_algo/usage.html#idp166856528
For example code see this SO question:
How to avoid empty tokens when splitting with boost::iter_split?
Let's say you need to get 5th argument (brand) from output below:
zoneid:zonename:state:zonepath:uuid:brand:ip-type:r/w:file-mac-profile
You cannot use any "str.find" function, because it is in the middle, but you can use 'strtok'. e.g.
char *brand;
brand = strtok( line, ":" );
for (int i=0;i<4;i++) {
brand = strtok( NULL, ":" );
}
This is a late answer, but this might work too:
string strgOrg= "STARTDELIMITER_0_192.168.1.18_STOPDELIMITER";
string strg= strgOrg;
strg.replace(strg.find("STARTDELIMITER"), 14, "");
strg.replace(strg.find("STOPDELIMITER"), 13, "");
Hope it works for others.
void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
int start = oStr.find(sStr1);
if (start >= 0)
{
string tstr = oStr.substr(start + sStr1.length());
int stop = tstr.find(sStr2);
if (stop >1)
rStr = oStr.substr(start + sStr1.length(), stop);
else
rStr ="error";
}
else
rStr = "error"; }
or if you are using Windows and have access to c++14, the following,
void getBtwString(std::string oStr, std::string sStr1, std::string sStr2, std::string &rStr)
{
using namespace std::literals::string_literals;
auto start = sStr1;
auto end = sStr2;
std::regex base_regex(start + "(.*)" + end);
auto example = oStr;
std::smatch base_match;
std::string matched;
if (std::regex_search(example, base_match, base_regex)) {
if (base_match.size() == 2) {
matched = base_match[1].str();
}
rStr = matched;
}
}
Example:
string strout;
getBtwString("it's_12345bb2","it's","bb2",strout);
getBtwString("it's_12345bb2"s,"it's"s,"bb2"s,strout); // second solution
Headers:
#include <regex> // second solution
#include <string.h>

C++ split string with space and punctuation chars

I wanna split an string using C++ which contains spaces and punctuations.
e.g. str = "This is a dog; A very good one."
I wanna get "This" "is" "a" "dog" "A" "very" "good" "one" 1 by 1.
It's quite easy with only one delimiter using getline but I don't know all the delimiters. It can be any punctuation chars.
Note: I don't wanna use Boost!
Use std::find_if() with a lambda to find the delimiter.
auto it = std::find_if(str.begin(), str.end(), [] (const char element) -> bool {
return std::isspace(element) || std::ispunct(element);})
So, starting at the first position, you find the first valid token. You can use
index = str.find_first_not_of (yourDelimiters);
Then you have to find the first delimiter after this, so you can do
delimIndex = str.substr (index).find_first_of (yourDelimiters);
your first word will then be
// since delimIndex will essentially be the length of the word
word = str.substr (index, delimIndex);
Then you truncate your string and repeat. You have to, of course, handle all of the cases where find_first_not_of and find_first_of return npos, which means that character was/was not found, but I think that's enough to get started.
Btw, I'm not claiming that this is the best method, but it works...
vmpstr's solution works, but could be a bit tedious.
Some months ago, I wrote a C library that does what you want.
http://wiki.gosub100.com/doku.php?id=librerias:c:cadenas
Documentation has been written in Spanish (sorry).
It doesn't need external dependencies. Try with splitWithChar() function.
Example of use:
#include "string_functions.h"
int main(void){
char yourString[]= "This is a dog; A very good one.";
char* elementsArray[8];
int nElements;
int i;
/*------------------------------------------------------------*/
printf("Character split test:\n");
printf("Base String: %s\n",yourString);
nElements = splitWithChar(yourString, ' ', elementsArray);
printf("Found %d element.\n", nElements);
for (i=0;i<nElements;i++){
printf ("Element %d: %s\n", i, elementsArray[i]);
}
return 0;
}
The original string "yourString" is modified after use spliWithChar(), so be carefull.
Good luck :)
CPP, unlike JAVA doesn't provide an elegant way to split the string by a delimiter. You can use boost library for the same but if you want to avoid it, a manual logic would suffice.
vector<string> split(string s) {
vector<string> words;
string word = "";
for(char x: s) {
if(x == ' ' or x == ',' or x == '?' or x == ';' or x == '!'
or x == '.') {
if(word.length() > 0) {
words.push_back(word);
word = "";
}
}
else
word.push_back(x);
}
if(word.length() > 0) {
words.push_back(word);
}
return words;

My last regular expression won't work but i cannot figure out the reason why

I have two vectors, one which holds my regular expressions and one which holds the string in which will be checked against the regular expression, most of them work fine except for this one (shown below) the string is a correct string and matches the regular expression but it outputs incorrect instead of correct.
INPUT STRING
.C/IATA
CODE IS BELOW
std::string errorMessages [6][6] = {
{
"Correct Corparate Name\n",
},
{
"Incorrect Format for Corporate Name\n",
}
};
std::vector<std::string> el;
split(el,message,boost::is_any_of("\n"));
std::string a = ("");
for(int i = 0; i < el.size(); i++)
{
if(el[i].substr(0,3) == ".C/")
{
DCS_LOG_DEBUG("--------------- Validating .C/ ---------------");
output.push_back("\n--------------- Validating .C/ ---------------\n");
str = el[i].substr(3);
split(st,str,boost::is_any_of("/"));
for (int split_id = 0 ; split_id < splitMask.size() ; split_id++ )
{
boost::regex const string_matcher_id(splitMask[split_id]);
if(boost::regex_match(st[split_id],string_matcher_id))
{
a = errorMessages[0][split_id];
DCS_LOG_DEBUG("" << a )
}
else
{
a = errorMessages[1][split_id];
DCS_LOG_DEBUG("" << a)
}
output.push_back(a);
}
}
else
{
DCS_LOG_DEBUG("Do Nothing");
}
st[split_id] = "IATA"
splitMask[split_id] = "[a-zA-Z]{1,15}" <---
But it still outputs Incorrect format for corporate name
I cannot see why it prints incorrect when it should be correct can someone help me here please ?
Your regex and the surrounding logic is OK.
You need to extend your logging and to print the relevant part of splitMask and st right before the call to boost::regex_match to double check that the values are what you believe they are. Print them surrounded in some punctuation and also print the string length to be sure.
As you probably know, boost::regex_match only finds a match if the whole string is a match; therefore, if there is a non-printable character somewhere, or maybe a trailing space character, that will perfectly explain the result you have seen.

Why wouldn't a search from glGetString(GL_EXTENSIONS) work correctly?

I read this page:
http://www.opengl.org/wiki/GlGetString
For example, if the extension
GL_EXT_pixel_transform_color_table is
listed, doing a simple search for
GL_EXT_pixel_transform will return a
positive whether or not it is defined.
How is that possible since its space separated? Why dont you just put a space after the keyword you're searching for?
For example:
char *exts = (char *)glGetString(GL_EXTENSIONS);
if(!strstr(exts, "GL_EXT_pixel_transform ")){ // notice the space!
// not supported
}
I would like to know why this wouldnt work, because for me it does work.
You can tokenise the returned string using space as separator for more reliable search (if you don't want to use the newer API). E.g. with Boost.Tokenizer:
typedef boost::tokenizer< boost::char_separator<char> > tokenizer;
boost::char_separator<char> sep(" ");
tokenizer tok(static_cast<const char*>(glGetString(GL_EXTENSIONS)), sep);
if (std::find(tok.begin(), tok.end(), "GL_EXT_pixel_transform") != tok.end()) {
// extension found
}
What if the extension you are looking for is listed last? Then it will not be followed by a blank.
I know this is an old question but maybe someone else will find it useful. In case you don't want to use any tokenizing library/class here is a function that scans a string for an exact substring (without the mentioned problem). Also, it almost doesn't use any additional memory (string data is not copied):
bool strstrexact(const char *str, const char *substr, const char *delim, const bool isRecursiveCall = 0)
{
static int substrLen;
if (!isRecursiveCall)
substrLen = strlen(substr);
if (substrLen <= 0)
return FALSE;
const char *occurence = strstr(str, substr);
if (occurence == NULL)
return FALSE;
occurence += substrLen;
if (*occurence == '\0')
return TRUE;
const char *nextDelim;
nextDelim = strstr(occurence, delim);
if (nextDelim == NULL)
return FALSE;
if (nextDelim == occurence)
return TRUE;
return strstrexact(nextDelim, substr, delim, TRUE);
}
It returns TRUE if the substring was found or FALSE if it wasn't. In my case Here's how I used it:
if (strstrexact((const char*) glGetString(GL_EXTENSIONS), "WGL_ARB_pixel_format", " ")) {
// extension is available
} else {
// extension isn't available
}

get atof to continue converting a string to a number after the first non valid ch in a string

i'd like to know if there a way to get atof continue converting to number even if there are non valid charcters in the way
for example let say i have string "444-3-3-33"
i want to covert it to a double a=4443333
(and keep the string as it was)
happy to get any suggestions or an alternative way
thanks!
I can't take credit for this solution, though it's a good one, see this SO post. For those too lazy to skip over, the author recommends using a locale to treat all non-numeric digits as whitespace. It might be overkill for your solution but the idea is easily adaptable. Instead of all non-numeric, you could just use "-" as your whitespace. Here's his code, not mine. Please, if you like this give him the upvote.
struct digits_only: std::ctype<char>
{
digits_only(): std::ctype<char>(get_table()) {}
static std::ctype_base::mask const* get_table()
{
static std::vector<std::ctype_base::mask>
rc(std::ctype<char>::table_size,std::ctype_base::space);
std::fill(&rc['0'], &rc['9'], std::ctype_base::digit);
return &rc[0];
}
};
bool in_range(int lower, int upper, std::string const &input) {
std::istringstream buffer(input);
buffer.imbue(std::locale(std::locale(), new digits_only()));
int n;
while (buffer>>n)
if (n < lower || upper < n)
return false;
return true;
}
Then just remove the whitespace and pass the string to atof.
Both of the following strip out non-digits for me
bool no_digit(char ch) {return !std::isdigit(ch);}
std::string read_number(const std::string& input)
{
std::string result;
std::remove_copy_if( input.begin()
, input.end()
, std::back_inserter(result)
, &no_digit);
return result;
}
std::string read_number(std::istream& is)
{
std::string result;
for(;;) {
while(is.good() && !std::isdigit(is.peek()))
is.get();
if(!is.good())
return result;
result += is.get();
}
assert(false);
}
You can then read number using string streams:
std::istringstream iss(read_number("444-3-3-33");
int i;
if( !(iss>>i) ) throw "something went wrong!";
std::cout << i << '\n';
I would recommend sscanf
[edit]
upon further review, it would seem that you'll have to use strstr as sscanf could have an issue with the embedded '-'
further, the page should give you a good start on finding (and removing) your '-' char's
[/edit]
copy the 'string number' to a local buffer(a std::string), then strip out the accepted chars from the number(compressing the string, as to not leave blank space, thus using std::string.replace), then call atof on std::string.c_str. alternatly you can use c strings, but then this wouldn't be C++.
alternatively, create a custom version of atof your self, using the source from say stdlibc or basic math.