I have a substring defined by two iterators (start and end). I need to check if this substring is present in another string.
Is there a standard library algorithm or string member I can use or adapt to do this without creating a whole new string object (std::string(start, end)) just for this purpose?
e.g.
struct Substring
{
std::string::const_iterator start, end;
};
auto found = std::contains(whole.begin(), whole.end(), substring.start, substring.end); // ???
std::search
bool found =
std::search(hay.begin(), hay.end(), needle.begin(), needle.end()) != hay.end();
You can use the std::string::find method:
auto found = (whole.find(&*substring.start, 0, substring.end - substring.start)
!= std::string::npos);
The advantage over std::search is that std::find works on strings and can be implemented using Boyer-Moore. Sadly, that is not how gcc's libstdc++ implements it.
Related
If I want to verify one string is completely matches with the any one in the vector of strings then i will use
std::find(vectOfStrings.begin(), vectOfStrings.end(), "<targetString>") != v.end()
If the target string matches with any of the string in the vector then it will return true.
But what if i want to check one string is matches with any one of the vector of regular expressions?
Is there any standard library i can use to make it work like
std::find(vectOfRegExprsns.begin(), vectOfRegExprsns.end(), "<targetString>") != v.end()?
Any suggestions would be highly appreciated.
How about using std::find_if() with a lambda?
std::find_if(
vectOfRegExprsns.begin(), vectOfRegExprsns.end(),
[](const std::string& item) { return regex_match(item, std::regex(targetString))});
I find the behaviour of std::string::find to be inconsistent with standard C++ containers.
E.g.
std::map<int, int> myMap = {{1, 2}};
auto it = myMap.find(10); // it == myMap.end()
But for a string,
std::string myStr = "hello";
auto it = myStr.find('!'); // it == std::string::npos
Why shouldn't the failed myStr.find('!') return myStr.end() instead of std::string::npos?
Since the std::string is somewhat special when compared with other containers, I am wondering whether there is some real reason behind this.
(Surprisingly, I couldn't find anyone questioning this anywhere).
To begin with, the std::string interface is well known to be bloated and inconsistent, see Herb Sutter's Gotw84 on this topic. But nevertheless, there is a reasoning behind std::string::find returning an index: std::string::substr. This convenience member function operates on indices, e.g.
const std::string src = "abcdefghijk";
std::cout << src.substr(2, 5) << "\n";
You could implement substr such that it accepts iterators into the string, but then we wouldn't need to wait long for loud complaints that std::string is unusable and counterintuitive. So given that std::string::substr accepts indices, how would you find the index of the first occurence of 'd' in the above input string in order to print out everything starting from this substring?
const auto it = src.find('d'); // imagine this returns an iterator
std::cout << src.substr(std::distance(src.cbegin(), it));
This might also not be what you want. Hence we can let std::string::find return an index, and here we are:
const std::string extracted = src.substr(src.find('d'));
If you want to work with iterators, use <algorithm>. They allow you to the above as
auto it = std::find(src.cbegin(), src.cend(), 'd');
std::copy(it, src.cend(), std::ostream_iterator<char>(std::cout));
This is because std::string have two interfaces:
The general iterator based interface found on all containers
The std::string specific index based interface
std::string::find is part of the index based interface, and therefore returns indices.
Use std::find to use the general iterator based interface.
Use std::vector<char> if you don't want the index based interface (don't do this).
I'm working on a multithreading project where for one segment of the project I need to find if a given character sequence exists within a string. Im wondering if C++/C have any pre-built functions which can handle this, but am having trouble figuring out the exact 'definition' to search for.
I know about 'strtr' and 'find', the issue is the function needs to be able to find a sequence which is SPLIT across a string.
Given the string 'Hello World', I need a function that returns true if the sequence 'H-W-l' exists. Is there anything prebuilt which can handle this?
As far as I know, subsequence searching as such is not part of either the standard C library or the standard C++ library.
However, you can express subsequence searching as either a regular expression or a "glob". Posix mandates both regex and glob matching functions, while the C++ standard library includes regular expressions since C++11. Both of these techniques require modifying the search string:
Regular expression: HWl ⇒ H.*W.*l. regexec will do a search for the regular expression (unless anchored, which this one is not); in C++, you would want to use std::regex_search rather than std::regex_match.
Glob: HWl ⇒ *H*W*l*. Glob matching is always a complete match, although in all the implementations I know of a trailing * is optimized. This is available as the fnmatch function in the Posix header fnmatch.h. For this application, provide 0 for the flags parameter.
If you don't like any of the above, you can use the standard C strchr function in a simple loop:
bool has_subsequence(const char* haystack, const char* needle) {
const char* p;
for (p = haystack; *needle && (p = strchr(p, *needle)); ++needle) {
}
return p != NULL;
}
If I understand correctly, then you're trying to search for chars in a given order but aren't necessarily contiguous. If you're in C++, I don't see why you couldn't use the std::find function under the <algorithm> system header. I would load both into a string and then search as follows:
bool has_noncontig_sequence(const std::string& str, const std::string& subStr)
{
typedef std::string::const_iterator iter;
iter start = str.begin();
// loop over substr and save iterator position;
for (iter i = subStr.begin(); i != subStr.end(); ++i)
start = std::find(start, str.end(), *i);
// check position, if at end, then false;
return start != str.end() ? true : false;
}
The std::find function will position start over the first correct character in str if it can find it and then search for the next. If it can't, then start will be positioned at the end, indicating failure.
I have a vector of strings, i need to search for a particular character in it
vector<string> users;
users.push_back("user25_5");
users.push_back("user65_6");
users.push_back("user95_9");
I have to search for the number 65 in the vector
the find library of vectors just searches for the entire string, it does not work for particular character in the string
You can use std::find_if with a suitable functor:
bool has_65(const std::string& s)
{
// search for "65" and return bool
}
then
auto it = std::find_if(users.begin(), users.end(), has_65);
For finding strings inside strings, have a look at std::string::find.
tIs it possible for me to detect if a string is 'all numeric' or not using tr1 regex?
If yes, please help me with a snipped as well since I am new to regex.
Why I am looking towards tr1 regex for something like this, because I don't want to create a separate function for detecting if the string is numeric. I want to do it inline in rest of the client code but do not want it to look ugly as well. I feel maybe tr1 regex might help. Not sure, any advises on this?
If you just want to test whether the string has all numeric characters, you can use std::find_if_not and std::isdigit:
std::find_if_not(s.begin(), s.end(), (int(*)(int))std::isdigit) == s.end()
If you do not have a Standard Library implementation with std::find_if_not, you can easily write it:
template <typename ForwardIt, typename Predicate>
ForwardIt find_if_not(ForwardIt first, ForwardIt last, Predicate pred)
{
for (; first != last; ++first)
if (!pred(first))
return first;
return first;
}
You can use the string::find_first_not_of member function to test for numeric characters.
if (mystring.find_first_not_of("0123456789") == std::string::npos)
{
std::cout << "numeric only!";
}
The regular expression for this is rather trivial. Just try to match "\\D". This will match on any character that's not a digit. If you'd like it to include a decimal separator too, you could use "[^\\d\\.]", which translates to "not a digit or dot".
However, how about simply using strtol() to read the number? You'll be able to retrieve a pointer to the first non-number character. So, if this points to the end of the string, it's been fine. Plus side here is, you won't even need TR1 for this.