Initialize a map with regex - c++

I'm using a very nice and simple std::vector<std::string> initializer which takes an input string and regex. It's similar to a basic split, just it works with regex Group1 matches:
static std::vector<std::string> match(const std::string& str, const std::regex& re) {
return { std::sregex_token_iterator(str.begin(), str.end(), re, 1), std::sregex_token_iterator() };
}
Construction of a vector is done like below:
std::string input = "aaa(item0,param0);bbb(item1,param1);cc(item2,param2);";
std::vector<std::string> myVector = match(input, std::regex(R"(\(([^,]*),)"));
This results a vector containing item0,item1,item2 extracted from an input string with regex:
Now my match function uses the first group results of the regex and (I believe) utilizes the vector's intialization form of:
std::vector<std::string> myVector = { ... };
I'd like to create a similar match function to construct a std::map<std::string,std::string>. Map also has the above initializator:
std::map<std::string,std::string> myMap = { {...}, {...} };
My idea is to modify the regex to create more group results:
And I would like to modify the above match function to create a nice map for me with the modified regex (\(([^,]*),([^)]*)), resulting the same as this:
std::map<std::string,std::string> myMap = { {"item0", "param0"}, {"item1", "param "}, {"item2", "param2"}, };
What I've tried?
static std::map<std::string, std::string> match(const std::string& str, const std::regex& re) {
return { std::sregex_token_iterator(str.begin(), str.end(), re, {1,2}), std::sregex_token_iterator() };
}
This one (in case of a vector) would put both Group1 and Group2 results into the vector. But it can not initialize a map.
How can I still do that easily (Is it not possible with sregex_token_iterator)?

I don't know what 'easily' does mean exactly, so here comes simple solution:
#include <iostream>
#include <regex>
#include <vector>
static std::map<std::string, std::string> match(const std::string& str, const std::regex& re) {
std::map<std::string, std::string> retVal;
auto token = std::sregex_token_iterator(str.begin(), str.end(), re, {1,2});
for (auto it=token++, jt=token; it != std::sregex_token_iterator(); ++it, jt = it++)
retVal.emplace(*it,*jt);
return retVal;
}
int main() {
std::string input = "aaa(item0,param0);bbb(item1,param1);cc(item2,param2);";
auto myVector = match(input, std::regex(R"(\(([^,]*),([^)]*))"));
for (const auto& item : myVector)
std::cout<<item.first<<'\t'<<item.second<<std::endl;
}
You can also could try to use boost and homemade generic algorithm.

Related

Copying string into set<string> in lowercase?

I have the following small and easy code:
int main(int argc, char *argv[]) {
std::vector<std::string> in;
std::set<std::string> out;
in.push_back("Hi");
in.push_back("Dear");
in.push_back("Buddy");
for (const auto& item : in) {
*** std::transform(item.begin(),item.end(),item.begin(), ::tolower);
*** out.insert(item);
}
return 0;
}
I'd like to copy all items of in into out.
However, with an in-place lowercase conversion, preferrably without an extra temporary variable.
So this is the required content of out at the end:
hi
dear
buddy
Please note, const auto& item is fixed, meaning I can't remove the const requirement (this is part of a bigger library, here is over-simplified for demo).
How should I do this? (If I remove the "const" modifier, it works, but if modifications are not allowed on item, how can I still insert the item into the set, while also transforming it to lowercase?)
Note, you have to copy - since items in the original in container can not be moved into out container. The below code makes the copy of each element exactly once.
...
in.push_back("Hi");
in.push_back("Dear");
in.push_back("Buddy");
std::transform(in.begin(), in.end(), std::inserter(out, out.end()),
[] (std::string str) { boost::algorithm::to_lower(str); return str;}
);
return 0;
You need a lambda function with transform, and you shouldn't have a const & to your strings or transform can't modify it.
#include <algorithm>
#include <set>
#include <string>
#include <vector>
int main()
{
std::vector<std::string> in;
std::set<std::string> out;
in.push_back("Hi");
in.push_back("Dear");
in.push_back("Buddy");
for (/*const*/ auto& item : in) // you can't have a const & if your going to modify it.
{
std::transform(item.begin(), item.end(), item.begin(), [](const char c)
{
return static_cast<char>(::tolower(c));
});
out.insert(item);
}
return 0;
}

Multithreaded idiomatic find first of substrings in a string using modern C++

It is easy to find a string in a set of strings using set::find or first of a set of strings in a set of strings using std::find_first_of. But I think that STL doesn't handle this case of find_first_of set of strings (substrings) in a string. For low latency reasons I use parallel execution, would you please let me know if this implementation is idiomatic using modern C++ :
#include <string>
#include <list>
#include <atomic>
#include <execution>
#include <iostream>
class Intent{
const std::list<std::string> m_Context;
const std::string m_Name;
std::atomic_bool m_Found;
public:
Intent(const std::list<std::string> context, const std::string name)
: m_Context(context)
, m_Name(name)
, m_Found(false)
{}
Intent(const Intent & intent) = delete;
Intent & operator=(const Intent & intent) = delete;
Intent(Intent && intent) : m_Context(std::move(intent.m_Context))
, m_Name(std::move(intent.m_Name))
, m_Found(static_cast< bool >(intent.m_Found))
{}
bool find(const std::string & sentence)
{
for_each( std::execution::par
, std::begin(m_Context)
, std::end(m_Context)
, [& m_Found = m_Found, & sentence](const std::string & context_element){
//
// Maybe after launching thread per context_element one of them make intent Found
// so no need to run string::find in the remaining threads.
//
if(!m_Found){
if(sentence.find(context_element) != std::string::npos)
{
m_Found = true;
}
}
}
);
return m_Found;
}
const bool getFound() const {return m_Found;}
const std::string & getName() const {return m_Name;}
};
int main()
{
Intent intent({"hello", "Hi", "Good morning"}, "GREETING");
std::cout << intent.find("Hi my friend.");
}
I think the idiomatic way of doing it would be to use std::find_if. Then you don't need the atomic<bool> either.
// return iterator to found element or end()
auto find(const std::string & sentence)
{
return std::find_if( std::execution::par
, std::begin(m_Context)
, std::end(m_Context)
, [&sentence](const std::string & context_element) {
return sentence.find(context_element) != std::string::npos;
}
);
}
If you really only want a bool you could use std::any_of:
bool find(const std::string & sentence)
{
return std::any_of( std::execution::par
, std::begin(m_Context)
, std::end(m_Context)
, [&sentence](const std::string & context_element) {
return sentence.find(context_element) != std::string::npos;
}
);
}
You may want to consider using a std::vector instead of a std::list too. vectors provide random access iterators while lists only provide bidirectional iterators.

How to Find a Substring in a String in a Multimap

How do you find the substring within a string in the key of a multimap? For example, if I enter "Louis," then Louisville, Louisberg, and StLouis are found?
For what you want to do you will have to search within each and every key, not just the prefix or suffix. I don't think there's any way round that and it cannot be optimised since the search term can occur anywhere within the key.
You can use std::find_if and supply a predicate function for matching elements and iterate your map. I am using std::map in the code below but this could apply for a std::multimap too.
#include <map>
#include <string>
#include <algorithm>
#include <iostream>
int main()
{
std::map<std::string, std::string> myMap{
{"Louis", "AA"}, {"Louisville", "BBB"}, {"Louisberg", "A"},
{"StLouis ", "C"}, {"Huntsville", "D"} };
std::string term("Louis");
auto keyContains = [&term](const std::pair<std::string, std::string>& item)
{
return item.first.find(term) != std::string::npos;
};
auto iter = std::find_if(myMap.begin(), myMap.end(), keyContains);
while (iter != myMap.end())
{
std::cout << iter->first << std::endl;
iter = std::find_if(std::next(iter), myMap.end(), keyContains);
}
}
keyContains is lambda function. If you're not familiar with lambda functions you can use a functor instead:
struct keyContains
{
keyContains(const std::string& searchTerm) : mSearchTerm(searchTerm) {}
bool operator() (const std::pair<std::string, string>& item) const
{
return item.first.find(mSearchTerm) != std::string::npos;
}
std::string mSearchTerm;
};
Then initialise it like this: keyContains comp("Louis") and pass comp as the predicate.
Hope this is helpful? It's effectively a for loop that walks the map. Working version here.
Update:
I just read your comment where you say your search should return 54049 results. That's a lot of records! To do that it is better to match against prefix or suffix. You can use std::map::lower_bound() and std::map::upper_bound().
Why bother with a multimap for that?
std::string search_term = "Louis";
std::unordered_set<std::string> data = { "Louisville", "Louisberg", "StLouis" };
for (std::string const& each : data) {
if (std::find(each.begin(), each.end(), search_term) != std::string::npos) {
// contains it
}
}
That would be a good option I believe, if you really have to use a multimap then doing substrings of keys is kind of hard, since that's not what they're made for

C++ - How to convert a function to a template function

I have a C++ function that takes a comma separated string and splits in a std::vector<std::string>:
std::vector<std::string> split(const std::string& s, const std::string& delim, const bool keep_empty = true) {
std::vector<std::string> result;
if (delim.empty()) {
result.push_back(s);
return result;
}
std::string::const_iterator substart = s.begin(), subend;
while (true) {
subend = std::search(substart, s.end(), delim.begin(), delim.end());
std::string temp(substart, subend);
if (keep_empty || !temp.empty()) {
result.push_back(temp);
}
if (subend == s.end()) {
break;
}
substart = subend + delim.size();
}
return result;
}
However, I would really like to be able to apply this function to mutiple datatypes. For instance, if I have the input std::string:
1,2,3,4,5,6
then I'd like the output of the function to be a vector of ints. I'm fairly new to C++, but I know there are something called template types, right? Would this be possible to create this function as a generic template? Or am I misunderstanding how template functions work?
You can declare the template function as:
template<class ReturnType>
std::vector<ReturnType> split(const std::string&, const std::string&, const bool = true);
and then specialize it for every vector type you want to allow:
template<>
std::vector<std::string> split(const std::string& s, const std::string& delim, const bool keep_empty) {
// normal string vector implementation
}
template<>
std::vector<int> split(const std::string& s, const std::string& delim, const bool keep_empty) {
// code for converting string to int
}
// ...
You can read about string to int conversion here.
You will then need to call split as:
auto vec = split<int>("1,2,3,4", ",");
You can "templatise" this function - to start it you just need to replace std::vector<std::string> with 'std::vectorand addtemplate` before the function. But you need to take care of how to put the strings into the resulting vector. In your current implementation you just have
result.push_back(temp);
because result is vector of strings, and temp is string. In the general case though it is not possible, and if you want to use this function with e.g. vector<int> this line will not compile. However this problem is easily solved with another function - template again - which will convert string to whatever type you want to use split with. Let's call this function convert:
template<typename T> T convert(const std::string& s);
Then you need to provide specialisations of this function for any type you need. For instance:
template<> std::string convert(const std::string& s) { return s; }
template<> int convert(const std::string& s) { return std::stoi(s); }
In this way you do not need to specialise the entire function as the other answer suggests, only the part depending on the type. The same should be done for the line
result.push_back(s);
in the case without delimiters.
Your function can be generalized fairly easily to return a vector of an arbitrary type using Boost.LexicalCast. The only hiccup is this condition:
if (delim.empty()) {
result.push_back(s);
return result;
}
This only works right now because both the input and output types are std::string, but obviously cannot work if you're returning a vector containing a type other than std::string. Using boost::lexical_cast to perform such an invalid conversion will result in boost::bad_lexical_cast being thrown. So maybe you want to rethink that part, but otherwise the implementation is straightforward.
#include <boost/lexical_cast.hpp>
template<typename Result>
std::vector<Result>
split(const std::string& s, const std::string& delim, const bool keep_empty = true)
{
std::vector<Result> result;
if (delim.empty()) {
result.push_back(boost::lexical_cast<Result>(s));
return result;
}
std::string::const_iterator substart = s.begin(), subend;
while (true) {
subend = std::search(substart, s.end(), delim.begin(), delim.end());
std::string temp(substart, subend);
if (keep_empty || !temp.empty()) {
result.push_back(boost::lexical_cast<Result>(temp));
}
if (subend == s.end()) {
break;
}
substart = subend + delim.size();
}
return result;
}
Basically, all I've done is made the result type a template parameter and replaced
result.push_back(x);
with
result.push_back(boost::lexical_cast<Result>(x));
If you cannot use Boost, take a look at this answer that shows how to convert a string to some other type using a stringstream.

C++ std::find with a custom comparator

This is basically what I want to do:
bool special_compare(const string& s1, const string& s2)
{
// match with wild card
}
std::vector<string> strings;
strings.push_back("Hello");
strings.push_back("World");
// I want this to find "Hello"
find(strings.begin(), strings.end(), "hell*", special_compare);
// And I want this to find "World"
find(strings.begin(), strings.end(), "**rld", special_compare);
But std::find doesn't work like that unfortunately. So using only the STL, how can I do something like this?
Based on your comments, you're probably looking for this:
struct special_compare : public std::unary_function<std::string, bool>
{
explicit special_compare(const std::string &baseline) : baseline(baseline) {}
bool operator() (const std::string &arg)
{ return somehow_compare(arg, baseline); }
std::string baseline;
}
std::find_if(strings.begin(), strings.end(), special_compare("hell*"));
The function you need to use is this : std::find_if, because std::find doesn't take compare function.
But then std::find_if doesn't take value. You're trying to pass value and compare both, which is confusing me. Anyway, look at the documentation. See the difference of the usage:
auto it1 = std::find(strings.begin(), strings.end(), "hell*");
auto it2 = std::find_if(strings.begin(), strings.end(), special_compare);
Hope that helps.
You'll need std::find_if(), which is awkward to use, unless you're on a C++11 compiler. Because then, you don't need to hardcode the value to search for in some comparator function or implement a functor object, but can do it in a lambda expression:
vector<string> strings;
strings.push_back("Hello");
strings.push_back("World");
find_if(strings.begin(), strings.end(), [](const string& s) {
return matches_wildcard(s, "hell*");
});
Then you write a matches_wildcard() somewhere.
Since nobody has mentioned std::bind yet, I'll propose this one
#include <functional>
bool special_compare(const std::string& s, const std::string& pattern)
{
// match with wild card
}
std::vector<std::string> strings;
auto i = find_if(strings.begin(), strings.end(), std::bind(special_compare, std::placeholders::_1, "hell*"));
With C++11 lambdas:
auto found = find_if(strings.begin(), strings.end(), [] (const std::string& s) {
return /* you can use "hell*" here! */;
});
If you can't use C++11 lambdas, you can just make a function object yourself. Make a type and overload operator ().
I wanted to have an example with custom class having custom find logic but didn't find any answer like that. So I wrote this answer which uses custom comparator function (C++11) to find an object.
class Student {
private:
long long m_id;
// private fields
public:
long long getId() { return m_id; };
};
Now suppose, I want to find the student object whose m_id matches with a given id. I can write std::find_if like this:
// studentList is a vector array
long long x_id = 3; // local variable
auto itr = std::find_if(studentList.begin(), studentList.end(),
[x_id](Student& std_val)
{ return std_val.getId() == x_id; }
);
if(itr == studentList.end())
printf("nothing found");