C++: is it possible to use "universal" pointer to vector? - c++

Good day, SO community!
I am new to C++ and I've ran into a situation in my project, where I have 2 vectors of similar paired data types:
std::vector<std::pair<int, std::string> firstDataVector
std::vector<std::pair<int, std::string> secondDataVector
and in one part of the code I need to select and process the vector depending on the external string value. So my question is - is it possible to create a pointer to vector outside of the conditions
if (stringValue.find("firstStringCondition"))
{
//use firstDataVector
}
if (stringValue.find("secondStringCondition"))
{
//use secondDataVector
}
some kind of pDataVector pointer, to which could be assigned the existing vectors (because now project has only two of them, but the vectors count might be increased)
I've tried to createstd::vector<std::string> &pDataVector pointer, but it will not work because reference variable must be initialized. So summarizing the question - is it possible to have universal pointer to vector?

You are trying to create a reference to one of the vectors - and that's certainly possible, but it must be initialized to reference it. You can't defer it.
It's unclear what you want to happen if no match is found in stringValue so I've chosen to throw an exception.
now project has only two of them, but the vectors count might be increased
Create a vector with a mapping between strings that you would like to try to find in stringValue and then the vector you'd like to create a reference to.
When initializing pDataVector, you can call a functor, like a lambda, that returns the reference.
In the functor, loop over the vector holding the strings you'd like to try to find, and return the referenced vector on the first match you get.
It could look like this:
#include <functional>
#include <iostream>
#include <string>
#include <vector>
int main() {
using vpstype = std::vector<std::pair<int, std::string>>;
vpstype firstDataVector{{1, "Hello"}};
vpstype secondDataVector{{2, "World"}};
// A vector of the condition strings you want to check keeping the order
// in which you want to check them.
std::vector<std::pair<std::string, std::reference_wrapper<vpstype>>>
conditions{
{"firstStringCondition", firstDataVector},
{"secondStringCondition", secondDataVector},
// add more mappings here
};
// an example stringValue
std::string stringValue = "ssdfdfsdfsecondStringConditionsdfsfsdf";
// initialize the vpstype reference:
auto& pDataVector = [&]() -> vpstype& {
// loop over all the strings and referenced vpstypes:
for (auto& [cond, vps] : conditions) {
if (stringValue.find(cond) != std::string::npos) return vps;
}
throw std::runtime_error("stringValue doesn't match any condition string");
}();
// and use the result:
for (auto [i, s] : pDataVector) {
std::cout << i << ' ' << s << '\n'; // prints "2 world"
}
}

You can indeed inintialize references conditionally. Either use a function or lambda that returns the vector you want to reference, or hard code it like below.
std::vector<std::string> &pDataVector =
(stringValue.find("firstStringCondition") != std::string::npos) ?
firstDataVector : ((stringValue.find("secondStringCondition") != std::string::npos) ?
secondDataVector : thirdDataVector);

Related

Combining regex and ranges causes memory issues

I wanted to construct a view over all the sub-matches of regex in text. Here are two ways to define such a view:
char const text[] = "The IP addresses are: 192.168.0.25 and 127.0.0.1";
std::regex regex{R"((\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}))"};
auto sub_matches_view =
std::ranges::subrange(
std::cregex_iterator{std::ranges::begin(text), std::ranges::end(text), regex},
std::cregex_iterator{}
) |
std::views::join;
auto sub_matches_sv_view =
std::ranges::subrange(
std::cregex_iterator{std::ranges::begin(text), std::ranges::end(text), regex},
std::cregex_iterator{}
) |
std::views::join |
std::views::transform([](std::csub_match const& sub_match) -> std::string_view { return {sub_match.first, sub_match.second}; });
sub_matches_view's value type is std::csub_match. It is created by first constructing a view of std::cmatch objects (via the regex iterator), and since each std::cmatch is a range of std::csub_match objects, it is flattened with std::views::join.
sub_matches_sv_view's value type is std::string_view. It is identical to sub_matches_view, except it also wraps each element of sub_matches_view in a std::string_view.
Here's an usage example of the above ranges:
for(auto const& sub_match : sub_matches_view) {
std::cout << std::string_view{sub_match.first, sub_match.second} << std::endl; // #1
}
for(auto const& sv : sub_matches_sv_view) {
std::cout << sv << std::endl; // #2
}
Loop #1 works without problems - the printed results are correct. However, loop #2 causes heap-use-after-free issues according to the Address Sanitizer. In fact, just looping over sub_matches_sv_view without accessing the elements at all causes this problem too. Here is the code on Compiler Explorer as well as the output of the Address Sanitizer.
I am out of ideas as to where my mistake is. text and regex never go out of scope, I don't see any iterators that might be accessed outside of their lifetimes. The std::csub_match object holds iterators (.first, .second) into text, so I don't think it needs to remain alive itself after constructing the std::string_view in std::views::transform.
I know there are many other ways to iterate over regex matches, but I am specifically interested in what's causing the memory bugs in my program, I don't need work-arounds for this issue.
The problem is std::regex_iterator and the fact that it stashes.
That type basically looks like this:
class regex_iterator {
vector<match> matches;
public:
auto operator*() const -> vector<match> const& { return matches; }
};
What this means, for instance, is that even though this iterator's reference type is T const&, if you have two copies of the same iterator, they'll actually give you references into different objects.
Now, join_view<R>::iterator basically looks like this:
class iterator {
// the iterator into the range we're joining
iterator_t<R> outer;
// an iterator into *outer that we're iterating over
iterator_t<range_reference_t<R>> inner;
};
Which, for regex_iterator, roughly looks like this:
class iterator {
// the regex matches
vector<match> outer;
// the current match
match* inner;
};
Now, what happens when you copy this iterator? The copy's inner still refers to the original's outer! These aren't actually independent in the way that you'd expect. Which means that if the original goes out of scope, we have a dangling iterator!
This is what you're seeing here: transform_view ends up copying the iterator (as it is certainly allowed to do), and now you have a dangling iterator (libc++'s implementation moves instead, which is why it happens to work in this case as 康桓瑋 pointed out). But we can reproduce the same issue without transform as long as we copy the iterator and destroy the original. For instance:
#include <ranges>
#include <regex>
#include <iostream>
#include <optional>
int main() {
std::string_view text = "The IP addresses are: 192.168.0.25 and 127.0.0.1";
std::regex regex{R"((\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3}))"};
auto a = std::ranges::subrange(
std::cregex_iterator(std::ranges::begin(text), std::ranges::end(text), regex),
std::cregex_iterator{}
);
auto b = a | std::views::join;
std::optional i = b.begin();
std::cout << std::string_view((*i)->first, (*i)->second) << '\n'; // fine
auto j = *i;
i.reset();
std::cout << std::string_view(j->first, j->second) << '\n'; // boom
}
I'm not sure what a solution to this problem would look like, but the cause is the std::regex_iterator and not the views::join or the views::transform.

How to find the value for a key in unordered map?

I am trying to do the below sample in unordered map in c++
my_dict = {'one': ['alpha','gamma'], 'two': ['beta'], 'three' : ['charlie']}
print(my_dict["one"]) // ['alpha','gamma']
I tried using find operator like below
int main ()
{
std::unordered_map<std::string, std::vector<std::string>> dict;
dict["one"].push_back("alpha");
dict["one"].push_back("beta");
dict["two"].push_back("gamma");
auto it = dict.find("one");
cout<<it->second<<endl; // expected output alphabeta
return 0;
}
But I am not able to retrieve the value for this key dict["one"]. Am i missing anything ?
Any help is highly appreciated.
Thanks
The failure you are encountering is due to it->second being a std::vector object, which cannot be printed to std::cout because it lacks an overload for operator<<(std::ostream&,...).
Unlike languages like Python that do this for you, in C++ you must manually loop through the elements and print each entry.
To fix this, you will need to change this line:
cout<<it->second<<endl; // expected output alphabeta
To instead print each object in the container. This could be something simple like looping through all elements and printing them:
for (const auto& v : it->second) {
std::cout << v << ' '; // Note: this will leave an extra space at the end
}
std::cout << std::endl;
Or you can go more complex if the exact formatting is important.
#DanielLangr posted a link in the comments to the question that summarizes all possible ways of doing this, and I recommend taking a look if you're wanting anything more complex: How do I print the contents to a Vector?
This is because your it->first will point to the key of the dictionary i.e. "One" and it->second will point to the value i.e. the vector.
So to print elements of the vector you need to specify the indexes of the vector that you are printing as well. The following code will give you the result you want:
int main() {
std::unordered_map <std::string, std::vector<std::string>> dict;
dict["one"].push_back("alpha");
dict["one"].push_back("beta");
dict["two"].push_back("gamma");
auto it = dict.find("one");
cout<<it->second[0]<<it->second[1]<<endl; // expected output alphabeta
return 0;
}
P.S. Please accept my answer if you find it useful as that would help me get some reputation points

Can we get an iterator that filters a vector from a predicate in C++?

Is it possible to get an iterator over a vector that filters some element with a predicate, i.e. showing a view of the vector?
I think remove_if does something similar but I have not found whether I can use it as I want to or not.
Something like:
auto it = filter(vec.begin(), vec.end(), predicate);
// I can reuse the iterator like:
for (auto i = it; i != vec.end(); i++)
// ...
Edit: (A bit more context to get the best answer) I am doing a lot of queries in an sqlite database of log data in order to print a report.
The performances are not good at the moment because of the number of request needed. I believe querying once the database and storing the result in a vector of smart pointers (unique_ptr if possible), then querying the vector with pure C++ may be faster.
Using copy_if is a good way to do the queries, but I don't need to copy everything and it might cost too much at the end (not sure about that), I should have mentioned than the data are immutable in my case.
As #Jarod42 mentioned in the comments one solution would be using ranges:
#include <algorithm>
#include <iostream>
#include <vector>
#include <range/v3/view/filter.hpp>
#include <range/v3/view/transform.hpp>
int main()
{
std::vector<int> numbers = { 1, 2, 3 ,4, 5 };
auto predicate = [](int& n){ return n % 2 == 0; };
auto evenNumbers = numbers | ranges::view::filter(predicate);
auto result = numbers | ranges::view::filter(predicate)
| ranges::view::transform([](int n) { return n * 2; });
for (int n : evenNumbers)
{
std::cout << n << ' ';
}
std::cout << '\n';
for (int n : result)
{
std::cout << n << ' ';
}
}
evenNumbers is a range view adapter which sticks to the numbers range and changes the way it iterates.
result is a ranges of numbers that have been filtered on the predicate and then have been applied a funciton.
see the compile at compiler-explorer
credit: fluentcpp
Your question
Can we get an iterator that filters a vector from a predicate in C++?
in the sense you are asked it, can only be answered with: No. At the moment not (C++17). As per your requirement the iterator then would have to store the predicate and checking that for each modification of the position or for all dereferencing stuff. I.e before any dereferencing, the predicate would need to be checked. Because other code could modifiy your std::vector. The the iterator would need to check the predicate all the time. Also standard functionality like begin, end, distance would be rather complicated.
So you could create your own iterator by deriving from an existing iterator. Store the predicate and overload most of the functions to take care of the predicate. Very, very complicated, much work and maybe not, what you want to have. This would be the only way to get exact your requested functionality.
For work arounds, there are are many other possible solutions. Peolple will show you here.
But if I read your statement
"showing a view of the vector"
then life becomes easier. You can easily create a view of a vector by copying it conditionally with std::copy_if, as oblivion has written. That is in my opinion the best answer. It is none destructive. But it is a snapshot and not the original data. So, it is read only. And, it does not take into account changes to the original std::vector after the snapshot has been taken.
The second option, a combination of std::remove_if and std::erase, will destroy the original data. Or better said, it will invalidate the filtered out data. You could also std::copy_if the unwanted data to a backup area, std::remove_if them, and at the end add them again to the vector.
All these methods are critical, if the original data will be modified.
Maybe for you the standard std::copy_if is best to create a view. You would then return an iterator of copy and work with that.
#include <iostream>
#include <vector>
#include <algorithm>
int main()
{
std::vector<int> testVector{ 1,2,3,4,5,6,7 }; // Test data
std::vector<int> testVectorView{}; // The view
// Create predicate
auto predForEvenNumbers = [](const int& i) -> bool { return (i % 2 == 0); };
// And filter. Take a snapshot
std::copy_if(testVector.begin(), testVector.end(), std::back_inserter(testVectorView), predForEvenNumbers);
// Show example result
std::vector<int>::iterator iter = testVectorView.begin();
std::cout << *iter << '\n';
return 0;
}
Please note. For big std::vectors, it will become a very expensive solution . . .

Creating a link between two vectors in c++

This is a conceptual question, so I am not providing the "working code" for this reason.
Imagine one has two std::vector of different types and different number of entities, just for example:
vector <int> A;
vector <string> B;
One has a set of rules following which one can associate any members of A with some(or none) of the members of B.
Is there a way to "store" this connection?
I was thinking that one of the ways to do so is having a vector <map <int, vector <string> > > or vector <map <int, vector <string*> > >, but this solutions seems to me unreliable (if A contains two same numbers for example) and I assume there are much more elegant solutions somewhere there.
You could implement some database techniques: indices. Place your data into a single vector then create std::map for each way you want to index your data or relate the data.
Rather than 2 vectors, make one vector of structures:
struct Datum
{
int value;
string text;
};
// The database
std::vector<Datum> database;
// An index table by integer
std::map<int, // Key
unsigned int vector_index> index_by_value;
// An index table, by text
std::map<std::string, // Key
unsigned int index_into_vector> index_by text;
The index tables give you a quick method to find things in the database, without having to sort the database.
A std::multiset of std::pairs would be able to map multiple int*s to zero or more std::string*s:
std::multiset < std::pair<int*, std::vector<std::string*>>> map_A_to_B;
Example:
#include <set>
#include <vector>
#include <string>
#include <utility>
#include <iostream>
int main()
{
std::vector<int> A{3,3,1,5};
std::vector<std::string> B{"three a", "three b", "one", "five", "unknown"};
std::multiset < std::pair<int*, std::vector<std::string*>>> map_A_to_B{
{&A[0],{&B[0],&B[1]}},
{&A[1],{&B[0],&B[1],&B[4]}},
{&A[2],{&B[2]}},
{&A[3],{&B[3]}},
};
for(auto e : map_A_to_B) {
for(auto s : e.second) {
std::cout << *e.first << " linked to " << *s << '\n';
}
std::cout << "------------------------------\n";
}
}
produces:
3 linked to three a
3 linked to three b
------------------------------
3 linked to three a
3 linked to three b
3 linked to unknown
------------------------------
1 linked to one
------------------------------
5 linked to five
------------------------------
Based on your comment, it seems like you want an actual mapping (as in math, from a set A to a set B) that is general (not one-to-one or onto). First you have to conceptually understand what you want. First, you want a mapping between a class A (say int in your example) to B (string). Let's template this:
template <class From, class To>
bool isMapped(From A,To B) {
return (//Code representing mapping,say check if A=int->char is in B=string)
}
Now the mapping of a From value to a To vector is (in math terms) the range in "To" which is reachable (isMapped) form this value:
template<class From, class To>
List<To>& getRange(From value, To range) {
List<To> result();
for (const auto& toValue : range) {
if(isMapped(value,toValue)
result.push_back(toValue);
return result;
This will return the range the From value is mapped to in the To vector, with duplicates if they appear more than once in the range. Another option (maybe better) would be to iterate over indices instead of values in the range, and return a Boolean vector of the length of range with true in the indices where From is mapped to.
Similarly you would need to define the opposite mapping. Probably you couldn't make this completely general, and maybe even templates won't fit this simply - you would need to give more specifics.
So concluding, the mapping from A to B would be a vector of length of vector A (the domain) of vectors of length B (domain) with True/False in the relevant indices.
There are of course, more possibilities.
You could use Boost to implement a bidirectional map - that would allow you to use either of the values as a key. Here is an example of how to use it. But, in short: (usage only, without definitions)
struct from {}; // tag for boost
typedef bidirectional_map<int, std::string>::type bi_map;
bi_map values;
values.insert(bi_map::value_type(123, "{"));
// ...
// ...
bi_map::iterator it = values.get<from>().find(123);
if (it != values.end()) {
cout << "Char #123 is " << it->second << endl;
// and in the opposite case, where "it" is the result of:
// values.get<to>().find("{")
// it->second would be 123, so you have access to both items
}

Is it possible to iterate over an iterator?

I have a working program that capitalizes strings in a vector, using iterators:
vector<string> v7{ 10, "apples" };
for (auto vIterator= v7.begin(); vIterator!= v7.end(); ++vIterator){
auto word = *vIterator; //here
auto charIterator = word.begin();
*charIterator = toupper(*charIterator);
*vIterator = word; //also here, i guess i could just print `word` instead?
cout << *vIterator << endl;
}
My question is;
2nd line inside the loop # the comment, i had to save the pointer to the iterator to another string variable before i was able to iterate over it.
Iterating over the pointer like so
*vIterator.begin();
didn't seem to work.
Is this the correct practice, or am i missing something?
I'm new to the C languages, the concept behind pointer-like tools is quite hard to understand even if i can use them, and in this case it just feels like I'm doing it wrong.
Edit: It was a syntax error (*vIterator).begin();
It just didn't make sense why i'd have to save it to another variable before iterating over it, cheers.
Since you are using C++11 look how simpler your code can become using ranged loops like the example below:
std::vector<std::string> v(10, "apples");
for(auto &&word : v) {
word[0] = toupper(word[0]);
}
LIVE DEMO
Now as far as it concerns the (*vIterator.begin(); didn't seem to work.):
The dot operator (i.e., .) has a higher precedence than the dereference operator (i.e., *). Thus, *vIterator.begin() is interpreted as *(vIterator.begin()). The compiler rightfully complains because vIterator hasn't got a member begin().
Think of iterators as if they were pointers. The correct way to access the members of an object via a pointer/iterator pointing to it is either using the arrow operator (i.e., vIterator->begin()) or first dereference the pointer/iterator and then use the dot operator (i.e., (*vIterator).begin()).
So your code via the use of iterators would become:
std::vector<std::string> v(10, "apples");
for(auto it(v.begin()), ite(v.end()); it != ite; ++it) {
*(it->begin()) = toupper(*(it->begin()));
}
LIVE DEMO
The correct way to write *vIterator.begin(); is (*vIterator).begin(); or, more often, vIterator->begin();. Also note that you can also access the first character of a string directly (without having to iterate over it) as word[0].
A simple STL-ish way of doing it:
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
int main()
{
vector<string> v7{ 10, "apples" };
for_each(v7.begin(), v7.end(), [](string& word){word[0] = toupper(word[0]);});
}