c++ variant class member stored by reference - c++

I am trying to experiment with std::variant. I am storing an std::variant as a member of a class. In the below code, things work fine if the variant is stored by value, but does not work (for the vector case, and for custom objects too) if the variant is stored by reference. Why is that?
#include <variant>
#include <vector>
#include <iostream>
template<typename T>
using VectorOrSimple = std::variant<T, std::vector<T>>;
struct Print {
void operator()(int v) { std::cout << "type = int, value = " << v << "\n"; }
void operator()(std::vector<int> v) const { std::cout << "type = vector<int>, size = " << v.size() << "\n"; }
};
class A {
public:
explicit A(const VectorOrSimple<int>& arg) : member(arg) {
print();
}
inline void print() const {
visit(Print{}, member);
}
private:
const VectorOrSimple<int> member; // const VectorOrSimple<int>& member; => does not work
};
int main() {
int simple = 1;
A a1(simple);
a1.print();
std::vector<int> vector(3, 1);
A a2(vector);
a2.print();
}
See http://melpon.org/wandbox/permlink/vhnkAnZhqgoYxU1H for a working version, and http://melpon.org/wandbox/permlink/T5RCx0ImTLi4gk5e for a crashing version with error : "terminate called after throwing an instance of 'std::bad_variant_access'
what(): Unexpected index"
Strangely, when writing a boost::variant version of the code with the member stored as a reference, it works as expected (prints vector size = 3 twice) with gcc7.0 (see here http://melpon.org/wandbox/permlink/eW3Bs1InG383vp6M) and does not work (prints vector size = 3 in constructor and then vector size = 0 on the subsequent print() call, but no crash) with clang 4.0 (see here http://melpon.org/wandbox/permlink/2GRf2y8RproD7XDM).
This is quite confusing. Can someone explain what is going on?
Thanks.

It doesn't work because this statement A a1(simple); creates a temporary variant object!
You then proceed to bind said temporary to your const reference. But the temporary goes out of scope immediately after the construction of a1 is over, leaving you with a dangling reference. Creating a copy works, obviously, since it always involves working with a valid copy.
A possible solution (if the performance of always copying worries you) is to accept a variant object by-value, and then move it into your local copy, like so:
explicit A(VectorOrSimple<int> arg) : member(std::move(arg)) {
print();
}
This will allow your constructor to be called with either lvalues or rvalues. For lvalues your member will be initialized by moving a copy of the source variant, and for rvalues the contents of the source will just be moved (at most) twice.

Variants are objects. They contain one of a set of types, but they are not one of those types.
A reference to a variant is a reference to the variant object, not a reference to one of the contained types.
A variant of reference wrappers may be what you want:
template<class...Ts>
using variant_ref=std::variant<std::reference_wrapper<Ts>...>;
template<typename T>
using VectorOrSimple = std::variant<T, std::vector<T>>;
template<typename T>
using VectorOrSimpleRef = variant_ref<T, std::vector<T>>;
template<typename T>
using VectorOrSimpleConstRef = variant_ref<const T, const std::vector<T>>;
Now store VectorOfSimpleConstRef<int>. (Not const&). And take one in the constructor as well.
Also modify Print to take by const& to avoid needlessly copying that std::vector when printing.

Related

C++ Get reference of object inside vector using function

Can you get a reference to an object that is within a vector through a function? I could do this with pointers easily, but you know, we're all obsessed here with "Don't use pointers".
This is a simple example. The absolute limitation is it must be done from a function call (so that function call can return false if not found).
// Example program
#include <iostream>
#include <string>
#include <vector>
class Dev {
public:
Dev(){}
std::string name;
};
void geter(std::vector<Dev> &devs, Dev &a){
a = devs.at(0);
}
int main()
{
Dev d;
d.name = "original name";
std::vector<Dev> devs;
devs.push_back(d);
Dev a;
geter(devs, a);
a.name = "new name";
std::cout << d.name; // still prints "original name"
}
Can you get a reference to an object that is within a vector through a function?
Yes, but you cannot return that reference via a function parameter. A reference can be bound to an element only at the point of initializing the reference. Once you are inside a function, it is too late to initialize the function's parameters, too late to bind a reference. You can return a reference via a function's return value, but not via an output parameter.
References also fail to cover the "not found" possibility, as a reference must be bound to something.
The language feature that allows the functionality you are looking for is called a "pointer".
References are useful if they can be bound during initialization, never need to change what they are bound to, and never need to be in a state of not being bound. The first parameter to your geter function is an example of this.
Pointers are useful if they need to point to different objects during their lifetime, or if they might need to be in an "unbound" state (a.k.a. be null). Think of a pointer as a reference that can be reseated (refer to a different object than it did before) and that can be unseated (refer to no object). The intended functionality of the second parameter to your geter function is an example of this.
I could do this with pointers easily,
Good. You know the right tool for the job. Do it.
but you know, we're all obsessed here with "Don't use pointers".
No, I do not know that. In fact, that is bad advice when stated that broadly. Pointers still have their place in modern C++. The "obsession" you probably are referring to is "don't use owning pointers". That is, don't use a pointer if you have to remember to delete the thing to which the pointer points. If there is no ownership involved (i.e. no responsibility for freeing memory), then there is nothing inherently bad about using pointers. In fact, pointers are often a more appropriate choice than references when "does not exist" is a valid possibility (just remember to check for null, which your logic would call for anyway).
Note: There are other "obsessions" that fall under "don't use pointers", but I don't see another that is relevant here. For the sake of an example: there is also "don't use a pointer when a reference will get the job done." This is good advice, but in this case a reference will not get the job done.
Nah. Doing it via parameters is silly. It's the year 2020 and you have std::optional and such.
I'd do it as follows. First, some helper code:
#include <type_traits>
#include <vector>
template <typename T> using const_qualified_value_type_impl =
std::conditional_t<std::is_const_v<T>,
typename std::add_const_t<typename T::value_type>,
typename T::value_type>;
template <typename T> using const_qualified_value_type =
const_qualified_value_type_impl<std::remove_reference_t<T>>;
static_assert(std::is_same_v<const_qualified_value_type<std::vector<int>>, int>, "");
static_assert(std::is_same_v<const_qualified_value_type<const std::vector<int>>, const int>, "");
#include <functional>
#include <optional>
template <typename T> class optional_ref
{
std::optional<std::reference_wrapper<T>> val;
public:
template <typename ...Args> constexpr optional_ref(Args &&...args) :
val(std::forward<Args>(args)...) {}
constexpr explicit operator bool() const { return static_cast<bool>(val); }
constexpr auto has_value() const { return val.has_value(); }
constexpr auto &value() const { return val.value().get(); }
constexpr auto &get() const { return val.value().get(); }
constexpr operator T&() const { return val.value().get(); }
};
Now we have an optional_ref type (a very rudimentary one, but still),
and we can use it when creating the get_first getter:
template <typename C>
auto get_first(C && container) -> optional_ref<const_qualified_value_type<C>>
{
auto const begin = container.begin();
static_assert(std::is_reference_v<decltype(*begin)>, "*begin() must return a reference");
if (begin == container.end()) return {};
return *begin;
}
Now a basic test:
#include <cassert>
int main()
{
std::vector<int> vect;
assert(!get_first(vect));
vect.push_back(0);
assert(get_first(vect));
assert(get_first(vect) == 0);
int &first = get_first(vect).value();
++first;
assert(get_first(vect) == 1);
const std::vector<int> cempty;
assert(!get_first(cempty));
const std::vector<int> cnon_empty{0};
assert(get_first(cnon_empty));
assert(get_first(cnon_empty) == 0);
auto &cfirst = get_first(cnon_empty).value();
static_assert(std::is_same_v<decltype(cfirst), const int &>, "");
}
That way:
get_first's return value is bool-convertible in boolean contexts, i.e. you can use it as if it were a bool to check if a reference is valid.
get_first's return value is convertible to a reference to the value stored in the container, and the value type is automatically const-qualified if the container is const-qualified. That typically is what you'd want, although don't take my word for it.
Ideally, optional_ref should be implemented using a pointer, but for demonstration purposes it was quicker to reuse std::optional and std::reference_wrapper.
But the above works (in a rudimentary fashion) under gcc, clang and msvc.
You'd use it as follows:
auto val = get_first(foo);
if (val.has_value())
{
auto &v = val.value();
// use v
}

Is there a way to return either a new object or reference to existing object from a function?

I am trying to write a function, which can either return a reference to an existing object passed as a first argument (if it is in correct state) or create and return a new object using literal passed as a second argument (default).
It would be even better if a function could take not only literal, but also another existing object as a second (default) argument and return a reference to it.
Below is a trivial implementation, but it does a lot of unneeded work:
If called with lvalue as a second (default) argument, it calls a copy constructor of an argument, that is selected for return. Ideally a reference to an object should be returned.
If called with literal as a second (default) argument, it calls constructor, copy constructor and destructor, even if second (default) argument is not selected for return. It would be better if an object is constructed and returned as rvalue reference without calling copy constructor or destructor.
std::string get_or_default(const std::string& st, const std::string& default_st) {
if (st.empty()) return default_st
else return st;
}
Is there a way to accomplish this more efficiently, while still keeping simple for caller? If I am correct, this requires a function to change return type based on run-time decision made inside a function, but I cannot think of a simple solution for caller.
I'm not 100% sure I understood the combinations of requirements but:
#include <iostream>
#include <string>
#include <type_traits>
// if called with an rvalue (xvalue) as 2:nd arg, move or copy
std::string get_or_default(const std::string& st, std::string&& default_st) {
std::cout << "got temporary\n";
if(st.empty())
return std::move(default_st); // rval, move ctor
// return std::forward<std::string>(default_st); // alternative
else
return st; // lval, copy ctor
}
// lvalue as 2:nd argument, return the reference as-is
const std::string& get_or_default(const std::string& st,
const std::string& default_st) {
std::cout << "got ref\n";
if(st.empty()) return default_st;
else return st;
}
int main() {
std::string lval = "lval";
// get ref or copy ...
decltype(auto) s1 = get_or_default("", "temporary1");
decltype(auto) s2 = get_or_default("", std::string("temporary2"));
decltype(auto) s3 = get_or_default("", lval);
std::cout << std::boolalpha;
std::cout << std::is_reference_v<decltype(s1)> << "\n";
std::cout << std::is_reference_v<decltype(s2)> << "\n";
std::cout << std::is_reference_v<decltype(s3)> << "\n";
}
Output:
got temporary
got temporary
got ref
false
false
true
Edit: Made a slightly more generic version after OP:s testing. It can use a lambda, like
auto empty_check = [](const std::string& s) { return s.empty(); };
to test if the first argument is empty.
template<typename T, typename F>
T get_or_default(const T& st, T&& default_st, F empty) {
if(empty(st)) return std::move(default_st);
// return std::forward<T>(default_st); // alternative
else return st;
}
template<typename T, typename F>
const T& get_or_default(const T& st, const T& default_st, F empty) {
if(empty(st)) return default_st;
else return st;
}
Well, there a few things here.
To express what you ask for directly you can use something like std::variant<std::string, std::string&> as your function return type. Although I have not checked if variant can store a reference.
Or some equivalent from a third party library. either<> ?
You can also write your own class wrapping string and string ref.
(Not an real code)
struct StringOrRef {
enum class Type {Value, Ref} type;
union {
std::string value;
std::reference_wrapper<const std::string> ref;
};
...
};
Check the topic: discriminating union in C++.
But I think there is a bigger problem with your example!
Please consider the ownership of data. std::string takes ownership of data passed. That is why it copy data. Thus when your function returns - the called is sure it had a data and don't need to worry about it as long as (s)he holds the value.
In case you design a function to return a reference to passed argument value - you need to make sure that the value is used within the same lifespan as the argument passed (to which the ref is returned)
So consider:
StringOrRef func(strging const& a, string const& b);
...
StringOrRef val;
{ // begin scope:
SomeStruct s = get_defaul();
val = func("some value", s.get_ref_to_internal_string());
}// end of val scope
val; // is in scope but may be referencing freed data.
The problem here is the temporary object SomeStruct s. if it's member function get_ref_to_internal_string() -> string& returns a ref to a string field of that object (which is often the way it is implemented) - then when s goes out of scope - tha ref becomes invalid. that is - it is referencing freed memory which may have been given to some other objects.
And if you capture that reference in val - val will be referencing invalid data.
You will be lucky if it all end in access violation or a signal. At worst your program continues but will be crashing randomly.

What is the reference to an emplaced instance invalidated?

I have a class that stores some data and also a member that needs to modify some of the parent class data. Consider the following simplified example:
#include <iostream>
#include <vector>
#include <string>
struct Modifier {
std::vector<std::string> &stuff;
Modifier(std::vector<std::string> &ref) : stuff(ref) {}
void DoIt() {
std::cout << "stuff.size = " << stuff.size() << '\n';
}
};
struct Container {
std::vector<std::string> stuff;
Modifier modifier;
std::vector<std::string> BuildStuff(int n) {
return std::vector<std::string>{"foo", std::to_string(n)};
}
Container(int n) : stuff(BuildStuff(n)), modifier(stuff) {}
};
int main()
{
std::vector<Container> containers;
containers.emplace_back(5);
containers.emplace_back(42);
containers[0].modifier.DoIt();
containers[1].modifier.DoIt();
return 0;
}
When I run this, one of the emplaced instances correctly reports size 2, but the other one reports size 0. I assume there's some undefined behaviour happening due to emplacing, but I cannot pinpoint what is the root cause.
Also, is there a more elegant way to represent this scenario?
Live example: http://coliru.stacked-crooked.com/a/e68ae9bf2b7e6b75
When you do the second emplace_back, the vector may undergo a reallocation operation: in order to grow, it allocates a new memory block and moves the objects from the old to the new, and frees the old memory block.
Your Modifier object generates a dangling reference when moved: the target object's reference refers to the same object that the old reference did.
To fix this you could add a move-constructor to Container, and either add or delete the copy-constructor. The Modifier has to be initialized to refer to the Container it is a member of; but the default copy- and move-constructors will initialize the Modifier to refer to the source being copy/move'd from.
For example:
Container(Container&& o) : stuff(std::move(o.stuff)), modifier(stuff) {}
Container(Container const& o) : stuff(o.stuff), modifier(stuff) {}

c++11 parameter pack wrong behaviour with Apple LLVM 7.0.0 but works with GCC-5.1

Going from a previous version of this question, thanks to #Gene, I was now able to reproduce this behaviour using a simpler example.
#include <iostream>
#include <vector>
class Wrapper
{
std::vector<int> const& bc;
public:
Wrapper(std::vector<int> const& bc) : bc(bc) { }
int GetSize() const { return bc.size(); }
};
class Adapter
{
Wrapper wrapper;
public:
Adapter(Wrapper&& w) : wrapper(w) { }
int GetSize() const { return wrapper.GetSize(); }
};
template <class T>
class Mixin : public Adapter
{
public:
//< Replace "Types ... args" with "Types& ... args" and it works even with Apple LLVM
template <class ... Types>
Mixin(Types ... args) : Adapter(T(args...)) { }
};
int main()
{
std::vector<int> data;
data.push_back(5);
data.push_back(42);
Mixin<std::vector<int>> mixin(data);
std::cout << "data: " << data.size() << "\n";
std::cout << "mixin: " << mixin.GetSize() << "\n";
return 0;
}
Result using Apple LLVM, tested with -std=c++11 and -std=c++14:
data: 2
mixin: -597183193
Interestingly, I've tested this code also #ideone which uses gcc-5.1 with C++14 enabled, and it works as expected!
data: 2
mixin: 2
Why does mixin.GetSize() return a garbage value on Clang and why does it work with GCC-5.1?
#Gene suggested that I'm using Types ... args which creates a temporary copy of the vector (and using Types& ... args makes it work with LLVM), but that copy would contain the same elements (thus also have the same size).
You have a dangling reference, and mixin.GetSize() is yielding undefined behavior:
Inside of Mixin's constructor, T = std::vector<int>, so Adapter(T(args...)) is passing Adapter's constructor a temporary std::vector<int>
Adapter's constructor parameter is a Wrapper&&, but we're passing it a std::vector<int>&&, so we invoke Wrapper's implicit conversion constructor
Wrapper's constructor parameter is a std::vector<int> const&, and we're passing it a std::vector<int>&&; rvalues are allowed to bind to const-lvalue references, so this is syntactically fine and compiles fine, but in effect we're binding Wrapper::bc to a temporary
Once construction is finished, the lifetime of the temporary created in Mixin's constructor ends, and Wrapper::bc becomes a dangling reference; calls to Adapter::GetSize now yield UB
When Mixin's constructor parameters are changed from Types... to Types&..., Adapter(T(args...)) is still passing Adapter's constructor a temporary std::vector<int>; it only appears to work because you are seeing a different manifestation of UB (likely the stack looks a bit different due to one fewer copies of std::vector<int> being made). I.e., both versions of the code are equally broken/wrong!
So, to answer this concretely:
Why does mixin.GetSize() return a garbage value on Clang and why does it work with GCC-5.1?
Because the behavior of undefined behavior is undefined. ;-] Appearing to work is one possible outcome, but the code is still broken and the appearance of being correct is purely superficial.

Is `std::function` allowed to move its arguments?

While working on this question, I noticed that GCC (v4.7)'s implementation of std::function moves its arguments when they are taken by value. The following code shows this behavior:
#include <functional>
#include <iostream>
struct CopyableMovable
{
CopyableMovable() { std::cout << "default" << '\n'; }
CopyableMovable(CopyableMovable const &) { std::cout << "copy" << '\n'; }
CopyableMovable(CopyableMovable &&) { std::cout << "move" << '\n'; }
};
void foo(CopyableMovable cm)
{ }
int main()
{
typedef std::function<void(CopyableMovable)> byValue;
byValue fooByValue = foo;
CopyableMovable cm;
fooByValue(cm);
}
// outputs: default copy move move
We see here that a copy of cm is performed (which seems reasonable since the byValue's parameter is taken by value), but then there are two moves. Since function is operating on a copy of cm, the fact that it moves its argument can be seen as an unimportant implementation detail. However, this behavior causes some trouble when using function together with bind:
#include <functional>
#include <iostream>
struct MoveTracker
{
bool hasBeenMovedFrom;
MoveTracker()
: hasBeenMovedFrom(false)
{}
MoveTracker(MoveTracker const &)
: hasBeenMovedFrom(false)
{}
MoveTracker(MoveTracker && other)
: hasBeenMovedFrom(false)
{
if (other.hasBeenMovedFrom)
{
std::cout << "already moved!" << '\n';
}
else
{
other.hasBeenMovedFrom = true;
}
}
};
void foo(MoveTracker, MoveTracker) {}
int main()
{
using namespace std::placeholders;
std::function<void(MoveTracker)> func = std::bind(foo, _1, _1);
MoveTracker obj;
func(obj); // prints "already moved!"
}
Is this behavior allowed by the standard? Is std::function allowed to move its arguments? And if so, is it normal that we can convert the wrapper returned by bind into a std::function with by-value parameters, even though this triggers unexpected behavior when dealing with multiple occurrences of placeholders?
std::function is specified to pass the supplied arguments to the wrapped function with std::forward. e.g. for std::function<void(MoveTracker)>, the function call operator is equivalent to
void operator(CopyableMovable a)
{
f(std::forward<CopyableMovable>(a));
}
Since std::forward<T> is equivalent to std::move when T is not a reference type, this accounts for one of the moves in your first example. It's possible that the second comes from having to go through the indirection layers inside std::function.
This then also accounts for the problem you are encountering with using std::bind as the wrapped function: std::bind is also specified to forward its parameters, and in this case it is being passed an rvalue reference resulting from the std::forward call inside std::function. The function call operator of your bind expression is thus forwarding an rvalue reference to each of the arguments. Unfortunately, since you've reused the placeholder, it's an rvalue reference to the same object in both cases, so for movable types whichever is constructed first will move the value, and the second parameter will get an empty shell.