I'm trying to understand structured binding introduced in C++17. The explanation on cppreference is not obvious to me, but it looks like
cv-auto ref-operator [x, y, z] = ...
is roughly equivalent to (not to consider array case)
cv-auto ref-operator unique_name = ...
#define x unique_name.member_a
#define y unique_name.member_b
#define z unique_name.member_c
The key point here is that x y z are not independently defined variables, but just aliases of the return value members. And cv-auto ref-operator applies to the return value, not the aliases (the syntax may be misleading here). For instance, see the cppreference example
float x{};
char y{};
int z{};
std::tuple<float&,char&&,int> tpl(x,std::move(y),z);
const auto& [a,b,c] = tpl;
// a names a structured binding of type float& that refers to x
// b names a structured binding of type char&& that refers to y
// c names a structured binding of type const int that refers to the 3rd element of tpl
If a b c are independently defined variables, with const auto& applying to them, c cannot be of type const int.
From a practical point of view, what are the key points this analogy failed to catch?
It might be insightful to consider this from another perspective.
In C++ we already had variables, objects with a name int a = 5 and objects that aren't variables and do not have a name: *new int. Structured bindings are a way to have names for all parts of a variable, while the whole variable has no explicit name. So it's the combination [x,y,z] that together names an variable with three members.
Importantly, they together name an object, so the compiler actually has to layout the object. Independent variables can be placed independently on the stack. but with structured bindings the compiler cannot do so (except for the normal as-if rule)
So when we consider the combination [x y z] as the name of the variable, it's clear that auto const& [x y z] makes the combination a const& variable.
We then have to consider what exactly the individual names x, y and z mean. Your question summarizes them as
cv-auto ref-operator unique_name = ...
#define x unique_name.member_a
#define y unique_name.member_b
#define z unique_name.member_c
That's a bit tricky. Where does member_a come from? It appears that unique_name has a member_a. You already excluded the array case, which has [0]. Tuples have get<0>(tpl). There might very well be a member_a behind the get<0>, but the name member_a could be private. member_a could also be less const-qualified than get<0>.
But yes, for the most simple case, a simple struct without bitfields, there will indeed be a corresponding member_a.
Related
The C++ reference pages say that () is for value initialisation, {} is for value and aggregate and list initialisation. So, if I just want value initialisation, which one do I use? () or {}? I'm asking because in the book "A Tour of C++" by Bjarne himself, he seems to prefer using {}, even for value initialisation (see for example pages 6 and 7), and so I thought it was good practice to always use {}, even for value initialisation. However, I've been badly bitten by the following bug recently. Consider the following code.
auto p = std::make_shared<int>(3);
auto q{ p };
auto r(p);
Now according to the compiler (Visual Studio 2013), q has type std::initializer_list<std::shared_ptr<int>>, which is not what I intended. What I actually intended for q is actually what r is, which is std::shared_ptr<int>. So in this case, I should not use {} for value initialisation, but use (). Given this, why does Bjarne in his book still seem to prefer to use {} for value initialisation? For example, he uses double d2{2.3} at the bottom of page 6.
To definitively answer my questions, when should I use () and when should I use {}? And is it a matter of syntax correctness or a matter of good programming practice?
Oh and uh, plain English if possible please.
EDIT:
It seems that I've slightly misunderstood value initialisation (see answers below). However the questions above still stands by and large.
Scott Meyers has a fair amount to say about the difference between the two methods of initialization in his book Effective Modern C++.
He summarizes both approaches like this:
Most developers end up choosing one kind of delimiter as a default, using
the other only when they have to. Braces-by-default folks are
attracted by their unrivaled breadth of applicability, their
prohibition of narrowing conversions, and their immunity to C++’s most
vexing parse. Such folks understand that in some cases (e.g., creation
of a std::vector with a given size and initial element value),
parentheses are required. On the other hand, the go-parentheses-go
crowd embraces parentheses as their default argument delimiter.
They’re attracted to its consistency with the C++98 syntactic
tradition, its avoidance of the auto-deduced-a-std::initializer_list
problem, and the knowledge that their object creation calls won’t be
inadvertently waylaid by std::initializer_list constructors. They
concede that sometimes only braces will do (e.g., when creating a
container with particular values). There’s no consensus that either
approach is better than the other, so my advice is to pick one and
apply it consistently.
This is my opinion.
When using auto as type specifier, it's cleaner to use:
auto q = p; // Type of q is same as type of p
auto r = {p}; // Type of r is std::initializer_list<...>
When using explicit type specifier, it's better to use {} instead of ().
int a{}; // Value initialized to 0
int b(); // Declares a function (the most vexing parse)
One could use
int a = 0; // Value initialized to 0
However, the form
int a{};
can be used to value initialize objects of user defined types too. E.g.
struct Foo
{
int a;
double b;
};
Foo f1 = 0; // Not allowed.
Foo f1{}; // Zero initialized.
First off, there seems to a terminology mixup. What you have is not value initialisation. Value initialisation happens when you do not provide any explicit initialisation arguments. int x; uses default initialisation, the value of x will be unspecified. int x{}; uses value initialisation, x will be 0. int x(); declares a function—that's why {} is preferred for value initialisation.
The code you've shown does not use value initialisation. With auto, the safest thing is to use copy initialisation:
auto q = p;
There is another important difference: The brace initializer requires that the given type can actually hold the given value. In other words, it forbids narrowing of the value, like rounding or truncation.
int a(2.3); // ok? a will hold the value 2, no error, maybe compiler warning
uint8_t c(256); // ok? the compiler should warn about something fishy going on
As compared to the brace initialization
int A{2.3}; // compiler error, because int can NOT hold a floating point value
double B{2.3}; // ok, double can hold this value
uint8_t C{256}; // compiler error, because 8bit is not wide enough for this number
Especially in generic programming with templates you should therefore use brace initialization to avoid nasty surprises when the underlying type does something unexpected to your input values.
{} is value initialization if empty, if not it is list/aggregate initialization.
From the draft, 7.1.6.4 auto specifier, 7/... Example,
auto x1 = { 1, 2 }; // decltype(x1) is std::initializer_list<int>
Rules are a little bit complex to explain here (even hard to read from the source!).
Herb Sutter seems to be making an argument in CppCon 2014 (39:25 into the talk) for using auto and brace initializers, like so:
auto x = MyType { initializers };
whenever you want to coerce the type, for left-to-right consistency in definitions:
Type-deduced: auto x = getSomething()
Type-coerced: auto x = MyType { blah }
User-defined literals auto x = "Hello, world."s
Function declaration: auto f { some; commands; } -> MyType
Named Lambda: using auto f = [=]( { some; commands; } -> MyType
C++11-style typedef: using AnotherType = SomeTemplate<MyTemplateArg>
Scott Mayers just posted a relevant blog entry Thoughts on the Vagaries of C++ Initialization. It seems that C++ still has a way to go before achieving a truly uniform initialization syntax.
From cppref
Like a reference, a structured binding is an alias to an existing
object. Unlike a reference, the type of a structured binding does not
have to be a reference type.
For example:
int a[2] = { 1, 2 };
auto [x, y] = a;
x and y are aliases rather than references. My question:
How to implement a type check function like is_alias_v<decltype(x)>?
I do not believe that such a thing is possible.
Fortunately, there is never any need for it.
Use x as if it were a nice juicy int, regardless of its origins. Because, well, that's what it is!
Also don't forget that x and y here don't alias or reference the elements of a, but an "invisible" copy.
An alias is either a type alias (e.g. using Id = int) or an alias template.
What is meant by a
structured binding is an alias to an existing object
is that [x, y] as a whole is an alias (a new name) to an array of two ints (in this example). It has nothing to do with the name of the type of x alone.
If we have some type alias using Id = int, a type trait to know whether an Id is an int would be std::is_same_t<Id, int>. I don’t know how to implement a generic is_alias_t<Id>.
In studying OCaml I found this bit of code that I was sure would throw an exception, but instead it returned the value 1.
let x = 1 in
let f y = x in
let x = 2 in
f 0;;
If I think of it sequentially, ok, x takes the value one. Then in the lower context, we say f y = x. Since y isn't defined, I would think right here the compiler should throw an exception. Even if y were defined, I'd think this would perhaps "define f at y" if it acts kidn of like Haskell. But I would not expect it to define f for other values.
So I seem to be a little confused about how this is working.
Variables in OCaml don't change value, they are immutable. Your code defines two different things named x. The function f uses the first definition always. When you define a new value with the same name, this has no effect on f.
When you say let f y = x you are defining y, not referring to a previous y. You're giving the name y to the parameter of f, which can then be used in the definition of f (though your code chooses not to use y, which is perfectly fine).
Why are structured bindings defined through a uniquely named variable and all the vague "name is bound to" language?
I personally thought structured bindings worked as follows. Given a struct:
struct Bla
{
int i;
short& s;
double* d;
} bla;
The following:
cv-auto ref-operator [a, b, c] = bla;
is (roughly) equivalent to
cv-auto ref-operator a = bla.i;
cv-auto ref-operator b = bla.s;
cv-auto ref-operator c = bla.d;
And the equivalent expansions for arrays and tuples.
But apparently, that would be too simple and there's all this vague special language used to describe what needs to happen.
So I'm clearly missing something, but what is the exact case where a well-defined expansion in the sense of, let's say, fold expressions, which a lot simpler to read up on in standardese?
It seems all the other behaviour of the variables defined by a structured binding actually follow the as-if simple expansion "rule" I'd think would be used to define the concept.
Structured binding exists to allow for multiple return values in a language that doesn't allow a function to resolve to more than one value (and thus does not disturb the C++ ABI). The means that whatever syntax is used, the compiler must ultimately store the actual return value. And therefore, that syntax needs a way to talk about exactly how you're going to store that value. Since C++ has some flexibility in how things are stored (as references or as values), the structured binding syntax needs to offer the same flexibility.
Hence the auto & or auto&& or auto choice applying to the primary value rather than the subobjects.
Second, we don't want to impact performance with this feature. Which means that the names introduced will never be copies of the subobjects of the main object. They must be either references or the actual subobjects themselves. That way, people aren't concerned about the performance impact of using structured binding; it is pure syntactic sugar.
Third, the system is designed to handle both user-defined objects and arrays/structs with all public members. In the case of user-defined objects, the "name is bound to" a genuine language reference, the result of calling get<I>(value). If you store a const auto& for the object, then value will be a const& to that object, and get will likely return a const&.
For arrays/public structs, the "names are bound to" something which is not a reference. These are treated exactly like you types value[2] or value.member_name. Doing decltype on such names will not return a reference, unless the unpacked member itself is a reference.
By doing it this way, structured binding remains pure syntactic sugar: it accesses the object in whatever is the most efficient way possible for that object. For user-defined types, that's calling get exactly once per subobject and storing references to the results. For other types, that's using a name that acts like an array/member selector.
It seems all the other behaviour of the variables defined by a structured binding actually follow the as-if simple expansion "rule" I'd think would be used to define the concept.
It kind of does. Except the expansion isn't based on the expression on the right hand side, it's based on the introduced variable. This is actually pretty important:
X foo() {
/* a lot of really expensive work here */
return {a, b, c};
}
auto&& [a, b, c] = foo();
If that expanded into:
// note, this isn't actually auto&&, but for the purposes of this example, let's simplify
auto&& a = foo().a;
auto&& b = foo().b;
auto&& c = foo().c;
It wouldn't just be extremely inefficient, it could also be actively wrong in many cases. For instance, imagine if foo() was implemented as:
X foo() {
X x;
std::cin >> x.a >> x.b >> x.c;
return x;
}
So instead, it expands into:
auto&& e = foo();
auto&& a = e.a;
auto&& b = e.b;
auto&& c = e.c;
which is really the only way to ensure that all of our bindings come from the same object without any extra overhead.
And the equivalent expansions for arrays and tuples. But apparently, that would be too simple and there's all this vague special language used to describe what needs to happen.
There's three cases:
Arrays. Each binding acts as if it's an access into the appropriate index.
Tuple-like. Each binding comes from a call to std::get<I>.
Aggregate-like. Each binding names a member.
That's not too bad? Hypothetically, #1 and #2 could be combined (could add the tuple machinery to raw arrays), but then it's potentially more efficient not to do this.
A healthy amount of the complexity in the wording (IMO) comes from dealing with the value categories. But you'd need that regardless of the way anything else is specified.
While im coding I declare structs or classes because they are based on real world objects/ideas/concepts.
But often those structs/classes only have one single member. So I was wondering if it makes any difference, if I simpy make a typedef.
And then I'm not sure if that's correct, because typedefs are not 'objects' in my opinion.
So should I do:
struct Y { int x; }
or just:
typedef int Y;
Does it make any difference? Is my image of structs being objects and typedefs being something else correct?
There's a huge difference. With the typedef all the following are valid:
Y y = 14.3;
y += 7;
y = 1 + y << 3;
std::cout << y;
double d = y;
With the struct none are, unless you choose to expose those operations. The question is, do you have something that is an int, with no restrictions, or an abstraction that is in some way based on an integer value, and has its own constraints or invariant?
There are several aspects you should take into consideration:
Primitive types operations
typedef, unlike struct/class, is just another name for the type, an alias.
Therfore, if you choose to use a typedef for a primitive, you'll be able to perform all the primitive types operations (like addition, subtraction and so on):
Y y = 4;
y+=5;
printf("Y is %d", y);
On the opposite, if you choose to use struct/class you will lose all this functionality and you'll have to implement these operators yourself.
Maintainability and extensibility
Primitive types, unlike structs can neither be complemented nor inherited.
Therefore, if you plan to inherit from your type, or complement it with additional fields, you should prefer class over typedef.
class X : public Y { int x; }
Using typeid
As I mentioned, typedef is an alias for the type.
Therefore, if you want your type to get a different id than the primitive type, you definitely should choose struct/class:
typedef int Y;
typeid(Y)==typeid(int)
In this case the expression will be true.
struct Y { int x; }
typeid(Y)==typeid(int)
In this case the expression will be false.
In most cases, it does not really matter one way or the other. Where it does matter is if you need to overload a function based on Y parameters, or if you want to add custom methods to (or override operators on) Y. A typedef is just an alias, so typedef int Y is just a normal int as far as the compiler is concerned, whereas struct Y is its own distinct type.