Default member values or default constructor parameters in structures? - c++

In modern C++ I am allowed to implement a struct with default member values i.e.
struct A
{
int x = 5;
float y = 1.0f;
};
but I can also do create a structure that has no "default member values" but can its constructor can be called with default parameters, as in the example below:
struct B {
int x;
float y;
B(int x_ = 5, float y_ = 1.0f) : x(x_), y(y_) {}
};
What I want to know is whether there is any difference from the clean code or architecture point of view between those? Or maybe there are another, even more important differences? In the first case, I've got less amount of code to write and I believe I can still construct the object like A({2, 3.14f}) even if the constructor is not defined.
Which would be your way to go in a project and why?

There is no "right" way do choose a form of initialization, it depends on the specific constraints/circumstances of a project or even your personal favor. However, I think those issues should be taken into account when deciding pro/contra in-class initializers.
Pro: Readability. When you want to understand what a class is good for, you often start off with its definition. When going through the data members, it's comfortable for your brain to have the initial value of each data member right next to its type, instead of scanning possibly multiple constructors.
Pro: Adding more constructors is much easier when no copy-pasting of the initializers of another existing constructor is required. With in-class member initializers, you are less likely to forget some initialization when adding a constructor.
Pro: Removing a constructor is easy. Let's say you start with a new class, add a bunch of member functions, a constructor and data members. Then you remember what Scott Meyers said in "How Non-Member Functions Improve Encapsulation" and you decide to turn the member functions into free ones. Your type now looks like a dumb struct with some data packed together - and when its constructor doesn't involve any business logic, you might want to turn the type into an aggregate: with in class initializers, you just remove the constructor - done.
Contra: Whenever you go with in class initializers, the concrete type of the data member being initialized must be known. This can be in issue when the class definition in question is in a header (most likely), compile times are an issue (most likely), the data member in question has a heavy-weight definition (sometimes) and/or requires non-trivial dependencies upon construction (sometimes). A simple guideline could be: in a scenario where the Pimpl-idiom is helpful, it's more important than points 1.-3. and does outrule in-class initializers, so go with Pimpl then.
Final note: the core guidelines recommend using in-class initializers in C.48, with the reasoning:
Makes it explicit that the same value is expected to be used in all constructors. Avoids repetition. Avoids maintenance problems. It leads to the shortest and most efficient code.

Related

C++ - Is it possible to use class inheritance without the member initialization list (MIL)?

I have just learned about classes in C++. I know that data members can be initialized by using the member initialization list syntax (MIL), but I think it is not very intuitive to use, and I think it is a very ugly way to assign data members.
Apparently class inheritance in C++ MUST be done through this MIL syntax. I do not understand the rationale behind this and googling did not give me answers. I am also unable to find any counter-example to this rule. Every example I saw online about inheritance requires the MIL.
So my question is: Is it possible to set up an inheritance without the MIL?
If the answer is yes, please explain how.
If the answer is no, please explain why not. (give the rationale behind mandating the MIL for inheritance)
No, that is not possible in general. If the base class has a default constructor, you can omit its initializer in the constructor of the derived class if that is appropriate. Typically though, you will want to invoke a specific constructor, and the member initializer list is the place to do that.
The reason for this is the design of the language: all "direct subobjects" of a class, meaning the base class objects and non-static data members it introduces, are actually initialized even before the body of its constructor is entered. Explicit member initializers must be given for subobjects that cannot be default-initialized, otherwise the program is ill-formed (must be rejected by compilers). For further reference, see for example cppreference on constructors and member initializer lists.
In practice this means that, while a constructor can perform additional work after initialization, constructors with empty bodies are fairly common in C++, but ones with empty member initializer lists are not.

Is aggregate initialization of struct safe during refactor?

If I have a struct
SomeStruct
{
double y;
double x;
};
and I somewhere initialize it like
SomeStruct s{1,2}; //y=1 x=2
then it seems that my code can silently break if I reorder my struct to
SomeStruct
{
double x;
double y;
double z;
};
since now SomeStruct s{1,2} means that x=1, y=2, z=0
Edit:
An argument brought up is that a constructor has the same issue, which is true, but there you can generally see the argument names and orders - even more clearly so if using any modern IDE.
I have not seen anyone ever mention this, but it seems that you can only safely use aggregate initialization like this if you are certain that you will never make changes to the layout of the data. That would be the rare situation, so is there an unspoken rule of "never use aggregate initialization on non homogeneous structs"?
Is aggregate initialization of struct safe during refactor?
Depends on how you define "safe" in this context. And what kind of refactoring is done.
it seems that my code can silently break ... since now SomeStruct s{1,2}means that x=1, y=2, z=0
Which might be perfectly OK. Why shouldn't z be initialized to 0? You can't tell if its broken until you know what it represents.
This is one of the more benign changes that you can do to a class. A different change such as re-ordering the members would more assuredly require changes in dependent code.
it seems that you can only safely use aggregate initialization like this if you are certain that you will never make changes to the layout of the data.
Or, if the only changes are expected to be new members that are fine to be value-initialized. (C has support for designated initializers, and those are even resilient to changes to the order of the members. Unfortunately, we don't have them in C++).
Or, if you don't find it a problem to find and update the aggregate initialization. You should know that this possibility exists whenever you make changes to an aggregate.
But indeed, it is true that constructors provide a sort of "implementation firewall" that decouples the user from changes to the members. Some changes like new members may prompt changes to the constructor, but missing constructor arguments do helpfully break the compilation.
so is there an unspoken rule of "never use aggregate initialization on non homogeneous structs"?
Certainly not without exceptions that you and I have considered. Also, the homo- vs heterogeneity makes no difference.
I would flip the perspective the other way, and say that when designing a public API, you should consider whether the layout of a class should be set in stone, or instead make the members private and provide a constructor. For internal API, it doesn't matter so much if you can quickly go through all uses of the class.
GCC has a warning option that solves the exact situation that you describe: -Wmissing-field-initializers (enabled by -Wextra). It will generate a warning if you provide any member initializers, but not all.

Will using brace-init syntax change construction behavior when an initializer_list constructor is added later?

Suppose I have a class like this:
class Foo
{
public:
Foo(int something) {}
};
And I create it using this syntax:
Foo f{10};
Then later I add a new constructor:
class Foo
{
public:
Foo(int something) {}
Foo(std::initializer_list<int>) {}
};
What happens to the construction of f? My understanding is that it will no longer call the first constructor but instead now call the init list constructor. If so, this seems bad. Why are so many people recommending using the {} syntax over () for object construction when adding an initializer_list constructor later may break things silently?
I can imagine a case where I'm constructing an rvalue using {} syntax (to avoid most vexing parse) but then later someone adds an std::initializer_list constructor to that object. Now the code breaks and I can no longer construct it using an rvalue because I'd have to switch back to () syntax and that would cause most vexing parse. How would one handle this situation?
What happens to the construction of f? My understanding is that it will no longer call the first constructor but instead now call the init list constructor. If so, this seems bad. Why are so many people recommending using the {} syntax over () for object construction when adding an initializer_list constructor later may break things silently?
On one hand, it's unusual to have the initializer-list constructor and the other one both be viable. On the other hand, "universal initialization" got a bit too much hype around the C++11 standard release, and it shouldn't be used without question.
Braces work best for like aggregates and containers, so I prefer to use them when surrounding some things which will be owned/contained. On the other hand, parentheses are good for arguments which merely describe how something new will be generated.
I can imagine a case where I'm constructing an rvalue using {} syntax (to avoid most vexing parse) but then later someone adds an std::initializer_list constructor to that object. Now the code breaks and I can no longer construct it using an rvalue because I'd have to switch back to () syntax and that would cause most vexing parse. How would one handle this situation?
The MVP only happens with ambiguity between a declarator and an expression, and that only happens as long as all the constructors you're trying to call are default constructors. An empty list {} always calls the default constructor, not an initializer-list constructor with an empty list. (This means that it can be used at no risk. "Universal" value-initialization is a real thing.)
If there's any subexpression inside the braces/parens, the MVP problem is already solved.
Retrofitting classes with initializer lists in updated code is something that sounds like it will be a common thing to happen. So people start using {} syntax for existing constructors before the class is updated, and we want to automatically catch any old uses, especially those used in templates where they may be overlooked.
If I had a class like vector that took a size, then arguably using {} syntax is "wrong", but for the transition we want to catch that anyway. Constructing C c1 {val} means take some (one, in this case) values for the collection, and C c2 (arg) means use val as a descriptive piece of metadata for the class.
In order to support both uses, when the type of element happens to be compatible with the descriptive argument, code that used C c2 {arg} will change meaning. There seems to be no way around it in that case if we want to support both forms with different meanings.
So what would I do? If the compiler provides some way to issue a warning, I'd make the initializer list with one argument give a warning. That sounds tricky not to mention compiler specific, so I'd make a general template for that, if it's not already in Boost, and promote its use.
Other than containers, what other situations would have initializer list and single argument constructors with different meanings where the single argument isn't something of a very distinct type from what you'd be using with the list? For non-containers, it might suffice to notice that they won't be confused because the types are different or the list will always have multiple elements. But it's good to think about that and take additional steps if they could be confused in this manner.
For a non-container being enhanced with initializer_list features, it might be sufficient to specifically avoid designing a one-argument constructor that can be mistaken. So, the one-arg constructor would be removed in the updated class, or the initializer list would require other (possibly tag) arguments first. That is, don't do that, under penalty of pie-in-face at the code review.
Even for container-like classes, a class that's not a standard library class could impose that the one-arg constructor form is no longer available. E.g. C c3 (size); would have to be written as C c3 (size, C()); or designed to take an enumeration argument also, which is handy to specify initialized to one value vs. reserved size, so you can argue it's a feature and point out code that begins with a separate call to reserve. So again, don't do that if I can reasonably avoid it.

Unions as Base Class

The standard defines that Unions cannot be used as Base class, but is there any specific reasoning for this? As far as I understand Unions can have constructors, destructors, also member variables, and methods to operate on those varibales. In short a Union can encapsulate a datatype and state which might be accessed through member functions. Thus it in most common terms qualifies for being a class and if it can act as a class then why is it restricted from acting as a base class?
Edit: Though the answers try to explain the reasoning I still do not understand how Union as a Derived class is worst than when Union as just a class. So in hope of getting more concrete answer and reasoning I will push this one for a bounty. No offence to the already posted answers, Thanks for those!
Tony Park gave an answer which is pretty close to the truth. The C++ committee basically didn't think it was worth the effort to make unions a strong part of C++, similarly to the treatment of arrays as legacy stuff we had to inherit from C but didn't really want.
Unions have problems: if we allow non-POD types in unions, how do they get constructed? It can certainly be done, but not necessarily safely, and any consideration would require committee resources. And the final result would be less than satisfactory, because what is really required in a sane language is discriminated unions, and bare C unions could never be elevated to discriminated unions in way compatible with C (that I can imagine, anyhow).
To elaborate on the technical issues: since you can wrap a POD-component only union in a struct without losing anything, there's no advantage allowing unions as bases. With POD-only union components, there's no problem with explicit constructors simply assigning one of the components, nor with using a bitblit (memcpy) for compiler generated copy constructor (or assignment).
Such unions, however, aren't useful enough to bother with except to retain them so existing C code can be considered valid C++. These POD-only unions are broken in C++ because they fail to retain a vital invariant they possess in C: any data type can be used as a component type.
To make unions useful, we must allow constructable types as members. This is significant because it is not acceptable to merely assign a component in a constructor body, either of the union itself, or any enclosing struct: you cannot, for example, assign a string to an uninitialised string component.
It follows one must invent some rules for initialising union component with mem-initialisers, for example:
union X { string a; string b; X(string q) : a(q) {} };
But now the question is: what is the rule? Normally the rule is you must initialise every member and base of a class, if you do not do so explicitly, the default constructor is used for the remainder, and if one type which is not explicitly initialised does not have a default constructor, it's an error [Exception: copy constructors, the default is the member copy constructor].
Clearly this rule can't work for unions: the rule has to be instead: if the union has at least one non-POD member, you must explicitly initialise exactly one member in a constructor. In this case, no default constructor, copy constructor, assignment operator, or destructor will be generated and if any of these members are actually used, they must be explicitly supplied.
So now the question becomes: how would you write, say, a copy constructor? It is, of course quite possible to do and get right if you design your union the way, say, X-Windows event unions are designed: with the discriminant tag in each component, but you will have to use placement operator new to do it, and you will have to break the rule I wrote above which appeared at first glance to be correct!
What about default constructor? If you don't have one of those, you can't declare an uninitialised variable.
There are other cases where you can determine the component externally and use placement new to manage a union externally, but that isn't a copy constructor. The fact is, if you have N components you'd need N constructors, and C++ has a broken idea that constructors use the class name, which leaves you rather short of names and forces you to use phantom types to allow overloading to choose the right constructor .. and you can't do that for the copy constructor since its signature is fixed.
Ok, so are there alternatives? Probably, yes, but they're not so easy to dream up, and harder to convince over 100 people that it's worthwhile to think about in a three day meeting crammed with other issues.
It is a pity the committee did not implement the rule above: unions are mandatory for aligning arbitrary data and external management of the components is not really that hard to do manually, and trivial and completely safe when the code is generated by a suitable algorithm, in other words, the rule is mandatory if you want to use C++ as a compiler target language and still generate readable, portable code. Such unions with constructable members have many uses but the most important one is to represent the stack frame of a function containing nested blocks: each block has local data in a struct, and each struct is a union component, there is no need for any constructors or such, the compiler will just use placement new. The union provides alignment and size, and cast free component access.
[And there is no other conforming way to get the right alignment!]
Therefore the answer to your question is: you're asking the wrong question. There's no advantage to POD-only unions being bases, and they certainly can't be derived classes because then they wouldn't be PODs. To make them useful, some time is required to understand why one should follow the principle used everywhere else in C++: missing bits aren't an error unless you try to use them.
Union is a type that can be used as any one of its members depending on which member has been set - only that member can be later read.
When you derive from a type the derived type inherits the base type - the derived type can be used wherever the base type could be. If you could derive from a union the derived class could be used (not implicitly, but explicitly through naming the member) wherever any of the union members could be used, but among those members only one member could be legally accessed. The problem is the data on which member has been set is not stored in the union.
To avoid this subtle yet dangerous contradiction that in fact subverts a type system deriving from a union is not allowed.
Bjarne Stroustrup said 'there seems little reason for it' in The Annotated C++ Reference Manual.
The title asks why unions can't be a base class, but the question appears to be about unions as a derived class. So, which is it?
There's no technical reason why unions can't be a base class; it's just not allowed. A reasonable interpretation would be to think of the union as a struct whose members happen to potentially overlap in memory, and consider the derived class as a class that inherits from this (rather odd) struct. If you need that functionality, you can usually persuade most compilers to accept an anonymous union as a member of a struct. Here's an example, that's suitable for use as a base class. (And there's an anonymous struct in the union for good measure.)
struct V3 {
union {
struct {
float x,y,z;
};
float f[3];
};
};
The rationale for unions as a derived class is probably simpler: the result wouldn't be a union. Unions would have to be the union of all their members, and all of their bases. That's fair enough, and might open up some interesting template possibilities, but you'd have a number of limitations (all bases and members would have to be POD -- and would you be able to inherit twice, because a derived type is inherently non-POD?), this type of inheritance would be different from the other type the language sports (OK, not that this has stopped C++ before) and it's sort of redundant anyway -- the existing union functionality would do just as well.
Stroustrup says this in the D&E book:
As with void *, programmers should know that unions ... are inherently dangerous, should be avoided wherever possible, and should be handled with special care when actually needed.
(The elision doesn't change the meaning.)
So I imagine the decision is arbitrary, and he just saw no reason to change the union functionality (it works fine as-is with the C subset of C++), and so didn't design any integration with the new C++ features. And when the wind changed, it got stuck that way.
I think you got the answer yourself in your comments on EJP's answer.
I think unions are only included in C++ at all in order to be backwards compatible with C. I guess unions seemed like a good idea in 1970, on systems with tiny memory spaces. By the time C++ came along I imagine unions were already looking less useful.
Given that unions are pretty dangerous anyway, and not terribly useful, the vast new opportunities for creating bugs that inheriting from unions would create probably just didn't seem like a good idea :-)
Here's my guess for C++ 03.
As per $9.5/1, In C++ 03, Unions can not have virtual functions. The whole point of a meaningful derivation is to be able to override behaviors in the derived class. If a union cannot have virtual functions, that means that there is no point in deriving from a union.
Hence the rule.
You can inherit the data layout of a union using the anonymous union feature from C++11.
#include <cstddef>
template <size_t N,typename T>
struct VecData {
union {
struct {
float x;
float y;
float z;
};
float a[N];
};
};
template <size_t N, typename T>
class Vec : public VecData<N,T> {
//methods..
};
In general its almost always better not work with unions directly but enclose them within a struct or class. Then you can base your inheritance off the struct outer layer and use unions within if you need to.

Initializing member variables

I've started to pick up this pattern:
template<typename T>
struct DefaultInitialize
{
DefaultInitialize():m_value(T()){}
// ... conversions, assignments, etc ....
};
So that when I have classes with primitive members, I can set them to be initialized to 0 on construction:
struct Class
{
...
DefaultInitialize<double> m_double;
...
};
The reason I do this is to avoid having to remember to initialize the member in each constructor (if there are multiple constructors). I'm trying to figure out if:
This is a valid pattern?
I am using the right terminology?
This is a valid pattern?
It's a known "valid" pattern, i would say. Boost has a class template called value_initialized that does exactly that, too.
I am using the right terminology?
Well, your template can be optimized to have fewer requirements on the type parameter. As of now, your type T requires a copy constructor, unfortunately. Let's change the initializer to the following
DefaultInitialize():m_value(){}
Then, technically this kind of initialization is called value initialization, starting with C++03. It's a little bit weird, since no kind of value is provided in the first place. Well, this kind of initialization looks like default initialization, but is intended to fill things with zero, but respecting any user defined constructor and executing that instead.
To summarize, what you did was to value initialize an object having type T, then to copy that object to m_value. What my version of above does it to value initialize the member directly.
Seems like a lot of work to avoid having to type m_double(0). I think it's harder to understand at first glance, but it does seem fine as long as everything is implemented properly.
But is it worth it? Do you really want to have to #include "DefaultInitialize.h" everywhere?
To clarify, basically, you're:
Making your compile times longer because of the includes.
Your code base larger because you have to manage the deceptively simple DefaultInitialize class
Increase the time it takes other people to read your code. If you have a member of a class that's a double, that's natural to me, but when I see DefaultInitialize, I have to learn what that is and why it was created
All that because you don't like to type out a constructor. I understand that it seems very nice to not have to do this, but most worth-while classes I've ever written tend to need to have a constructor written anyway.
This is certainly only my opinion, but I think most other people will agree with it. That is: it would be handy to not have to explicitly initialize members to 0, but the alternative (your class) isn't worth it.
Not to mention that in C++0x, you can do this;
class Foo
{
private:
int i = 0; // will be initialized to 0
}
Some compilers don't properly implement value initialization. For example, see Microsoft Connect, Value-initialization in new-expression, reported by Pavel Kuznetsov.
Fernando Cacciola's boost::value_initialized (mentioned already here by litb) offers a workaround to such compiler bugs.
If you're just initializing basic types to zero, you can override new and have it memset allocated memory to zero. May be simpler. There are pros and cons to doing this.