Is aggregate initialization of struct safe during refactor? - c++

If I have a struct
SomeStruct
{
double y;
double x;
};
and I somewhere initialize it like
SomeStruct s{1,2}; //y=1 x=2
then it seems that my code can silently break if I reorder my struct to
SomeStruct
{
double x;
double y;
double z;
};
since now SomeStruct s{1,2} means that x=1, y=2, z=0
Edit:
An argument brought up is that a constructor has the same issue, which is true, but there you can generally see the argument names and orders - even more clearly so if using any modern IDE.
I have not seen anyone ever mention this, but it seems that you can only safely use aggregate initialization like this if you are certain that you will never make changes to the layout of the data. That would be the rare situation, so is there an unspoken rule of "never use aggregate initialization on non homogeneous structs"?

Is aggregate initialization of struct safe during refactor?
Depends on how you define "safe" in this context. And what kind of refactoring is done.
it seems that my code can silently break ... since now SomeStruct s{1,2}means that x=1, y=2, z=0
Which might be perfectly OK. Why shouldn't z be initialized to 0? You can't tell if its broken until you know what it represents.
This is one of the more benign changes that you can do to a class. A different change such as re-ordering the members would more assuredly require changes in dependent code.
it seems that you can only safely use aggregate initialization like this if you are certain that you will never make changes to the layout of the data.
Or, if the only changes are expected to be new members that are fine to be value-initialized. (C has support for designated initializers, and those are even resilient to changes to the order of the members. Unfortunately, we don't have them in C++).
Or, if you don't find it a problem to find and update the aggregate initialization. You should know that this possibility exists whenever you make changes to an aggregate.
But indeed, it is true that constructors provide a sort of "implementation firewall" that decouples the user from changes to the members. Some changes like new members may prompt changes to the constructor, but missing constructor arguments do helpfully break the compilation.
so is there an unspoken rule of "never use aggregate initialization on non homogeneous structs"?
Certainly not without exceptions that you and I have considered. Also, the homo- vs heterogeneity makes no difference.
I would flip the perspective the other way, and say that when designing a public API, you should consider whether the layout of a class should be set in stone, or instead make the members private and provide a constructor. For internal API, it doesn't matter so much if you can quickly go through all uses of the class.
GCC has a warning option that solves the exact situation that you describe: -Wmissing-field-initializers (enabled by -Wextra). It will generate a warning if you provide any member initializers, but not all.

Related

Default member values or default constructor parameters in structures?

In modern C++ I am allowed to implement a struct with default member values i.e.
struct A
{
int x = 5;
float y = 1.0f;
};
but I can also do create a structure that has no "default member values" but can its constructor can be called with default parameters, as in the example below:
struct B {
int x;
float y;
B(int x_ = 5, float y_ = 1.0f) : x(x_), y(y_) {}
};
What I want to know is whether there is any difference from the clean code or architecture point of view between those? Or maybe there are another, even more important differences? In the first case, I've got less amount of code to write and I believe I can still construct the object like A({2, 3.14f}) even if the constructor is not defined.
Which would be your way to go in a project and why?
There is no "right" way do choose a form of initialization, it depends on the specific constraints/circumstances of a project or even your personal favor. However, I think those issues should be taken into account when deciding pro/contra in-class initializers.
Pro: Readability. When you want to understand what a class is good for, you often start off with its definition. When going through the data members, it's comfortable for your brain to have the initial value of each data member right next to its type, instead of scanning possibly multiple constructors.
Pro: Adding more constructors is much easier when no copy-pasting of the initializers of another existing constructor is required. With in-class member initializers, you are less likely to forget some initialization when adding a constructor.
Pro: Removing a constructor is easy. Let's say you start with a new class, add a bunch of member functions, a constructor and data members. Then you remember what Scott Meyers said in "How Non-Member Functions Improve Encapsulation" and you decide to turn the member functions into free ones. Your type now looks like a dumb struct with some data packed together - and when its constructor doesn't involve any business logic, you might want to turn the type into an aggregate: with in class initializers, you just remove the constructor - done.
Contra: Whenever you go with in class initializers, the concrete type of the data member being initialized must be known. This can be in issue when the class definition in question is in a header (most likely), compile times are an issue (most likely), the data member in question has a heavy-weight definition (sometimes) and/or requires non-trivial dependencies upon construction (sometimes). A simple guideline could be: in a scenario where the Pimpl-idiom is helpful, it's more important than points 1.-3. and does outrule in-class initializers, so go with Pimpl then.
Final note: the core guidelines recommend using in-class initializers in C.48, with the reasoning:
Makes it explicit that the same value is expected to be used in all constructors. Avoids repetition. Avoids maintenance problems. It leads to the shortest and most efficient code.

Should constructor initialize all the data members of the class?

I have a situation like this:
class A {
public:
A() : n(0) {}
private:
int n;
int m;
}
There is simply no meaning in the application logic to initialize m in the constructor. However, Eclipse warns me that the constructor leaves m uninitialized. I can't run the code somewhere else now. The warning is:
Member 'm' was not initialized in this constructor
So, does C++ encourage us to initialize all the data members in the constructor or it is just Eclipse's logic?
Should constructor initialize all the data members of the class?
That would be a good practice.
So, does C++ encourage us to initialize all the data members in the constructor?
It's not required by the c++ standard. As long as you initialize all variables before they're used, your program is correct in that regard.
or it is just Eclipse's logic?
Quite likely. Neither g++ nor clang versions that I tested warn about this when all warnings are enabled. The logic may or might not be based on high integrity c++ coding standard
12.4.2 or some other coding standard or style guide.
C++ doesn't require attributes to be initialized in constructor, except in case of const attributes where there value must be defined in initialization list.
However, it is clearly a good practice to initialize every attributes in constructor. I cannot count how many bugs I've met due to a non initialized variable or attributes.
Finally, every object should permanently be in a consistent state, which include both public (accessible) attributes and private attributes as well. Optimization should not be a reason for keeping an object un-consistent.
Fully disagree with all the answers and comments. There is absolutely no need to default initialze a member when it is not needed. This is why C/C++ never initializes built-in types as members or automatic variables - because doing so would impede performance. Of course, it is not a problem when you create your object/variable once (that's why statics are default-initialized), but for something happening in a tight loop default initialization might eat valuable nanoseconds.
The one exception to this rule would, in my view, be pointers (if you happen to have raw pointers in your code). Raw pointers should be NULL-initialized, since having invalid pointer is a direct way to undefined behaviour.
For completeness, the warning comes from the C/C++ Code Analysis. In particular the problem is Potential Programming Problems / Class members should be properly initialized
To change the code analysis settings (in this case I recommend per-project) edit the project properties. You can disable the whole warning, or disable it on just the files that violate the warning condition.
As for comparing CDT with GCC or CLang, this appears to be a case where additional code analysis is being done by CDT compared to what is available from the compilers. Of course that is to be expected as the CDT Code Analysis' remit is greater than that of the compiler.
PS, If you are up for it, you can read the implementation of this particular checker.
As it has been already said, you should always initialize pointers and of course const objects are mandatory.
In my opinion you should not initialize when it is not necessary but it is good to check for all non constructor initialized variables once in a while because they are source of very frequent and hard to find bugs.
I run Cppcheck every few months. This gives me more than one hundred 'false' warnings like "Member variable 'foo::bar' is not initialized in the constructor." but once in a while it discovers some real bugs so it is totally worth it.

Why can I not implement default constructors for structs in D?

Writing code like
struct S
{
this() // compile-time error
{
}
}
gives me an error message saying
default constructor for structs only allowed with #disable and no body.
Why??
This is one of cases much more tricky than one can initially expect.
One of important and useful features D has over C++ is that every single type (including all user types) has some initial non-garbage value that can be evaluated at compile-time. It is used as T.init and has two important use cases:
Template constraints can use T.init value to check if certain operations can be done on given type (quoting Kenji Hara's snippet):
template isSomething(T) {
enum isSomething = is(typeof({
//T t1; // not good if T is nested struct, or has #disable this()
//T t2 = void; auto x = t2; // not good if T is non-mutable type
T t = T.init; // avoid default construct check
...use t...
}));
}
Your variables are always initialized properly unless you explicitly use int i = void syntax. No garbage possible.
Given that, difficult question arises. Should we guarantee that T() and T.init are the same (as many programmers coming from C++ will expect) or allow default construction that may easily destroy that guarantee. As far as I know, decision was made that first approach is safer, despite being surprising.
However, discussions keep popping with various improvements proposed (for example, allowing CTFE-able default constructor). One such thread has appeared very recently.
It stems from the fact that all types in D must have a default value. There are quite a few places where a type's init value gets used, including stuff like default-initializing member variables and default-initializing every value in an array when it's allocated, and init needs to be known at compile for a number of those cases. Having init provides quite a few benefits, but it does get in the way of having a default constructor.
A true default constructor would need to be used in all of the places that init is used (or it wouldn't be the default), but allowing arbitrary code to run in a number of the cases that init is used would be problematic at best. At minimum, you'd probably be forced to make it CTFE-able and possibly pure. And as soon as you start putting restrictions like that on it, pretty soon, you might as well just directly initialize all of the member variables to what you want (which is what happens with init), as you wouldn't be gaining much (if anything) over that, which would make having default constructors pretty useless.
It might be possible to have both init and a default constructor, but then the question comes up as to when one is used over the other, and the default constructor wouldn't really be the default anymore. Not to mention, it could become very confusing to developers as to when the init value was used and when the default constructor was used.
Now, we do have the ability to #disable the init value of a struct (which causes its own set of problems), in which case, it would be illegal to use that struct in any situation that required init. So, it might be possible to then have a default constructor which could run arbitrary code at runtime, but what the exact consequences of that would be, I don't know. However, I'm sure that there are cases where people would want to have a default constructor that would require init and therefore wouldn't work, because it had been #disabled (things like declaring arrays of the type would probably be one of them).
So, as you can see, by doing what D has done with init, it's made the whole question of default constructors much more complicated and problematic than it is in other languages.
The normal way to get something akin to default construction is to use a static opCall. Something like
struct S
{
static S opCall()
{
//Create S with the values that you want and return it.
}
}
Then whenever you use S() - e.g.
auto s = S();
then the static opCall gets called, and you get a value that was created at runtime. However, S.init will still be used any place that it was before (including S s;), and the static opCall will only be used when S() is used explicitly. But if you couple that with #disable this() (which disables the init property), then you get something akin to what I described earlier where we might have default constructors with an #disabled init.
We may or may not end up with default constructors being added to the language eventually, but there are a number of technical problems with adding them due to how init and the language work, and Walter Bright doesn't think that they should be added. So, for default constructors to be added to the language, someone would have to come up with a really compelling design which appropriately resolves all of the issues (including convincing Walter), and I don't expect that to happen, but we'll see.

Unions as Base Class

The standard defines that Unions cannot be used as Base class, but is there any specific reasoning for this? As far as I understand Unions can have constructors, destructors, also member variables, and methods to operate on those varibales. In short a Union can encapsulate a datatype and state which might be accessed through member functions. Thus it in most common terms qualifies for being a class and if it can act as a class then why is it restricted from acting as a base class?
Edit: Though the answers try to explain the reasoning I still do not understand how Union as a Derived class is worst than when Union as just a class. So in hope of getting more concrete answer and reasoning I will push this one for a bounty. No offence to the already posted answers, Thanks for those!
Tony Park gave an answer which is pretty close to the truth. The C++ committee basically didn't think it was worth the effort to make unions a strong part of C++, similarly to the treatment of arrays as legacy stuff we had to inherit from C but didn't really want.
Unions have problems: if we allow non-POD types in unions, how do they get constructed? It can certainly be done, but not necessarily safely, and any consideration would require committee resources. And the final result would be less than satisfactory, because what is really required in a sane language is discriminated unions, and bare C unions could never be elevated to discriminated unions in way compatible with C (that I can imagine, anyhow).
To elaborate on the technical issues: since you can wrap a POD-component only union in a struct without losing anything, there's no advantage allowing unions as bases. With POD-only union components, there's no problem with explicit constructors simply assigning one of the components, nor with using a bitblit (memcpy) for compiler generated copy constructor (or assignment).
Such unions, however, aren't useful enough to bother with except to retain them so existing C code can be considered valid C++. These POD-only unions are broken in C++ because they fail to retain a vital invariant they possess in C: any data type can be used as a component type.
To make unions useful, we must allow constructable types as members. This is significant because it is not acceptable to merely assign a component in a constructor body, either of the union itself, or any enclosing struct: you cannot, for example, assign a string to an uninitialised string component.
It follows one must invent some rules for initialising union component with mem-initialisers, for example:
union X { string a; string b; X(string q) : a(q) {} };
But now the question is: what is the rule? Normally the rule is you must initialise every member and base of a class, if you do not do so explicitly, the default constructor is used for the remainder, and if one type which is not explicitly initialised does not have a default constructor, it's an error [Exception: copy constructors, the default is the member copy constructor].
Clearly this rule can't work for unions: the rule has to be instead: if the union has at least one non-POD member, you must explicitly initialise exactly one member in a constructor. In this case, no default constructor, copy constructor, assignment operator, or destructor will be generated and if any of these members are actually used, they must be explicitly supplied.
So now the question becomes: how would you write, say, a copy constructor? It is, of course quite possible to do and get right if you design your union the way, say, X-Windows event unions are designed: with the discriminant tag in each component, but you will have to use placement operator new to do it, and you will have to break the rule I wrote above which appeared at first glance to be correct!
What about default constructor? If you don't have one of those, you can't declare an uninitialised variable.
There are other cases where you can determine the component externally and use placement new to manage a union externally, but that isn't a copy constructor. The fact is, if you have N components you'd need N constructors, and C++ has a broken idea that constructors use the class name, which leaves you rather short of names and forces you to use phantom types to allow overloading to choose the right constructor .. and you can't do that for the copy constructor since its signature is fixed.
Ok, so are there alternatives? Probably, yes, but they're not so easy to dream up, and harder to convince over 100 people that it's worthwhile to think about in a three day meeting crammed with other issues.
It is a pity the committee did not implement the rule above: unions are mandatory for aligning arbitrary data and external management of the components is not really that hard to do manually, and trivial and completely safe when the code is generated by a suitable algorithm, in other words, the rule is mandatory if you want to use C++ as a compiler target language and still generate readable, portable code. Such unions with constructable members have many uses but the most important one is to represent the stack frame of a function containing nested blocks: each block has local data in a struct, and each struct is a union component, there is no need for any constructors or such, the compiler will just use placement new. The union provides alignment and size, and cast free component access.
[And there is no other conforming way to get the right alignment!]
Therefore the answer to your question is: you're asking the wrong question. There's no advantage to POD-only unions being bases, and they certainly can't be derived classes because then they wouldn't be PODs. To make them useful, some time is required to understand why one should follow the principle used everywhere else in C++: missing bits aren't an error unless you try to use them.
Union is a type that can be used as any one of its members depending on which member has been set - only that member can be later read.
When you derive from a type the derived type inherits the base type - the derived type can be used wherever the base type could be. If you could derive from a union the derived class could be used (not implicitly, but explicitly through naming the member) wherever any of the union members could be used, but among those members only one member could be legally accessed. The problem is the data on which member has been set is not stored in the union.
To avoid this subtle yet dangerous contradiction that in fact subverts a type system deriving from a union is not allowed.
Bjarne Stroustrup said 'there seems little reason for it' in The Annotated C++ Reference Manual.
The title asks why unions can't be a base class, but the question appears to be about unions as a derived class. So, which is it?
There's no technical reason why unions can't be a base class; it's just not allowed. A reasonable interpretation would be to think of the union as a struct whose members happen to potentially overlap in memory, and consider the derived class as a class that inherits from this (rather odd) struct. If you need that functionality, you can usually persuade most compilers to accept an anonymous union as a member of a struct. Here's an example, that's suitable for use as a base class. (And there's an anonymous struct in the union for good measure.)
struct V3 {
union {
struct {
float x,y,z;
};
float f[3];
};
};
The rationale for unions as a derived class is probably simpler: the result wouldn't be a union. Unions would have to be the union of all their members, and all of their bases. That's fair enough, and might open up some interesting template possibilities, but you'd have a number of limitations (all bases and members would have to be POD -- and would you be able to inherit twice, because a derived type is inherently non-POD?), this type of inheritance would be different from the other type the language sports (OK, not that this has stopped C++ before) and it's sort of redundant anyway -- the existing union functionality would do just as well.
Stroustrup says this in the D&E book:
As with void *, programmers should know that unions ... are inherently dangerous, should be avoided wherever possible, and should be handled with special care when actually needed.
(The elision doesn't change the meaning.)
So I imagine the decision is arbitrary, and he just saw no reason to change the union functionality (it works fine as-is with the C subset of C++), and so didn't design any integration with the new C++ features. And when the wind changed, it got stuck that way.
I think you got the answer yourself in your comments on EJP's answer.
I think unions are only included in C++ at all in order to be backwards compatible with C. I guess unions seemed like a good idea in 1970, on systems with tiny memory spaces. By the time C++ came along I imagine unions were already looking less useful.
Given that unions are pretty dangerous anyway, and not terribly useful, the vast new opportunities for creating bugs that inheriting from unions would create probably just didn't seem like a good idea :-)
Here's my guess for C++ 03.
As per $9.5/1, In C++ 03, Unions can not have virtual functions. The whole point of a meaningful derivation is to be able to override behaviors in the derived class. If a union cannot have virtual functions, that means that there is no point in deriving from a union.
Hence the rule.
You can inherit the data layout of a union using the anonymous union feature from C++11.
#include <cstddef>
template <size_t N,typename T>
struct VecData {
union {
struct {
float x;
float y;
float z;
};
float a[N];
};
};
template <size_t N, typename T>
class Vec : public VecData<N,T> {
//methods..
};
In general its almost always better not work with unions directly but enclose them within a struct or class. Then you can base your inheritance off the struct outer layer and use unions within if you need to.

Initializing member variables

I've started to pick up this pattern:
template<typename T>
struct DefaultInitialize
{
DefaultInitialize():m_value(T()){}
// ... conversions, assignments, etc ....
};
So that when I have classes with primitive members, I can set them to be initialized to 0 on construction:
struct Class
{
...
DefaultInitialize<double> m_double;
...
};
The reason I do this is to avoid having to remember to initialize the member in each constructor (if there are multiple constructors). I'm trying to figure out if:
This is a valid pattern?
I am using the right terminology?
This is a valid pattern?
It's a known "valid" pattern, i would say. Boost has a class template called value_initialized that does exactly that, too.
I am using the right terminology?
Well, your template can be optimized to have fewer requirements on the type parameter. As of now, your type T requires a copy constructor, unfortunately. Let's change the initializer to the following
DefaultInitialize():m_value(){}
Then, technically this kind of initialization is called value initialization, starting with C++03. It's a little bit weird, since no kind of value is provided in the first place. Well, this kind of initialization looks like default initialization, but is intended to fill things with zero, but respecting any user defined constructor and executing that instead.
To summarize, what you did was to value initialize an object having type T, then to copy that object to m_value. What my version of above does it to value initialize the member directly.
Seems like a lot of work to avoid having to type m_double(0). I think it's harder to understand at first glance, but it does seem fine as long as everything is implemented properly.
But is it worth it? Do you really want to have to #include "DefaultInitialize.h" everywhere?
To clarify, basically, you're:
Making your compile times longer because of the includes.
Your code base larger because you have to manage the deceptively simple DefaultInitialize class
Increase the time it takes other people to read your code. If you have a member of a class that's a double, that's natural to me, but when I see DefaultInitialize, I have to learn what that is and why it was created
All that because you don't like to type out a constructor. I understand that it seems very nice to not have to do this, but most worth-while classes I've ever written tend to need to have a constructor written anyway.
This is certainly only my opinion, but I think most other people will agree with it. That is: it would be handy to not have to explicitly initialize members to 0, but the alternative (your class) isn't worth it.
Not to mention that in C++0x, you can do this;
class Foo
{
private:
int i = 0; // will be initialized to 0
}
Some compilers don't properly implement value initialization. For example, see Microsoft Connect, Value-initialization in new-expression, reported by Pavel Kuznetsov.
Fernando Cacciola's boost::value_initialized (mentioned already here by litb) offers a workaround to such compiler bugs.
If you're just initializing basic types to zero, you can override new and have it memset allocated memory to zero. May be simpler. There are pros and cons to doing this.