Best practice for deferred initialization of private class members - c++

Is there a best practice for deferred initialization of a private class member M of class C? For example:
class C {
public:
C();
// This works properly without m, and maybe called at any time,
// even before startWork was called.
someSimpleStuff();
// Called single time, once param is known and work can be started.
startWork(int param);
// Uses m. Called multiple times.
// Guaranteed to only be called after startWork was called
doProcessing();
private:
M m;
};
class M {
M(int param);
};
Objects of class C can't be constructed because M doesn't have a default initializer.
If you can modify M's implementation, it's possible to add an init method to M, and make its constructor accept no arguments, which would allow constructing objects of class C.
If not, you can wrap the C's member m in std::unique_ptr, and construct it when it becomes possible.
However, both these solutions are prone to errors which would be caught in run-time. Is there some practice to make sure at compile-time that m is only used after its been initialized?
Restriction: An object of class C is handed to external code which makes use of its public interface, so C's public methods can't be split into multiple classes.

The best practice is to never used deferred initialisation.
In your case, ditch the default constructor for C and replace it with C(int param) : m(param){}. That is, class members get initialised at the point of construction using base member initialisation.
Using deferred initialisation means your object is potentially in an undefined state, and achieving things like concurrency is harder.

#define ENABLE_THREAD_SAFETY
class C {
public:
C();
// This works properly without m, and maybe called at any time,
// even before startWork was called.
someSimpleStuff();
// Called single time, once param is known and work can be started.
startWork(int param);
// Uses m. Called multiple times.
// Guaranteed to only be called after startWork was called
doProcessing();
M* mptr()
{
#ifdef ENABLE_THREAD_SAFETY
std::call_once(create_m_once_flag, [&] {
m = std::make_unique<M>(mparam);
});
#else
if (m == nullptr)
m = std::make_unique<M>(mparam);
#endif
return m.get();
}
private:
int mparam;
std::unique_ptr<M> m;
#ifdef ENABLE_THREAD_SAFETY
std::once_flag create_m_once_flag;
#endif
};
class M {
M(int param);
};
Now all you have to do is stop using m directly, and access it through mptr() instead. It will only create the M class once, when it's first used.

I would go with unique_ptr... Where do you see issues with that? When using M, you can easily check:
if(m)
m->foo();
I know that this is not a compile-time check but as far as I know, there is no check possible with current compilers. Code analysis would have to be quite complicated to see something like this because you can initialize m whenever you are willing to in any method or - if public/protected - even in another file. A compile time check would mean that lazy initialization is done at compile time but the very concept of lazy initialization is runtime-based.

Ok from what I understand of your problem, would this be a solution?
You put the functionality that does not require M into class D. You create D object and use it. Once you need M and you want to do the doProcessing() code, you create object of C, pass D to it and initialize it with param that you now have.
The below code is made just to illustrate the idea. You probably don't need startWork() to be a separate function in this case and its code could be written in the constructor of C
Note: I have made all the functions empty, so I could compile the code to check for syntax errors :)
class M
{
public:
M(int param) {}
};
class D
{
public:
D() {}
// This works properly without m, and maybe called at any time,
// even before startWork was called.
void someSimpleStuff() {}
};
class C
{
public:
C(D& d, int param) : d(d), m(param) { startWork(param); }
// Uses m. Called multiple times.
// Guaranteed to only be called after startWork was called
void doProcessing() {}
private:
// Called single time, once param is known and work can be started.
void startWork(int param) {}
D& d;
M m;
};
int main()
{
D d;
d.someSimpleStuff();
C c(d, 1337);
c.doProcessing();
c.doProcessing();
}

The question is "Is it possible to check at compile time that m is only used after it has been initialized without spliting the interface of C?"
The answer is No, you must use the type system to ensure that an object M is not used before initialized which would implies to split C interface. At compile time, compilers only know the type of objects, and the value of constant expressions. C cannot be a literal type. So you must use the type system: you must split C interface to ensure at compile time that M is only used after initialization.

Related

Using setters in constructor

I'm a Java developer trying to pick up C++. Is it okay to use a setter inside a constructor in order to reuse the sanity checks the setter provides?
For example:
#include <stdexcept>
using namespace std;
class Test {
private:
int foo;
void setFoo(int foo) {
if (foo < 42) {
throw invalid_argument{"Foo < 42."};
}
this->foo = foo;
}
public:
Test(int foo) {
setFoo(foo);
};
};
Yes, it is recommended to do this, basically for the reason you already mentioned.
On the other hand you should ask yourself if you need the setter at all and not directly implement the checks inside the constructor. The reason I am writing this is that setters in general result in mutable state which has many disadvantages as opposed to immutable classes. However sometimes they are required.
Another recommendation: If your class variable is an object and you can modify the constructor of this object, you could put the check into the constructor of this object:
class MyFoo {
public:
MyFoo(int value) {
if (value < 42) {
throw invalid_argument{"Foo < 42."};
}
v = value;
}
private:
int v;
}
This will enable you to use an initialization list in the constructor of your Test class:
Test(int foo) : foo(foo) {}
However, now the check is a property of the class of the variable and no longer one of the owning class.
Yes you can. It's fine as long as your setters are not virtual, because it's inheritance hierarchy in calling right functions as the "this" ptr is not ready yet.
Here is Herb Sutter GOTW on this matter: http://www.gotw.ca/gotw/066.htm
Yes, that's fine as long as it makes sense to have a setter for a particular member variable (have some logic that can't be checked by assignment only for example) . In this example, setFoo could've just taken an unsigned int and the caller would know not to pass negative values. Which in turn could eliminate the check and thus the need for a setter. For more elaborate checks, a setter and usage of that setter in the constructor is just fine.
Short answer: Yes. In fact, your example works.
Long answer: But it is not a good practice. Al least, you have to take care.
In general, a set function works with a constructed object. It is supposed that the invariant of the class holds. The functions in a class are implemented considering the invariant is true.
If you want other functions to be used in a constructor, you would have to write some code. For example, to create an empty object.
For example, if in your class you change setFoo in the future (let's say setFoo changes the member foo only it is larger) you example stop working.
This is okay.
The only situation you cannot call member function is when the base classes are not constructed yet.
can member functions be used to initialize member variables in an initialization list?
I know this doesn't fit your situation. Its just for the sake of completeness:
When you are simply settings member values (without checks like yours in setFoo) it is recommended to use initialization lists in the constructor. This prevents members being "initialized" 2 times: 1. with their default value, 2. with the value that you passed into the constructor.
class Test {
private:
int foo_;
public:
Test(int foo)
: foo_(foo)
{ };
};

How much initialization can/should be in C++ Initialization lists

This is my first post. I believe I am aware of best practices on stackoverflow but probably not 100%. I believe there is no specific post that addresses my interrogation; also I hope it's not too vague.
I am trying to figure out good practices for writing C++ constructors
that do medium-to-heavy-duty work.
Pushing (all?) init work into initialization lists seems a good idea
for two reasons that cross my mind, namely:
Resource Acquisition Is Initialization
As far as I know, the simplest way of guaranteeing that members
are initialized correctly at resource acquisition is to make sure that
what's inside the parentheses of the initialization list is correct
when it is evaluated.
class A
{
public:
A(const B & b, const C & c)
: _c(c)
{
/* _c was allocated and defined at the same time */
/* _b is allocated but its content is undefined */
_b = b;
}
private:
B _b;
C _c;
}
const class members
Using initialization lists is the only correct way of using
const members which can hold actual content.
class A
{
public:
A(int m, int n, int p)
: _m(m) /* correct, _m will be initialized to m */
{
_n = n; /* incorrect, _n is already initialized to an undefined value */
*(const_cast<int*>(&_p)) = p; /* technically valid, but ugly and not particularly RAII-friendly */
}
private:
const int _m, _n, _p;
}
However some problems seem to affect over usage of initialization lists:
Order
Member variables are always initialized in the order they are declared in the class definition, so write them in that order in the constructor initialization list. Writing them in a different order just makes the code confusing because it won't run in the order you see, and that can make it hard to see order-dependent bugs.
http://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#S-discussion
This is important if you initialize a value using a value
initialized previously in the list. For example:
A(int in) : _n(in), _m(_n) {}
If _m is defined before _n, its value at initialization is undefined.
I am ready to apply this rule in my code, but when working
with other people it causes code redundancy and forces reading
two files at once.
That is not acceptable and somewhat error-prone.
Solution — initialize using only data from ctor arguments.
Solution's problem — keeping work in the init list without
inner dependency means repeating operations. For example:
int in_to_n(int in)
{
/* ... */
return n;
}
int modify(int n)
{
/* ... */
return modified;
}
A::A(int in)
: _n(in_to_n(in))
, _n_modified(modify(in_to_n(in)))
{}
For tiny bits of repeated operations I believe compilers
can reuse existing data but I don't think one should rely on that
for significant work (and I don't even think it's done if calling
noninlined separate code).
How much work can you put in the list?
In the previous example, I called functions to compute what the
attributes are to be initialized to. These can be plain/lambda
functions or static/nonstatic methods,
of the current class or of another.
(I don't suggest using nonstatic methods of the current class,
it might even be undefined usage according to the standard, not sure.)
I guess this is not in itself a big problem, but one needs to make
special efforts in clarity to keep the intent of the code clear if
writing big classes that do big work that way.
Also, when trying to apply the solution to the previous problem,
there is only so much independent work you can do when initializing
your instance... This usually gets big if you have a long sequence
of attributes to initialize with inner dependencies.
It's starting to look like just the program, translated into an
initialization list; I guess this is not what C++ is supposed to be
transitioning into?
Multiple inits
One often computes two variables at once. Setting two variables
at once in an init list means either:
using an ugly intermediate attribute
struct InnerAData
{
B b;
C c;
};
/* must be exported with the class definition (ugly) */
class A
{
public:
A(const D & input)
: _inner(work(input))
, _b(_inner.b)
, _c(_inner.c) {}
private:
B _b;
C _c;
InnerAData _inner;
}
This is awful and forces extra useless copies.
or some ugly hack
class A
{
public:
A(const D & input) : _b(work(input)) {}
private:
B _b;
C _c;
B work(const D & input)
{
/* ... work ... */
_c = ...;
}
}
This is even more awful and doesn't even work with const
or non-builtin type attributes.
keeping stuff const
Sometimes it can take most of the ctor to figure out the value
to give to an attribute, so that making sure it is const,
and therefore moving the work to the initialization list,
can seem constrained. I won't give a full example, but think
something like computing data from a default filename, then
computing the full filename from that data, then checking if
the corresponding file exists to set a const boolean, etc.
I guess it's not a fundamental problem, but all that seems
intuitively more legible in the body of the ctor, and moving
it to the init list just to do a correct initialization of
a const field seems overkill. Maybe I'm just imagining things.
So here's the hard part: asking a specific question!
Have you faced similar problems, did you find a better solution,
if not what's the lesson to learn — or is there something I'm
missing here?
I guess my problem is I'm pretty much trying to move all the work
to the init list when I could search for a compromise of what state
is initiated and leave some work for later. I just feel like init list
could play a bigger role in making modern C++ code than it does but
I haven't seen them pushed further than basic usage yet.
Additionally, I'm really not convinced as to why the values are
initialized in that order, and not in the order of the list.
I've been orally told it's because attributes are in order on the stack and
the compiler must guarantee that stack data is never above the SP.
I'm not sure how that's a final answer... pretty sure one could
implement safe arbitrarily reordered initialization lists,
correct me if I'm wrong.
In your code:
class A
{
public:
A(const B & b, const C & c)
: _c(c)
{
/* _c was allocated and defined at the same time */
/* _b is allocated but its content is undefined */
_b = b;
}
private:
B _b;
C _c;
}
the constructor calls B::B() and then B::operator= which may be a problem if any of these doesn't exist, is expensive or is not implemented correctly to the RAII and rule-of-three guidelines. The rule of thumb is to always prefer initializer list if it is possible.
Since c++11, an alternative is to use delegating constructors:
struct InnerData;
InnerData (work(const D&);
class A
{
public:
A(const D & input) : A(work(input)) {}
private:
A(const InnerAData&);
private:
const B _b;
const C _c;
};
And (that can be inlined, but then visible in header)
struct InnerAData
{
B b;
C c;
};
A::A(const InnerAData& inner) : _b(inner.b), _c(inner.c) {}

Call method only once for EACH object of the class NOT STATIC

I wrote a class where the constructor is private.
I need to assign the given value to the private members ONLY ONCE
in the method construct(int a).
It should be something like a constructor but not a constructor !
Every time this construct(int a) is called after the first time,
I do not need to reassign anything to that specific OBJECT.
How to achieve that without any booleans?
I thought of boost::call_once but it calls construct(int a) once for ENTIRE CLASS! and I need to call this function ONCE for EACH OBJECT.
just like ctor! Any ideas?
UPDATE 1:
The Constructor is private. But the class has some members those values can be assigned from the outside but only ONCE
I am trying to achieve some automatisation for checking if a function was called or not already without using bool wasCalled or something like that.
UPDATE 2:
LT::Pointer lut = LT::New();
std::vector<double> points;
....
lut->construct(points);
The second time
lut->construct(points);
is called - error should be given, or just somehow make it impossible.
Direct Answer:
You can devise a wrapper that applies "assign-once" semantics to the wrapped object.
However, you can not make the compiler detect that a value is being set for the second time at compile time, so you should be prepared to make it assert/throw at runtime.
Background/look around
As others have said, this smells very much like a design flaw. Why can't you have the New operation forward constructor parameters (a-la make_shared, make_unique?):
template <typename T, typename... Args>
SmartPointer<T> genericNew(Args&&... args) {
return SmartPointer<T>(new T(std::forward<Args>(args)...));
}
Of course, there could be specialized factory methods that even know how to set private properties after construction. Make the factory methods friends, to preven others from using the hidden property (setters) after creation by the factory:
struct X {
int a;
X(int i) : a(i) {}
typedef SmartPointer<X> Ptr;
static Ptr New(int a, int init_only) {
Ptr p(new X(a));
p->init_only = init_only;
return p;
}
private:
int init_only;
};
(here I opted to make the New factory method a static member, so it's implicitly a friend)

C++: Is it possible to call an object's function before constructor completes?

In C++, is it possible to call a function of an instance before the constructor of that instance completes?
e.g. if A's constructor instantiates B and B's constructor calls one of A's functions.
Yes, that's possible. However, you are responsible that the function invoked won't try to access any sub-objects which didn't have their constructor called. Usually this is quite error-prone, which is why it should be avoided.
This is very possible
class A;
class B {
public:
B(A* pValue);
};
class A {
public:
A() {
B value(this);
}
void SomeMethod() {}
};
B::B(A* pValue) {
pValue->SomeMethod();
}
It's possible and sometimes practically necessary (although it amplifies the ability to level a city block inadvertently). For example, in C++98, instead of defining an artificial base class for common initialization, in C++98 one often see that done by an init function called from each constructor. I'm not talking about two-phase construction, which is just Evil, but about factoring out common initialization.
C++0x provides constructor forwarding which will help to alleviate the problem.
For the in-practice it is Dangerous, one has to be extra careful about what's initialized and not. And for the purely formal there is some unnecessarily vague wording in the standard which can be construed as if the object doesn't really exist until a constructor has completed successfully. However, since that interpretation would make it UB to use e.g. an init function to factor out common initialization, which is a common practice, it can just be disregarded.
why would you wanna do that? No, It can not be done as you need to have an object as one of its parameter(s). C++ member function implementation and C function are different things.
c++ code
class foo
{
int data;
void DoSomething()
{
data++;
}
};
int main()
{
foo a; //an object
a.data = 0; //set the data member to 0
a.DoSomething(); //the object is doing something with itself and is using 'data'
}
Here is a simple way how to do it C.
typedef void (*pDoSomething) ();
typedef struct __foo
{
int data;
pDoSomething ds; //<--pointer to DoSomething function
}foo;
void DoSomething(foo* this)
{
this->data++; //<-- C++ compiler won't compile this as C++ compiler uses 'this' as one of its keywords.
}
int main()
{
foo a;
a.ds = DoSomething; // you have to set the function.
a.data = 0;
a.ds(&a); //this is the same as C++ a.DoSomething code above.
}
Finally, the answer to your question is the code below.
void DoSomething(foo* this);
int main()
{
DoSomething( ?? ); //WHAT!?? We need to pass something here.
}
See, you need an object to pass to it. The answer is no.

Configuration structs vs setters

I recently came across classes that use a configuration object instead of the usual setter methods for configuration. A small example:
class A {
int a, b;
public:
A(const AConfiguration& conf) { a = conf.a; b = conf.b; }
};
struct AConfiguration { int a, b; };
The upsides:
You can extend your object and easily guarantee reasonable default values for new values without your users ever needing to know about it.
You can check a configuration for consistency (e.g. your class only allows some combinations of values)
You save a lot of code by ommiting the setters.
You get a default constructor for specifying a default constructor for your Configuration struct and use A(const AConfiguration& conf = AConfiguration()).
The downside(s):
You need to know the configuration at construction time and can't change it later on.
Are there more downsides to this that I'm missing? If there aren't: Why isn't this used more frequently?
Whether you pass the data individually or per struct is a question of style and needs to be decided on a case-by-case basis.
The important question is this: Is the object is ready and usable after construction and does the compiler enforce that you pass all necessary data to the constructor or do you have to remember to call a bunch of setters after construction who's number might increase at any time without the compiler giving you any hint that you need to adapt your code. So whether this is
A(const AConfiguration& conf) : a(conf.a), b(conf.b) {}
or
A(int a_, int b_) : a(a_), b(b_) {}
doesn't matter all that much. (There's a number of parameters where everyone would prefer the former, but which number this is - and whether such a class is well designed - is debatable.) However, whether I can use the object like this
A a1(Configuration(42,42));
A a2 = Configuration(4711,4711);
A a3(7,7);
or have to do this
A urgh;
urgh.setA(13);
urgh.setB(13);
before I can use the object, does make a huge difference. Especially so, when someone comes along and adds another data field to A.
Using this method makes binary compatibility easier.
When the library version changes and if the configuration struct contains it, then constructor can distinguish whether "old" or "new" configuration is passed and avoid "access violation"/"segfault" when accessing non-existant fields.
Moreover, the mangled name of constructor is retained, which would have changed if it changed its signature. This also lets us retain binary compatibility.
Example:
//version 1
struct AConfiguration { int version; int a; AConfiguration(): version(1) {} };
//version 2
struct AConfiguration { int version; int a, b; AConfiguration(): version(2) {} };
class A {
A(const AConfiguration& conf) {
switch (conf.version){
case 1: a = conf.a; b = 0; // No access violation for old callers!
break;
case 2: a = conf.a; b = conf.b; // New callers do have b member
break;
}
}
};
The main upside is that the A object can be unmutable. I don't know if having the AConfiguration stuct actualy gives any benefit over just an a and a b parameter to the constructor.
Using this method makes binary compatability harder.
If the struct is changed (one new optional field is added), all code using the class might need a recompile. If one new non-virtual setter function is added, no such recompilation is necessary.
I would support the decreased binary compatibility here.
The problem I see comes from the direct access to a struct fields.
struct AConfig1 { int a; int b; };
struct AConfig2 { int a; std::map<int,int> b; }
Since I modified the representation of b, I am screwed, whereas with:
class AConfig1 { public: int getA() const; int getB() const; /* */ };
class AConfig2 { public: int getA() const; int getB(int key = 0) const; /* */ };
The physical layout of the object might have change, but my getters have not and the offset to the functions have not either.
Of course, for binary compatibility, one should check out the PIMPL idiom.
namespace details { class AConfigurationImpl; }
class AConfiguration {
public:
int getA() const;
int getB() const;
private:
AConfigurationImpl* m_impl;
};
While you do end up writing more code, you have the guarantee here of backward compatibility of your object as long as you add supplementary methods AFTER the existing ones.
The representation of an instance in memory does not depend on the number of methods, it only depends on:
the presence or absence of virtual methods
the base classes
the attributes
Which is what is VISIBLE (not what is accessible).
And here we guarantee that we won't have any change in the attributes. The definition of AConfigurationImpl might change without any problem and the implementation of the methods might change too.
The more code means: constructor, copy constructor, assignment operator and destructor, which is a fair amount, and of course the getters and setters. Also note that these methods can no longer be inlined, since their implementation are defined in a source file.
Whether or not it suits you, you're on your own to decide.