RAII and uninitalized values - c++

Just a simple question:
if I had a simple vector class:
class Vector
{
public:
float x;
float y;
float z;
};
Doesnt the RAII concept apply here as well? i.e. to provide a constructor to initialize all values to some values (to prevent uninitialized value being used).
EDIT or to provide a constructor that explicitly asks the user to initialize the member variables before the object can be obstantiated.
i.e.
class Vector
{
public:
float x;
float y;
float z;
public:
Vector( float x_, float y_, float z_ )
: x( x_ ), y( y_ ), z( z_ )
{ // Code to check pre-condition; }
};
Should RAII be used to help programmer forgetting to initialize the value before it's used, or is that the developer's responsibility?
Or is that the wrong way of looking at RAII?
I intentionally made this example ridiculously simple. My real question was to answer, for example, a composite class such as:
class VectorField
{
public:
Vector top;
Vector bottom;
Vector back;
// a lot more!
};
As you can see...if I had to write a constructor to initialize every single member, it's quite tedious.
Thoughts?

The "R" in RAII stands for Resource. Not everything is a resource.
Many classes, such as std::vector, are self-initializing. You don't need to worry about those.
POD types are not self initializing, so it makes sense to initialize them to some useful value.

Since the fields in your Vector class are built-in types, in order to ensure that they are initialized you'll have to do that in a constructor:
class Vector
{
public:
float x;
float y;
float z;
Vector() : x(0.0), y( 0.0), z( 0.0) {}
};
Now, if your fields were classes that were properly written, they should automatically initialize (and clean up, if necessary) by themselves.
In a way this is similar and related to RAII in that RAII means that resources (memory, handles, whatever) are acquired and cleaned up automatically by the object.

I wouldn't exactly say RAII applies here. Remember what the letters stand for: resource acquisition is initialization. You have no resources being acquired here, so RAII doesn't apply.
You could provide a default constructor to Vector; that would remove the need for you to explicitly initialize all the members of VectorField. The compiler would insert code to do that for you.

You use the RAII pattern when you need to do explicit cleanup, and want that cleanup to occur at the same time as another object is implicitly cleaned up. This can occur for memory allocation/deallocation, critical section entry/exit, database connections, etc. In your example, the "floats" are cleaned up automatically so you don't need to worry about them. However, say you had the following function that you called to obtain vectors:
Vector* getMeAVector() {
Vector *v = new Vector();
// do something
return v;
}
And say it was the caller's responsibility to delete the returned vector. If you called this code the following way:
Vector *v = getMeAVector();
// do some stuff with v
delete v;
You'd have to remember to free the vector. If the "stuff" is a long bit of code, which may throw an exception, or have a bunch of return statements in there, you'd have to free the vector with every exit point. Even if you do it, the person who maintains the code by adding another "return" statement or calling some library that throws an exception may not. Instead, you could write a class like this:
class AutoVector
{
Vector *v_;
public:
AutoVector(Vector *v) : v_(v) {}
~AutoVector() { delete v_; }
};
Then, you could obtain the vector like so:
Vector *v = getMeAVector();
AutoVector av(v);
// do lots of complicated stuff including throwing exceptions, multiple returns, etc.
Then you don't have to worry about deleting the vector any more because when av goes out of scope it will be deleted automatically. You can write a little macro to make the "AutoVector av(v)" syntax a little nicer too, if you want.
This is a bit of a contrived example, but if the surrounding code is complicated, or if it can throw exceptions, or someone comes along and adds a "return" statement in the middle, it's nice that the "AutoVector" will free the memory automatically.
You can do the same thing with an "auto" class that enters a critical section in its ctor and exits in its dtor, etc.

If you don't write constructor, the compiler will generate a default constructor for you, and set those values to default (uninitialized values). Provide a default constructor yourself and initialize the values there will be your best way to do this. I don't think it's too complicated to do that. Don't be too lazy :-)

Related

vector and primitive type initialisation

I've learned that if you declare for example an int in the global scope,
int x; //defaults to 0;
and in the local scope,
void f() {
int x; //undefined
}
However if we use a vector either in the global or local scope:
vector<int> v(3); //initialise v to {0,0,0} using int's default constructor.
We can default initialise int like vector's elements in the local scope by doing this:
int x = int(); //defaults to 0
I think if we use int's default constructor it's allocated in the heap.
Why can't a primitive type be default initialised in the local scope like T x;? Or
In the local scope, why does vectors (dunno about other containers) use the element's default constructor and not leave them uninitialised just like an int declaration?
What are the benefits of current approach on those two types? Why are they initialised in different ways? Is this about performance?
It's like this for "performance" reasons, because the C++ folks wanted the C folks back in the 1980's to not have any reason to complain about "paying for what we don't need." That's one of the tenets of C++, to not pay (run-time) costs for things you don't use. So the old-style POD types are uninitialized by default, though classes and structs with constructors always have one of those constructors called.
If I were specifying it today, I'd say that int x; in local scope would be default-initialized (to 0), and if you wanted to avoid that you could say something like int x = std::noinit;. It's far too late for this now, but I have actually done it in some class types when performance mattered a lot:
class SuperFast
{
struct no_init_t {};
public:
no_init_t no_init;
SuperFast() : x(0), y(0) {}
SuperFast(no_init_t) {}
private:
int x, y;
};
This way, default construction will give a valid object, but if you have a serious reason to need to avoid this, you can. You might use this technique if you know you will soon overwrite a whole bunch of these objects anyway--no need to default-construct them:
SuperFast sf(SuperFast::no_init); // look ma, I saved two nanoseconds!

How to allocate an object with a complex constructor?

I think I know C++ reasonably well and I am thinking about implementing something a bit bigger than a "toy" program. I know the difference between stack- and heap-memory and the RAII-idiom.
Lets assume I have a simple class point:
class point {
public:
int x;
int y;
point(int x, int y) : x(x), y(y) {}
};
I would allocate points always on the stack, since the objects are small. Since on 64-bit machines sizeof(point) == sizeof(void*), if a am not wrong, I would go even further and pass points by value by default.
Now lets assume a more complex class battlefield, that I want to use in the class game:
class battlefield {
public:
battlefield(int w, int h, int start_x, int start_y, istream &in) {
// Complex generation of a battlefield from a file/network stream/whatever.
}
};
Since I really like RAII and the automatic cleanup when an object leaves the scope I am tempted to allocate the battlefield on the stack.
game::game(const settings &s) :
battlefield(s.read("w"), s.read("h"), gen_random_int(), gen_random_int(), gen_istream(s.read("level_number"))) {
// ...
}
But I have several problems now:
Since this class has not got a zero-args-constructor I have to initialize it in the initialisation list of the class I use battlefield in. This is cumbersome since I need a istream from somewhere. This leads to the next problem.
The complex constructors "snowball" at some point. When I use battlefield in the game class and initialize it in the initialisation list the constructor of game, the constructor of game will become fairly complex too and the initialisation of game itself might become cumbersome too. (When I decide to take the istream as argument of the game constructor)
I need auxiliary functions to fill in complex parameters.
I see two solutions to this problem:
Either I create a simple constructor for battlefield that does not initialize the object. But this approach has the problem that I have a half-initialized object, aka an object that violates the RAII-idiom. Strange things might happen when calling methods on such an object.
game::game(const settings &s) {
random_gen r;
int x = r.random_int();
int y = r.random_int();
ifstream in(s.read("level_number"));
in.open();
this->battlefield.init(s.read("w"), s.read("h"), x, y, in);
// ...
}
Or I allocate battlefield on the heap in the game constructor. But I have to beware of exceptions in the constructor and I have to take care that the destructor deletes the battlefield.
game::game(const settings &s) {
random_gen r;
int x = r.random_int();
int y = r.random_int();
ifstream in(s.read("level_number"));
in.open();
this->battlefield = new battlefield(s.read("w"), s.read("h"), x, y, in);
// ...
}
I hope you can see the problem I am thinking of. Some questions that arise for me are:
Is there a design pattern for this situations I do not know?
What is the best practise in bigger C++ projects? Which objects are allocated on the heap, which ones are allocated on the stack? Why?
What is the general advice regarding the complexity of constructors? Is reading from a file too much for a constructor? (Since this problem mostly arises from the complex constructor.)
You could let your battlefield be constructed from settings:
explicit battlefield(const settings& s);
or alternatively, why not create a factory function for your battlefield?
E.g.
battlefield CreateBattlefield(const settings& s)
{
int w = s.read("w");
int h = s.read("w");
std::istream& in = s.genistream();
return battlefield(w, h, gen_random_int(), gen_random_int(), in);
}
game::game(const settings &s) :
battlefield(CreateBattlefield(s)) {
// ...
}
But this approach has the problem that I have a half-initialized object, aka an object that violates the RAII-idiom.
That is not RAII. The concept is you use objects to manage the resources. When you aquire a resource like heap memory, semaphore, file handle, you have to transfer the ownership to a resource managing class. This is what smart pointers in C++ are meant for. You have to use either unique_ptr if you want to have sole ownership of the object or use a shared_ptr if you want multiple pointers to have ownership.
Or I allocate battlefield on the heap in the game constructor. But I have to beware of exceptions in the constructor and I have to take care that the destructor deletes the battlefield.
If your constructor throws an exception, then the destructor of the object would not be called and you might end up in a half-cooked object. In this case, you have to remember what allocations you did in the constructor before the exception was thrown and deallocate all those. Again smart pointers will help automatic cleaning of resources. See this faq
Which objects are allocated on the heap, which ones are allocated on the stack? Why?
Try to allocate the objects in stack whenever possible. Your objects then have life only in the scope of that block. If you have a case where this is not possible go for heap allocation - eg: you only know the size at runtime, the size of the object is too big to sit on stack.

Allocating memory without initializing it in C++

I'm getting acquainted with C++, and I'm having a problem with memory management. In C, whenever I'd want to reserve memory for any number of elements, regardless of type, I would just call malloc() and then initialize by hand (through a loop), to whichever value I wanted. With C++'s new, everything is automagically initialized.
Problem is, I've got a BattlePoint class which goes a little something like this:
class BattlePoint {
public:
BattlePoint(int x, int y) : x(x), y(y) { };
bool operator==(const BattlePoint &right);
virtual ~BattlePoint();
private:
int x, y;
};
As you can see, it takes a few x and y values through the initializer and then sets its own x and y from it. The problem is, this function will be called from a function which will allocate an array of them:
BattleShip::BattleShip(BattlePoint start, enum shipTypeSize size, enum shipOrientation orientation) : size(size), orientation(orientation) {
points = new BattlePoint[size]; // Here be doubts.
}
So, I need my BattleShip's point to hold an array of BattlePoints, each one with different initialization values (such as 0,1; 0,2; 0,3, etcetera).
Question is: how could I allocate my memory uninitialized?
Julian,
P.S.: I haven't done any testing regarding the way new works, I simple read Wikipedia's article on it which says:
In the C++ programming language, as well as in many C++-based
languages, new is a language construct that dynamically allocates
memory on the heap and initialises the memory using the
constructor. Except for a form called the "placement new", new
attempts to allocate enough memory on the heap for the new data. If
successful, it initialises the memory and returns the address to the
newly allocated and initialised memory. However if new cannot allocate
memory on the heap it will throw an exception of type std::bad_alloc.
This removes the need to explicitly check the result of an allocation.
A call to delete, which calls the destructor and returns the memory
allocated by new back to the heap, must be made for every call to new
to avoid a memory leak.
placement new should be the solution, yet it makes no mention on how to do it.
P.S. 2: I know this can be done through stdlib's vector class, but I'm avoiding it on purpose.
You need to use a std::vector. In this case you can push_back whatever you want, e.g.
std::vector<BattlePoint> x;
x.push_back(BattlePoint(1, 2));
If you ever find yourself using new[], delete, or delete[], refactor your program immediately to remove such. They are hideously unsafe in virtually every way imaginable. Instead, use resource-managing classes, such as std::unique_ptr, std::vector, and std::shared_ptr.
Regular new can be useful in some situations involving unique_ptr, but else avoid it. In addition, placement new is usually not worth it. Of course, if you're writing a resource-managing class, then you may have to use them as underlying primitives, but that's few and very far between.
Edit: My mistake, I didn't see the very last line of your question. Addressing it:
P.S. 2: I know this can be done through stdlib's vector class, but I'm
avoiding it on purpose.
If you have some campaign against the Standard Library, then roll your own vector replacement. But do not go without a vector class. There's a reason that it must be provided by all conforming compilers.
points = new BattlePoint[size]; // Here be doubts.
P.S. 2: I know this can be done through stdlib's vector class, but I'm avoiding it on purpose.
Most certainly there will be doubts! Use std::vector. Why wouldn't you? There is no reason not to use std::vector, especially if it solves your problem.
std::vector<BattlePoint> bpoints;
bpoints.reserve(size); // there, only alloc'd memory, not initialized it.
bpoints.push_back(some_point); // still need to use push_back to initialize it
I'm sure the question will come - how does std::vector only alloc the memory?!
operator new is the answer. It's the operator that gets called for memory allocation when you use new. new is for construction and initialization, while operator new is for allocation (that's why you can overload it).
BattlePoint* bpoints = ::operator new(size); // happens in reserve
new (bpoints[index]) BattlePoint(some_x, some_y); // happens in push_back
The comp.lang.c++ FAQ has useful things to say on the matter, including attempting to dissuade you from using placement new - but if you really insist, it does have a useful section on placement new and all its pitfalls.
To echo the above answers, I would most certainly point you towards std::vector as it is the best possible solution. Managing your own dynamic arrays in C++ is almost never a good idea, and is almost never necessary.
However, to answer the direct question -- in this situation you can create a default constructor and some mutators to get the desired effect:
class BattlePoint {
public:
// default constructor, default initialize to 0,0
BattlePoint() x(0), y(0) {};
BattlePoint(int x, int y) : x(x), y(y) { };
bool operator==(const BattlePoint &right);
virtual ~BattlePoint();
// mutator functions allow you to modify the classes member values
void set_x(int x_) {x = x_;}
void set_y(int y_) {y = y_;}
private:
int x, y;
};
Then you can initialize this as you are used to in C:
BattlePoint* points = new BattlePoint[100];
for(int x = 0; x < 100; ++x)
{
points->set_x(x);
points->set_y(x * 2);
}
If you're bothered by basically making the BattlePoint class publically mutable, you can keep the mutators private and introduce a friend function specifically for initializing the values. This is a slightly more involved concept, so I'll forgo further explanation on this for now, unless it is needed.
Since you asked :)
Create your BattlePoint class again with a default constructor and mutators, however this time leave the mutators private, and declare a friend function to use them:
class BattlePoint {
public:
// default constructor, default initialize to 0,0
BattlePoint() x(0), y(0) {};
BattlePoint(int x, int y) : x(x), y(y) { };
bool operator==(const BattlePoint &right);
virtual ~BattlePoint();
private:
// mutator functions allow you to modify the classes member values
void set_x(int x_) {x = x_;}
void set_y(int y_) {y = y_;}
int x, y;
friend void do_initialize_x_y(BattlePoint*, int, int);
};
Create a header file that will contain a local function for creating the array of BattlePoint objects. This function will be available to anyone that includes the header, but if named properly then "everyone" should know not to use it.
// BattlePoint_Initialize.h
BattlePoint* create_battle_point_array(size_t count, int* x, int* y);
This function gets defined in the implementation file, along with our friend function that we will "hide" from the outside world:
// BattlePoint_Initialize.cpp
#include <BattlePoint_Initialize.h>
namespace
{
// by putting this function in an anonymous namespace it is only available
// to this compilation unit. This function can only be called from within
// this particular file.
//
// technically, the symbols are still exported, but they are mangled badly
// so someone could call this, but they would have to really try to do it
// not something that could be done "by accident"
void do_initialize_x_y(BattlePoint* bp, int x, int y)
{
bp->set_x(x);
bp->set_y(y);
}
}
// caution, relies on the assumption that count indicates the number of
// BattlePoint objects to be created, as well as the number of valid entries
// in the x and y arrays
BattlePoint* create_battle_point_array(size_t count, int* x, int* y)
{
BattlePoint* bp_array = new BattlePoint[count];
for(size_t curr = 0; curr < count; ++curr)
{
do_initialize_x_y(bp_array[curr], x[curr], y[curr]);
}
return bp_array;
}
So there you have it. A very convoluted way to meet your basic requirements.
While, create_battlepoint_array() could in theory be called anywhere, it's actually not capable of modifying an already created BattlePoint object. The do_initialize_x_y() function by nature of being hidden in an anonymous namespace tucked away behind the initialization code cannot easily be called from anywhere else in your program. In effect, once a BattlePoint object has been created (and initialized in two steps), it cannot be modified further.

How to store different data types in one list? (C++)

I need to store a list of various properties of an object. Property consists of a name and data, which can be of any datatype.
I know I can make a class "Property", and extend it with different PropertySubClasses which only differ with the datatype they are storing, but it does not feel right.
class Property
{
Property(std::string name);
virtual ~Property();
std::string m_name;
};
class PropertyBoolean : Property
{
PropertyBoolean(std::string name, bool data);
bool m_data;
};
class PropertyFloat : Property
{
PropertyFloat(std::string name, float data);
float m_data;
};
class PropertyVector : Property
{
PropertyVector(std::string name, std::vector<float> data);
std::vector<float> m_data;
};
Now I can store all kinds of properties in a
std::vector<Property*>
and to get the data, I can cast the object to the subclass. Or I can make a pure virtual function to do something with the data inside the function without the need of casting.
Anyways, this does not feel right to create these different kind of subclasses which only differ by the data type they are storing. Is there any other convenient way to achieve similar behavior?
I do not have access to Boost.
C++ is a multi-paradigm language. It shines brightest and is most powerful where paradigms are mixed.
class Property
{
public:
Property(const std::string& name) //note: we don't lightly copy strings in C++
: m_name(name) {}
virtual ~Property() {}
private:
std::string m_name;
};
template< typename T >
class TypedProperty : public Property
{
public:
TypedProperty (const std::string& name, const T& data)
: Property(name), m_data(data);
private:
T m_data;
};
typedef std::vector< std::shared_ptr<Property> > property_list_type;
Edit: Why using std::shared_ptr<Property> instead of Property*?
Consider this code:
void f()
{
std::vector<Property*> my_property_list;
for(unsigned int u=0; u<10; ++u)
my_property_list.push_back(new Property(u));
use_property_list(my_property_list);
for(std::vector<Property*>::iterator it=my_property_list.begin();
it!=my_property_list.end(); ++it)
delete *it;
}
That for loop there attempts to cleanup, deleting all the properties in the vector, just before it goes out of scope and takes all the pointers with it.
Now, while this might seem fine for a novice, if you're an only mildly experienced C++ developer, that code should raise alarm bells as soon as you look at it.
The problem is that the call to use_property_list() might throw an exception. If so, the function f() will be left right away. In order to properly cleanup, the destructors for all automatic objects created in f() will be called. That is, my_property_list will be properly destroyed. std::vector's destructor will then nicely cleanup the data it holds. However, it holds pointers, and how should std::vector know whether these pointers are the last ones referencing their objects?
Since it doesn't know, it won't delete the objects, it will only destroy the pointers when it destroys its content, leaving you with objects on the heap that you don't have any pointers to anymore. This is what's called a "leak".
In order to avoid that, you would need to catch all exceptions, clean up the properties, and the rethrow the exception. But then, ten years from now, someone has to add a new feature to the 10MLoC application this has grown to, and, being in a hurry, adds code which leaves that function prematurely when some condition holds. The code is tested and it works and doesn't crash - only the server it's part of now leaks a few bytes an hour, making it crash due to being out of memory about once a week. Finding that makes for many hours of fine debugging.
Bottom line: Never manage resources manually, always wrap them in objects of a class designed to handle exactly one instance of such a resource. For dynamically allocated objects, those handles are called "smart pointer", and the most used one is shared_ptr.
A lower-level way is to use a union
class Property
union {
int int_data;
bool bool_data;
std::cstring* string_data;
};
enum { INT_PROP, BOOL_PROP, STRING_PROP } data_type;
// ... more smarts ...
};
Dunno why your other solution doesn't feel right, so I don't know if this way would feel better to you.
EDIT: Some more code to give an example of usage.
Property car = collection_of_properties.head();
if (car.data_type == Property::INT_PROP) {
printf("The integer property is %d\n", car.int_data);
} // etc.
I'd probably put that sort of logic into a method of the class where possible. You'd also have members such as this constructor to keep the data and type field in sync:
Property::Property(bool value) {
bool_data = value;
data_type = BOOL_PROP;
}
I suggest boost::variant or boost::any. [Related question]
Write a template class Property<T> that derives from Property with a data member of type T
Another possible solution is to write a intermediate class managing the pointers to Property classes:
class Bla {
private:
Property* mp
public:
explicit Bla(Property* p) : mp(p) { }
~Bla() { delete p; }
// The standard copy constructor
// and assignment operator
// aren't sufficient in this case:
// They would only copy the
// pointer mp (shallow copy)
Bla(const Bla* b) : mp(b.mp->clone()) { }
Bla& operator = (Bla b) { // copy'n'swap trick
swap(b);
return *this;
}
void swap(Bla& b) {
using std::swap; // #include <algorithm>
swap(mp, b.mp);
}
Property* operator -> () const {
return mp;
}
Property& operator * () const {
return *mp;
}
};
You have to add a virtual clone method to your classes returning a pointer to a newly created copy of itself:
class StringProperty : public Property {
// ...
public:
// ...
virtual Property* clone() { return new StringProperty(*this); }
// ...
};
Then you'll be able to do this:
std::vector<Bla> v;
v.push_back(Bla(new StringProperty("Name", "Jon Doe")));
// ...
std::vector<Bla>::const_iterator i = v.begin();
(*i)->some_virtual_method();
Leaving the scope of v means that all Blas will be destroyed freeing automatically the pointers they're holding. Due to its overloaded dereferencing and indirection operator the class Bla behaves like an ordinary pointer. In the last line *i returns a reference to a Bla object and using -> means the same as if it was a pointer to a Property object.
A possible drawback of this approach is that you always get a heap operation (a new and a delete) if the intermediate objects must be copied around. This happens for example if you exceed the vector's capacity and all intermediate objects must be copied to a new piece of memory.
In the new standard (i.e. c++0x) you'll be able to use the unique_ptr template: It
can be used inside the standard containers (in contrast to the auto_ptr which must not be used in the standard containers),
offers the usually faster move semantics (it can easily passed around) and
takes care over the held pointers (it frees them automatically).
I see that there are lots of shots at trying to solve your problem by now, but I have a feeling that you're looking in the wrong end - why do you actually want to do this in the first place? Is there some interesting functionality in the base class that you have omitted to specify?
The fact that you'd be forced to switch on a property type id to do what you want with a specific instance is a code smell, especially when the subclasses have absolutely nothing in common via the base class other than a name (which is the type id in this case).
Starting with C++ 17 we have something called as std::variant and std::any.
std::variant
An instance of std::variant at any given time either holds a value of one of its alternative types, or in the case of error - no value.
std::any
The class any describes a type-safe container for single values of any copy constructible type.
An object of class any stores an instance of any type that satisfies the constructor requirements or is empty, and this is referred to as the state of the class any object. The stored instance is called the contained object. Two states are equivalent if they are either both empty or if both are not empty and if the contained objects are equivalent.
The non-member any_cast functions provide type-safe access to the contained object.
You can probably do this with the Boost library, or you could create a class with a type code and a void pointer to the data, but it would mean giving up some of the type safety of C++. In other words, if you have a property "foo", whose value is an integer, and give it a string value instead, the compiler will not find the error for you.
I would recommend revisiting your design, and re-evaluating whether or not you really need so much flexibility. Do you really need to be able to handle properties of any type? If you can narrow it down to just a few types, you may be able to come up with a solution using inheritance or templates, without having to "fight the language".

Lazy/multi-stage construction in C++

What's a good existing class/design pattern for multi-stage construction/initialization of an object in C++?
I have a class with some data members which should be initialized in different points in the program's flow, so their initialization has to be delayed. For example one argument can be read from a file and another from the network.
Currently I am using boost::optional for the delayed construction of the data members, but it's bothering me that optional is semantically different than delay-constructed.
What I need reminds features of boost::bind and lambda partial function application, and using these libraries I can probably design multi-stage construction - but I prefer using existing, tested classes. (Or maybe there's another multi-stage construction pattern which I am not familiar with).
The key issue is whether or not you should distinguish completely populated objects from incompletely populated objects at the type level. If you decide not to make a distinction, then just use boost::optional or similar as you are doing: this makes it easy to get coding quickly. OTOH you can't get the compiler to enforce the requirement that a particular function requires a completely populated object; you need to perform run-time checking of fields each time.
Parameter-group Types
If you do distinguish completely populated objects from incompletely populated objects at the type level, you can enforce the requirement that a function be passed a complete object. To do this I would suggest creating a corresponding type XParams for each relevant type X. XParams has boost::optional members and setter functions for each parameter that can be set after initial construction. Then you can force X to have only one (non-copy) constructor, that takes an XParams as its sole argument and checks that each necessary parameter has been set inside that XParams object. (Not sure if this pattern has a name -- anybody like to edit this to fill us in?)
"Partial Object" Types
This works wonderfully if you don't really have to do anything with the object before it is completely populated (perhaps other than trivial stuff like get the field values back). If you do have to sometimes treat an incompletely populated X like a "full" X, you can instead make X derive from a type XPartial, which contains all the logic, plus protected virtual methods for performing precondition tests that test whether all necessary fields are populated. Then if X ensures that it can only ever be constructed in a completely-populated state, it can override those protected methods with trivial checks that always return true:
class XPartial {
optional<string> name_;
public:
void setName(string x) { name_.reset(x); } // Can add getters and/or ctors
string makeGreeting(string title) {
if (checkMakeGreeting_()) { // Is it safe?
return string("Hello, ") + title + " " + *name_;
} else {
throw domain_error("ZOINKS"); // Or similar
}
}
bool isComplete() const { return checkMakeGreeting_(); } // All tests here
protected:
virtual bool checkMakeGreeting_() const { return name_; } // Populated?
};
class X : public XPartial {
X(); // Forbid default-construction; or, you could supply a "full" ctor
public:
explicit X(XPartial const& x) : XPartial(x) { // Avoid implicit conversion
if (!x.isComplete()) throw domain_error("ZOINKS");
}
X& operator=(XPartial const& x) {
if (!x.isComplete()) throw domain_error("ZOINKS");
return static_cast<X&>(XPartial::operator=(x));
}
protected:
virtual bool checkMakeGreeting_() { return true; } // No checking needed!
};
Although it might seem the inheritance here is "back to front", doing it this way means that an X can safely be supplied anywhere an XPartial& is asked for, so this approach obeys the Liskov Substitution Principle. This means that a function can use a parameter type of X& to indicate it needs a complete X object, or XPartial& to indicate it can handle partially populated objects -- in which case either an XPartial object or a full X can be passed.
Originally I had isComplete() as protected, but found this didn't work since X's copy ctor and assignment operator must call this function on their XPartial& argument, and they don't have sufficient access. On reflection, it makes more sense to publically expose this functionality.
I must be missing something here - I do this kind of thing all the time. It's very common to have objects that are big and/or not needed by a class in all circumstances. So create them dynamically!
struct Big {
char a[1000000];
};
class A {
public:
A() : big(0) {}
~A() { delete big; }
void f() {
makebig();
big->a[42] = 66;
}
private:
Big * big;
void makebig() {
if ( ! big ) {
big = new Big;
}
}
};
I don't see the need for anything fancier than that, except that makebig() should probably be const (and maybe inline), and the Big pointer should probably be mutable. And of course A must be able to construct Big, which may in other cases mean caching the contained class's constructor parameters. You will also need to decide on a copying/assignment policy - I'd probably forbid both for this kind of class.
I don't know of any patterns to deal with this specific issue. It's a tricky design question, and one somewhat unique to languages like C++. Another issue is that the answer to this question is closely tied to your individual (or corporate) coding style.
I would use pointers for these members, and when they need to be constructed, allocate them at the same time. You can use auto_ptr for these, and check against NULL to see if they are initialized. (I think of pointers are a built-in "optional" type in C/C++/Java, there are other languages where NULL is not a valid pointer).
One issue as a matter of style is that you may be relying on your constructors to do too much work. When I'm coding OO, I have the constructors do just enough work to get the object in a consistent state. For example, if I have an Image class and I want to read from a file, I could do this:
image = new Image("unicorn.jpeg"); /* I'm not fond of this style */
or, I could do this:
image = new Image(); /* I like this better */
image->read("unicorn.jpeg");
It can get difficult to reason about how a C++ program works if the constructors have a lot of code in them, especially if you ask the question, "what happens if a constructor fails?" This is the main benefit of moving code out of the constructors.
I would have more to say, but I don't know what you're trying to do with delayed construction.
Edit: I remembered that there is a (somewhat perverse) way to call a constructor on an object at any arbitrary time. Here is an example:
class Counter {
public:
Counter(int &cref) : c(cref) { }
void incr(int x) { c += x; }
private:
int &c;
};
void dontTryThisAtHome() {
int i = 0, j = 0;
Counter c(i); // Call constructor first time on c
c.incr(5); // now i = 5
new(&c) Counter(j); // Call the constructor AGAIN on c
c.incr(3); // now j = 3
}
Note that doing something as reckless as this might earn you the scorn of your fellow programmers, unless you've got solid reasons for using this technique. This also doesn't delay the constructor, just lets you call it again later.
Using boost.optional looks like a good solution for some use cases. I haven't played much with it so I can't comment much. One thing I keep in mind when dealing with such functionality is whether I can use overloaded constructors instead of default and copy constructors.
When I need such functionality I would just use a pointer to the type of the necessary field like this:
public:
MyClass() : field_(0) { } // constructor, additional initializers and code omitted
~MyClass() {
if (field_)
delete field_; // free the constructed object only if initialized
}
...
private:
...
field_type* field_;
next, instead of using the pointer I would access the field through the following method:
private:
...
field_type& field() {
if (!field_)
field_ = new field_type(...);
return field_;
}
I have omitted const-access semantics
The easiest way I know is similar to the technique suggested by Dietrich Epp, except it allows you to truly delay the construction of an object until a moment of your choosing.
Basically: reserve the object using malloc instead of new (thereby bypassing the constructor), then call the overloaded new operator when you truly want to construct the object via placement new.
Example:
Object *x = (Object *) malloc(sizeof(Object));
//Use the object member items here. Be careful: no constructors have been called!
//This means you can assign values to ints, structs, etc... but nested objects can wreak havoc!
//Now we want to call the constructor of the object
new(x) Object(params);
//However, you must remember to also manually call the destructor!
x.~Object();
free(x);
//Note: if you're the malloc and new calls in your development stack
//store in the same heap, you can just call delete(x) instead of the
//destructor followed by free, but the above is the correct way of
//doing it
Personally, the only time I've ever used this syntax was when I had to use a custom C-based allocator for C++ objects. As Dietrich suggests, you should question whether you really, truly must delay the constructor call. The base constructor should perform the bare minimum to get your object into a serviceable state, whilst other overloaded constructors may perform more work as needed.
I don't know if there's a formal pattern for this. In places where I've seen it, we called it "lazy", "demand" or "on demand".