I can initialize a member variable from a class for example with a constructor, of course but i can also do this by creating a variable of type class from which i could use it's members and initialize them by doing a function. So, i would like to know if the use of a constructor has something to do with reducing memory usage or real time enhancement that makes application more responsive.
Here i placed an example that i found on the internet. Trying to understand the real usage of a constructor in a c program as well in real life.
#include <iostream>
using namespace std;
class Line
{
public:
void setLength( double len );
double getLength( void );
Line();
private:
double length;
};
Line::Line(void)
{
cout << "Object is being created" << endl;
}
void Line::setLength( double len )
{
length = len;
}
double Line::getLength( void )
{
return length;
}
int main( )
{
Line line;
line.setLength(6.0);
cout << "Length of line : " << line.getLength() <<endl;
return 0;
}
In real life, in larger nontrivial projects, you forget to call your own initializers, leaving the object in uninitialized state. But you can't forget to call a constructor.
A constructor provides a syntax that can guarantee that your object is completely initialized when it is created.
Bugs abound when you create an object that requires additional function calls before it's not in an uninitialized state.
There are reasons to favor a constructor over a method, and often, these reasons depend on the context. Let's first consider the most obvious reason to favor constructors: immutable objects. An object that's considered to be immutable will never change its state after its initial construction (where it's declared and constructed). Often, such immutable objects keep their member variables private and allow you to create other instances of them based on various functions (e.g., a string's substring routine).
On the other hand, objects which go through various state changes throughout their life (e.g., a GameBoard) may require client calls to modify their state. However, even these objects will be initialized somehow (e.g., think Chess, Checkers, Sudoko, etc...) and should require a basic constructor to ensure that they started from a "sane" initial state.
Methods can initialize objects, but only sometimes is this legal. For instance a member variable that is a reference must be initialized in the constructor list, variables that require parameters, variables that are const, etc...
From a performance perspective, I don't know how I would resolve this aspect of the question... but writing clean and clear code always makes other programmers reading your code efficient and, dare-i-say-it, happy! :-)
One reason is that is reduces the scope for errors. Allowing an object to be constructed in an invalid/nonsensical state means it can be used in that state.
Line line;
// any read line.length is undefined behaviour
// What is the benefit of that?
line.setLength(6.0);
Constructors allow you to initialize instances in a valid and desired state, which means they can be used immediately without further manipulations.
Also, in regarding terminology, line.setLength(6.0) does not initialize the member, it modifies an already initialized one.
Related
I've been trying not to initialize memory when I don't need to, and am using malloc arrays to do so:
This is what I've run:
#include <iostream>
struct test
{
int num = 3;
test() { std::cout << "Init\n"; }
~test() { std::cout << "Destroyed: " << num << "\n"; }
};
int main()
{
test* array = (test*)malloc(3 * sizeof(test));
for (int i = 0; i < 3; i += 1)
{
std::cout << array[i].num << "\n";
array[i].num = i;
//new(array + i) i; placement new is not being used
std::cout << array[i].num << "\n";
}
for (int i = 0; i < 3; i += 1)
{
(array + i)->~test();
}
free(array);
return 0;
}
Which outputs:
0 ->- 0
0 ->- 1
0 ->- 2
Destroyed: 0
Destroyed: 1
Destroyed: 2
Despite not having constructed the array indices. Is this "healthy"? That is to say, can I simply treat the destructor as "just a function"?
(besides the fact that the destructor has implicit knowledge of where the data members are located relative to the pointer I specified)
Just to specify: I'm not looking for warnings on the proper usage of c++. I would simply like to know if there's things I should be wary of when using this no-constructor method.
(footnote: the reason I don't wanna use constructors is because many times, memory simply does not need to be initialized and doing so is slow)
No, this is undefined behaviour. An object's lifetime starts after the call to a constructor is completed, hence if a constructor is never called, the object technically never exists.
This likely "seems" to behave correctly in your example because your struct is trivial (int::~int is a no-op).
You are also leaking memory (destructors destroy the given object, but the original memory allocated via malloc still needs to be freed).
Edit: You might want to look at this question as well, as this is an extremely similar situation, simply using stack allocation instead of malloc. This gives some of the actual quotes from the standard around object lifetime and construction.
I'll add this as well: in the case where you don't use placement new and it clearly is required (e.g. struct contains some container class or a vtable, etc.) you are going to run into real trouble. In this case, omitting the placement-new call is almost certainly going to gain you 0 performance benefit for very fragile code - either way, it's just not a good idea.
Yes, the destructor is nothing more than a function. You can call it at any time. However, calling it without a matching constructor is a bad idea.
So the rule is: If you did not initialize memory as a specific type, you may not interpret and use that memory as an object of that type; otherwise it is undefined behavior. (with char and unsigned char as exceptions).
Let us do a line by line analysis of your code.
test* array = (test*)malloc(3 * sizeof(test));
This line initializes a pointer scalar array using a memory address provided by the system. Note that the memory is not initialized for any kind of type. This means you should not treat these memory as any object (even as scalars like int, let aside your test class type).
Later, you wrote:
std::cout << array[i].num << "\n";
This uses the memory as test type, which violates the rule stated above, leading to undefined behavior.
And later:
(array + i)->~test();
You used the memory a test type again! Calling destructor also uses the object ! This is also UB.
In your case you are lucky that nothing harmful happens and you get something reasonable. However UBs are solely dependent on your compiler's implementation. It can even decide to format your disk and that's still standard-conforming.
That is to say, can I simply treat the destructor as "just a function"?
No. While it is like other functions in many ways, there are some special features of the destructor. These boil down to a pattern similar to manual memory management. Just as memory allocation and deallocation need to come in pairs, so do construction and destruction. If you skip one, skip the other. If you call one, call the other. If you insist upon manual memory management, the tools for construction and destruction are placement new and explicitly calling the destructor. (Code that uses new and delete combine allocation and construction into one step, while destruction and deallocation are combined into the other.)
Do not skip the constructor for an object that will be used. This is undefined behavior. Furthermore, the less trivial the constructor, the more likely that something will go wildly wrong if you skip it. That is, as you save more, you break more. Skipping the constructor for a used object is not a way to be more efficient — it is a way to write broken code. Inefficient, correct code trumps efficient code that does not work.
One bit of discouragement: this sort of low-level management can become a big investment of time. Only go this route if there is a realistic chance of a performance payback. Do not complicate your code with optimizations simply for the sake of optimizing. Also consider simpler alternatives that might get similar results with less code overhead. Perhaps a constructor that performs no initializations other than somehow flagging the object as not initialized? (Details and feasibility depend on the class involved, hence extend outside the scope of this question.)
One bit of encouragement: If you think about the standard library, you should realize that your goal is achievable. I would present vector::reserve as an example of something that can allocate memory without initializing it.
You currently have UB as you access field from non-existing object.
You might let field uninitialized by doing a constructor noop. compiler might then easily doing no initialization, for example:
struct test
{
int num; // no = 3
test() { std::cout << "Init\n"; } // num not initalized
~test() { std::cout << "Destroyed: " << num << "\n"; }
};
Demo
For readability, you should probably wrap it in dedicated class, something like:
struct uninitialized_tag {};
struct uninitializable_int
{
uninitializable_int(uninitialized_tag) {} // No initalization
uninitializable_int(int num) : num(num) {}
int num;
};
Demo
I learned long ago that the only reliable way for a static member of be initialized for sure is to do in a function. Now, what I'm about to do is to start returning static data by non-const reference and I need someone to stop me.
function int& dataSlot()
{
static int dataMember = 0;
return dataMember;
}
To my knowledge this is the only way to ensure that the static member is initlized to zero. However, it creates obscure code like this:
dataSlot() = 7; // perfectly normal?
The other way is to put the definition in a translation unit and keep the stuff out of the header file. I have nothing against that per se but I have no idea what the standard says regard when and under what circumstances that is safe.
The absolute last thing I wanna end up doing is accidently accessing uninitialized data and losing control of my program.
(With the usual cautions against indiscriminate use of globals...) Just declare the variable at global scope. It is guaranteed to be zero-initialized before any code runs.
You have to be more cunning when it comes to types with non-trivial constructors, but ints will work fine as globals.
Returning a non-const reference in itself is fairly harmless, for example it's what vector::at() does, or vector::iterator::operator*.
If you don't like the syntax dataSlot() = 7;, you could define:
void setglobal(int i) {
dataSlot() = i;
}
int getglobal() {
return dataSlot();
}
Or you could define:
int *dataSlot() {
static int dataMember = 0;
return &dataMember;
}
*dataSlot() = 7; // better than dataSlot() = 7?
std::cout << *dataSlot(); // worse than std::cout << dataSlot()?
If you want someone to stop you, they need more information in order to propose an alternative to your use of mutable global state!
It is called Meyers singletor, and it is almost perfectly safe.
You have to take care that the object is created when the function dataSlot() is called, but it is going to be destroyed when the program exists (somewhere when global variables are destructed), therefore you have to take special care. Using this function in destructors is specially dangerous and might cause random crashes.
I learned long ago that the only reliable way for a static member of be initialized for sure is to do in a function.
No, it isn't. The standard guarantees that:
All objects with static storage (both block and file or class-static scope) with trivial constructors are initialized before any code runs. Any code of the program at all.
All objects with file/global/class-static scope and non-trivial constructos are than initialized before the main function is called. It is guaranteed that if objects A and B are defined in the same translation unit and A is defined before B, than A is initialized before B. However order of construction of objects defined in different translation units is unspecified and will often differ between compilations.
Any block-static objects are initialized when their declaration is reached for the first time. Since C++03 standard does not have any support for threads, this is NOT thread-safe!
All objects with static storage (both block and file/global/class-static scoped) are destroyed in the reverse order of their constructors completing after the main() function exits or the application terminates using exit() system call.
Neither of the methods is usable and reliable in all cases!
Now, what I'm about to do is to start returning static data by non-const reference and I need someone to stop me.
Nobody is going to stop you. It's legal and perfectly reasonable thing to do. But make sure you don't fall in the threads trap.
E.g. any reasonable unit-test library for C++ automatically registers all test cases. It does it by having something like:
std::vector<TestCase *> &testCaseList() {
static std::vector<TestCase *> test_cases;
return test_cases;
}
TestCase::TestCase() {
...
testCaseList().push_back(this);
}
Because that's the one of only two ways to do it. The other is:
TestCase *firstTest = NULL;
class TestCase {
...
TestCase *nextTest;
}
TestCase::TestCase() {
...
nextTest = firstTest;
firstTest = this;
}
this time using the fact that firstTest has trivial constructor and therefore will be initialized before any of the TestCases that have non-trivial one.
dataSlot() = 7; // perfectly normal?
Yes. But if you really want, you can do either:
The old C thing of
#define dataSlot _dataSlot()
in a way the errno "variable" is usually defined,
Or you can wrap it in a struct like
class dataSlot {
Type &getSlot() {
static Type slot;
return slot;
}
operator const Type &() { return getSlot(); }
operator=(Type &newValue) { getSlot() = newValue; }
};
(the disadvantage here is that compiler won't look for Type's method if you try to invoke them on dataSlot directly; that's why it needs the operator=)
You could make yourself 2 functions, dataslot() and set_dataslot() which are wrappers round the actual dataslot, a bit like this:
int &_dataslot() { static int val = 0; return val; }
int dataslot() { return _dataslot(); }
void set_dataslot(int n) { _dataslot() = n; }
You probably wouldn't want to inline that lot in a header, but I've found some C++ implementations do rather badly if you try that sort of thing anyway.
I want to open a file in a class constructor. It is possible that the opening could fail, then the object construction could not be completed. How to handle this failure? Throw exception out? If this is possible, how to handle it in a non-throw constructor?
If an object construction fails, throw an exception.
The alternative is awful. You would have to create a flag if the construction succeeded, and check it in every method.
I want to open a file in a class constructor. It is possible that the opening could fail, then the object construction could not be completed. How to handle this failure? Throw exception out?
Yes.
If this is possible, how to handle it in a non-throw constructor?
Your options are:
redesign the app so it doesn't need constructors to be non-throwing - really, do it if possible
add a flag and test for successful construction
you could have each member function that might legitimately be called immediately after the constructor test the flag, ideally throwing if it's set, but otherwise returning an error code
This is ugly, and difficult to keep right if you have a volatile group of developers working on the code.
You can get some compile-time checking of this by having the object polymorphically defer to either of two implementations: a successfully constructed one and an always-error version, but that introduces heap usage and performance costs.
You can move the burden of checking the flag from the called code to the callee by documenting a requirement that they call some "is_valid()" or similar function before using the object: again error prone and ugly, but even more distributed, unenforcable and out of control.
You can make this a little easier and more localised for the caller if you support something like: if (X x) ... (i.e. the object can be evaluated in a boolean context, normally by providing operator bool() const or similar integral conversion), but then you don't have x in scope to query for details of the error. This may be familiar from e.g. if (std::ifstream f(filename)) { ... } else ...;
have the caller provide a stream they're responsible for having opened... (known as Dependency Injection or DI)... in some cases, this doesn't work that well:
you can still have errors when you go to use the stream inside your constructor, what then?
the file itself might be an implementation detail that should be private to your class rather than exposed to the caller: what if you want to remove that requirement later? For example: you might have been reading a lookup table of precalculated results from a file, but have made your calculations so fast there's no need to precalculate - it's painful (sometimes even impractical in an enterprise environment) to remove the file at every point of client usage, and forces a lot more recompilation rather than potentially simply relinking.
force the caller to provide a buffer to a success/failure/error-condition variable which the constructor sets: e.g. bool worked; X x(&worked); if (worked) ...
this burden and verbosity draws attention and hopefully makes the caller much more conscious of the need to consult the variable after constructing the object
force the caller to construct the object via some other function that can use return codes and/or exceptions:
if (X* p = x_factory()) ...
Smart_Ptr_Throws_On_Null_Deref p_x = x_factory();</li>
<li>X x; // never usable; if (init_x(&x)) ...`
etc...
In short, C++ is designed to provide elegant solutions to these sorts of issues: in this case exceptions. If you artificially restrict yourself from using them, then don't expect there to be something else that does half as good a job.
(P.S. I like passing variables that will be modified by pointer - as per worked above - I know the FAQ lite discourages it but disagree with the reasoning. Not particularly interested in discussion thereon unless you've something not covered by the FAQ.)
New C++ standard redefines this in so many ways that it's time to revisit this question.
Best choices:
Named optional: Have a minimal private constructor and a named constructor: static std::experimental::optional<T> construct(...). The latter tries to set up member fields, ensures invariant and only calls the private constructor if it'll surely succeed. Private constructor only populates member fields. It's easy to test the optional and it's inexpensive (even the copy can be spared in a good implementation).
Functional style: The good news is, (non-named) constructors are never virtual. Therefore, you can replace them with a static template member function that, apart from the constructor parameters, takes two (or more) lambdas: one if it was successful, one if it failed. The 'real' constructor is still private and cannot fail. This might sound an overkill, but lambdas are optimized wonderfully by compilers. You might even spare the if of the optional this way.
Good choices:
Exception: If all else fails, use an exception - but note that you can't catch an exception during static initialization. A possible workaround is to have a function's return value initialize the object in this case.
Builder class: If construction is complicated, have a class that does validation and possibly some preprocessing to the point that the operation cannot fail. Let it have a way to return status (yep, error function). I'd personally make it stack-only, so people won't pass it around; then let it have a .build() method that constructs the other class. If builder is friend, constructor can be private. It might even take something only builder can construct so that it's documented that this constructor is only to be called by builder.
Bad choices: (but seen many times)
Flag: Don't mess up your class invariant by having an 'invalid' state. This is exactly why we have optional<>. Think of optional<T> that can be invalid, T that can't. A (member or global) function that works only on valid objects works on T. One that surely returns valid works on T. One that might return an invalid object return optional<T>. One that might invalidate an object take non-const optional<T>& or optional<T>*. This way, you won't need to check in each and every function that your object is valid (and those ifs might become a bit expensive), but then don't fail at the constructor, either.
Default construct and setters: This is basically the same as Flag, only that this time you're forced to have a mutable pattern. Forget setters, they unnecessarily complicate your class invariant. Remember to keep your class simple, not construction simple.
Default construct and init() that takes a ctor parameters: This is nothing better than a function that returns an optional<>, but requires two constructions and messes up your invariant.
Take bool& succeed: This was what we were doing before optional<>. The reason optional<> is superior, you cannot mistakenly (or carelessly!) ignore the succeed flag and continue using the partially constructed object.
Factory that returns a pointer: This is less general as it forces the object to be dynamically allocated. Either you return a given type of managed pointer (and therefore restrict allocation/scoping schema) or return naked ptr and risk clients leaking. Also, with move schematics performance-wise this might become less desirable (locals, when kept on stack, are very fast and cache-friendly).
Example:
#include <iostream>
#include <experimental/optional>
#include <cmath>
class C
{
public:
friend std::ostream& operator<<(std::ostream& os, const C& c)
{
return os << c.m_d << " " << c.m_sqrtd;
}
static std::experimental::optional<C> construct(const double d)
{
if (d>=0)
return C(d, sqrt(d));
return std::experimental::nullopt;
}
template<typename Success, typename Failed>
static auto if_construct(const double d, Success success, Failed failed = []{})
{
return d>=0? success( C(d, sqrt(d)) ): failed();
}
/*C(const double d)
: m_d(d), m_sqrtd(d>=0? sqrt(d): throw std::logic_error("C: Negative d"))
{
}*/
private:
C(const double d, const double sqrtd)
: m_d(d), m_sqrtd(sqrtd)
{
}
double m_d;
double m_sqrtd;
};
int main()
{
const double d = 2.0; // -1.0
// method 1. Named optional
if (auto&& COpt = C::construct(d))
{
C& c = *COpt;
std::cout << c << std::endl;
}
else
{
std::cout << "Error in 1." << std::endl;
}
// method 2. Functional style
C::if_construct(d, [&](C c)
{
std::cout << c << std::endl;
},
[]
{
std::cout << "Error in 2." << std::endl;
});
}
My suggestion for this specific situation is that if you don't want a constuctor to fail because if can't open a file, then avoid that situation. Pass in an already open file to the constructor if that's what you want, then it can't fail...
One way is to throw an exception. Another is to have a 'bool is_open()' or 'bool is_valid()' functuon that returns false if something went wrong in the constructor.
Some comments here say it's wrong to open a file in the constructor. I'll point out that ifstream is part of the C++ standard it has the following constructor:
explicit ifstream ( const char * filename, ios_base::openmode mode = ios_base::in );
It doesn't throw an exception, but it has an is_open function:
bool is_open ( );
I want to open a file in a class constructor.
Almost certainly a bad idea. Very few cases when opening a file during construction is appropriate.
It is possible that the opening could fail, then the object construction could not be completed. How to handle this failure? Throw exception out?
Yep, that'd be the way.
If this is possible, how to handle it in a non-throw constructor?
Make it possible that a fully constructed object of your class can be invalid. This means providing validation routines, using them, etc...ick
A constructor may well open a file (not necessarily a bad idea) and may throw if the file-open fails, or if the input file does not contain compatible data.
It is reasonable behaviour for a constructor to throw an exception, however you will then be limited as to its use.
You will not be able to create static (compilation unit file-level) instances of this class that are constructed before "main()", as a constructor should only ever be thrown in the regular flow.
This can extend to later "first-time" lazy evaluation, where something is loaded the first time it is required, for example in a boost::once construct the call_once function should never throw.
You may use it in an IOC (Inversion of Control / Dependency Injection) environment. This is why IOC environments are advantageous.
Be certain that if your constructor throws then your destructor will not be called. So anything you initialised in the constructor prior to this point must be contained in an RAII object.
More dangerous by the way can be closing the file in the destructor if this flushes the write buffer. No way at all to handle any error that may occur at this point properly.
You can handle it without an exception by leaving the object in a "failed" state. This is the way you must do it in cases where throwing is not permitted, but of course your code must check for the error.
Use a factory.
A factory can be either an entire factory class "Factory<T>" for building your "T" objects (it doesn't have to be a template) or a static public method of "T". You then make the constructor protected and leave the destructor public. That ensures new classes can still derive from "T" but no external code other than them can call the constructor directly.
With factory methods (C++17)
class Foo {
protected:
Foo() noexcept; // Default ctor that can't fail
virtual bool Initialize(..); // Parts of ctor that CAN fail
public:
static std::optional<Foo> Create(...) // 'Stack' or value-semantics version (no 'new')
{
Foo out();
if(foo.Initialize(..)) return {out};
return {};
}
static Foo* /*OR smart ptr*/ Create(...) // Heap version.
{
Foo* out = new Foo();
if(foo->Initialize(...) return out;
delete out;
return nullptr;
}
virtual ~Foo() noexcept; // Keep public to allow normal inheritance
};
Unlike setting 'valid' bits or other hacks, this is relatively clean and extensible. Done right it guarantees no invalid objects ever escape into the wild, and writing derived 'Foo's is still straightforward. And since factory functions are normal functions, you can do a lot of other things with them that constructors can't.
In my humble opinion you should never put any code that can realistically fail into a constructor. That pretty much means anything that does I/O or other 'real work'. Constructors are a special corner case of the language, and they basically lack the ability to do error handling.
What's a good existing class/design pattern for multi-stage construction/initialization of an object in C++?
I have a class with some data members which should be initialized in different points in the program's flow, so their initialization has to be delayed. For example one argument can be read from a file and another from the network.
Currently I am using boost::optional for the delayed construction of the data members, but it's bothering me that optional is semantically different than delay-constructed.
What I need reminds features of boost::bind and lambda partial function application, and using these libraries I can probably design multi-stage construction - but I prefer using existing, tested classes. (Or maybe there's another multi-stage construction pattern which I am not familiar with).
The key issue is whether or not you should distinguish completely populated objects from incompletely populated objects at the type level. If you decide not to make a distinction, then just use boost::optional or similar as you are doing: this makes it easy to get coding quickly. OTOH you can't get the compiler to enforce the requirement that a particular function requires a completely populated object; you need to perform run-time checking of fields each time.
Parameter-group Types
If you do distinguish completely populated objects from incompletely populated objects at the type level, you can enforce the requirement that a function be passed a complete object. To do this I would suggest creating a corresponding type XParams for each relevant type X. XParams has boost::optional members and setter functions for each parameter that can be set after initial construction. Then you can force X to have only one (non-copy) constructor, that takes an XParams as its sole argument and checks that each necessary parameter has been set inside that XParams object. (Not sure if this pattern has a name -- anybody like to edit this to fill us in?)
"Partial Object" Types
This works wonderfully if you don't really have to do anything with the object before it is completely populated (perhaps other than trivial stuff like get the field values back). If you do have to sometimes treat an incompletely populated X like a "full" X, you can instead make X derive from a type XPartial, which contains all the logic, plus protected virtual methods for performing precondition tests that test whether all necessary fields are populated. Then if X ensures that it can only ever be constructed in a completely-populated state, it can override those protected methods with trivial checks that always return true:
class XPartial {
optional<string> name_;
public:
void setName(string x) { name_.reset(x); } // Can add getters and/or ctors
string makeGreeting(string title) {
if (checkMakeGreeting_()) { // Is it safe?
return string("Hello, ") + title + " " + *name_;
} else {
throw domain_error("ZOINKS"); // Or similar
}
}
bool isComplete() const { return checkMakeGreeting_(); } // All tests here
protected:
virtual bool checkMakeGreeting_() const { return name_; } // Populated?
};
class X : public XPartial {
X(); // Forbid default-construction; or, you could supply a "full" ctor
public:
explicit X(XPartial const& x) : XPartial(x) { // Avoid implicit conversion
if (!x.isComplete()) throw domain_error("ZOINKS");
}
X& operator=(XPartial const& x) {
if (!x.isComplete()) throw domain_error("ZOINKS");
return static_cast<X&>(XPartial::operator=(x));
}
protected:
virtual bool checkMakeGreeting_() { return true; } // No checking needed!
};
Although it might seem the inheritance here is "back to front", doing it this way means that an X can safely be supplied anywhere an XPartial& is asked for, so this approach obeys the Liskov Substitution Principle. This means that a function can use a parameter type of X& to indicate it needs a complete X object, or XPartial& to indicate it can handle partially populated objects -- in which case either an XPartial object or a full X can be passed.
Originally I had isComplete() as protected, but found this didn't work since X's copy ctor and assignment operator must call this function on their XPartial& argument, and they don't have sufficient access. On reflection, it makes more sense to publically expose this functionality.
I must be missing something here - I do this kind of thing all the time. It's very common to have objects that are big and/or not needed by a class in all circumstances. So create them dynamically!
struct Big {
char a[1000000];
};
class A {
public:
A() : big(0) {}
~A() { delete big; }
void f() {
makebig();
big->a[42] = 66;
}
private:
Big * big;
void makebig() {
if ( ! big ) {
big = new Big;
}
}
};
I don't see the need for anything fancier than that, except that makebig() should probably be const (and maybe inline), and the Big pointer should probably be mutable. And of course A must be able to construct Big, which may in other cases mean caching the contained class's constructor parameters. You will also need to decide on a copying/assignment policy - I'd probably forbid both for this kind of class.
I don't know of any patterns to deal with this specific issue. It's a tricky design question, and one somewhat unique to languages like C++. Another issue is that the answer to this question is closely tied to your individual (or corporate) coding style.
I would use pointers for these members, and when they need to be constructed, allocate them at the same time. You can use auto_ptr for these, and check against NULL to see if they are initialized. (I think of pointers are a built-in "optional" type in C/C++/Java, there are other languages where NULL is not a valid pointer).
One issue as a matter of style is that you may be relying on your constructors to do too much work. When I'm coding OO, I have the constructors do just enough work to get the object in a consistent state. For example, if I have an Image class and I want to read from a file, I could do this:
image = new Image("unicorn.jpeg"); /* I'm not fond of this style */
or, I could do this:
image = new Image(); /* I like this better */
image->read("unicorn.jpeg");
It can get difficult to reason about how a C++ program works if the constructors have a lot of code in them, especially if you ask the question, "what happens if a constructor fails?" This is the main benefit of moving code out of the constructors.
I would have more to say, but I don't know what you're trying to do with delayed construction.
Edit: I remembered that there is a (somewhat perverse) way to call a constructor on an object at any arbitrary time. Here is an example:
class Counter {
public:
Counter(int &cref) : c(cref) { }
void incr(int x) { c += x; }
private:
int &c;
};
void dontTryThisAtHome() {
int i = 0, j = 0;
Counter c(i); // Call constructor first time on c
c.incr(5); // now i = 5
new(&c) Counter(j); // Call the constructor AGAIN on c
c.incr(3); // now j = 3
}
Note that doing something as reckless as this might earn you the scorn of your fellow programmers, unless you've got solid reasons for using this technique. This also doesn't delay the constructor, just lets you call it again later.
Using boost.optional looks like a good solution for some use cases. I haven't played much with it so I can't comment much. One thing I keep in mind when dealing with such functionality is whether I can use overloaded constructors instead of default and copy constructors.
When I need such functionality I would just use a pointer to the type of the necessary field like this:
public:
MyClass() : field_(0) { } // constructor, additional initializers and code omitted
~MyClass() {
if (field_)
delete field_; // free the constructed object only if initialized
}
...
private:
...
field_type* field_;
next, instead of using the pointer I would access the field through the following method:
private:
...
field_type& field() {
if (!field_)
field_ = new field_type(...);
return field_;
}
I have omitted const-access semantics
The easiest way I know is similar to the technique suggested by Dietrich Epp, except it allows you to truly delay the construction of an object until a moment of your choosing.
Basically: reserve the object using malloc instead of new (thereby bypassing the constructor), then call the overloaded new operator when you truly want to construct the object via placement new.
Example:
Object *x = (Object *) malloc(sizeof(Object));
//Use the object member items here. Be careful: no constructors have been called!
//This means you can assign values to ints, structs, etc... but nested objects can wreak havoc!
//Now we want to call the constructor of the object
new(x) Object(params);
//However, you must remember to also manually call the destructor!
x.~Object();
free(x);
//Note: if you're the malloc and new calls in your development stack
//store in the same heap, you can just call delete(x) instead of the
//destructor followed by free, but the above is the correct way of
//doing it
Personally, the only time I've ever used this syntax was when I had to use a custom C-based allocator for C++ objects. As Dietrich suggests, you should question whether you really, truly must delay the constructor call. The base constructor should perform the bare minimum to get your object into a serviceable state, whilst other overloaded constructors may perform more work as needed.
I don't know if there's a formal pattern for this. In places where I've seen it, we called it "lazy", "demand" or "on demand".
Compare the following two pieces of code, the first using a reference to a large object, and the second has the large object as the return value. The emphasis on a "large object" refers to the fact that repeated copies of the object, unnecessarily, is wasted cycles.
Using a reference to a large object:
void getObjData( LargeObj& a )
{
a.reset() ;
a.fillWithData() ;
}
int main()
{
LargeObj a ;
getObjData( a ) ;
}
Using the large object as a return value:
LargeObj getObjData()
{
LargeObj a ;
a.fillWithData() ;
return a ;
}
int main()
{
LargeObj a = getObjData() ;
}
The first snippet of code does not require copying the large object.
In the second snippet, the object is created inside the function, and so in general, a copy is needed when returning the object. In this case, however, in main() the object is being declared. Will the compiler first create a default-constructed object, then copy the object returned by getObjData(), or will it be as efficient as the first snippet?
I think the second snippet is easier to read but I am afraid it is less efficient.
Edit: Typically, I am thinking of cases LargeObj to be generic container classes that, for the sake of argument, contains thousands of objects inside of them. For example,
typedef std::vector<HugeObj> LargeObj ;
so directly modifying/adding methods to LargeObj isn't a directly accessible solution.
The second approach is more idiomatic, and expressive. It is clear when reading the code that the function has no preconditions on the argument (it does not have an argument) and that it will actually create an object inside. The first approach is not so clear for the casual reader. The call implies that the object will be changed (pass by reference) but it is not so clear if there are any preconditions on the passed object.
About the copies. The code you posted is not using the assignment operator, but rather copy construction. The C++ defines the return value optimization that is implemented in all major compilers. If you are not sure you can run the following snippet in your compiler:
#include <iostream>
class X
{
public:
X() { std::cout << "X::X()" << std::endl; }
X( X const & ) { std::cout << "X::X( X const & )" << std::endl; }
X& operator=( X const & ) { std::cout << "X::operator=(X const &)" << std::endl; }
};
X f() {
X tmp;
return tmp;
}
int main() {
X x = f();
}
With g++ you will get a single line X::X(). The compiler reserves the space in the stack for the x object, then calls the function that constructs the tmp over x (in fact tmp is x. The operations inside f() are applied directly on x, being equivalent to your first code snippet (pass by reference).
If you were not using the copy constructor (had you written: X x; x = f();) then it would create both x and tmp and apply the assignment operator, yielding a three line output: X::X() / X::X() / X::operator=. So it could be a little less efficient in cases.
Use the second approach. It may seem that to be less efficient, but the C++ standard allows the copies to be evaded. This optimization is called Named Return Value Optimization and is implemented in most current compilers.
Yes in the second case it will make a copy of the object, possibly twice - once to return the value from the function, and again to assign it to the local copy in main. Some compilers will optimize out the second copy, but in general you can assume at least one copy will happen.
However, you could still use the second approach for clarity even if the data in the object is large without sacrificing performance with the proper use of smart pointers. Check out the suite of smart pointer classes in boost. This way the internal data is only allocated once and never copied, even when the outer object is.
The way to avoid any copying is to provide a special constructor. If you
can re-write your code so it looks like:
LargeObj getObjData()
{
return LargeObj( fillsomehow() );
}
If fillsomehow() returns the data (perhaps a "big string" then have a constructor that takes a "big string". If you have such a constructor, then the compiler will very likelt construct a single object and not make any copies at all to perform the return. Of course, whether this is userful in real life depends on your particular problem.
A somewhat idiomatic solution would be:
std::auto_ptr<LargeObj> getObjData()
{
std::auto_ptr<LargeObj> a(new LargeObj);
a->fillWithData();
return a;
}
int main()
{
std::auto_ptr<LargeObj> a(getObjData());
}
Alternatively, you can avoid this issue all together by letting the object get its own data, i. e. by making getObjData() a member function of LargeObj. Depending on what you are actually doing, this may be a good way to go.
Depending on how large the object really is and how often the operation happens, don't get too bogged down in efficiency when it will have no discernible effect either way. Optimization at the expense of clean, readable code should only happen when it is determined to be necessary.
The chances are that some cycles will be wasted when you return by copy. Whether it's worth worrying about depends on how large the object really is, and how often you invoke this code.
But I'd like to point out that if LargeObj is a large and non-trivial class, then in any case its empty constructor should be initializing it to a known state:
LargeObj::LargeObj() :
m_member1(),
m_member2(),
...
{}
That wastes a few cycles too. Re-writing the code as
LargeObj::LargeObj()
{
// (The body of fillWithData should ideally be re-written into
// the initializer list...)
fillWithData() ;
}
int main()
{
LargeObj a ;
}
would probably be a win-win for you: you'd have the LargeObj instances getting initialized into known and useful states, and you'd have fewer wasted cycles.
If you don't always want to use fillWithData() in the constructor, you could pass a flag into the constructor as an argument.
UPDATE (from your edit & comment) : Semantically, if it's worthwhile to create a typedef for LargeObj -- i.e., to give it a name, rather than referencing it simply as typedef std::vector<HugeObj> -- then you're already on the road to giving it its own behavioral semantics. You could, for example, define it as
class LargeObj : public std::vector<HugeObj> {
// constructor that fills the object with data
LargeObj() ;
// ... other standard methods ...
};
Only you can determine if this is appropriate for your app. My point is that even though LargeObj is "mostly" a container, you can still give it class behavior if doing so works for your application.
Your first snippet is especially useful when you do things like have getObjData() implemented in one DLL, call it from another DLL, and the two DLLs are implemented in different languages or different versions of the compiler for the same language. The reason is because when they are compiled in different compilers they often use different heaps. You must allocate and deallocate memory from within the same heap, else you will corrupt memory. </windows>
But if you don't do something like that, I would normally simply return a pointer (or smart pointer) to memory your function allocates:
LargeObj* getObjData()
{
LargeObj* ret = new LargeObj;
ret->fillWithData() ;
return ret;
}
...unless I have a specific reason not to.