In my browsings amongst the Internet, I came across this post, which includes this
"(Well written) C++ goes to great
lengths to make stack automatic
objects work "just like" primitives,
as reflected in Stroustrup's advice to
"do as the ints do". This requires a
much greater adherence to the
principles of Object Oriented
development: your class isn't right
until it "works like" an int,
following the "Rule of Three" that
guarantees it can (just like an int)
be created, copied, and correctly
destroyed as a stack automatic."
I've done a little C, and C++ code, but just in passing, never anything serious, but I'm just curious, what it means exactly?
Can someone give an example?
Stack objects are handled automatically by the compiler.
When the scope is left, it is deleted.
{
obj a;
} // a is destroyed here
When you do the same with a 'newed' object you get a memory leak :
{
obj* b = new obj;
}
b is not destroyed, so we lost the ability to reclaim the memory b owns. And maybe worse, the object cannot clean itself up.
In C the following is common :
{
FILE* pF = fopen( ... );
// ... do sth with pF
fclose( pF );
}
In C++ we write this :
{
std::fstream f( ... );
// do sth with f
} // here f gets auto magically destroyed and the destructor frees the file
When we forget to call fclose in the C sample the file is not closed and may not be used by other programs. (e.g. it cannot be deleted).
Another example, demonstrating the object string, which can be constructed, assigned to and which is destroyed on exiting the scope.
{
string v( "bob" );
string k;
v = k
// v now contains "bob"
} // v + k are destroyed here, and any memory used by v + k is freed
In addition to the other answers:
The C++ language actually has the auto keyword to explicitly declare the storage class of an object. Of course, it's completely needless because this is the implied storage class for local variables and cannot be used anywhere. The opposite of auto is static (both locally and globall).
The following two declarations are equivalent:
int main() {
int a;
auto int b;
}
Because the keyword is utterly useless, it will actually be recycled in the next C++ standard (“C++0x”) and gets a new meaning, namely, it lets the compiler infer the variable type from its initialization (like var in C#):
auto a = std::max(1.0, 4.0); // `a` now has type double.
Variables in C++ can either be declared on the stack or the heap. When you declare a variable in C++, it automatically goes onto the stack, unless you explicitly use the new operator (it goes onto the heap).
MyObject x = MyObject(params); // onto the stack
MyObject * y = new MyObject(params); // onto the heap
This makes a big difference in the way the memory is managed. When a variable is declared on the stack, it will be deallocated when it goes out of scope. A variable on the heap will not be destroyed until delete is explicitly called on the object.
Stack automatic are variables which are allocated on the stack of the current method. The idea behind designing a class which can acts as Stack automatic is that it should be possible to fully initialize it with one call and destroy it with another. It is essential that the destructor frees all resources allocated by the object and its constructor returns an object which has been fully initialized and ready for use. Similarly for the copy operation - the class should be able to be easily made copies, which are fully functional and independent.
The usage of such class should be similar to how primitive int, float, etc. are used. You define them (eventually give them some initial value) and then pass them around and in the end leave the compiler to the cleaning.
Correct me if i'm wrong, but i think that copy operation is not mandatory to take full advantage of automatic stack cleaning.
For example consider a classic MutexGuard object, it doesn't need a copy operation to be useful as stack automatic, or does it ?
Related
https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization
There is an example of how RAII works. I always thought that C++ gets this behavior from C. That when you define a variable in a function, that variable becomes invalid when you leave the function. Though maybe the concept has no meaning when there is no object. C does not initialize structs but C++ does. Is that the difference? I am a bit confused.
I always thought that ... That when you define a variable in a function, that variable becomes invalid when you leave the function
You've thought correctly.
You seem to be confused about what RAII is for. It is for management of dynamic resources such as dynamic memory allocations. It relies on language features such a constructors and destructors, which do not exist in C.
The difference is that C++ has constructors and destructors.
C doesn't guarantee anything will be done on scope entrance and exit. If you declare a variable and don't assign anything to it, you can read garbage when you try to read that variable. When you exit a scope, nothing is done to what was on the stack in the scope you just exited.
In C++, trivial types like int behave the same way. With class types (classes, struct, unions), a variable of the type is created with a constructor and destroyed with a destructor. If you declare a variable in a way that calls a non-trivial constructor, then that constructor performs initialization on the variable. If you declare a scoped variable of a type has a non-trivial destructor, that destructor is run on scope exit to clean up the variable. Use of this construction and destruction mechanism is what is usually meant by RAII in C++.
In C, this programming error can easily happen.
typedef struct
{
int *data;
} Trivial_C;
void
my_c_function(void)
{
Trivial_C t;
t.data=malloc(5*sizeof(int));
... do something with t.data ...
} // oops! t does not exist anymore but the allocated
// memory that was known through t.data still exists!
In C++, RAII relies on destructors to do some cleanup
when an object disappears.
struct Trivial_Cpp
{
int *data;
Trivial_Cpp() : data{new int[5]} {} // data is allocated at creation
~Trivial_Cpp() { delete[] data; } // data is released at destruction
};
void
my_cpp_function()
{
Trivial_Cpp t;
... do something with t.data ...
} // OK, t does not exist anymore and the destructor has
// been called, so the allocated memory has been released
Of course these code snippets are trivial and largely incomplete.
Moreover, you rarely need to allocate memory by yourself; std::vector
for example will do it perfectly for you (because it uses RAII).
I always thought that C++ gets this behavior from C.
No, C++ and C are wildly different languages (by now) and differ in many aspects, often surprising.
First, you should understand different ways of lifetimes of objects (an object, in C speak, is a thing that uses memory, it hasn't got anything to do with Object Oriented Programming). In C there are three kinds of lifetimes:
static: Things that get allocated before the program start and destroyed afterwards. All global variables or static local variables. Static objects are implicitly initialized to 0/0.0 or NULL for pointer objects respectively.
dynamic: Objects that are created with the help of an allocation function such as malloc which in turn usually calls the OS via syscall (eg. sys_brk) to create space on the heap. This is an implementation detail though, C is very vague about how exactly the dynamic memory is acquired and has no notion of the heap.
When you don't need the memory anymore, you should free the object.
automatic: This is what happens when you simply create local variable without much thinking. The space is usually allocated on the stack and thus it will be invalid as soon as it leaves its scope.
const int GLOBAL_VARIABLE; // static lifetime
struct s {
int a;
}
void foo(void)
{
static int local_static_variable; // static lifetime as well.
int x; // automatic lifetime
struct s t; // automatic as well
struct s *p = malloc(sizeof (*p)); // while the pointer p itself is automatic, it points to an object of dynamic lifetime
{
int y; // automatic variable
}
// y out of scope: "freed"
free(p); // object pointed to by p must be freed.
} // x and p go out of scope
C++ with RAII introduces the usability of automatic freeing when something goes out of scope for more complex structures and objects allocated on the heap. For example you may have a structure that contains pointers objects which contain pointers to objects etc. In C, automatic allocation of the first-level structure only allocates memory for the pointers but not the objects within. Nor, if you allocate the memory for those objects using malloc, will it free them if the primary structure goes out of scope:
struct s {
int *array;
}
void foo(void)
{
struct s; // structure automatically allocated, but s.array is undefined
s.array = calloc(10, sizeof (*s.array)); // allocate space for 10 elements in array
s.array[0] = 0xf00;
free(s.array); // must free space acquired earlier
}
// s goes out of scope.
// if we hadn't free'd s.array, the allocated space would still "be there"
C++ allows you to define constructors and destructors for each object and subobject that will automatically allocate and free all subobjects, s.t. if s would go out of scope, s.array would be deallocated as well.
Here's the line of code:
A a = static_cast<A>(*(new A)); // ?
It compiles fine on 64bit clang at least.
But where is the memory actually allocated and what happens to variable a?
Besides there's no static cast needed, the memory allocated with new A simply leaks. You have lost access to that pointer and can never delete it properly anymore.
But where is the memory actually allocated and what happens to variable a?
Variable a is destroyed as soon it leaves scope as usual.
A a = static_cast<A>(*(new A)); // ?
This does the following.
(new A) // allocate a new A on the heap
*(new A) // the type is now A instead of *A
static_cast<A>(*(new A)) // makes it into an type A from type A in an potentially unsafe way (here it is a no-op as they are the same type)
A a = static_cast<A>(*(new A)); // copies the (default) value of A to a
; // leaks the allocted A as the last reference to it disappear.
I'm going to answer this question on the assumption that this line of code appears inside a function. If it appears elsewhere, the bit about the "stack" is inaccurate but everything else is still accurate.
This line of code compiles to four operations, which we can write as their own lines of C++ to make things clearer. It makes two allocations, in two different places, and one of them is "leaked".
A a;
{
A *temp = new A;
a = *temp;
}
The first operation allocates space for an object of type A on the "stack", and default-initializes it. This object is accessible through the variable a. It will be automatically destructed and deallocated no later than when the function returns; depending on surrounding context, this might happen earlier, but in no case while the variable a is in scope.
The second operation allocates space for another object of type A, but on the "heap" instead of the "stack". This object is also default-initialized. The new operator returns a pointer to this object, which the compiler stores in an temporary variable. (I gave that variable the name temp because I had to give it some name; in your original code the temporary is not accessible by any means.) This object will only ever be deallocated if, at some point in the future, the pointer returned by new is used in a delete operation.
The third operation, finally, copies the contents of the object on the heap, pointed to by temp, into the object on the stack, accessible via the variable a. (Note: the static_cast<A>(...) that you had written here has no effect whatsoever, because *temp already has the type A. Therefore, I took it out.)
Finally, the temporary variable holding the pointer to the object on the heap is discarded. The object on the heap is not deallocated when this happens; in fact, it becomes impossible for anything ever to deallocate it. That object is said to have leaked.
You probably wanted to write either
A a;
which allocates an object on the stack and does nothing else, or
// note: C++11 only; C++03 equivalent is std::shared_ptr<A> a(new A());
auto a = std::make_shared<A>();
which allocates an object on the heap and arranges to reference-count it, so that it probably won't leak. (There are a few other things you might have meant, but those are the most likely.)
For a simple definition of A, it is equivalent to:
A a(*new(A));
An A is dynamically allocated on the heap, a is copy constructed on the stack, and the dynamic allocation is leaked.
For a trivial definition of A the overall effect might as well be:
new A;
A a;
this copy implements the leak without the wasteful copy operation or the messy, redundant cast :)
If we have the following code snippet:
MyObject my_object = MyObject(0);
my_object = MyObject(1);
What happens to MyObject(0)? Is it deleted? Looking at what I have read about it it should only be deleted when we leave the scope of creation, so the anwser is probably no. If this is the case is there any way to explicitly delete it other than using pointers?
MyObject my_object = MyObject(0);
This line creates my_object on the stack using MyObject's constructor that can accept an int.
my_object = MyObject(1);
This line creates a temporary MyObject, again, using the same constructor as the first. This is then assigned to my_object by calling the assignment operator. If you didn't provide this operator then the compiler will make one for you that performs a shallow copy. When this statement completes, the temporary MyObject goes out of scope and the destructor for it is called.
When your my_object goes out of scope it is in turn destroyed in the same fashion. At no point do you need to manually delete this because everything is allocated on the stack.
There are two main regions in memory when talking about newly created objects: the stack and the heap. The heap contains all objects created dynamically with new. These objects need to be explicitly deleted with the delete operator. The stack is scope-specific and all objects defined on the stack will be deleted automatically. Since you don't use new, all your objects will be destroyed when their scope ends.
Assuming no compiler optimizations, your code roughly translates to:
{
MyObject my_object;
MyObject tempObject0(0);
my_Object = tempObject0;
MyObject tempObject1(1);
my_Object = tempObject;
}//3 objects are deleted by this point (in theory)
Also note the difference between
MyObject myObject(0);
and
MyObject myObject = MyObject(0);
The second case creates a temporary object, so it will be less efficient. This all of course assuming no optimizations. Depending on the compiler, it might translate to the same thing.
The term delete has a special meaning in C++, so the use of deleted is unfortunate.
MyObject my_object = MyObject(0);
This line declares that an object of type MyObject created with automatic storage duration (ie, on the stack). This object will be destructed (ie, its destructor will be executed) when the scope ends. No provision is made in the Standard for the recollection of the associated memory (see later example).
This object of type MyObject will be constructed using the expression MyObject(0). This constructor will initialize the memory that has been set apart for its exclusive use.
Note: actually, a temporary could be created and the copy constructor then called, but most compiler eschew this intermediate step, thankfully, as the Standard specifically allows it.
my_object = MyObject(1);
This line assigns a new value, determined by the expression MyObject(1), to the already existing my_object object. To do so, a temporary of type MyObject is created with automatic storage duration. Then, the assignment operator is executed; if not overloaded it will copy the state of the temporary into my_object, erasing what previous state was there. At the end of the expression, the temporary is destructed (once again, no provising is made for the recollection of the associated memory).
Note: MyObject(0) is not "deleted", as it does not exist, instead the memory it had written its state to is reused to copy the state from MyObject(1).
As promised, since this seems your worry, a discussion on the memory aspects. This is compiler specific, but most compilers do behave similarly.
Suppose that we have the following function:
void f() {
MyObject my_object = MyObject(0);
{
my_object = MyObject(1);
do_something(my_object);
}
{
my_object = MyObject(2);
do_something(my_object);
}
}
How much space does it require on the stack ?
We assume that it performs direct construction on the first line
We assume the compiler not smart enough to perform Stack Coloring (Clang does not, for example)
With those assumption, it requires the space for 3 MyObject.
MyObject my_object = MyObject(0);: my_object need live until the end of the function
my_object = MyObject(1);: a temporary need be created
my_object = MyObject(2);: a temporary need be created
The stack space is recollected at the end of the function execution.
If the compiler was smart enough to perform Stack Coloring, then the two temporaries (that are never needed together) could use the same memory spot, thus lowering the space requirement to 2 MyObject.
A smart optimizer could also, possibly, directly build MyObject(1) and MyObject(2) directly into my_object (if it can prove that the effects would be the same than building a temporary and then copying it), thus lowering the space requirement to 1 MyObject.
Finally, if the definition of do_something is visible, and it does not use its parameter, then under certain conditions it could (in theory) completely bypass the construction of my_object. Such optimizations can be witness with simple programs like:
int main() { int i = 0; for (; i < 1000; ++i); return i; }
which are trivially optimized to:
int main() { return 1000; }
(Note how i disappeared)
As you may notice... it's actually very hard to guess what the compiler/optimizer will be able to do. If you really have tight memory requirements, then (perhaps surprisingly), your best bet might be to replace the blocks by functions.
1st line is creating an temp object and this object is assigned to my_object.
In 2nd line, a temporary object is created and it is assigned to my_object.
So there is only one object my_object
We need not to think about temporay object. It is compiler responsibility to handle temporary object.
I am still new to C++. I have found that you can instantiate an instance in C++ with two different ways:
// First way
Foo foo;
foo.do_something();
// Second way
Baz *baz = new Baz();
baz->do_something();
And with both I don't see big difference and can access the attributes. Which is the preferred way in C++? Or if the question is not relevant, when do we use which and what is the difference between the two?
Thank you for your help.
The question is not relevant: there's no preferred way, those just do different things.
C++ both has value and reference semantics. When a function asks for a value, it means you'll pass it a copy of your whole object. When it asks for a reference (or a pointer), you'll only pass it the memory address of that object. Both semantics are convertible, that is, if you get a value, you can get a reference or a pointer to it and then use it, and when you get a reference you can get its value and use it. Take this example:
void foo(int bar) { bar = 4; }
void foo(int* bar) { *bar = 4; }
void test()
{
int someNumber = 3;
foo(someNumber); // calls foo(int)
std::cout << someNumber << std::endl;
// printed 3: someNumber was not modified because of value semantics,
// as we passed a copy of someNumber to foo, changes were not repercuted
// to our local version
foo(&someNumber); // calls foo(int*)
std::cout << someNumber << std::endl;
// printed 4: someNumber was modified, because passing a pointer lets people
// change the pointed value
}
It is a very, very common thing to create a reference to a value (i.e. get the pointer of a value), because references are very useful, especially for complex types, where passing a reference notably avoids a possibly costly copy operation.
Now, the instantiation way you'll use depends on what you want to achieve. The first way you've shown uses automatic storage; the second uses the heap.
The main difference is that objects on automatic storage are destroyed with the scope in which they existed (a scope being roughly defined as a pair of matching curly braces). This means that you must not ever return a reference to an object allocated on automatic storage from a regular function, because by the time your function returns, the object will have been destroyed and its memory space may be reused for anything at any later point by your program. (There are also performance benefits for objects allocated on automatic storage because your OS doesn't have to look up a place where it might put your new object.)
Objects on the heap, on the other hand, continue to exist until they are explicitly deleted by a delete statement. There is an OS- and platform-dependant performance overhead to this, since your OS needs to look up your program's memory to find a large enough unoccupied place to create your object at. Since C++ is not garbage-collected, you must instruct your program when it is the time to delete an object on the heap. Failure to do so leads to leaks: objects on the heap that are no longer referenced by any variable, but were not explicitly deleted and therefore will exist until your program exits.
So it's a matter of tradeoff. Either you accept that your values can't outlive your functions, or you accept that you must explicitly delete it yourself at some point. Other than that, both ways of allocating objects are valid and work as expected.
For further reference, automatic storage means that the object is allocated wherever its parent scope was. For instance, if you have a class Foo that contains a std::string, the std::string will exist wherever you allocate your Foo object.
class Foo
{
public:
// in this context, automatic storage refers to wherever Foo will be allocated
std::string a;
};
int foo()
{
// in this context, automatic storage refers to your program's stack
Foo bar; // 'bar' is on the stack, so 'a' is on the stack
Foo* baz = new Foo; // 'baz' is on the heap, so 'a' is on the heap too
// but still, in both cases 'a' will be deleted once the holding object
// is destroyed
}
As stated above, you cannot directly leak objects that reside on automatic storage, but you cannot use them once the scope in which they were created is destroyed. For instance:
int* foo()
{
int a; // cannot be leaked: automatically managed by the function scope
return &a; // BAD: a doesn't exist anymore
}
int* foo()
{
int* a = new int; // can be leaked
return a; // NOT AS BAD: now the pointer points to somewhere valid,
// but you eventually need to call `delete a` to release the memory
}
The first way -- "allocating on the stack" -- is generally faster and preferred much of the time. The constructed object is destroyed when the function returns. This is both a blessing -- no memory leaks! -- and a curse, because you can't create an object that lives for a longer time.
The second way -- "allocating on the heap" is slower, and you have to manually delete the objects at some point. But it has the advantage that the objects can live on until you delete them.
The first way allocates the object on the stack (though the class itself may have heap-allocated members). The second way allocates the object on the heap, and must be explicitly delete'd later.
It's not like in languages like Java or C# where objects are always heap-allocated.
They do very different things. The first one allocates an object on the stack, the 2nd on the heap. The stack allocation only lasts for the lifetime of the declaring method; the heap allocation lasts until you delete the object.
The second way is the only way to dynamically allocate objects, but comes with the added complexity that you must remember to return that memory to the operating system (via delete/delete[]) when you are done with it.
The first way will create the object on the stack, and the object will go away when you return from the function it was created in.
The second way will create the object on the heap, and the object will stick around until you call delete foo;.
If the object is just a temporary variable, the first way is better. If it's more permanent data, the second way is better - just remember to call delete when you're finally done with it so you don't build up cruft on your heap.
Hope this helps!
When you create a new object in C++ that lives on the stack, (the way I've mostly seen it) you do this:
CDPlayer player;
When you create an object on the heap you call new:
CDPlayer* player = new CDPlayer();
But when you do this:
CDPlayer player=CDPlayer();
it creates a stack based object, but whats the difference between that and the top example?
The difference is important with PODs (basically, all built-in types like int, bool, double etc. plus C-like structs and unions built only from other PODs), for which there is a difference between default initialization and value initialization. For PODs, a simple
T obj;
will leave obj uninitialized, while T() default-initializes the object. So
T obj = T();
is a good way to ensure that an object is properly initialized.
This is especially helpful in template code, where T might either a POD or a non-POD type. When you know that T is not a POD type, T obj; suffices.
Addendum: You can also write
T* ptr = new T; // note the missing ()
(and avoid initialization of the allocated object if T is a POD).
When you create a new object in C++ that lives on the stack, (…) you do this:
CDPlayer player;
Not necessarily on the stack: variables declared in this way have automatic storage. Where they actually go depends. It may be on the stack (in particular when the declaration is inside a method) but it may also be somewhere else.
Consider the case where the declaration is inside a class:
class foo {
int x;
};
Now the storage of x is where ever the class instance is stored. If it’s stored on the heap, then so is x:
foo* pf = new foo(); // pf.x lives on the heap.
foo f; // f.x lives where f lives, which has (once again) automatic storage.