Basically, I have not been able to find much information about this on the internet, but I understand that the basic class instantiation is:
-> operator new() -> allocates memory from somewhere
-> constructor -> assigns values to "data types"
Now, what I want to know is, how does C++ allocate methods/functions of the class rather than its members. According to my web research, this cannot happen in new() because it is only allocating raw memory, and as far as I have gotten, I have not quite been able to figure out how this could be done in the constructor with functions (rather than function pointers). Also, I assume that because of the existence of the keyword static, without this keyword, it is allocated as part of the parent class. How and where does this happen?
Also, if the functions are included in the memory of the class, does the function sizeof() give the size of just the class and its members, or does it also include the related functions?
While compiling the code compiler takes stores the addresses of the starting point of the functions in the raw code. This address can be relative the starting location of the program or an absolute memory address.
The point is when the function is called the(assuming that scope issues are taken care of) in the code, while compiling the compiler just insert a jump statement to the address where the code of the function is present. For returning to the same location, there is some other operations taking place.
So when you say space is allocated, it just the space occupied by bytecode of the function plus the entry in a table inn compiler which says this function is present at this address
This is pretty much the case with every programming language(which compiles) not only C++.
As for your other part: sizeof(type) returns size in bytes of the object representation of type which is basically an aggregation of size of its members(if we leave out the padding which is done by compiler for optimization).
Related
Form the link below Difference between Definition and Declaration says that:
Definition of a variable says where the variable gets stored. i.e.,
memory for the variable is allocated during the definition of the
variable.
And to my knowledge, the declaration of class looks like :
class stu ;
And the definition of class looks like :
class stu{
public:
int x;
};
And so from information above , the memory allocation of this class should happen when I write the complete definition of class.However,
from this link says that :
Memory will be allocated when you create an instance of the class.
which means that the memory woudl be allocated at the moment I write
stu s;
So I would like to know the exact time that memory would allocate for thsi class, in the other word, it happens during compile time or run time?
In general: The memory holding the values for the members is allocated when they are used, which is - with some exceptions - at runtime. (assuming it is not optimized away by the compiler)
The forward declaration of a class is for the compiler to make the type known.
The definition describes the class:
its member functions are transformed into machine code. Those - depending on the target architecture - exist in a data section that is loaded into memory. So the member function takes up memory before any instance is created.
The compiler also stores some information about the memory layout, which is either part of the machine code or also exists somewhere in a data section.
This memory allocation is however about the description of the class, and not what is generally referred to when you talk about memory allocation for a type.
The memory holding the values for the members is allocated when they are used which is generally at runtime. Under certain circumstances, the values of an instance of a type can already be determined at compile-time, which might have the result that those also become part of the data section.
Every time somebody asks a question about delete[] on here, there is always a pretty general "that's how C++ does it, use delete[]" kind of response. Coming from a vanilla C background what I don't understand is why there needs to be a different invocation at all.
With malloc()/free() your options are to get a pointer to a contiguous block of memory and to free a block of contiguous memory. Something in implementation land comes along and knows what size the block you allocated was based on the base address, for when you have to free it.
There is no function free_array(). I've seen some crazy theories on other questions tangentially related to this, such as calling delete ptr will only free the top of the array, not the whole array. Or the more correct, it is not defined by the implementation. And sure... if this was the first version of C++ and you made a weird design choice that makes sense. But why with $PRESENT_YEAR's standard of C++ has it not been overloaded???
It seems to be the only extra bit that C++ adds is going through the array and calling destructors, and I think maybe this is the crux of it, and it literally is using a separate function to save us a single runtime length lookup, or nullptr at end of the list in exchange for torturing every new C++ programmer or programmer who had a fuzzy day and forgot that there is a different reserve word.
Can someone please clarify once and for all if there is a reason besides "that's what the standard says and nobody questions it"?
Objects in C++ often have destructors that need to run at the end of their lifetime. delete[] makes sure the destructors of each element of the array are called. But doing this has unspecified overhead, while delete does not. This is why there are two forms of delete expressions. One for arrays, which pays the overhead and one for single objects which does not.
In order to only have one version, an implementation would need a mechanism for tracking extra information about every pointer. But one of the founding principles of C++ is that the user shouldn't be forced to pay a cost that they don't absolutely have to.
Always delete what you new and always delete[] what you new[]. But in modern C++, new and new[] are generally not used anymore. Use std::make_unique, std::make_shared, std::vector or other more expressive and safer alternatives.
Basically, malloc and free allocate memory, and new and delete create and destroy objects. So you have to know what the objects are.
To elaborate on the unspecified overhead François Andrieux's answer mentions, you can see my answer on this question in which I examined what does a specific implementation do (Visual C++ 2013, 32-bit). Other implementations may or may not do a similar thing.
In case the new[] was used with an array of objects with a non-trivial destructor, what it did was allocating 4 bytes more, and returning the pointer shifted by 4 bytes ahead, so when delete[] wants to know how many objects are there, it takes the pointer, shifts it 4 bytes prior, and takes the number at that address and treats it as the number of objects stored there. It then calls a destructor on each object (the size of the object is known from the type of the pointer passed). Then, in order to release the exact address, it passes the address that was 4 bytes prior to the passed address.
On this implementation, passing an array allocated with new[] to a regular delete results in calling a single destructor, of the first element, followed by passing the wrong address to the deallocation function, corrupting the heap. Don't do it!
Something not mentioned in the other (all good) answers is that the root cause of this is that arrays - inherited from C - have never been a "first-class" thing in C++.
They have primitive C semantics and do not have C++ semantics, and therefore C++ compiler and runtime support, which would let you or the compiler runtime systems do useful things with pointers to them.
In fact, they're so unsupported by C++ that a pointer to an array of things looks just like a pointer to a single thing. That, in particular, would not happen if arrays were proper parts of the language - even as part of a library, like string or vector.
This wart on the C++ language happened because of this heritage from C. And it remains part of the language - even though we now have std::array for fixed-length arrays and (have always had) std::vector for variable-length arrays - largely for purposes of compatibility: Being able to call out from C++ to operating system APIs and to libraries written in other languages using C-language interop.
And ... because there are truckloads of books and websites and classrooms out there teaching arrays very early in their C++ pedagogy, because of a) being able to write useful/interesting examples early on that do in fact call OS APIs, and of course because of the awesome power of b) "that's the way we've always done it".
Generally, C++ compilers and their associated runtimes build on top of the platform's C runtime. In particular in this case the C memory manager.
The C memory manager allows you to free a block of memory without knowing its size, but there is no standard way to get the size of the block from the runtime and there is no guarantee that the block that was actually allocated is exactly the size you requested. It may well be larger.
Thus the block size stored by the C memory manager can't usefully be used to enable higher-level functionality. If higher-level functionality needs information on the size of the allocation then it must store it itself. (And C++ delete[] does need this for types with destructors, to run them for every element.)
C++ also has an attitude of "you only pay for what you use", storing an extra length field for every allocation (separate from the underlying allocator's bookkeeping) would not fit well with this attitude.
Since the normal way to represent an array of unknown (at compile time) size in C and C++ is with a pointer to its first element, there is no way the compiler can distinguish between a single object allocation and an array allocation based on the type system. So it leaves it up to the programmer to distinguish.
The cover story is that delete is required because of C++'s relationship with C.
The new operator can make a dynamically allocated object of almost any object type.
But, due to the C heritage, a pointer to an object type is ambiguous between two abstractions:
being the location of a single object, and
being the base of a dynamic array.
The delete versus delete[] situation just follows from that.
However, that's does not ring true, because, in spite of the above observations being true, a single delete operator could be used. It does not logically follow that two operators are required.
Here is informal proof. The new T operator invocation (single object case) could implicitly behave as if it were new T[1]. So that is to say, every new could always allocate an array. When no array syntax is mentioned, it could be implicit that an array of [1] will be allocated. Then, there would just have to exist a single delete which behaves like today's delete[].
Why isn't that design followed?
I think it boils down to the usual: it's a goat that was sacrificed to the gods of efficiency. When you allocate an array with new [], extra storage is allocated for meta-data to keep track of the number of elements, so that delete [] can know how many elements need to be iterated for destruction. When you allocate a single object with new, no such meta-data is required. The object can be constructed directly in the memory which comes from the underlying allocator without any extra header.
It's a part of "don't pay for what you don't use" in terms of run-time costs. If you're allocating single objects, you don't have to "pay" for any representational overhead in those objects to deal with the possibility that any dynamic object referenced by pointer might be an array. However, you are burdened with the responsibility of encoding that information in the way you allocate the object with the array new and subsequently delete it.
An example might help. When you allocate a C-style array of objects, those objects may have their own destructor that needs to be called. The delete operator does not do that. It works on container objects, but not C-style arrays. You need delete[] for them.
Here is an example:
#include <iostream>
#include <stdlib.h>
#include <string>
using std::cerr;
using std::cout;
using std::endl;
class silly_string : private std::string {
public:
silly_string(const char* const s) :
std::string(s) {}
~silly_string() {
cout.flush();
cerr << "Deleting \"" << *this << "\"."
<< endl;
// The destructor of the base class is now implicitly invoked.
}
friend std::ostream& operator<< ( std::ostream&, const silly_string& );
};
std::ostream& operator<< ( std::ostream& out, const silly_string& s )
{
return out << static_cast<const std::string>(s);
}
int main()
{
constexpr size_t nwords = 2;
silly_string *const words = new silly_string[nwords]{
"hello,",
"world!" };
cout << words[0] << ' '
<< words[1] << '\n';
delete[] words;
return EXIT_SUCCESS;
}
That test program explicitly instruments the destructor calls. It’s obviously a contrived example. For one thing, a program does not need to free memory immediately before it terminates and releases all its resources. But it does demonstrate what happens and in what order.
Some compilers, such as clang++, are smart enough to warn you if you leave out the [] in delete[] words;, but if you force it to compile the buggy code anyway, you get heap corruption.
Delete is an operator that destroys array and non-array(pointer) objects which are generated by new expression.
It can be used by either using the Delete operator or Delete [ ] operator
A new operator is used for dynamic memory allocation which puts variables on heap memory.
This means the Delete operator deallocates memory from the heap.
Pointer to object is not destroyed, value or memory block pointed by the pointer is destroyed.
The delete operator has a void return type that does not return a value.
I have found a problem when using some existing FORTRAN code. Although it had anticipated the need to deallocate arrays before re-allocating, this had never been necessary. I now need it to do this, and it doesn't function correctly.
The current pseudo-code is approximately:
MODULE mA
TYPE A
REAL, DIMENSION(:,:,:), ALLOCATABLE :: array
END TYPE
TYPE (A), POINTER :: sw(:)
END MODULE
Later, there is the code which allocates the size of 'array', which I'm now calling twice (hitherto only once):
...
IF (ALLOCATED(sw(1)%array)) DEALLOCATE(sw(1)%array, STAT=aviFail)
IF (aviFail.EQ.0) ALLOCATE(sw(1)%array(1,2,3), STAT=aviFail)
...
I've looked at the definition of ALLOCATE, DEALLOCATE and ALLOCATED, and I have found the following:
On the second time through, DEALLOCATE is called, but the STAT value is '1'
In case of failure (i.e. a positive STAT return), DEALLOCATE is meant to leave the original array untouched. It doesn't: it apparently clears it correctly (at least, according to the debugger).
In case of failure and no STAT being defined, DEALLOCATE is meant to terminate the program. It doesn't, but the following ALLOCATE statement fails with STAT value of '1'.
I had also inadvertently called ALLOCATE on the same array twice elsewhere, without DEALLOCATING first. According to the book, this should result in program termination. It not only works, but works correctly and the STAT return from the second ALLOCATE is '0'.
Does Intel FORTRAN handle these things differently, or is FORTRAN not as fussy about fulfilling its specification as C++?
Without seeing more of the implementation, it is difficult to give a detailed & targeted explanation, but I think it's likely to be the implementation of the pointer that is causing your problem. The "book" answers you gave on the behavior of ALLOCATE and DEALLOCATE sound correct, but you described how they behave when working directly with an allocatable array. ALLOCATE and DEALLOCATE may function differently (compiler dependent) when operating on a pointer. At the most basic level, allocating memory through a pointer requires more steps: 1) determine the type/dimension of object to be created for the pointer, 2) create and allocate an unnamed object of that type/dimension in memory, 3) associate the pointer with the new object. Depending on the implementation, compiler, and other factors these extra steps can add complexity to the observed behavior of a program.
Is there a particular reason for using a pointer in this implementation? If not, I would recommend switching to a simpler normal allocatable array to see if the problem persists.
Regarding you being able to ALLOCATE and array twice by mistake without the expected program termination: I think this is also related to your implementation using a pointer. The pointer you are re-allocating is already associated with a location in memory. It is likely that this association changes the manner in which the compiler handles the ALLOCATE statement as it is executed the second time. If the pointer is already associated with a memory position with the dimensions the ALLOCATE statement is asking for, then there is no reason to terminate the program or throw an error; the programmer is getting exactly what he or she asked for.
In closing, the ALLOCATE/DEALLOCATE statements and pointer association/nullification are handled differently by different compilers, so it's not surprising that your observing behavior not in accordance with "the book." I would recommend taking a look at whether you really need the pointer implementation and be sure to be applying memory management best practices as you code.
I'm going through a C++ book at the moment and i'm slightly confused about pointing to classes.
Earlier in the book the examples used classes and methods in this way:
Calculator myCalc;
myCalc.launch();
while( myCalc.run() ){
myCalc.readInput();
myCalc.writeOutput();
}
However, now it's changed to doing it this way:
Calculator* myCalc = new Calculator;
myCalc -> launch();
while( myCalc -> run() ){
myCalc -> readInput();
myCalc -> writeOutput();
}
And I can't seem to find an explanation in there as to WHY it is doing it this way.
Why would I want to point to a class in this way, rather than use the standard way of doing it?
What is the difference? And what circumstances would one or the other be preferable?
Thank you.
First, you are not pointing to the class, but to an instance of the class, also called an object. (Pointing to classes is not possible in C++, one of its flaws if you'd ask me).
The difference is the place where the object is allocated. When you're doing:
Calculator myCalc;
The whole object is created on the stack. The stack is the storage for local variables, nested calls and so on, and is often limited to 1 MB or lower. On the other hand, allocations on the stack are faster, as no memory manager call is involved.
When you do:
Calculator *myCalc;
Not much happens, except that a Pointer is allocated on the stack. A pointer is usually 4 or 8 bytes in size (32bit vs. 64bit architectures) and only holds a memory address. You have to allocate an object and make the pointer point to it by doing something like:
myCalc = new Calculator;
which can also be combined into one line like shown in your example. Here, the object is allocated on the heap, which is approximately as large as your phyiscal memory (leaving swap space and architectural limitations unconsidered), so you can store way more data there. But it is slower, as the memory manager needs to kick in and find a spare place on the heap for your object or even needs to get more memory from the operating system. Now the pointer myCalc contains the memory address of the object, so it can be used with the * and the -> operators.
Also you cannot pass pointers or references to objects on the stack outside their scope, as the stack will get cleaned when the scope ends (i.e. at the end of a function for example), thus the object becomes unavailable.
Oh and nearly forgot to mention. Objects on the heap are not automatically destroyed, so you have to delete them manually like this*:
delete myCalc;
So to sum it up: For small, short living objects which are not to leave their scope, you can use stack based allocation, while for larger, long living objects the heap is usually the better place to go.
*: Well, ideally, not like that. Use a smart pointer, like std::unique_ptr.
You use the dot (.) when your variable is an instance or reference of the class while you use -> if your variable is a pointer to an instance of a class.
They are both part of the C++ standard, but there is a core difference. In the first way, your object lives on the stack (which is where functions and local variables are stored, and removed after they are no longer used). When you instead declare your variable type as a pointer, you are only storing a pointer on the stack, and the object itself is going on the heap.
While when you use the stack local variable to allocate the memory, it is automatically taken care of by C++. When it's on the heap, you have to get the memory with new and free it with delete.
While in the stack example your code uses . to call methods, to call methods on a pointer, C++ provides a shortcut: ->, which is equivalent to *obj.method().
Remember, when you use new, always use delete.
Both are standard. One is not preferred over the other.
The first one is typical of local variables that you declare and use in a narrow scope.
The pointer method allows you to dynamically allocate memory and assign it to a pointer type; that's what the "star" notation means. These can be passed out of a method or assigned to a member variable, living on after a method is exited.
But you have to be aware that you are also responsible for cleaning up that memory when you're done with the object the pointer refers to. If you don't, you many eventually exhaust a long-running application with a "memory leak".
Other than the obvious difference in notation/syntax. Pointers are generally useful when passing data into a function.
void myFunc(Calculator *c) {
...
}
is usually preferred over
void myFunc(Calculator c) {
...
}
since the second requires a copy be made of the calculator. A pointer only contains the location to what is being pointed to, so it only refers to another spot in memory instead of containing the data itself. Another good use is for strings, imagine reading a text file and calling functions to process the text, each function would make a copy of the string if it were not a pointer. A pointer is either 4 or 8 bytes depending on the machines architecture so it can save a lot of time and memory when passing it to functions.
In some case though it may be better to work with a copy. Maybe you just want to return an altered version like so
Calculator myFunc(Calculator c) {
...
}
one of the important things about pointers is the "new" keyword. It is not the only way to create a pointer but it is the easiest way that for c++. You should also be able to use a function called malloc() but that is more for structs and c IMO but I have seen both ways.
Speaking of C. Pointers may also be good for arrays. I think you can still only declare the size of an array at compile time in c++ too, but I could be mistaken. You could use the following I believe
Calculator *c;
....
Calculator d = c[index];
So now you have an array which can make it quite ambiguous IMO.
I think that covers just about all I know and in the example provided I do not think there is any difference between the two snippets you provided.
First of all, you are not pointing to a class, you are pointing to an instance (or object) of that class. In some other languages, classes are actually objects too :-)
The example is just that, an example. Most likely you wouldn't use pointers there.
Now, what IS a pointer? A pointer is just a tiny little thing that points to the real thing. Like the nametag on a doorbell -- it shows your name, but it's not actually you. However, because it is not you, you can actually have multiple buttons with your name on it in different locations.
This is one reason for using pointers: if you have one object, but you want to keep pointers to that object in various places. I mean, the real world has tons of "pointers" to you in all sorts of places; it shouldn't be too difficult to imagine that programs might need similar things inside their data.
Pointers are also used to avoid having to copy the object around, which can be an expensive operation. Passing a pointer to functions is much cheaper. Plus, it allows functions to modify the object (note that technically, C++ "references" are pointers as well, it's just a little less obvious and they are more limited).
In addition, objects allocated with "new" will stay around until they are deallocated with "delete". Thus, they don't depend on scoping -- they don't disappear when the function around them finishes, they only disappear when they are told to get lost.
Plus, how would you make a "bag with fruit"? You allocate a "bag" object. Then you allocate a "fruit" object, and you set a pointer inside the bag object to point to the fruit object, indicating that the bag is supposed to contain that fruit. The fruit might also get a pointer to the bag object, just so code working on the fruit can also get to the bag. You can also allocate another "fruit" object, and establish a chain of pointers: each "fruit" could have a single "next" pointer that points to the "next" fruit, so you can put an arbitrary number of fruits into the bag: the bag contains a pointer to the first fruit, and each fruit contains a pointer to another fruit. So you get a whole chain of fruits.
( This is a simple "container"; there are several such classes that "contain" an arbitrary number of objects ).
It's actually not that simple to come up with descriptions of when or why pointers are used; usually there'll just be situations where you'll need them. It's much easier to see their usefulness when you run into such a situation. Like "why is an umbrella useful" -- once you step into the pouring rain outside, the usefulness of an umbrella will become obvious.
One use would be if the variable myCalc has a very long lifetime. You can create it when you need if with new and remove it when done with delete. Then you don't have to worry about carrying it around at times when it's not needed and would only take up space. Or you can reinitialise it at will when needed, etc.
Or when you have a very big class, it's common practice to use new to allocate it on the heap rather than the stack. This is a leftover from the days when stack space was scarce and the heap was larger, so heap space was cheaper.
Or, of course, the most common use, allocating a dynamic array. myCalc = new Calculator[x]; to create x new calculators. You can't do this with static variables if you don't know beforehand how large x is; how many objects you're going to create.
Can I check if an object (passed by pointer or reference) is dynamically allocated?
Example:
T t;
T* pt = new T();
is_tmp(&t); // false
is_tmp(pt); // true
Context
I perfectly realise this smells like bad design, and as a matter of fact it is, but I am trying to extend code I cannot (or should not) modify (of course I blame code that isn't mine ;) ). It calls a method (which I can override) that will delete the passed object among other things that are only applicable to dynamically allocated objects. Now, I want to check whether I have something that is okay to be deleted or if it is a temporary.
I will never pass a global (or static) variable, so I leave this undefined, here.
Not portably. Under Solaris or Linux on a PC (at least 32 bit Linux),
the stack is at the very top of available memory, so you can compare the
address passed in to the address of a local variable: if the address
passed in is higher than that of the local variable, the object it
points to is either a local variable or a temporary, or a part of a
local variable or temporary. This technique, however, invokes undefined
behavior right and left—it just happens to work on the two
platforms I mention (and will probably work on all platforms where the
stack is at the top of available memory and grows down).
FWIW: you can also check for statics on these machines. All statics are
at the bottom of memory, and the linker inserts a symbol end at the
end of them. So declare an external data (of any type) with this name,
and compare the address with it.
With regards to possibly deleting the object, however... just knowing
that the object is not on the heap (nor is a static) is not enough. The
object might be a member of a larger dynamically allocated object.
In general, as DeadMG said, there's no way you can tell from a pointer where it comes from. However, as a debugging or porting or analyzing measure, you could add a member operator new to your class which tracks dynamic allocations (provided nobody uses the explicit global ::new -- that includes containers, I'm afraid). You could then build up a set<T*> of dynamically allocated memory and search in there.
That's not at all suitable for any sort of serious application, but perhaps this can help you track where things are coming from. You can even add debug messages with line numbers to your operator.
No, it's impossible to know. You should fix the bug. In the least case, you can use a smart pointer (like shared_ptr) and give it an empty custom destructor if you don't want it to be deleted.
If you have access to the dynamic memory allocator code itself, you could scan the internal structure and see if the current pointer is in its allocated list/stack/area or however it is being stored. Quite often they are stored as linked list style structs and it wouldn't be too hard to scan for your var's address.
In my opinion it should be possible
because you can check if the memory is on the heap or on the stack
This is going to be highly platform depended code
First you have to get the range of the heap, and then you have to check if the passed memory adress is in this range...
(sounds simple, but the first step is probably tricky :-) )