What types in C++ can be instantiated?
I know that the following each directly create a single instance of Foo:
Foo bar;
Foo *bizz = new Foo();
However, what about with built-in types? Does the following create two instances of int, or is instance the wrong word to use and memory is just being allocated?
int bar2;
int *bizz2 = new int;
What about pointers? Did the above example create an int * instance, or just allocate memory for an int *?
Would using literals like 42 or 3.14 create an instance as well?
I've seen the argument that if you cannot subclass a type, it is not a class, and if it is not a class, it cannot be instantiated. Is this true?
So long as we're talking about C++, the only authoritative source is the ISO standard. That doesn't ever use the word "instantiation" for anything but class and function templates.
It does, however, use the word "instance". For example:
An instance of each object with automatic storage duration (3.7.2) is associated with each entry into its block.
Note that in C++ parlance, an int lvalue is also an "object":
The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is a region of storage.
Since new clearly creates regions of storage, anything thus created is an object, and, following the precedent of the specification, can be called an instance.
As far as I can tell, you're really just asking about terminology here. The only real distinction made by the C++ standard is POD types and non-POD types, where non-POD types have features like user-defined constructors, member functions, private variables, etc., and POD types don't. Basic types like int and float are of course PODs, as are arrays of PODs and C-structs of PODs.
Apart from (and overlapping with) C++, the concept of an "instance" in Object-Oriented Programming usually refers to allocating space for an object in memory, and then initializing it with a constructor. Whether this is done on the stack or the heap, or any other location in memory for that matter, is largely irrelevant.
However, the C++ standard seems to consider all data types "objects." For example, in 3.9 it says:
"The object representation of type T
is the sequence of N unsigned char
objects taken up by the object of type
T, where N equals sizeof(T)..."
So basically, the only distinction made by the C++ standard itself is POD versus non-POD.
in C++ an 'instance' and 'instantiate' is only associated with Classes
note however that these are also english words that can have conversational meaning.
'pointer' is certainly a class of things in the english usage and a pointer is certainly an instance of that class
but in c++ speak 'pointer' is not a Class and a pointer is not an Instance of a Class
see also - how many angels on pinheads
The concept of an "instance" isn't something that's really intrinsic to C++ -- basically you have "things which have a constructor and things which don't".
So, all types have a size, e.g. an int is commonly 4 bytes, a struct with a couple of ints is going to be 8 and so on. Now, slap a constructor on that struct, and it starts looking (and behaving) like a class. More specifically:
int foo; // <-- 4 bytes, no constructor
struct Foo
{
int foo;
int bar;
}; // <-- 8 bytes, no constructor
struct Foo
{
Foo() : foo(0), bar(0) {}
int foo;
int bar;
}; // <-- 8 bytes, with constructor
Now, you any of these types can live on the stack or on the heap. When you create something on the stack, like the "int foo;" above, goes away after its scope goes away (e.g. at the end of the function call). If you create something with "new" it goes on the heap and gets its own place to live in memory until you call delete on it. In both cases the constructor, if there, will be called during instantiation.
It is unusual to do "new int", but it's allowed. You can even pass 0 or 1 arguments to the constructor. I'm not sure if "new int()" means it's 0-initialized (I'd guess yes) as distinct from "new int".
When you define a value on the stack, it's not usually called "allocating memory" (although it is getting memory on the stack in theory, it's possible that the value lives only in CPU registers).
Literals don't necessarily get an address in program memory; CPU instructions may encode data directly (e.g. put 42 into register B). Probably arbitrary floating point constants have an address.
Related
In C99 you can have something like
struct foo
{
int a;
int data[];
};
And then allocate with foo* f=(foo*)malloc(sizeof(foo)+n) to have a struct where the length of the array is n.
Can one do something similar in C++ when the class is a subclass with virtual functions?
Like foo being a subclass of bar, then do something like std::unique_ptr<bar> f= std::unique_ptr<foo>((foo*)malloc(sizeof(foo)+n))
I know that that code doesn't work as freeing the memory would be done with delete but allocation was done with malloc
Variable length arrays are not actually part of the C++ standard, but rather a compiler extension. However, if you really want to use them, I mean, allocating the object with malloc, you would need to use placement new to call the constructor, and manually call the destructor (which should be virtual) like f->~bar() before calling free. Since malloc produces a pointer to memory of necessary size for initialization of the object, this shouldn't produce undefined behaviour.
No, it is not possible by standard rules. Variable-length arrays and flexible array members, such as you show in your example, are not allowed in C++ at all. There is no equivalent or alternative either.
Also, malloc cannot be used to create objects in C++ at all. Only new with the correct type given to it can create an object of that type dynamically. Everything else is not allowed and causes undefined behavior if one pretends that an object of the given type was created.
Since C++20 there are some exceptions to the rule above for certain types of objects which may be created implicitly, but still, the size of an object is fixed at compile-time by its type and can not be varied at all.
Overallocating for an object does never cause the additional storage to become part of the object and one is never allowed to access it as if it was.
I have tried to research it.. but I think everything is an object in c++....
like (int, float) are scalar objects..etc.
But when we create class's instance, documentation refers the "initialize an object with constructor" in c++. What does it actually means.
I think everything is an object in c++
By the letter of the standard, an "object" is "a region of storage". You can read the nitty-gritty details under [intro.object].
But in layman's terms yes you are right.
like (int, float) are scalar objects..etc.
Absolutely. An int is an object. A float is an object.
(Of course, int and float themselves are types.)
But when we create class's instance, documentation refers the "initialize an object with constructor" in c++.
There's nothing wrong with that. You can initialise an int, and you can initialise a float, and you can initialise an object of class type. For the latter case, one way to do that is using a constructor. That doesn't change anything.
What does it actually means.
Exactly what it says: performing the steps needed to give some object an initial value.
I'll caution you also that there is a lot of bad "documentation" (notably poor tutorials etc.) for C++ out there on the web, so it's also possible that you came across badly-worded or flat-out incorrect text. Notice that, even in the comments section under your question, some people got this wrong.
everything is an object in c++.... like (int, float) are scalar
objects
This is wrong. int and float are built-in types. The user can define it's own types, like for example:
struct A {};
Here, A is a user-defined type. It is not an object!
An object is an instance of a type:
A a;
int i;
Here, a and i are objects of type A and int. When you initialize an object of type A, it means you instanciate class A and initialize it by calling one of the class constructors.
Another example:
std::vector<int> v {1,2,3}
Here, the object v of type std::vector<int> is initialized with the values {1,2,3}, by calling the constructor of the class std::vector<int>.
Creating an object in C++ has 2 steps:
1) find some memory to contain the object.
This can be some space in the stack frame of the function for local objects or the data section for global objects. In those cases the compiler deals with that. Or operator new is used to dynamically create an object and allocates some memory. Which is the case if you write "new Foo()" anywhere.
2) "initialize an object with constructor"
This simply means the constructor is called. The address from step 1 is passed as this to the constructor and any arguments you specified too. If you have no constructor then the default constructor is used when possible.
I know that, in C++, when you write
int i;
you can not make any assumptions about the value that the variable will hold until you effectively assign it a value. However, if you write
int i = int();
then you have the guarantee that i will be 0. So my question is, isn't it actually an incosistency in the behavior of the language? I mean, if I have defined a class MyClass and write
MyClass myInstance;
I can rest assured that the default constructor without parameters of the class will be called to initialize myInstance (and the compiler will fail if there is none), because that's how the RAII principle goes. However, it seems that when it comes to primitive types, resource acquisition is not initialization anymore. Why is that?
I don't think that changing this behavior inherited from C would break any existing code (is there any code in the world that works on the assumption that no assumption can be made about the value of a variable?), so the main possible reason that comes to my mind is performance, for example when creating big arrays of primitive types; but still, I'd like to know if there is some official explanation to this.
Thanks.
No. It is not inconsistency.
What if your class is defined as:
struct MyClass
{
int x;
float y;
char *z;
};
then this line does NOT do that you think it does:
MyClass myInstance;
Assuming the above is declared inside a function, it is same as:
int x; //assuming declared inside a function
In C++, the types are broadly divided into 3 kinds viz. POD, non-POD, Aggregates — and there is a clear distinction between them. Please read about them and their initialization rules (there are too many topics on them. Search on this site). Also read about static initialization and dynamic initialization.
The real reason, at least initially, was that C++ wanted all
objects which are compatible with C to behave exactly as they
would in C. The reason in C was (and still is) performance;
zero initialization of objects with static lifetime was free
(because the OS must initialize all memory that it gives the
process anyway, for security reasons); zero initialization
otherwise costs runtime. (The performance rationale is less
strong today than it was originally, because compilers are a lot
better at determining that the variable will be initialized
later, and suppressing the zero-initialization in such cases;
but they do still exist; in particular, in cases like:
char buffer[1000];
strcpy( buffer, something );
If zero initialization were required, I don't know of any
compiler which would be able to skip it here, even though it
won't be necessary.)
If you write
int i;
then the initialization or not depends on the context.
Namespace scope → zero-initialized.
Local function scope → uninitialized.
Class member: depends on the constructors, if any.
The lack of initialization for a local variable is just for efficiency. For a very simple function that is called repeatedly at the lowest levels, this can matter. And C and C++ are languages used to construct the bottom levels of things.
When you set a local variable in a function to some value, then every time the function is called, the assignment takes place and the value is loaded into the stack.
For example:
void func()
{
int i = 0; // Every time `func` is called, '0' is loaded into the stack
...
}
This is something that you might want to avoid, in particularly since the C and C++ languages are also designated for real-time systems, where every operation matters.
And by the way, when you declare MyClass myInstance, you can indeed rest assure that the default constructor is called, but you can choose whether or not you want to do anything in that constructor.
So the C and C++ languages allow you to make the same choice for primitive-type variables as well.
I'm trying to understand POD types and how they are allocated and initialized on the stack.
Given
class A {
public:
A();
int x;
};
class B {
public:
int x;
};
int func()
{
A a;
B b;
}
Am I correct in saying that b is allocated after a but initialized prior to a? By that I mean
that the space is allocate for a and b in the order that they are declared but b is initialized
when the space is allocated and a is initialized when it is declared?
I read a very good FAQ about PODs and Aggregated here
What are Aggregates and PODs and how/why are they special?
One of the things he said is:
The lifetime of objects of non-POD class type begins when the constructor has finished and ends when the destructor has finished. For POD classes, the lifetime begins when storage for the object is occupied and finishes when that storage is released or reused.
So I'm trying understand the details of how PODs are allocated and initialized and how that is
different from non-PODs.
No. a is allocated and initialized first, and b is allocated and initialized second. C++ programs are executed statement by statement. Since the memory is automatic, there is no explicit allocation happening anyway -- it's all taken care of automatically.
(For instance, in typical call-stack implementations used on desktop operating systems, the memory is and has always been there and doesn't need to be allocated at all, just addressed.)
You have zero guarantees of any kind for the order in memory that A and B are allocated.
If A and B both had constructors, a's would be called before b's. But POD types, which you're asking about (and which B is) are not initialized at all with this syntax, so the question is moot.
The question of object initialization in relation to when the storage is allocated doesn't make much sense anyway. For example, most compilers here will allocate space for A and B in a single stack pointer move. Given that there is no way a conforming C++ program can detect such a thing (what does it even mean?), the compiler can do pretty much whatever it wants, though.
These are local variables, they are not "allocated" in the common sense, you can just consider them being there. (How is left to implementation; common way is to use a processor-supported stack. In that case all the storage for all local objects is taken on the stack at function entry).
Initialization always happens in the order of declarations. Here it means A::A() is called for a, then B::B() is called for b.
Constructors build objects from dust.
This is a statement which I have been coming across many times,recently.
While initializing a built-in datatype variable, the variable also HAS to be "built from dust" . So, are there also constructors for built in types?
Also, how does the compiler treat a BUILT IN DATATYPE and a USER DEFINED CLASS differently, while creating instances for each?
I mean details regarding constructors, destructors etc.
This query on stack overflow is regarding the same and it has some pretty intresting details , most intresting one being what Bjarne said ... !
Do built-in types have default constructors?
Simply put, according to the C++ standard:
12.1 Constructors [class.ctor]
2. A constructor is used to initialize objects of its class type...
so no, built-in datatypes (assuming you're talking about things like ints and floats) do not have constructors because they are not class types. Class types are are specified as such:
9 Classes [class]
1. A class is a type. Its name becomes a class-name (9.1) within
its scope.
class-name:
identifier
template-id
Class-specifiers and elaborated-type-specifiers (7.1.5.3) are used
to make class-names. An object of a class consists of a (possibly
empty) sequence of members and base class objects.
class-specifier:
class-head { member-specification (opt) }
class-head:
class-key identifieropt base-clauseopt
class-key nested-name-specifier identifier base-clauseopt
class-key nested-name-specifieropt template-id base-clauseopt
class-key:
class
struct
union
And since the built-in types are not declared like that, they cannot be class types.
So how are instances of built-in types created? The general process of bringing built-in and class instances into existance is called initialization, for which there's a huge 8-page section in the C++ standard (8.5) that lays out in excruciating detail about it. Here's some of the rules you can find in section 8.5.
As already mentioned, built-in data types don't have constructors.
But you still can use construction-like initialization syntax, like in int i(3), or int i = int(). As far as I know that was introduced to language to better support generic programming, i.e. to be able to write
template <class T>
T f() { T t = T(); }
f(42);
While initializing a built-in datatype variable, the variable also HAS to be "built from dust" . So, are there also constructors for built in types?
Per request, I am rebuilding my answer from dust.
I'm not particularly fond of that "Constructors build objects from dust" phrase. It is a bit misleading.
An object, be it a primitive type, a pointer, or a instance of a big class, occupies a certain known amount of memory. That memory must somehow be set aside for the object. In some circumstances, that set-aside memory is initialized. That initialization is what constructors do. They do not set aside (or allocate) the memory needed to store the object. That step is performed before the constructor is called.
There are times when a variable does not have to be initialized. For example,
int some_function (int some argument) {
int index;
...
}
Note that index was not assigned a value. On entry to some_function, a chunk of memory is set aside for the variable index. This memory already exists somewhere; it is just set aside, or allocated. Since the memory already exists somewhere, each bit will have some pre-existing value. If a variable is not initialized, it will have an initial value. The initial value of the variable index might be 42, or 1404197501, or something entirely different.
Some languages provide a default initialization in case the programmer did not specify one. (C and C++ do not.) Sometimes there is nothing wrong with not initializing a variable to a known value. The very next statement might be an assignment statement, for example. The upside of providing a default initialization is that failing to initialize variables is a typical programming mistake. The downside is that this initialization has a cost, albeit typically tiny. That tiny cost can be significant when it occurs in a time-critical, multiply-nested loop. Not providing a default initial value fits the C and C++ philosophy of not providing something the programmer did not ask for.
Some variables, even non-class variables, absolutely do need to be given an initial value. For example, there is no way to assign a value to a variable that is of a reference type except in the declaration statement. The same goes for variables that are declared to be constant.
Some classes have hidden data that absolutely do need to be initialized. Some classes have const or reference data members that absolutely do need to be initialized. These classes need to be initialized, or constructed. Not all classes do need to be initialized. A class or structure that doesn't have any virtual functions, doesn't have an explicitly-provided constructor or destructor, and whose member data are all primitive data types, is called plain old data, or POD. POD classes do not need to be constructed.
Bottom line:
An object, whether it is a primitive type or an instance of a very complex class, is not "built from dust". Dust is, after all, very harmful to computers. They are built from bits.
Setting aside, or allocating, memory for some object and initializing that set-aside memory are two different things.
The memory need to store an object is allocated, not created. The memory already exists. Because that memory already exists, the bits that comprise the object will have some pre-existing values. You should of course never rely on those preexisting values, but they are there.
The reason for initializing variables, or data members, is to give them a reliable, known value. Sometimes that initialization is just a waste of CPU time. If you didn't ask the compiler to provide such a value, C and C++ assume the omission is intentional.
The constructor for some object does not allocate the memory needed to store the object itself. That step has already been done by the time the constructor is called. What a constructor does do is to initialize that already allocated memory.
The initial response:
A variable of a primitive type does not have to be "built from dust". The memory to store the variable needs to be allocated, but the variable can be left uninitialized. A constructor does not build the object from dust. A constructor does not allocate the memory needed to store the to-be constructed object. That memory has already been allocated by the time the constructor is called. (A constructor might initialize some pointer data member to memory allocated by the constructor, but the bits occupied by that pointer must already exist.)
Some objects such as primitive types and POD classes do not necessarily need to be initialized. Declare a non-static primitive type variable without an initial value and that variable will be uninitialized. The same goes for POD classes. Suppose you know you are going to assign a value to some variable before the value of the variable is accessed. Do you need to provide an initial value? No.
Some languages do give an initial value to every variable. C and C++ do not. If you didn't ask for an initial value, C and C++ are not going to force an initial value on the variable. That initialization has a cost, typically tiny, but it exists.
Built In data types(fundamental types, arrays,references, pointers, and enums) do not have constructors.
A constructor is a member function. A member function can only be defined for a class type
C++03 9.3/1:
"Functions declared in the definition of a class, excluding those declared with a friend specifier, are called member functions of that class".
Many a times usage of an POD type in certain syntax's(given below) might give an impression that they are constructed using constructors or copy constructors but it just Initialization without any of the two.
int x(5);