C++ What is the difference between definition and instantiation? - c++

What is the difference between definition and instantiation?
Sub-question: Are "variable definition" and "variable instantiation" the same?
int x;
The above code can be reffered to as both a variable definition as well as a variable instantiation, right? If so, my question is if these two terms are synonyms? (or is there a different relation between them?)

After quite some edits and also a correction made by Johannes Schaub:
A definition of a variable of a certain type creates a variable of that type. As far as Stroustrup is concerned, this also holds for the definition of objects of a certain class, since a class is nothing more than a (non-native) type. (This makes much sense, although it isn't general OO terminology.)
General object oriented termininology: Instantiation of a class creates an object of that class. Specialized C++ terminology: Instantiation of a template creates a "perfectly ordinary" (Stroustrup) class.
A class is a type, defined in code, rather than as part of the language.
An implementation is a concrete class that realizes the functionality specified in an abstract class from which it derives.
A declaration is a specification of a variable or function, without allocating memory for it or generating code for it.
So #Riko, the definition on learncpp.com specifying that a definition implements an indentifier is not very accurate. Interfaces can be implemented, not types or classes. But one part of the definition is valuable: Definition in general goes hand in hand with memory allocation. You can declare a function or variable as often as you want (e.g. a declaration in a header file), but you an define it only once. If you declare a function, you give the signature (the name, return type and params, but not the body).
If you declare a variable, you used to put the word extern in front of it in a header file, but that isn't done often anymore, since object orientation and classes took over. Defining a variable in a header file, on the other hand, may lead to multiple instances of the variable, since the same header is read during compilation of distinct source files. Since C++ uses independent compilation and just textually includes the header, the variable is defined in multiple files, so there are several variables under the same name. Linkers don't like such ambiguity and will complain.
While the term instantiation in general means "creating an object of a class", Stroustrup (maker of C++) uses it in a special sense: A class is an instance of a template with all its parameters resolved. Nevertheless in many texts on C++ the word instantiation is used in the general Object Oriented sense, which is confusing.
#Jonannes Schaub. Although I am not too happy with C++ terminology deviating from general OO terminology, I think it's right to follow Stroustrup here, since after all, he created the language.

There are:
1) variable definitions,
2) variable/object instantiations, and
3) template instantiations.
1 & 3 are specific C++ terminology. 2 is more general terminology that might be used with C++. It is not an "officially" defined term for C++.
I understand that your question is about 1 and 2, but not 3. 3 is different than 2, though related in meaning. I won't address 3 further as I don't believe it is part of your question.
Instantiation is the creation of an object instance. It is more usual to use the term in reference to a class object than something like an int or double.
A C++ variable definition does cause an object of the type being defined to be instantiated. It is, however, possible in C++ to instantiate an object other than via a variable definition.
Example 1:
std::string name;
The variable name, a std::string, is defined and (at run-time) instantiated.
Example 2:
std::string *namePointer;
The variable namePointer, a pointer, is defined and might be said (at run-time) to be instantiated (though not initialized). There is no std::string variable and no std::string is instantiated.
//simple example, not what one should usually write in real code
namePointer = new std::string("Some Text");
No additional variable is defined. A std::string object is instantiated (at run-time) and the separate and pre-existing namePointer variable also has its value set.

Definition and Declaration are compile time concerns.
Declaration and definition of identifiers happens while your program is being compiled.
Declaration: A declaration is telling the compiler about the type of an identifier that is defined somewhere else but may be referenced here.
Definition: There can be only one definition of an identifier. This is where the thing is actually defined. All the declarations refer to this definition.
Mostly this is only a distinction we make with classes because built in types are already defined (the compiler already knows what an int is). The only exception I can think of is when we declare a variable to be extern.
Instantiation, this happens at run time.
An object is an instance of a class.
Instantiation is the act of creating a new object.
Instantiation of an object happens while your program is being run. Instantiation is when a new instance of the class is created (an object).
In C++ when an class is instantiated memory is allocated for the object and the classes constructor is run. In C++ we can instantiate objects in two ways, on the stack as a variable declaration, or on the heap with the new keyword. So for the class A both of the following create an instance of the class (instantiate it)
struct A {
int a;
};
A inst1;
A* inst2 = new A();
inst1 is a local variable that refers to an instance of the class A that was just created on the stack.
inst2 is a local variable that holds a pointer to an instance of class A that was just created on the heap.
There may be some confusion because the first is not possible in Java or C#, only the second. In C++ both instantiate (create a new runtime instance of) the class A. The only difference beteween the two is scope and where the memory was allocated.

Related

Why C++ static data members are needed to define but non-static data members do not?

I am trying to understand the difference between the declaration & definition of static and non-static data members. Apology, if I am fundamentally miss understood concepts. Your explanations are highly appreciated.
Code Trying to understand
class A
{
public:
int ns; // declare non-static data member.
static int s; // declare static data member.
void foo();
};
int A::s; // define non-static data member.
// int A::ns; //This gives an error if defined.
void A::foo()
{
ns = 10;
s = 5; // if s is not defined this gives an error 'undefined reference'
}
When you declare something, you're telling the compiler that the name being declared exists and what kind of name it is (type, variable, function, etc.) The definition could be with the declaration (as with your class A) or be elsewhere—the compiler and linker will have to connect the two later.
The key point of a variable or function definition is that it tells the compiler and linker where this variable/function will live. If you have a variable, there needs to be a place in memory for it. If you have a function, there needs to be a place in the binary containing the function's instructions.
For non-static data members, the declaration is also the definition. That is, you're giving them a place to live¹. This place is within each instance of the class. Every time you make a new A object, it comes with an ns as part of it.
Static data members, on the other hand, have no associated object. Without a definition, you've got a situation where you have N instances of A all sharing the same s, but nowhere to put s. Therefore, C++ makes you choose one translation unit for it via a definition, most often the source file that acommpanies that header.
You could argue that the compiler should just pick one instance for it, but this won't work for various reasons, one being that you can use static data members before ever creating an instance, after the last instance is gone, or without having instances at all.
Now you might wonder why the compiler and linker still can't just figure it out on their own, and... that's actually pretty much what happens if you slap an inline on the variable or function. You can end up with multiple definitions, but only one will be chosen.
1: Giving them a place to live is a little beside the point here. All the compiler needs to know when it creates an object of that class is how much space to give it and which parts of that space are which data members. You could think of it as the compiler doing the definition part for you since there's only one place that data member could possibly live.
static members are essentially global variables with a special name and access rules tied to the class. Hence, they inherit all the problems for usual global variables. Namely, in the whole C++ program (which is the union of all translation units aka .cpp files) there should be exactly one definition of each global variable, no more.
You can think of "variable definition" as "the place which will allocate memory for the variable".
However, classes are typically defined in a header file (.h/.hpp/etc) which is included in multiple translation units. So it's up to the programmer to specify which translation unit actually defines the variable. Note that since C++17 we have the inline keyword which places this burden on a compiler, look for "inline variables". The naming is weird for historical reasons.
However, non-static members do not really exist until you create an instance of the class, i.e. an object. And it's the object lifetime and storage duration which define how each individual member is created/stored/destroyed. So there is no need to actually define them anywhere outside of the class.
static variables belongs to the class definition. non-static variables belong to the instances created with the class definition.
int main()
{
A::s = 5; // this is ok
A a;
a.ns = 5 // this is also ok
}

Confusion about the difference between declarations and definitions in C++

I am confused. In part 3.8 of Bjarne Stroustrup's book 'Programming Principles and Practice Using C++' he talks about types of objects. I cite the following list:
A type defines a set of possible values and a set of operations (for an object).
An object is some memory that holds a value of a given type.
A value is a set of bits in memory interpreted according to a type.
A variable is a named object.
A declaration is a statement that gives a name to an object.
A definition is a declaration that sets aside memory for an object.
From his explanation of definition I understand that no memory is set aside for an object during declaration. However, the fact that Bjarne mentions that declaration involves the naming of an object, suggests that memory is actually set aside, as objects are explained as being
some memory that holds a value of a given type.
Can someone clarify this?
One of the complexities of C++ is that compilation is done in "translation units" (without seeing the whole program). Each translation units contains declarations of some parts defined in other translation units and definitions of some other parts. The declaration provides enough information to be able to generate code that uses the declared part once the address will be resolved by the linker.
Only one definition for an object or non-inline function is allowed in a program but there can be multiple declarations.
Things are indeed even more complex than this because of templates and some magic that C++ can do at link time (e.g. static variables in inline functions).
A declarations says "there is an object/function like this somewhere", a definition says "make an object/function like this".

Changing struct to class (and other type changes) and ABI/code generation

It is well-established and a canonical reference question that in C++ structs and classes are pretty much interchangeable, when writing code by hand.
However, if I want to link to existing code, can I expect it to make any difference (i.e. break, nasal demons etc.) if I redeclare a struct as a class, or vice versa, in a header after the original code has been generated?
So the situation is the type was compiled as a struct (or a class), and I'm then changing the header file to the other declaration before including it in my project.
The real-world use case is that I'm auto-generating code with SWIG, which generates different output depending on whether it's given structs or classes; I need to change one to the other to get it to output the right interface.
The example is here (Irrlicht, SVertexManipulator.h) - given:
struct IVertexManipulator
{
};
I am redeclaring it mechanically as:
/*struct*/class IVertexManipulator
{public:
};
The original library compiles with the original headers, untouched. The wrapper code is generated using the modified forms, and compiled using them. The two are then linked into the same program to work together. Assume I'm using the exact same compiler for both libraries.
Is this sort of thing undefined? "Undefined", but expected to work on real-world compilers? Perfectly allowable?
Other similar changes I'm making include removing some default values from parameters (to prevent ambiguity), and removing field declarations from a couple of classes where the type is not visible to SWIG (which changes the structure of the class, but my reasoning is that the generated code should need that information, only to link to member functions). Again, how much havoc could this cause?
e.g. IGPUProgrammingServices.h:
s32 addHighLevelShaderMaterial(
const c8* vertexShaderProgram,
const c8* vertexShaderEntryPointName/*="main"*/,
E_VERTEX_SHADER_TYPE vsCompileTarget/*=EVST_VS_1_1*/,
const c8* pixelShaderProgram=0,
...
CIndexBuffer.h:
public:
//IIndexList *Indices;
...and so on like that. Other changes include replacing some template parameter types with their typedefs and removing the packed attribute from some structs. Again, it seems like there should be no problem if the altered struct declarations are never actually used in machine code (just to generate names to link to accessor functions in the main library), but is this reliably the case? Ever the case?
This is technically undefined behavior.
3.2/5:
There can be more than one definition of a class type, [... or other things that should be defined in header files ...] in a program provided that each definition appears in a different translation unit, and provided the definitions satisfy the following requirements. Given such an entity named D defined in more than one translation unit, then
each definition of D shall consist of the same sequence of tokens; and
...
... If the definitions of D satisfy all these requirements, then the program shall behave as if there were a single definition of D. If the definitions of D do not satisfy these requirements, then the behavior is undefined.
Essentially, you are changing the first token from struct to class, and inserting tokens public and : as appropriate. The Standard doesn't allow that.
But in all compilers I'm familiar with, this will be fine in practice.
Other similar changes I'm making include removing some default values from parameters (to prevent ambiguity)
This actually is formally allowed, if the declaration doesn't happen to be within a class definition. Different translation units and even different scopes within a TU can define different default function arguments. So you're probably fine there too.
Other changes include replacing some template parameter types with their typedefs
Also formally allowed outside of a class definition: two declarations of a function that use different ways of naming the same type refer to the same function.
... removing field declarations ... and removing the packed attribute from some structs
Now you're in severe danger territory, though. I'm not familiar with SWIG, but if you do this sort of thing, you'd better be darn sure the code using these "wrong" definitions never:
create or destroy an object of the class type
define a type that inherits or contains a member of the class type
use a non-static data member of the class
call an inline or template function that uses a non-static data member of the class
call a virtual member function of the class type or a derived type
try to find sizeof or alignof the class type

Why doesn't C++ need forward declarations for class members?

I was under the impression that everything in C++ must be declared before being used.
In fact, I remember reading that this is the reason why the use of auto in return types is not valid C++0x without something like decltype: the compiler must know the declared type before evaluating the function body.
Imagine my surprise when I noticed (after a long time) that the following code is in fact perfectly legal:
[Edit: Changed example.]
class Foo
{
Foo(int x = y);
static const int y = 5;
};
So now I don't understand:
Why doesn't the compiler require a forward declaration inside classes, when it requires them in other places?
The standard says (section 3.3.7):
The potential scope of a name declared in a class consists not only of the declarative region following the name’s point of declaration, but also of all function bodies, brace-or-equal-initializers of non-static data members, and default arguments in that class (including such things in nested classes).
This is probably accomplished by delaying processing bodies of inline member functions until after parsing the entire class definition.
Function definitions within the class body are treated as if they were actually defined after the class has been defined. So your code is equivalent to:
class Foo
{
Foo();
int x, *p;
};
inline Foo::Foo() { p = &x; }
Actually, I think you need to reverse the question to understand it.
Why does C++ require forward declaration ?
Because of the way C++ works (include files, not modules), it would otherwise need to wait for the whole Translation Unit before being able to assess, for sure, what the functions are. There are several downsides here:
compilation time would take yet another hit
it would be nigh impossible to provide any guarantee for code in headers, since any introduction of a later function could invalidate it all
Why is a class different ?
A class is by definition contained. It's a small unit (or should be...). Therefore:
there is little compilation time issue, you can wait until the class end to start analyzing
there is no risk of dependency hell, since all dependencies are clearly identified and isolated
Therefore we can eschew this annoying forward-declaration rule for classes.
Just guessing: the compiler saves the body of the function and doesn't actually process it until the class declaration is complete.
unlike a namespace, a class' scope cannot be reopened. it is bound.
imagine implementing a class in a header if everything needed to be declared in advance. i presume that since it is bound, it was more logical to write the language as it is, rather than requiring the user to write forwards in the class (or requiring definitions separate from declarations).

Why is the 'Declare before use' rule not required inside a class? [duplicate]

This question already has answers here:
Do class functions/variables have to be declared before being used?
(5 answers)
Closed 4 years ago.
I'm wondering why the declare-before-use rule of C++ doesn't hold inside a class.
Look at this example:
#ifdef BASE
struct Base {
#endif
struct B;
struct A {
B *b;
A(){ b->foo(); }
};
struct B {
void foo() {}
};
#ifdef BASE
};
#endif
int main( ) { return 0; }
If BASE is defined, the code is valid.
Within A's constructor I can use B::foo, which hasn't been declared yet.
Why does this work and, mostly, why only works inside a class?
Well, to be pedantic there's no "declare before use rule" in C++. There are rules of name lookup, which are pretty complicated, but which can be (and often are) roughly simplified into the generic "declare before use rule" with a number of exceptions. (In a way, the situation is similar to "operator precedence and associativity" rules. While the language specification has no such concepts, we often use them in practice, even though they are not entirely accurate.)
This is actually one of those exceptions. Member function definitions in C++ are specifically and intentionally excluded from that "declare before use rule" in a sense that name lookup from the bodies of these members is performed as if they are defined after the class definition.
The language specification states that in 3.4.1/8 (and footnote 30), although it uses a different wording. It says that during the name lookup from the member function definition, the entire class definition is inspected, not just the portion above the member function definition. Footnote 30 additionally states though that the lookup rules are the same for functions defined inside the class definition or outside the class definition (which is pretty much what I said above).
Your example is a bit non-trivial. It raises the immediate question about member function definitions in nested classes: should they be interpreted as if they are defined after the definition of the most enclosing class? The answer is yes. 3.4.1/8 covers this situation as well.
"Design & Evolution of C++" book describes the reasoning behind these decisions.
That's because member functions are compiled only after the whole class definition has been parsed by the compiler, even when the function definition is written inline, whereas regular functions are compiled immediatedly after being read. The C++ standard requires this behaviour.
I don't know the chapter and verse of the standard on this.
But if you would apply the "declare before use" rule strictly within a class, you would not be able to declare member variables at the bottom of the class declaration either. You would have to declare them first, in order to use them e.g. in a constructor initialization list.
I could imagine the "declare before use" rule has been relaxed a bit within the class declaration to allow for "cleaner" overall layout.
Just guesswork, as I said.
The most stubborn problems in the definition of C++ relate to name lookup: exactly which uses of a name refer to which declarations? Here, I'll describe just one kind of lookup problem: the ones that relate to order dependencies between class member declarations. [...]
Difficulties arise because of conflicts between goals:
We want to be able to do syntax analysis reading the source text once only.
Reordering the members of a class should not change the meaning of the class.
A member function body explicitly written inline should mean the same thing when written out of line.
Names from an outer scope should be usable from an inner scope (in the same way as they are in C).
The rules for name lookup should be independent of what a name refers to.
If all of these hold, the language will be reasonably fast to parse, and users won't have to worry about these rules because the compiler will catch the ambiguous and near ambiguous cases. The current rules come very close to this ideal.
[The Design And Evolution Of C++, section 6.3.1 called Lookup Issues on page 138]