Confusion about the difference between declarations and definitions in C++ - c++

I am confused. In part 3.8 of Bjarne Stroustrup's book 'Programming Principles and Practice Using C++' he talks about types of objects. I cite the following list:
A type defines a set of possible values and a set of operations (for an object).
An object is some memory that holds a value of a given type.
A value is a set of bits in memory interpreted according to a type.
A variable is a named object.
A declaration is a statement that gives a name to an object.
A definition is a declaration that sets aside memory for an object.
From his explanation of definition I understand that no memory is set aside for an object during declaration. However, the fact that Bjarne mentions that declaration involves the naming of an object, suggests that memory is actually set aside, as objects are explained as being
some memory that holds a value of a given type.
Can someone clarify this?

One of the complexities of C++ is that compilation is done in "translation units" (without seeing the whole program). Each translation units contains declarations of some parts defined in other translation units and definitions of some other parts. The declaration provides enough information to be able to generate code that uses the declared part once the address will be resolved by the linker.
Only one definition for an object or non-inline function is allowed in a program but there can be multiple declarations.
Things are indeed even more complex than this because of templates and some magic that C++ can do at link time (e.g. static variables in inline functions).
A declarations says "there is an object/function like this somewhere", a definition says "make an object/function like this".

Related

Can you have two classes with the same name and the same member function in different translation units?

Suppose I have two translation units:
//A.cpp
class X
{
};
//B.cpp
class X
{
int i;
};
Is the above program well-formed?
If not, no further questions. If the answer is yes, the program is well-formed (ignore the absence of main), then the second question. What if there is a function with the same name in those?
//A.cpp
class X
{
void f(){}
};
//B.cpp
class X
{
int i;
void f(){}
};
Would this be a problem for the linker as it would see &X::f in both object files? Are anonymous namespaces a must in such a situation?
Is the above program well-formed?
No. It violates the One-Definition Rule:
[basic.def.odr]
There can be more than one definition of a
class type ([class]),
...
in a program provided that each definition appears in a different translation unit and the definitions satisfy the following requirements.
Given such an entity D defined in more than one translation unit, for all definitions of D, or, if D is an unnamed enumeration, for all definitions of D that are reachable at any given program point, the following requirements shall be satisfied.
...
Each such definition shall consist of the same sequence of tokens, where the definition of a closure type is ...
...
Are anonymous namespaces a must in such a situation?
If you need different class definitions, they must be separate types. A uniquely named namespace is one option, and an anonymous namespace is a guaranteed way to get a unique (to the translation unit) namespace.
Short version
Well, no... C++ bases on assumption that every name in a namespace is unique. If you break that assumption you have 0 guarantee it will work.
For example if you have methods with the same name in two translation units (*.o files). Linker wont know which one to use for given call so it will just return an error.
Long version
... but actually yes!
There is actually quite a few situation when you could get away with classes/methods with the same name.
Do not actually use any of these tricks in your programs! Compilers are free to do pretty much anything if they think it will optimize the resulting program so any of the assumption bellow may break.
Classes are the easiest. Let's take some class with only non-static members and no functions. Such a thing don't even leave any trace in the compiled program. Classes/structs are only tools for a programmer to organize data so there is no need to deal with memory pools and offsets.
So basically if you have two classes with the same name in different compilation units, it should work. After the compiler is done with them, they will consist of just a few instruction of how much to move a pointer in memory to access a specific field.
There is hardly anything here that would confuse the linker.
Functions and variables (including static class variables) are more tricky because compiler often creates symbols for them in the *.o file. If you're lucky linker may ignore them if such a function/variable is not used but I wouldn't even count on that.
There are ways, though, to omit creating symbols for them. Static global elements or ones in anonymous namespaces are not visible outside their translation units so linker shouldn't complain on them. Also, inlined functions don't exist as separate entities so they also don't have symbols, which is especially relevant here because functions defined inside classes are inlined by default.
If there is no symbol, linker won't see a conflict and everything should compile.
Templates are also using some dirty tricks because they are compiled on demand in each compilation unit that uses them but they end up as a single copy in the final programs. I don't think this is the same case as multiple different things with the same name so let's drop the topic.
In conclusion, if your classes don't have static members and they do not define functions outside of their bodies, it may be possible to have two classes with the same name as long as you don't include them in the same file.
This is extremely fragile, though. Even if it works right now, a new version of the compiler may have some fix/optimization/change that would broke such program.
Let alone the fact that includes tends to be pretty interwoven in bigger projects so there is decent chance that at some point you will need to include both files in the same place.

Why is the global definition "const Date default_date(1970,1,1);" bad?

When reading the book Programming: Principles and Practices using C++, 2nd Edition I came along the following statement:
...what do you do if you really need a global variable (or constant)
with a complicated initializer? A plausible example would be that we
wanted a default value for a Date type we were providing for a library
supporting business transactions:
const Date default_date(1970,1,1); // the default date is January 1, 1970
How would we know that default_date was never used before it was
initialized? Basically, we can’t know, so we shouldn’t write that
definition...
What got me curious about this line of code is the implied idea of using a global variable before its definition. What did the author (Bjarne Stroupstrup) exactly mean by using a global variable before its initialization? Of course, one could have declared the variable somewhere else. But that scenario is not mentioned.
If there's another object declared in global scope, somewhere else, with a complex constructor, you have no practical means to specify the relative initialization order of these two objects in a portable manner. You can't expect, for either object, that the other object has been constructed, before it is referenced.
There's nothing inherently wrong with declaring global singleton objects, where they make sense, as long as it is fully understood that the relative initialization order of global objects in different translation units is not specified.

C++ What is the difference between definition and instantiation?

What is the difference between definition and instantiation?
Sub-question: Are "variable definition" and "variable instantiation" the same?
int x;
The above code can be reffered to as both a variable definition as well as a variable instantiation, right? If so, my question is if these two terms are synonyms? (or is there a different relation between them?)
After quite some edits and also a correction made by Johannes Schaub:
A definition of a variable of a certain type creates a variable of that type. As far as Stroustrup is concerned, this also holds for the definition of objects of a certain class, since a class is nothing more than a (non-native) type. (This makes much sense, although it isn't general OO terminology.)
General object oriented termininology: Instantiation of a class creates an object of that class. Specialized C++ terminology: Instantiation of a template creates a "perfectly ordinary" (Stroustrup) class.
A class is a type, defined in code, rather than as part of the language.
An implementation is a concrete class that realizes the functionality specified in an abstract class from which it derives.
A declaration is a specification of a variable or function, without allocating memory for it or generating code for it.
So #Riko, the definition on learncpp.com specifying that a definition implements an indentifier is not very accurate. Interfaces can be implemented, not types or classes. But one part of the definition is valuable: Definition in general goes hand in hand with memory allocation. You can declare a function or variable as often as you want (e.g. a declaration in a header file), but you an define it only once. If you declare a function, you give the signature (the name, return type and params, but not the body).
If you declare a variable, you used to put the word extern in front of it in a header file, but that isn't done often anymore, since object orientation and classes took over. Defining a variable in a header file, on the other hand, may lead to multiple instances of the variable, since the same header is read during compilation of distinct source files. Since C++ uses independent compilation and just textually includes the header, the variable is defined in multiple files, so there are several variables under the same name. Linkers don't like such ambiguity and will complain.
While the term instantiation in general means "creating an object of a class", Stroustrup (maker of C++) uses it in a special sense: A class is an instance of a template with all its parameters resolved. Nevertheless in many texts on C++ the word instantiation is used in the general Object Oriented sense, which is confusing.
#Jonannes Schaub. Although I am not too happy with C++ terminology deviating from general OO terminology, I think it's right to follow Stroustrup here, since after all, he created the language.
There are:
1) variable definitions,
2) variable/object instantiations, and
3) template instantiations.
1 & 3 are specific C++ terminology. 2 is more general terminology that might be used with C++. It is not an "officially" defined term for C++.
I understand that your question is about 1 and 2, but not 3. 3 is different than 2, though related in meaning. I won't address 3 further as I don't believe it is part of your question.
Instantiation is the creation of an object instance. It is more usual to use the term in reference to a class object than something like an int or double.
A C++ variable definition does cause an object of the type being defined to be instantiated. It is, however, possible in C++ to instantiate an object other than via a variable definition.
Example 1:
std::string name;
The variable name, a std::string, is defined and (at run-time) instantiated.
Example 2:
std::string *namePointer;
The variable namePointer, a pointer, is defined and might be said (at run-time) to be instantiated (though not initialized). There is no std::string variable and no std::string is instantiated.
//simple example, not what one should usually write in real code
namePointer = new std::string("Some Text");
No additional variable is defined. A std::string object is instantiated (at run-time) and the separate and pre-existing namePointer variable also has its value set.
Definition and Declaration are compile time concerns.
Declaration and definition of identifiers happens while your program is being compiled.
Declaration: A declaration is telling the compiler about the type of an identifier that is defined somewhere else but may be referenced here.
Definition: There can be only one definition of an identifier. This is where the thing is actually defined. All the declarations refer to this definition.
Mostly this is only a distinction we make with classes because built in types are already defined (the compiler already knows what an int is). The only exception I can think of is when we declare a variable to be extern.
Instantiation, this happens at run time.
An object is an instance of a class.
Instantiation is the act of creating a new object.
Instantiation of an object happens while your program is being run. Instantiation is when a new instance of the class is created (an object).
In C++ when an class is instantiated memory is allocated for the object and the classes constructor is run. In C++ we can instantiate objects in two ways, on the stack as a variable declaration, or on the heap with the new keyword. So for the class A both of the following create an instance of the class (instantiate it)
struct A {
int a;
};
A inst1;
A* inst2 = new A();
inst1 is a local variable that refers to an instance of the class A that was just created on the stack.
inst2 is a local variable that holds a pointer to an instance of class A that was just created on the heap.
There may be some confusion because the first is not possible in Java or C#, only the second. In C++ both instantiate (create a new runtime instance of) the class A. The only difference beteween the two is scope and where the memory was allocated.

Is encapsulation violated, if I use a global variable in a class member function's definition?

I've been asked to explain what encapsulation is and I replied "bundling of data and functions that modify this data, is called encapsulation."
The answer was followed by another question—"So, by your definition if I modify a global variable from a member function of a class then the encapsulation is violated."
It made sense to answer YES.
I am not sure whether my explanation is wrong or following question is valid and my answer to it as YES is correct.
Can somebody help.
Quoting from wikipedia:
In programming languages, encapsulation is used to refer to one of two
related but distinct notions, and sometimes to the combination
thereof:
A language mechanism for restricting access to some of the object's components.
A language construct that facilitates the bundling of data with the methods (or other functions) operating on that data
In my humble opinion the answer to the follow up question is subjective and it depends on the interpretation of the notion of encapsulation.
For example it's not a violation if the encapsulating data are limited to be the member variables of classes. A global variable that doesn't belong to an object is accessible by everyone and thus, accessing it via a member function doesn't consist any encapsulation violation.
On the other hand if you consider that encapsulation should be applied to your entire program then this global variable should have been bundled to an object and thus, raw access to it constitutes an encapsulation violation.
The bottom line is that the answer lies in the realms of theology, meaning that it depends on how encapsulation is interpreted by the different programming dogmas.
This depends on how global variable defined and accessed.
Imagine header file containing declaration, but not definitions of member functions, and corresponding implementation file containing class members implementation.
Now consider global variable defined in this header file as internal linkage one (static). Or placed in unnamed namespace. It is a global variable, but functionally it does not differ from private static class member.
It is smelly code, but, I say, that variable is encapsulated properly:

Why is the 'Declare before use' rule not required inside a class? [duplicate]

This question already has answers here:
Do class functions/variables have to be declared before being used?
(5 answers)
Closed 4 years ago.
I'm wondering why the declare-before-use rule of C++ doesn't hold inside a class.
Look at this example:
#ifdef BASE
struct Base {
#endif
struct B;
struct A {
B *b;
A(){ b->foo(); }
};
struct B {
void foo() {}
};
#ifdef BASE
};
#endif
int main( ) { return 0; }
If BASE is defined, the code is valid.
Within A's constructor I can use B::foo, which hasn't been declared yet.
Why does this work and, mostly, why only works inside a class?
Well, to be pedantic there's no "declare before use rule" in C++. There are rules of name lookup, which are pretty complicated, but which can be (and often are) roughly simplified into the generic "declare before use rule" with a number of exceptions. (In a way, the situation is similar to "operator precedence and associativity" rules. While the language specification has no such concepts, we often use them in practice, even though they are not entirely accurate.)
This is actually one of those exceptions. Member function definitions in C++ are specifically and intentionally excluded from that "declare before use rule" in a sense that name lookup from the bodies of these members is performed as if they are defined after the class definition.
The language specification states that in 3.4.1/8 (and footnote 30), although it uses a different wording. It says that during the name lookup from the member function definition, the entire class definition is inspected, not just the portion above the member function definition. Footnote 30 additionally states though that the lookup rules are the same for functions defined inside the class definition or outside the class definition (which is pretty much what I said above).
Your example is a bit non-trivial. It raises the immediate question about member function definitions in nested classes: should they be interpreted as if they are defined after the definition of the most enclosing class? The answer is yes. 3.4.1/8 covers this situation as well.
"Design & Evolution of C++" book describes the reasoning behind these decisions.
That's because member functions are compiled only after the whole class definition has been parsed by the compiler, even when the function definition is written inline, whereas regular functions are compiled immediatedly after being read. The C++ standard requires this behaviour.
I don't know the chapter and verse of the standard on this.
But if you would apply the "declare before use" rule strictly within a class, you would not be able to declare member variables at the bottom of the class declaration either. You would have to declare them first, in order to use them e.g. in a constructor initialization list.
I could imagine the "declare before use" rule has been relaxed a bit within the class declaration to allow for "cleaner" overall layout.
Just guesswork, as I said.
The most stubborn problems in the definition of C++ relate to name lookup: exactly which uses of a name refer to which declarations? Here, I'll describe just one kind of lookup problem: the ones that relate to order dependencies between class member declarations. [...]
Difficulties arise because of conflicts between goals:
We want to be able to do syntax analysis reading the source text once only.
Reordering the members of a class should not change the meaning of the class.
A member function body explicitly written inline should mean the same thing when written out of line.
Names from an outer scope should be usable from an inner scope (in the same way as they are in C).
The rules for name lookup should be independent of what a name refers to.
If all of these hold, the language will be reasonably fast to parse, and users won't have to worry about these rules because the compiler will catch the ambiguous and near ambiguous cases. The current rules come very close to this ideal.
[The Design And Evolution Of C++, section 6.3.1 called Lookup Issues on page 138]