Why is the 'Declare before use' rule not required inside a class? [duplicate] - c++

This question already has answers here:
Do class functions/variables have to be declared before being used?
(5 answers)
Closed 4 years ago.
I'm wondering why the declare-before-use rule of C++ doesn't hold inside a class.
Look at this example:
#ifdef BASE
struct Base {
#endif
struct B;
struct A {
B *b;
A(){ b->foo(); }
};
struct B {
void foo() {}
};
#ifdef BASE
};
#endif
int main( ) { return 0; }
If BASE is defined, the code is valid.
Within A's constructor I can use B::foo, which hasn't been declared yet.
Why does this work and, mostly, why only works inside a class?

Well, to be pedantic there's no "declare before use rule" in C++. There are rules of name lookup, which are pretty complicated, but which can be (and often are) roughly simplified into the generic "declare before use rule" with a number of exceptions. (In a way, the situation is similar to "operator precedence and associativity" rules. While the language specification has no such concepts, we often use them in practice, even though they are not entirely accurate.)
This is actually one of those exceptions. Member function definitions in C++ are specifically and intentionally excluded from that "declare before use rule" in a sense that name lookup from the bodies of these members is performed as if they are defined after the class definition.
The language specification states that in 3.4.1/8 (and footnote 30), although it uses a different wording. It says that during the name lookup from the member function definition, the entire class definition is inspected, not just the portion above the member function definition. Footnote 30 additionally states though that the lookup rules are the same for functions defined inside the class definition or outside the class definition (which is pretty much what I said above).
Your example is a bit non-trivial. It raises the immediate question about member function definitions in nested classes: should they be interpreted as if they are defined after the definition of the most enclosing class? The answer is yes. 3.4.1/8 covers this situation as well.
"Design & Evolution of C++" book describes the reasoning behind these decisions.

That's because member functions are compiled only after the whole class definition has been parsed by the compiler, even when the function definition is written inline, whereas regular functions are compiled immediatedly after being read. The C++ standard requires this behaviour.

I don't know the chapter and verse of the standard on this.
But if you would apply the "declare before use" rule strictly within a class, you would not be able to declare member variables at the bottom of the class declaration either. You would have to declare them first, in order to use them e.g. in a constructor initialization list.
I could imagine the "declare before use" rule has been relaxed a bit within the class declaration to allow for "cleaner" overall layout.
Just guesswork, as I said.

The most stubborn problems in the definition of C++ relate to name lookup: exactly which uses of a name refer to which declarations? Here, I'll describe just one kind of lookup problem: the ones that relate to order dependencies between class member declarations. [...]
Difficulties arise because of conflicts between goals:
We want to be able to do syntax analysis reading the source text once only.
Reordering the members of a class should not change the meaning of the class.
A member function body explicitly written inline should mean the same thing when written out of line.
Names from an outer scope should be usable from an inner scope (in the same way as they are in C).
The rules for name lookup should be independent of what a name refers to.
If all of these hold, the language will be reasonably fast to parse, and users won't have to worry about these rules because the compiler will catch the ambiguous and near ambiguous cases. The current rules come very close to this ideal.
[The Design And Evolution Of C++, section 6.3.1 called Lookup Issues on page 138]

Related

C++ What is the difference between definition and instantiation?

What is the difference between definition and instantiation?
Sub-question: Are "variable definition" and "variable instantiation" the same?
int x;
The above code can be reffered to as both a variable definition as well as a variable instantiation, right? If so, my question is if these two terms are synonyms? (or is there a different relation between them?)
After quite some edits and also a correction made by Johannes Schaub:
A definition of a variable of a certain type creates a variable of that type. As far as Stroustrup is concerned, this also holds for the definition of objects of a certain class, since a class is nothing more than a (non-native) type. (This makes much sense, although it isn't general OO terminology.)
General object oriented termininology: Instantiation of a class creates an object of that class. Specialized C++ terminology: Instantiation of a template creates a "perfectly ordinary" (Stroustrup) class.
A class is a type, defined in code, rather than as part of the language.
An implementation is a concrete class that realizes the functionality specified in an abstract class from which it derives.
A declaration is a specification of a variable or function, without allocating memory for it or generating code for it.
So #Riko, the definition on learncpp.com specifying that a definition implements an indentifier is not very accurate. Interfaces can be implemented, not types or classes. But one part of the definition is valuable: Definition in general goes hand in hand with memory allocation. You can declare a function or variable as often as you want (e.g. a declaration in a header file), but you an define it only once. If you declare a function, you give the signature (the name, return type and params, but not the body).
If you declare a variable, you used to put the word extern in front of it in a header file, but that isn't done often anymore, since object orientation and classes took over. Defining a variable in a header file, on the other hand, may lead to multiple instances of the variable, since the same header is read during compilation of distinct source files. Since C++ uses independent compilation and just textually includes the header, the variable is defined in multiple files, so there are several variables under the same name. Linkers don't like such ambiguity and will complain.
While the term instantiation in general means "creating an object of a class", Stroustrup (maker of C++) uses it in a special sense: A class is an instance of a template with all its parameters resolved. Nevertheless in many texts on C++ the word instantiation is used in the general Object Oriented sense, which is confusing.
#Jonannes Schaub. Although I am not too happy with C++ terminology deviating from general OO terminology, I think it's right to follow Stroustrup here, since after all, he created the language.
There are:
1) variable definitions,
2) variable/object instantiations, and
3) template instantiations.
1 & 3 are specific C++ terminology. 2 is more general terminology that might be used with C++. It is not an "officially" defined term for C++.
I understand that your question is about 1 and 2, but not 3. 3 is different than 2, though related in meaning. I won't address 3 further as I don't believe it is part of your question.
Instantiation is the creation of an object instance. It is more usual to use the term in reference to a class object than something like an int or double.
A C++ variable definition does cause an object of the type being defined to be instantiated. It is, however, possible in C++ to instantiate an object other than via a variable definition.
Example 1:
std::string name;
The variable name, a std::string, is defined and (at run-time) instantiated.
Example 2:
std::string *namePointer;
The variable namePointer, a pointer, is defined and might be said (at run-time) to be instantiated (though not initialized). There is no std::string variable and no std::string is instantiated.
//simple example, not what one should usually write in real code
namePointer = new std::string("Some Text");
No additional variable is defined. A std::string object is instantiated (at run-time) and the separate and pre-existing namePointer variable also has its value set.
Definition and Declaration are compile time concerns.
Declaration and definition of identifiers happens while your program is being compiled.
Declaration: A declaration is telling the compiler about the type of an identifier that is defined somewhere else but may be referenced here.
Definition: There can be only one definition of an identifier. This is where the thing is actually defined. All the declarations refer to this definition.
Mostly this is only a distinction we make with classes because built in types are already defined (the compiler already knows what an int is). The only exception I can think of is when we declare a variable to be extern.
Instantiation, this happens at run time.
An object is an instance of a class.
Instantiation is the act of creating a new object.
Instantiation of an object happens while your program is being run. Instantiation is when a new instance of the class is created (an object).
In C++ when an class is instantiated memory is allocated for the object and the classes constructor is run. In C++ we can instantiate objects in two ways, on the stack as a variable declaration, or on the heap with the new keyword. So for the class A both of the following create an instance of the class (instantiate it)
struct A {
int a;
};
A inst1;
A* inst2 = new A();
inst1 is a local variable that refers to an instance of the class A that was just created on the stack.
inst2 is a local variable that holds a pointer to an instance of class A that was just created on the heap.
There may be some confusion because the first is not possible in Java or C#, only the second. In C++ both instantiate (create a new runtime instance of) the class A. The only difference beteween the two is scope and where the memory was allocated.

Why don't methods of structs have to be declared in C++?

Take, for example, the following code:
#include <iostream>
#include <string>
int main()
{
print("Hello!");
}
void print(std::string s) {
std::cout << s << std::endl;
}
When trying to build this, I get the following:
program.cpp: In function ‘int main()’:
program.cpp:6:16: error: ‘print’ was not declared in this scope
Which makes sense.
So why can I conduct a similar concept in a struct, but not get yelled at for it?
struct Snake {
...
Snake() {
...
addBlock(Block(...));
}
void addBlock(Block block) {
...
}
void update() {
...
}
} snake1;
Not only do I not get warnings, but the program actually compiles! Without error! Is this just the nature of structs? What's happening here? Clearly addBlock(Block) was called before the method was ever declared.
A struct in C++ is actually a class definition where all its content is public, unless specified otherwise by including a protected: or private: declaration.
When the compiler sees a class or struct, it first digests all its declarations from inside the block ({}) before operating on them.
In the regular method case, the compiler hasn't yet seen the type declared.
C++ standard 3.4.1:
.4:
A name used in global scope, outside of any function, class or
user-declared namespace, shall be declared before its use in global
scope.
This is why global variables and functions cannot be used before an afore declaration.
.5:
A name used in a user-declared namespace outside of the definition of
any function or class shall be declared before its use in that
namespace or before its use in a namespace enclosing its namespace.
same thing just written again as the .4 paragraph explictely restricted its saying to "global", this paragraph now says "by the way, its true as well in namespeces folks..."
.7:
A name used in the definition of a class X outside of a member
function body or nested class definition29 shall be declared in one of
the following ways: — before its use in class X or be a member of a
base class of X (10.2), or — if X is a nested class of class Y (9.7),
before the definition of X in Y, or shall be a member of a base class
of Y (this lookup applies in turn to Y ’s enclosing classes, starting
with the innermost enclosing class),30 or — if X is a local class
(9.8) or is a nested class of a local class, before the definition of
class X in a block enclosing the definition of class X, or — if X is a
member of namespace N, or is a nested class of a class that is a
member of N, or is a local class or a nested class within a local
class of a function that is a member of N, before the definition of
class X in namespace N or in one of N ’s enclosing namespaces.
I think this speaks of all the code that does not stand in cpu executed code (eg declarative code).
and finally the interesting part:
3.3.7 Class scope [basic.scope.class]
1 The following rules describe the scope of names declared in classes.
1) The potential scope of a
name declared in a class consists not only of the declarative region
following the name’s point of declaration, but also of all function
bodies, brace-or-equal-initializers of non-static data members, and
default arguments in that class (including such things in nested
classes).
2) A name N used in a class S shall refer to the same
declaration in its context and when re-evaluated in the completed
scope of S. No diagnostic is required for a violation of this rule.
3)
If reordering member declarations in a class yields an alternate valid
program under (1) and (2), the program is ill-formed, no diagnostic is
required.
particularly, by the last point they use a negative manner to define that "any ordering is possible" because if re-ordering would change lookup then there is a problem. its a negative way of saying "you can reorder anything and its ok, it doesnt change anything".
effectively saying, in a class, the declaration is looked-up in a two-phase compilation fashion.
"why can I conduct a similar concept in a struct, but not get yelled at for it?"
In a struct or class definition you're presenting the public interface to a class and it's much easier to understand, search and maintain/update that API if it's presented in:
a predictable order, with
minimal clutter.
For predictable order, people have their own styles and there's a bit of "art" involved, but for example I use each access specifier at most once and always public before protected before private, then within those I normally put typedefs, const data, constructors, destructors, mutating/non-const functions, const functions, statics, friends....
To minimise clutter, if a function is defined in the class, it might as well be without a prior declaration. Having both tends only to obfuscate the interface.
This is different from functions that aren't members of a class - where people who like top-down programming do use function declarations and hide the definitions later in the file - in that:
people who prefer a bottom-up programming style won't appreciate being forced to either have separate declarations in classes or abandon the oft-conflicting practice of grouping by access specifier
Classes are statistically more likely to have many very short functions, largely because they provide encapsulation and wrap a lot of trivial data member accesses or provide operator overloading, casting operators, implicit constructors and other convenience features that aren't relevant to non-OO, non-member functions. That makes a constant forced separation of declarations and definitions more painful for many classes (not so much in the public interfaces where definitions might be in a separate file, but definitely for e.g. classes in anonymous namespaces supporting the current translation unit).
Best practice is for classes not to cram in a wildly extensive interface... you generally want a functional core and then some discretionary convenience functions, after which it's worth considering what can be added as non-member functions. The std::string is an often claimed to have too many member functions, though I personally think it's quite reasonable. Still, this also differs from a header file declaring a library interface, where exhaustive functionality can be expected to be crammed together making a separation of even inline implementation more desirable.

Why are access declarations deprecated? What does this mean for SRO and using declarations?

I've been looking high and low for an answer to what I thought was a fairly simple question: Why are access declarations deprecated?
class A
{
public:
int testInt;
}
class B: public A
{
private:
A::testInt;
}
I understand that it can be fixed by simply plopping "using" in front of A::testInt,
but without some sort of understanding as to why I must do so, that feels like a cheap fix.
Worse yet, it muddies my understanding of using declarations/directives, and the scope resolution operator. If I must use a using declaration here, why am I able to use the SRO and only the SRO elsewhere? A trivial example is std::cout. Why not use using std::cout? I used to think that using and the SRO were more or less interchangeable (give or take some handy functionality provided with the "using" keyword, of which I am aware, at least in the case of namespaces).
I've seen the following in the standard:
The access of a member of a base class can be changed in the derived class by mentioning >its qualified-id in the derived class declaration. Such mention is called an access >declaration. The effect of an access declaration qualified-id; is defined to be equivalent >to the declaration using qualified-id; [Footnote: Access declarations are deprecated; member >using-declarations (7.3.3) provide a better means of doing the same things. In earlier >versions of the C++ language, access declarations were more limited; they were generalized >and made equivalent to using-declarations - end footnote]
However, that really does nothing other than confirm what I already know. If you really boiled it down, I am sure my problem stems from the fact that I think using and the SRO are interchangeable, but I haven't seen anything that would suggest otherwise.
Thanks in advance!
If I must use a using declaration here, why am I able to use the SRO and only the SRO elsewhere?
Huh? You are not able to. Not to re-declare a name in a different scope (which is what an access declaration does).
A trivial example is std::cout. Why not use using std::cout?
Because they're not the same thing, not even close.
One refers to a name, the other re-declares a name.
I am sure my problem stems from the fact that I think using and the SRO are interchangeable
I agree that's your problem, because you are entirely wrong. Following a using declaration it is not necessary to qualify the name, but that doesn't make them interchangeable.
std::cout is an expression, it refers to the variable so you can write to it, pass it as a function argument, take its address etc.
using std::cout; is a declaration. It makes the name cout available in the current scope, as an alias for the name std::cout.
std::cout << "This is an expression involving std::cout\n";
using std::cout; // re-declaration of `cout` in current scope
If you're suggesting that for consistency you should do this to write to cout:
using std::cout << "This is madness.\n";
then, erm, that's madness.
In a class, when you want to re-declare a member with a different access you are re-declaring it, so you want a declaration. You aren't trying to refer to the object to write to involve it in some expression, which (if it was allowed at class scope) would look like this:
class B: public A
{
private:
A::testInt + 1;
};
For consistency with the rest of the language, re-declaring a name from a base class is done with a using-declaration, because that's a declaration, it's not done with something that looks like an expression.
class B: public A
{
private:
A::testInt; // looks like an expression involving A::testInt, but isn't
using A::testInt; // re-declaration of `testInt` in current scope
};
Compare this to the std::cout example above and you'll see that requiring using is entirely consistent, and removing access declarations from C++ makes the language more consistent.

Reason for C++ member function hiding [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
name hiding and fragile base problem
I'm familiar with the rules involving member function hiding. Basically, a derived class with a function that has the same name as a base class function doesn't actually overload the base class function - it completely hides it.
struct Base
{
void foo(int x) const
{
}
};
struct Derived : public Base
{
void foo(const std::string& s) { }
};
int main()
{
Derived d;
d.foo("abc");
d.foo(123); // Will not compile! Base::foo is hidden!
}
So, you can get around this with a using declaration. But my question is, what is the reason for base class function hiding? Is this a "feature" or just a "mistake" by the standards committee? Is there some technical reason why the compiler can't look in the Base class for matching overloads when it doesn't find a match for d.foo(123)?
Name lookup works by looking in the current scope for matching names, if nothing is found then it looks in the enclosing scope, if nothing is found it looks in the enclosing scope, etc. until reaching the global namespace.
This isn't specific to classes, you get exactly the same name hiding here:
#include <iostream>
namespace outer
{
void foo(char c) { std::cout << "outer\n"; }
namespace inner
{
void foo(int i) { std::cout << "inner\n"; }
void bar() { foo('c'); }
}
}
int main()
{
outer::inner::bar();
}
Although outer::foo(char) is a better match for the call foo('c') name lookup stops after finding outer::inner::foo(int) (i.e. outer::foo(char) is hidden) and so the program prints inner.
If member function name weren't hidden that would mean name lookup in class scope behaved differently to non-class scope, which would be inconsistent and confusing, and make C++ even harder to learn.
So there's no technical reason the name lookup rules couldn't be changed, but they'd have to be changed for member functions and other types of name lookup, it would make compilers slower because they'd have to continue searching for names even after finding matching names in the current scope. Sensibly, if there's a name in the current scope it's probably the one you wanted. A call in a scope A probably wants to find names in that scope, e.g. if two functions are in the same namespace they're probably related (part of the same module or library) and so if one uses the name of the other it probably means to call the one in the same scope. If that's not what you want then use explicit qualification or a using declaration to tell the compiler the other name should be visible in that scope.
Is this a "feature" or just a "mistake" by the standards committee?
It's definitely not a mistake, since it's clearly stipulated in the standard. It's a feature.
Is there some technical reason why the compiler can't look in the Base class for matching overloads when it doesn't find a match for d.foo(123)?
Technically, a compiler could look in the base class. Technically. But if it did, it would break the rules set by the standard.
But my question is, what is the reason for base class function hiding?
Unless someone from the committee comes with an answer, I think we can only speculate. Basically, there were two options:
if I declare a function with the same name in a derived class, keep the base class's functions with the same name directly accessible through a derived class
don't
It could have been determined by flipping a coin (...ok, maybe not).
In general, what are the reasons for wanting a function with the same name as that of a base class? There's different functionality - where you'd more likely use polymorphism instead. For handling different cases (different parameters), and if these cases aren't present in the base class, a strategy pattern might be more appropriate to handle the job. So most likely function hiding comes in effect when you actually do want to hide the function. You're not happy with the base class implementation so you provide your own, with the option of using using, but only when you want to.
I think it's just a mechanism to make you think twice before having a function with the same name & different signature.
I believe #Lol4t0 is pretty much correct, but I'd state things much more strongly. If you allowed this, you'd end up with two possibilities: either make a lot of other changes throughout almost the entirety of the language, or else you end up with something almost completely broken.
The other changes you'd make to allow this to work would be to completely revamp how overloading is done -- you'd have to change at least the order of the steps that were taken, and probably the details of the steps themselves. Right now, the compiler looks up the name, then forms an overload set, resolves the overload, then checks access to the chosen overload.
To make this work even sort of well, you'd pretty much have to change that to check access first, and only add accessible functions to the overload set. With that, at least the example in #Lol4t0's answer could continue to compile, because Base::foo would never be added to the overload set.
That still means, however, that adding to the interface of the base class could cause serious problems. If Base didn't originally contain foo, and a public foo were added, then the call in main to d.foo() would suddenly do something entirely different, and (again) it would be entirely outside the control of whoever wrote Derived.
To cure that, you'd just about have to make a fairly fundamental change in the rules: prohibit implicit conversions of function arguments. Along with that, you'd change overload resolution so in case of a tie, the most derived/most local version of a function was favored over a less derived/outer scope. With those rules, the call to d.foo(5.0) could never resolve to Derived::foo(int) in the first place.
That, however, would only leave two possibilities: either calls to free functions would have different rules than calls to member functions (implicit conversions allowed only for free functions) or else all compatibility with C would be discarded entirely (i.e., also prohibit implicit conversions in all function arguments, which would break huge amounts of existing code).
To summarize: to change this without breaking the language entirely, you'd have to make quite a few other changes as well. It would almost certainly be possible to create a language that worked that way, but by the time you were done it wouldn't be C++ with one minor change -- it would be an entirely different language that wasn't much like C++ or C, or much of anything else.
I can only propose, that this decision was made to make things simpler.
Imagine, that derived function will overload base one. Then, does the following code should generate compilation error, or use Deriveds function?
struct Base
{
private:
void foo(float);
}
struct Derived: public Base
{
public:
void foo(int);
}
int main()
{
Derived d;
d.foo(5.0f);
}
According to existing behavior of overloads this should generate error.
Now imagine, in the first version Base had no foo(float). In second version it appears. Now changing the realization of base class breaks interface of derived.
If you are developer of Derived and cannot influence developers of Base and a lot of clients use your interface, you are in a bad situation now.

Why doesn't C++ need forward declarations for class members?

I was under the impression that everything in C++ must be declared before being used.
In fact, I remember reading that this is the reason why the use of auto in return types is not valid C++0x without something like decltype: the compiler must know the declared type before evaluating the function body.
Imagine my surprise when I noticed (after a long time) that the following code is in fact perfectly legal:
[Edit: Changed example.]
class Foo
{
Foo(int x = y);
static const int y = 5;
};
So now I don't understand:
Why doesn't the compiler require a forward declaration inside classes, when it requires them in other places?
The standard says (section 3.3.7):
The potential scope of a name declared in a class consists not only of the declarative region following the name’s point of declaration, but also of all function bodies, brace-or-equal-initializers of non-static data members, and default arguments in that class (including such things in nested classes).
This is probably accomplished by delaying processing bodies of inline member functions until after parsing the entire class definition.
Function definitions within the class body are treated as if they were actually defined after the class has been defined. So your code is equivalent to:
class Foo
{
Foo();
int x, *p;
};
inline Foo::Foo() { p = &x; }
Actually, I think you need to reverse the question to understand it.
Why does C++ require forward declaration ?
Because of the way C++ works (include files, not modules), it would otherwise need to wait for the whole Translation Unit before being able to assess, for sure, what the functions are. There are several downsides here:
compilation time would take yet another hit
it would be nigh impossible to provide any guarantee for code in headers, since any introduction of a later function could invalidate it all
Why is a class different ?
A class is by definition contained. It's a small unit (or should be...). Therefore:
there is little compilation time issue, you can wait until the class end to start analyzing
there is no risk of dependency hell, since all dependencies are clearly identified and isolated
Therefore we can eschew this annoying forward-declaration rule for classes.
Just guessing: the compiler saves the body of the function and doesn't actually process it until the class declaration is complete.
unlike a namespace, a class' scope cannot be reopened. it is bound.
imagine implementing a class in a header if everything needed to be declared in advance. i presume that since it is bound, it was more logical to write the language as it is, rather than requiring the user to write forwards in the class (or requiring definitions separate from declarations).