Parsing declarations before definitions in a class - c++

This question here piqued my interest a little. Is there anywhere in the C++ standard that specifies all declarations within a class must be parsed before any accompanying implementations of member functions? I've seen a few other questions similar to this but no references to the standard in any of the answers.

The Standard doesn't specify how the compiler should parse a translation unit. Instead, it specifies everywhere it is and is not valid to use any identifier to refer to a declaration.
3.3.2p5:
After the point of declaration of a class member, the member name can be looked up in the scope of its
class. [ Note: this is true even if the class is an incomplete class. ]
3.3.7p1:
The following rules describe the scope of names declared in classes.
The potential scope of a name declared in a class consists not only of the declarative region following the name’s point of declaration, but also of all function bodies, brace-or-equal-initializers of non-static data members, and default arguments in that class (including such things in nested classes).
A name N used in a class S shall refer to the same declaration in its context and when re-evaluated in the completed scope of S. No diagnostic is required for a violation of this rule.
If reordering member declarations in a class yields an alternate valid program under (1) and (2), the program is ill-formed, no diagnostic is required.
A name declared within a member function hides a declaration of the same name whose scope extends to or past the end of the member function’s class.
The potential scope of a declaration that extends to or past the end of a class definition also extends to the regions defined by its member definitions, even if the members are defined lexically outside the class (this includes static data member definitions, nested class definitions, member function definitions (including the member function body and any portion of the declarator part of such definitions which follows the declarator-id, including a parameter-declaration-clause and any default arguments (8.3.6).

[class.mem] says:
-2- A class is considered a completely-defined object type (3.9) (or complete type) at the closing } of the class-specifier. Within the class member-specification, the class is regarded as complete within function bodies, default arguments, and brace-or-equal-initializers for non-static data members (including such things in nested classes). Otherwise it is regarded as incomplete within its own class member-specification.
For the class to be complete within function bodies then in general all declaration need to be parsed: without completely parsing all declaration you can't know if something that wasn't parsed would change the meaning. Although, possibly related to that is [basic.scope.class]/1 which says:
3) If reordering member declarations in a class yields an alternate valid program under (1) and (2), the program is ill-formed, no diagnostic is required.
That means certain declarations could be used without parsing the entire class, because if another later declaration altered the meaning then the program would be ill-formed.
Of course the "as if" rule allows the compiler to choose any implementation as long as the user can't tell the difference, so maybe a compiler could choose to parse function bodies and then parse definitions as needed, but it would be hard to tell what's needed to process the member function definition (consider a function call which might call one of several overloaded functions, possibly involving enable_if-type tricks.)

This is the draft which explains C++ Programming Language Standard.
Programming Language C++ PDF
I think page 220 has some explanations on member functions.

Related

what are lookup rules when a name occured before function's declarator-id?

#include <iostream>
typedef int Name;
Name func(int){
return 0;
}
int main(){
}
Consider the above code, I can't find a bullet in [basic.lookup.unqual] that can interpret how to lookup Name for definition of func. I will cite some potential rules here:
A name used in global scope, outside of any function, class or user-declared namespace, shall be declared before its use in global scope.
A name used in a user-declared namespace outside of the definition of any function or class shall be declared before its use in that namespace or before its use in a namespace enclosing its namespace.
In the definition of a function that is a member of namespace N, a name used after the function's declarator-id shall be declared before its use in the block in which it is used or in one of its enclosing blocks ([stmt.block]) or shall be declared before its use in namespace N or, if N is a nested namespace, shall be declared before its use in one of N's enclosing namespaces
Please note the emphasized parts, it seems that my case does not satisfy these bullets, because Function definitions have the form
function-definition:
attribute-specifier-seq(opt) decl-specifier-seq(opt) declarator virt-specifier-seq(opt) function-body
Let me analyze the first bullet. It says outside of any function,but according to the Function definitions rule, I think that Name is within the function(definition), bullet 1 isn't satisfied. Bullet 2 is similar with that of bullet 1. Bullet 3 says that the name used after the function's declarator-id, in my case, the Name is used before the function's declarator-id. So what's the rule about this case to find the unqualified name Name?
My confusions:
In my example ,the Name,func ,(int) and { return 0;}(function body) are all parts of that function definition,So:
what is outside of any function,such as the func in my example,where's area of outside of that function?
what is outside of the definition of any function,such as the func definition in my example,where's area of outside of the definition of that function?
I think that Name is within the function(definition), bullet 1 isn't satisfied.
Bullet 1 didn't say "function(definition)". It said "outside of any function". It doesn't specify declaration or definition; merely "outside of any function".
Since being inside or outside of a "function" is not a defined concept, it must be read as plain English. Is a function prototype "outside of the function"? Visually speaking, there's nothing special about a function prototype that is suggestive of an inside/outside distinction. By contrast, the function body's block scope does suggest an inside/outside distinction.
The intent of the text of course is quite obvious; it's talking about the function body. The rule equates "function", "class" and "namespace", all of which have a block that defines scoping for names. That is the most logical place for any inside/outside distinction, so it seems pretty obvious that it's saying that the global scope consists of everything that isn't in the scope of a function body, the scope of a class definition, or the scope of a namespace body.
So this can be easily handled with an editorial change.
Note that the committee recognizes the wording of this section (among others in that area) as somewhat defective, and there's a proposal for rewriting it into something more coherent in C++23. The new wording from the proposal completely rewrites this section.

Non-overloadable non-inline function definitions in different translation units

Let's say I have 2 TUs with 2 non-inline function definitions with external linkage which differ only in their return types.
Which paragraph(s) my program violates?
[basic.def.odr]/4 says:
Every program shall contain exactly one definition of every non-inline function or variable that is odr-used in that program outside of a discarded statement; no diagnostic required.
But
This paragraph says "that is odr-used" which may or may not be the case.
How do I tell if I define the same non-inline function in different TUs, after all? [over.dcl]/1 speaks about the same scope.
I believe you're looking for: [basic.link]/9:
Two names that are the same ([basic.pre]) and that are declared in different scopes shall denote the same variable, function, type, template or namespace if
both names have external or module linkage and are declared in declarations attached to the same module, or else both names have internal linkage and are declared in the same translation unit; and
both names refer to members of the same namespace or to members, not by inheritance, of the same class; and
when both names denote functions or function templates, the signatures ([defns.signature], [defns.signature.templ]) are the same.
If multiple declarations of the same name with external linkage would declare the same entity except that they are attached to different modules, the program is ill-formed; no diagnostic is required. [ Note: using-declarations, typedef declarations, and alias-declarations do not declare entities, but merely introduce synonyms. Similarly, using-directives do not declare entities. — end note ]
And [basic.link]/11:
After all adjustments of types (during which typedefs are replaced by their definitions), the types specified by all declarations referring to a given variable or function shall be identical, except that declarations for an array object can specify array types that differ by the presence or absence of a major array bound ([dcl.array]). A violation of this rule on type identity does not require a diagnostic.
And [defns.signature]:
⟨function⟩ name, parameter-type-list ([dcl.fct]), and enclosing namespace (if any)
The return type isn't part of the signature, so you're violating the rule that same signature means same entity.
Generally speaking, all discussions of scope and name lookup in the standard are pretty broken until Davis "The Hero We Don't Deserve" Herring's work goes through.

Template alias scope

As per http://en.cppreference.com/w/cpp/language/type_alias, aliases are block-level declarations. It doesn't say anything special about template aliases, so it should be read that template aliases are block-level declarations as well.
However, it is impossible to use template aliases at block level. The errors are different depending on the compiler - while g++ gives a meaningful message, saying that templates are not allowed at block scope, clang is completely cryptic. (example: http://coliru.stacked-crooked.com/a/0f0862dad6f3da61).
Questions I have so far:
Does cppreference fail to specify that template aliases can not be used at block scope? (Or do I need to take a reading course?)
Are the compilers correct in denying template aliases on block level (the feature I find very interesting for my particular coding habits)
If the answer to the second is Yes, what might be the rationale for this? Why would compiler deny me this pure syntax sugar?
An alias template is [temp.alias]
A template-declaration in which the declaration is an alias-declaration (Clause 7) declares the identifier to
be a alias template. An alias template is a name for a family of types. The name of the alias template is a
template-name.
And if we look at 14.2 [temp] we have
A template-declaration can appear only as a namespace scope or class scope declaration. In a function
template declaration, the last component of the declarator-id shall not be a template-id.
So yes cppreference is off saying that it can be declared at block scope and your compilers are correct. If you do click on the link of block declarations It will bring you to a list of declarations and in that it has Template declaration and in there it has
declaration of a class (including struct and union), a member class or member enumeration type, a function or member function, a static data member at namespace scope, a variable or static data member at class scope, (since C++14) or an alias template (since C++11) It may also define a template specialization.
As for why the standard says that templates can only be declared in namespace scope or class scope I like James Kanze answer
The problem is probably linked to the historical way templates were implemented: early implementation techniques (and some still used today) require all symbols in a template to have external linkage. (Instantiation is done by generating the equivalent code in a separate file.) And names defined inside a function never have linkage, and cannot be referred to outside of the scope in which they were defined.
The compilers are behaving correctly.
Section 14 of the C++14 standard:
A template-declaration can appear only as a namespace scope or class
scope declaration.

Why don't methods of structs have to be declared in C++?

Take, for example, the following code:
#include <iostream>
#include <string>
int main()
{
print("Hello!");
}
void print(std::string s) {
std::cout << s << std::endl;
}
When trying to build this, I get the following:
program.cpp: In function ‘int main()’:
program.cpp:6:16: error: ‘print’ was not declared in this scope
Which makes sense.
So why can I conduct a similar concept in a struct, but not get yelled at for it?
struct Snake {
...
Snake() {
...
addBlock(Block(...));
}
void addBlock(Block block) {
...
}
void update() {
...
}
} snake1;
Not only do I not get warnings, but the program actually compiles! Without error! Is this just the nature of structs? What's happening here? Clearly addBlock(Block) was called before the method was ever declared.
A struct in C++ is actually a class definition where all its content is public, unless specified otherwise by including a protected: or private: declaration.
When the compiler sees a class or struct, it first digests all its declarations from inside the block ({}) before operating on them.
In the regular method case, the compiler hasn't yet seen the type declared.
C++ standard 3.4.1:
.4:
A name used in global scope, outside of any function, class or
user-declared namespace, shall be declared before its use in global
scope.
This is why global variables and functions cannot be used before an afore declaration.
.5:
A name used in a user-declared namespace outside of the definition of
any function or class shall be declared before its use in that
namespace or before its use in a namespace enclosing its namespace.
same thing just written again as the .4 paragraph explictely restricted its saying to "global", this paragraph now says "by the way, its true as well in namespeces folks..."
.7:
A name used in the definition of a class X outside of a member
function body or nested class definition29 shall be declared in one of
the following ways: — before its use in class X or be a member of a
base class of X (10.2), or — if X is a nested class of class Y (9.7),
before the definition of X in Y, or shall be a member of a base class
of Y (this lookup applies in turn to Y ’s enclosing classes, starting
with the innermost enclosing class),30 or — if X is a local class
(9.8) or is a nested class of a local class, before the definition of
class X in a block enclosing the definition of class X, or — if X is a
member of namespace N, or is a nested class of a class that is a
member of N, or is a local class or a nested class within a local
class of a function that is a member of N, before the definition of
class X in namespace N or in one of N ’s enclosing namespaces.
I think this speaks of all the code that does not stand in cpu executed code (eg declarative code).
and finally the interesting part:
3.3.7 Class scope [basic.scope.class]
1 The following rules describe the scope of names declared in classes.
1) The potential scope of a
name declared in a class consists not only of the declarative region
following the name’s point of declaration, but also of all function
bodies, brace-or-equal-initializers of non-static data members, and
default arguments in that class (including such things in nested
classes).
2) A name N used in a class S shall refer to the same
declaration in its context and when re-evaluated in the completed
scope of S. No diagnostic is required for a violation of this rule.
3)
If reordering member declarations in a class yields an alternate valid
program under (1) and (2), the program is ill-formed, no diagnostic is
required.
particularly, by the last point they use a negative manner to define that "any ordering is possible" because if re-ordering would change lookup then there is a problem. its a negative way of saying "you can reorder anything and its ok, it doesnt change anything".
effectively saying, in a class, the declaration is looked-up in a two-phase compilation fashion.
"why can I conduct a similar concept in a struct, but not get yelled at for it?"
In a struct or class definition you're presenting the public interface to a class and it's much easier to understand, search and maintain/update that API if it's presented in:
a predictable order, with
minimal clutter.
For predictable order, people have their own styles and there's a bit of "art" involved, but for example I use each access specifier at most once and always public before protected before private, then within those I normally put typedefs, const data, constructors, destructors, mutating/non-const functions, const functions, statics, friends....
To minimise clutter, if a function is defined in the class, it might as well be without a prior declaration. Having both tends only to obfuscate the interface.
This is different from functions that aren't members of a class - where people who like top-down programming do use function declarations and hide the definitions later in the file - in that:
people who prefer a bottom-up programming style won't appreciate being forced to either have separate declarations in classes or abandon the oft-conflicting practice of grouping by access specifier
Classes are statistically more likely to have many very short functions, largely because they provide encapsulation and wrap a lot of trivial data member accesses or provide operator overloading, casting operators, implicit constructors and other convenience features that aren't relevant to non-OO, non-member functions. That makes a constant forced separation of declarations and definitions more painful for many classes (not so much in the public interfaces where definitions might be in a separate file, but definitely for e.g. classes in anonymous namespaces supporting the current translation unit).
Best practice is for classes not to cram in a wildly extensive interface... you generally want a functional core and then some discretionary convenience functions, after which it's worth considering what can be added as non-member functions. The std::string is an often claimed to have too many member functions, though I personally think it's quite reasonable. Still, this also differs from a header file declaring a library interface, where exhaustive functionality can be expected to be crammed together making a separation of even inline implementation more desirable.

Why is the 'Declare before use' rule not required inside a class? [duplicate]

This question already has answers here:
Do class functions/variables have to be declared before being used?
(5 answers)
Closed 4 years ago.
I'm wondering why the declare-before-use rule of C++ doesn't hold inside a class.
Look at this example:
#ifdef BASE
struct Base {
#endif
struct B;
struct A {
B *b;
A(){ b->foo(); }
};
struct B {
void foo() {}
};
#ifdef BASE
};
#endif
int main( ) { return 0; }
If BASE is defined, the code is valid.
Within A's constructor I can use B::foo, which hasn't been declared yet.
Why does this work and, mostly, why only works inside a class?
Well, to be pedantic there's no "declare before use rule" in C++. There are rules of name lookup, which are pretty complicated, but which can be (and often are) roughly simplified into the generic "declare before use rule" with a number of exceptions. (In a way, the situation is similar to "operator precedence and associativity" rules. While the language specification has no such concepts, we often use them in practice, even though they are not entirely accurate.)
This is actually one of those exceptions. Member function definitions in C++ are specifically and intentionally excluded from that "declare before use rule" in a sense that name lookup from the bodies of these members is performed as if they are defined after the class definition.
The language specification states that in 3.4.1/8 (and footnote 30), although it uses a different wording. It says that during the name lookup from the member function definition, the entire class definition is inspected, not just the portion above the member function definition. Footnote 30 additionally states though that the lookup rules are the same for functions defined inside the class definition or outside the class definition (which is pretty much what I said above).
Your example is a bit non-trivial. It raises the immediate question about member function definitions in nested classes: should they be interpreted as if they are defined after the definition of the most enclosing class? The answer is yes. 3.4.1/8 covers this situation as well.
"Design & Evolution of C++" book describes the reasoning behind these decisions.
That's because member functions are compiled only after the whole class definition has been parsed by the compiler, even when the function definition is written inline, whereas regular functions are compiled immediatedly after being read. The C++ standard requires this behaviour.
I don't know the chapter and verse of the standard on this.
But if you would apply the "declare before use" rule strictly within a class, you would not be able to declare member variables at the bottom of the class declaration either. You would have to declare them first, in order to use them e.g. in a constructor initialization list.
I could imagine the "declare before use" rule has been relaxed a bit within the class declaration to allow for "cleaner" overall layout.
Just guesswork, as I said.
The most stubborn problems in the definition of C++ relate to name lookup: exactly which uses of a name refer to which declarations? Here, I'll describe just one kind of lookup problem: the ones that relate to order dependencies between class member declarations. [...]
Difficulties arise because of conflicts between goals:
We want to be able to do syntax analysis reading the source text once only.
Reordering the members of a class should not change the meaning of the class.
A member function body explicitly written inline should mean the same thing when written out of line.
Names from an outer scope should be usable from an inner scope (in the same way as they are in C).
The rules for name lookup should be independent of what a name refers to.
If all of these hold, the language will be reasonably fast to parse, and users won't have to worry about these rules because the compiler will catch the ambiguous and near ambiguous cases. The current rules come very close to this ideal.
[The Design And Evolution Of C++, section 6.3.1 called Lookup Issues on page 138]