Can user edit invalidate code before current token? - c++

I'm working on a source editor for C++ and came up with simple optimization that I do not need to invalidate (e.g. highlighting, rebuild AST, do static analysis) the code before the currently edited statement (basically, before the previous semicolon/closing brace) but I am not sure if this is always true for C++.
E.g. in Java it is possible to call functions declared/defined after the edit location. Hence, if the user adds an argument to the function then error marker should be placed in the code before the edit location. In C++ function should be declared before it is used (and if the declaration does not match definition the error will be on definition).

Member function bodies defined inline in the class will be conceptually (and actually, in the compilers I know) parsed at the end of the class and can thus access members of the class declared after them.

Templates are compiled in two phases. The first phase is where they're defined, the second where they're instantiated. The current compilers can blame you for template errors at the point of instantiation, when substituting the actual arguments leads to an error in the template.
It would be entirely reasonable for your editor to follow the same logic. If the template definition looks good when you see it, it passes the first phase. When the code being edited instantiates a template, re-check, and blame any errors in the second phase on the instantiation.
This hints at a more fundamental problem. Where do you say an error occurred? The C++ standard doesn't care. As far as it's concerned, "syntax error somewhere" is a sufficient diagnostic. So, if there's an inline method trying to access a non-existing member this->a, you can claim there's an error in the method. But with equal validity, you can claim at the final }; that the class failed to define the necessary member a.
The underlying cause of all these errors is that two pieces of code must agree on something. When they don't, you can choose who to blame. For your editor, you could blame the fragment which came last. It's just a matter of getting the diagnostic wording right.

In general, outside a template definition, the rules for name lookup in C++ require a name to be declared before the point at which it is used. Your idea would therefore work in most cases. Unfortunately - as pointed out by SebastianRedl - there is a special case in which this rule of thumb does not apply.
Within a class definition, the declarations of all members of the class (and its enclosing class(es)) are visible during name lookup within the body of a member function (including a ctor-initializer-list or exception-specification), or in a default argument of a member function.
An illustration:
struct A
{
struct B
{
static void f(int i = M()) // uses 'A::M' declared later
{
A::f(); // calls A::f(int) declared later
}
};
static void f(void*)
{
f(); // calls A::f(int) declared later
}
static void f(int i = M()) // uses 'M' declared later
{
}
typedef int M;
};
If the token that was modified occurs within a member function-body or a default argument, you would have to reparse all classes that enclose the token.
From C++ Working Draft Standard N3337:
3.4.1 Unqualified name lookup [basic.lookup.unqual]
A name used in the definition of a member function (9.3) of class X following the function’s declarator-id
or in the brace-or-equal-initializer of a non-static data member (9.2) of class X shall be declared in one of
the following ways:
— before its use in the block in which it is used or in an enclosing block (6.3), or
— shall be a member of class X or be a member of a base class of X (10.2), or
— if X is a nested class of class Y (9.7), shall be a member of Y, or shall be a member of a base class of Y
(this lookup applies in turn to Y’s enclosing classes, starting with the innermost enclosing class), or
— if X is a local class (9.8) or is a nested class of a local class, before the definition of class X in a block
enclosing the definition of class X, or
— if X is a member of namespace N, or is a nested class of a class that is a member of N, or is a local class
or a nested class within a local class of a function that is a member of N, before the use of the name,
in namespace N or in one of N ’s enclosing namespaces.

Related

How does the lookup rule apply to a name that is used in a default member initializer within a class template?

#include <iostream>
template<class T>
struct A{
int c = T::a;
};
struct B{
A<B> cc;
static const int a = 0;
};
int main(){
}
GCC and Clang both accept the above example. Regarding how the lookup rule applies to a name in a class template specialization, which might be a member of a class that causes the implicit instantiation of the class template specialization is not clear in c++20 or before. Fortunately, it's clear in the current draft.
[basic.lookup#class.member.lookup-3]
The declaration set is the result of a single search in the scope of C for N from immediately after the class-specifier of C if P is in a complete-class context of C or from P otherwise. If the resulting declaration set is not empty, the subobject set contains C itself, and calculation is complete.
It seems that the rule can be used to interpret most cases that I said in the above, however, consider the above example, it's a bit special.
The definition of non-static member cc would cause the implicit instantiation of specialization A<B>, at this point, it's not the complete context of class B. Hence, the subsequent lookup for a name in the scope B should be from this point. If T::a is used as the type-specifier of a non-static member of class template A, It's 100% sure that the compiler would report an error that says no identifier a can be found in the scope of B.
However, except for the following rule
temp.res#general-1
If the name is dependent (as specified in [temp.dep]), it is looked up for each specialization (after substitution) because the lookup depends on a template parameter.
I cannot find any wording in the standard that states whether an implicit instantiation of A can cause the lookup rule to be applied to the name within the default member initializer or is not at that point.
At first glance of the above rule, It should mean that the lookup should be performed for the name when the enclosing class template specialization is instantiating, then a cannot be found in the scope of B since it's not deemed as a complete class at that point and only these declarations that are reachable at point P can be found.
Obviously, the behavior of GCC and Clang consider T is a complete type such that a can be found in T. In other words, it seems that these compilers do not immediately perform the lookup for T::a when A is instantiating. If I miss a certain special rule, please point them out.

Using "using" twice interpreted differently by different compilers

Consider the following code:
struct A {
int propose();
};
struct A1 : A {
int propose(int);
using A::propose;
};
struct B1 : A1 {
protected:
using A1::propose;
public:
using A::propose;
};
int main() {
B1().propose();
}
Let's compile this: g++ -std=c++11 main.cpp.
I'm getting the following compiler error using GNU 4.8.1:
main.cpp: In function 'int main()':
main.cpp:2:9: error: 'int A::propose()' is inaccessible
int propose();
^
main.cpp:18:18: error: within this context
B1().propose();
However, this code compiles in AppleClang 6.0.0.6000056.
I understand that there is no need for the using in B1, (in my code was necessary, but I had 1 using too much by mistake). In any case, why Clang compiles it? Is this expected?
In [namespace.udecl], we have:
When a using-declaration brings names from a base class into a derived class scope, member functions and
member function templates in the derived class override and/or hide member functions and member function
templates with the same name, parameter-type-list (8.3.5), cv-qualification, and ref-qualifier (if any) in a
base class (rather than conflicting).
The standard explicitly says that names brought in will not conflict with names in a base class. But it doesn't say anything about bringing in conflicting names.
The section also says:
A using-declaration is a declaration and can therefore be used repeatedly where (and only where) multiple
declarations are allowed. [ Example:
struct B {
int i;
};
struct X : B {
using B::i;
using B::i; // error: double member declaration
};
—end example ]
And interestingly, in the following example it's GCC that happily compiles it (and prints A) while Clang allows the construction of a C but rejects the call to foo as ambiguous:
struct A {
void foo() { std::cout << "A\n"; }
};
struct B {
void foo() { std::cout << "B\n"; }
};
struct C : A, B {
using A::foo;
using B::foo;
};
int main()
{
C{}.foo();
return 0;
}
So the short answer is - I suspect this is underspecified in the standard and that both compilers are doing acceptable things. I would just avoid writing this sort of code for general sanity.
The declaration is legal.
Calling it is legal and should work anywhere, and it can only be called from the class and derived classes, and it can be called from within any class. You'll note that this makes little sense.
There are no rules that ban that construct in declarations (importing the name twice from two different base classes with the same signature), and it is even used in "real" code where the derived class goes and hides the name after they are imported.
If you don't hide it, you are in the strange situation where the same function A::propose is both protected and public at the same time, as it is named twice (legally) in the same scope with different access control. This is ... unusual.
If you are within a class, a sub-clause says you can use it:
[class.access.base]/5.1
A member m is accessible at the point R when named in class N if — (5.1) m as a member of N is public
and propose is clearly public. (it is also protected but we don't have to keep reading for that case!)
Elsewhere, we have a contradiction. You are told you can use it everywhere without restriction [class.access]/1(3). And you are told that you can only use it in certain circumstances [class.access]/1(2).
I am uncertain how to resolve that ambiguity.
The rest of the logic train:
In [namespace.udecl]/10 we have:
A using-declaration is a declaration and can therefore be used repeatedly where (and only where) multiple declarations are allowed.
And [namespace.udecl]/13:
Since a using-declaration is a declaration, the restrictions on declarations of the same name in the same declarative region
so each of those using X::propose; are declarations.
[basic.scope] has no applicable restrictions on two functions of the same name in a scope, other than [basic.scope.class]/1(3) which states that if reordering of declarations changes the program, the program is ill-formed. So we cannot say that the later one wins.
Two declarations of member functions in the same scope are legal under [basic.scope]. However, under [over], there are restrictions on two member functions with the same name.
[over]/1 states:
When two or more different declarations are specified for a single name in the same scope, that name is said to be overloaded
And there are some restrictions on overloading. This is what usually prevents
struct foo {
int method();
int method();
};
from being legal. However:
[over.load]/1 states:
Not all function declarations can be overloaded. Those that cannot be overloaded are specified here. A program is ill-formed if it contains two such non-overloadable declarations in the same scope. [Note: This
restriction applies to explicit declarations in a scope, and between such declarations and declarations made through a using-declaration (7.3.3). It does not apply to sets of functions fabricated as a result of name lookup (e.g., because of using-directives) or overload resolution (e.g., for operator functions). —end note
the note explicitly permits symbols introduced via two using-declarations from being considered by the overloading restrictions! The rules only apply to two explicit declarations within the scope, or between an explicit declaration within the scope and a using declaration.
There are zero restrictions on two using-declarations. They can have the same name, and their signatures can conflict as much as you'd like.
This is useful, because usually you can go and then hide their declaration (with a declaration in the derived class), and nothing goes wrong [namespace.udecl]/15:
When a using-declaration brings names from a base class into a derived class scope, member functions and member function templates in the derived class override and/or hide member functions and member function templates with the same name, parameter-type-list (8.3.5), cv-qualification, and ref-qualifier (if any) in a base class (rather than conflicting).
However, this is not done here. We then call the method. Overload resolution occurs.
See [namespace.udecl]/16:
For the purpose of overload resolution, the functions which are introduced by a
using-declaration into a derived class will be treated as though they were members of the derived class. In particular, the implicit this parameter shall be treated as if it were a pointer to the derived class rather than to the base class. This has no effect on the type of the function, and in all other respects the function remains a member of
the base class.
So we have to treat them as if they are members of the derived class for the purpose of overload resolution. But there are still 3 declarations here:
protected:
int A::propose(); // X
int A1::propose(int); // Y
public:
int A::propose(); // Z
Thus the call to B1().propose() considers all 3 declarations. Both X and Z are equal. They, however, refer to the same function, and overload resolution states there is an ambiguity if two different functions are selected. So the result is not ambiguous. There may be access control violations, or not, depending on how you read the standard.
[over.match]/3
If a best viable function exists and is unique, overload resolution succeeds and produces it as the result. Otherwise overload resolution fails and the invocation is ill-formed. When overload resolution succeeds, and the best viable function is not accessible (Clause 11) in the context in which it is used, the program is ill-formed.

Why calling a non-member function with the same name as a member function generates an error

I have next code:
void f(int){}
struct A
{
void f()
{
f(1);
}
};
This code is not well-formed with the error message (GCC): error: no matching function for call to ‘A::f(int)’ or (clang) Too many arguments to function call, expected 0, have 1; did you mean '::f'?
Why do I need to use :: to call the non-member function with the same name as the member function, but with different signature? What is the motivation for this requirement?
I think the compiler should be able to figure it out I want to call the non-member function as the signature is different (clang even puts that in the error message!).
Please don't mark this as duplicate - it is a different question from this Calling in C++ a non member function inside a class with a method with the same
Why do I need to use :: to call the non-member function with the same name as the member function, but with different signature
Because those are the rules. Names in a nested scope hide entities with the same name in a wider scope.
What is the motivation for this requirement?
Consider the case where a member function calls another member with a signature that doesn't quite match:
struct A {
void f(double);
void g() {f(42);} // requires int->double conversion
};
Now suppose someone adds an unrelated function in the surrounding namespace
void f(int);
If this were included in the set of overloads within the scope of A, then suddenly the behaviour of A::g would change: it would call this instead of A::f. Restricting the overload set to names in the narrowest available scope prevents this kind of unexpected breakage.
As you (and your helpful compiler) say, the outer name is still available (with qualification) if you need it.
The compiler performs unqualified name lookup, for f, specified in §3.4.1 [basic.lookup.unqual] (thankfully, there's no ADL here):
1 In all the cases listed in 3.4.1, the scopes are searched for a
declaration in the order listed in each of the respective categories;
name lookup ends as soon as a declaration is found for the name. If no
declaration is found, the program is ill-formed.
8 For the members of a class X, a name used in a member function
body, in a default argument, in an exception-specification, in the
brace-or-equal-initializer of a non-static data member (9.2), or in the definition of a class member outside of the definition of X,
following the member’s declarator-id, shall be declared in one of
the following ways:
before its use in the block in which it is used or in an enclosing block (6.3), or
shall be a member of class X or be a member of a base class of X (10.2), or
if X is a nested class of class Y (9.7), shall be a member of Y, or shall be a member of a base class of Y (this lookup applies
in turn to Y’s enclosing classes, starting with the innermost
enclosing class), or
if X is a local class (9.8) or is a nested class of a local class, before the definition of class X in a block enclosing the definition
of class X, or
if X is a member of namespace N, or is a nested class of a class that is a member of N, or is a local class or a nested class within
a local class of a function that is a member of N, before the use of
the name, in namespace N or in one of N’s enclosing namespaces.
Name lookup stops as soon as a declaration is found. So once it finds the member f() at the second bullet point it stops and never searches elsewhere.
Rejection of nonviable functions are done after name lookup, at overload resolution.
Why do I need to use :: to call the non-member function with the same name as the member function, but with different signature? What is the motivation for this requirement?
That is the whole point of having namespaces. A local (closer-scoped) name is preferred and more visible over a global name. Since a struct is again a scope, its f is shadowing the visibility of ::f. When you've to have the global one, you've to say you do. Why?
This is provided as a feature to make sure you can peacefully call functions you defined assuming they would get called, and when you need one from a different namespace, say the standard library, you'd state that explicitly, like std::. It's just a clean form of disambiguation, without leaving room for chance to play its part.
To understand the reason of your error and why you need to explicitly use the ::f() syntax, you may want to consider some aspects of the C++ compiler process:
The first thing the compiler does is name lookup.
Unqualified name lookup starts from the current scope, and then moves outwards; it stops as soon as it finds a declaration for the name of the function, even if this function will be later determined to not be a viable candidate for the function call.
Then, overload resolution is executed over the set of the functions that name lookup found.
(And, finally, access check is executed on the function that overload resolution picked up.)

Why VS 2008 compile with no warning with an erroneous template logic?

I have a simple template example that is as follow:
template<class T> class A {
friend int f(T);
}
int main(){
A<int> a;
return 0;
}
That code compile and execute without warning in VS2008 (except for the unused variable). I believe there should be a problem since we obtain many versions of a non-template function in the same class with only one definition. Did I miss something?
Why should this code produce an error? For every T you instantiate A with, a new function will be declared and friended. There will never be two identical functions, since you can't instantiate a template twice for the same type (you will just reuse the old instantiation).
Also, even if it was somehow possible to generate two equal declarations, there would be no ambiguity, since the functions are first declared inside the class. As such, they can never be found by anything other than argument dependant lookup. (Basically, those functions are useless as they cannot be called)
§7.3.1.2 [namespace.memdef] p3
[...] If a friend declaration in a nonlocal class first declares a class or function the friend class or function is a member of the innermost enclosing namespace. The name of the friend is not found by unqualified lookup or by qualified lookup until a matching declaration is provided in that namespace scope (either before or after the class definition granting friendship). [...]
Also, see this.
According to the C++ standard, the degree of syntax checking for unused template functions is up to the implementation. The compiler does not do any semantic checking—for example, symbols are not looked up.

Use of qualified name in function parameter

According to the C++ Standard, function parameter's name is parsed by a declarator-id, and a declarator-id can also be a qualified name. That means, the following code is perfectly valid (if I've understood the relevant sections from the Standard correctly):
template<class T>
struct Sample
{
int fun(int T::count); //T::count is qualified variable name
};
My question basically is, why would anyone write such code? In what situations, the use of qualified name (in function parameter-list) can be advantageous?
EDIT:
It seems I understood the sections incorrectly. Instead of the above code, we can probably write the following code (as per the C++ standard):
template<class T>
struct sample
{
void fun(int arr[T::count]);
};
gcc-4.3.4 compiles it perfectly. But then, I'm not totally satisfied, because T::count is not a parameter anymore (I guess).
It's invalid. The syntax allows arbitrary declarators, but 8.3.5p8 says
An identifier can optionally be
provided as a parameter name; if
present in a function definition
(8.4), it names a parameter (sometimes
called “formal argument”)
Edit Another quote which syntactically constraints declarators (8.3p1, [dcl.meaning]):
Each declarator contains exactly one
declarator-id; it names the identifier
that is declared. The id-expression of
a declarator-id shall be a simple
identifier except for the declaration
of some special functions (12.3, 12.4,
13.5) and for the declaration of template specializations or partial
specializations (14.7). A declarator-id
shall not be qualified except for the
definition of a member function (9.3)
or static data member (9.4) or nested
class (9.7) outside of its class, the
definition or explicit instantiation
of a function, variable or class
member of a namespace outside of its
namespace, or the definition of a
previously declared explicit
specialization outside of its
namespace, or the declaration of a
friend function that is a member of
another class or namespace (11.4).
So in a parameter declaration, you must not use qualified names.
Edit: In the edited form, the function parameter type decays to an int*, even before a test is being made whether T::count actually exists and is an integer constant. If you want an example where a qualified name in such a signature would do something meaningful, consider
template<class T>
struct sample
{
void fun(int S=T::count);
};
When fun gets called without parameter, the compiler needs to determine the default argument, which then fails if T does not have a count member, or that cannot be converted to int.
As far as I understand your code is ill formed because
$8.3/1 : When the declarator-id is qualified, the declaration shall refer to a previously declared member of the class or namespace to which the qualifier refers, and the member shall not have been introduced by a using-declaration in the scope of the class or namespace nominated by the nested-name-specifier of the declarator-id. [Note: if the qualifier is the global ::scope resolution operator, the declarator-id refers to a name declared in the global namespace scope. ]
P.S: I am not 100% sure. Please correct me if I am wrong. :)
In what situations, the use of qualified name (in function parameter-list) can be advantageous?
Read Items 31 and 32 from Exceptional C++ by Herb Sutter. Both the items deal with Koenig lookup and the Interface principle.
It seems I understood the sections incorrectly. Instead of that code, we can probably write the following code (as per the C++ standard):
template<class T>
struct sample
{
void fun(int arr[T::count]);
};
gcc-4.3.4 compiles it perfectly. But then, I'm not totally satisfied, because T::count is not a parameter anymore (I guess).