Using clang++ version 11 and C++ 17 with the compiler flag -Wall, clang will normally complain if you use a variable before it is initialized. However, it does not detect the following case:
struct Bar
{
bool b1;
};
class Foo {
public:
Foo()
: b2(Bar{b2}.b1) // We are using b2 here before it is initialized, but clang doesn't complain
{ }
bool b2;
};
This is the simplest example that I can create. It seems to only happen when initializing a member variable in the constructor (b2 in this case) with a member variable (b1) of an object (Bar). Does anyone know why clang fails to detect the problem here?
I recognize that this is a contrived example, but it actually caused a problem for me and I'd like to understand it.
It is not possible to detect if you used a member variable before it is initialized in the general case. Doing so for general programs violates Rice's theorem.
Compilers do not try.
Instead, they have some simple and cheap heuristics that catch common cases.
You cannot rely on your compiler to detect every case where you use uninitialized variables.
In this particular case, you are passing b2 to another class prior to initialization, but only using it to initialize a temporary variable. That is then used to initialize the variable that was originally initialized.
If that is your simplest case, clang is doing a pretty good job. Compilers tend to be worse at this when you use a variable as part of its own initialization statement.
Here is an even simpler case:
class Foo {
public:
Foo()
: b2((bool const&)b2) // We are using b2 here before it is initialized, but clang doesn't complain
{ }
bool b2;
};
another case:
struct Bob {
bool b;
operator bool() const{ return b; }
};
class Foo {
public:
Foo()
: b2(Bob{b2}) // We are using b2 here before it is initialized, but clang doesn't complain
{ }
bool b2;
};
another
Bob bob = {Bob{bob.b}.b};
I can go on.
Clang does not claim to detect all uninitialized variable uses. So it failing to detect one is not a "bug". Rather, getting them to detect another uninitialized case is a new feature.
Related
The clang-tidy static analyzer detects uses of variables after being moved.
class a_class {
std::unique_ptr<int> p_;
public:
auto p() -> auto& {return p_;}
void f() const {}
};
int main() {
auto aa = a_class{};
[[maybe_unused]] auto bb = std::move(aa);
aa.f();
}
error: Method called on moved-from object 'aa' [clang-analyzer-cplusplus.Move,-warnings-as-errors]
This great! ©.
How can I make the compiler, clang or GCC detect the same issue too? Either by activating some warning option or by some (non-standard?) attribute?
I tried using -Wmove in clang and the [[consumed]] attribute but they didn't help.
Perhaps I used them incorrectly.
The code is here: https://godbolt.org/z/18hr4vn7x (the lower panel is clang-tidy and the mid panel on the right is the [empty] compiler output)
Is there a chance a compiler will warn about this or it is just too costly for the compiler to check for this pattern?
I found one way to do it, using attributes in clang. .
(A GCC or a more standard solution is still welcome.)
needs clang 6 or higher
mark the class as "consumable"
mark the method(s) "callable-when-unconsumed" (not sure how to make this the default)
class [[clang::consumable(unconsumed)]] a_class {
std::unique_ptr<int> p_;
public:
[[clang::callable_when(unconsumed)]]
void f() {}
// private: [[clang::set_typestate(consumed)]] void invalidate() {} // not needed but good to know
};
https://godbolt.org/z/45q8vzdnc
The recipe is simplified from https://awesomekling.github.io/Catching-use-after-move-bugs-with-Clang-consumed-annotations/ .
I couldn't find detailed documentation on how to use the these features.
It is simplified because:
a) it seems that "clang-consumable" moved object becomes "consumed" by default when moved-from, so it is not necessary to write a special function to invalidate if not necessary (no need for [[clang::set_typestate(consumed)]]).
b) constructors seem to leave the object in an unconsumed state by default (no need for [[clang::return_typestate(unconsumed)]]);
If a compiler isn't built with a setting to do this, then you can't make it do this. Use-after-move is a legitimate thing in C++, so no compiler is obligated to consider it an error.
These kinds of things are what static analyzers are for.
So, I just got through with a grueling multi-hour debug session of a large server application. The error eventually came down to a barely noticeable typo in a constructor. Basically, it was something like:
template <class T>
class request_handler
{
public:
request_handler(T& request, Log& error_log)
: m_request(m_request), m_error_log(error_log)
{
/*... some code ... */
}
...
};
See the bug? Well, I didn't. The problem is a small typo in the initializer list: m_request(m_request) is assigning an uninitialized reference to itself. Obviously, it's supposed to read m_request(request).
Now, the member variable m_request is of type T&. So - is there some reason the compiler didn't warn me that I was using an uninitialized variable here?
Using GCC 4.6 with the -Wall flag, if I say:
int x;
x = x;
...it will issue a warning: warning: ‘x’ is used uninitialized in this function [-Wuninitialized]
So, why didn't the compiler warn me when I assigned m_request to itself: essentially assigning an uninitialized reference to itself? It would have saved me hours of annoyance.
Annoying bug to track down. It turns out, you don't even need templates to silently fail on this one. This'll do the trick:
class C {
int a, b;
public:
C(int t, int z) : a(a), b(z) { };
};
Clang warns on this with -Wuninitialized.
Good news for gcc folks: according to gnu's bugzilla, gcc 4.7.0 has fixed this.
Update
On gcc 4.7.0, add -Wself-init to get this warning (verified by sbellef):
tst.cc: In constructor ‘C::C(int, int)’: tst.cc:4:9: warning: ‘C::a’ is initialized with itself [-Wuninitialized]
I like to use the trick of using the same name for the members as the constructor parameters.
template <class T>
request_handler(T& request, Log& error_log)
: request(request), error_log(error_log)
{
/*... some code ... */
}
This will always prevent the error. You have to be careful though as in the function body request refers to argument, not the member. This of course doesn't matter for simple types such as references, but I don't recommend it for classes.
Consider the following pair of mutually referencing types:
struct A;
struct B { A& a; };
struct A { B& b; };
This can be initialized with aggregate initialization in GCC, Clang, Intel, MSVC, but not SunPro which insists that user-defined ctors are required.
struct {A first; B second;} pair = {pair.second, pair.first};
Is this initialization legal?
slightly more elaborate demo: http://ideone.com/P4XFw
Now, heeding Sun's warning, what about classes with user-defined constructors? The following works in GCC, clang, Intel, SunPro, and MSVC, but is it legal?
struct A;
struct B { A& ref; B(A& a) : ref(a) {} };
struct A { B& ref; A(B& b) : ref(b) {} };
struct {B first; A second;} pair = {pair.second, pair.first};
demo: http://ideone.com/QQEpA
And finally, what if the container is not trivial either, e.g. (works in G++, Intel, Clang (with warnings), but not MSVC ("pair" unknown in initializer) or SunPro ("pair is not a structure")
std::pair<A, B> pair(pair.second, pair.first);
From what I can see, §3.8[basic.life]/6 forbids access to a non-static data member before lifetime begins, but is lvalue evaluation of pair.second "access" to second? If it is, then are all three initializations illegal? Also, §8.3.2[dcl.ref]/5 says "reference shall be initialized to refer to a valid object" which probably makes all three illegal as well, but perhaps I'm missing something and the compilers accept this for a reason.
PS: I realize these classes are not practical in any way, hence the language-lawyer tag. Related and marginally more practical old discussion here: Circular reference in C++ without pointers
This one was warping my mind at first but I think I got it now. As per 12.6.2.5 of 1998 Standard, C++ guarantees that data members are initialized in the order they are declared in the class, and that the constructor body is executed after all members have been initialized. This means that the expression
struct A;
struct B { A& a; };
struct A { B& b; };
struct {A first; B second;} pair = {pair.second, pair.first};
makes sense since pair is an auto (local, stack) variable, so its relative address and address of members are known to the compiler, AND there are no constructors for first and second.
Why the two conditions mean the code above makes sense: when first, of type A, is constructed (before any other data member of pair), first's data member b is set to reference pair.second, the address of which is known to the compiler because it is a stack variable (space already exists for it in the program, AFAIU). Note that pair.second as an object, ie memory segment, has not been initialized (contains garbage), but that doesn't change the fact that the address of that garbage is known at compile time and can be used to set references. Since A has no constructor, it can't attempt to do anything with b, so behavior is well defined. Once first has been initialized, it is the turn of second, and same: its data member a references pair.first, which is of type A, and pair.first address is known by compiler.
If the addresses were not known by compiler (say because using heap memory via new operator), there should be compile error, or if not, undefined behavior. Though judicious use of the placement new operator might allow it to work, since then again the addresses of both first and second could be known by the time first is initialized.
Now for the variation:
struct A;
struct B { A& ref; B(A& a) : ref(a) {} };
struct A { B& ref; A(B& b) : ref(b) {} };
struct {B first; A second;} pair = {pair.second, pair.first};
The only difference from first code example is that B constructor is explicitly defined, but the assembly code is surely identical as there is no code in the constructor bodies. So if first code sample works, the second should too.
HOWEVER, if there is code in the constructor body of B, which is getting a reference to something (pair.second) that hasn't been initialized yet (but for which address is defined and known), and that code uses a, well clearly you're looking for trouble. If you're lucky you'll get a crash, but writing to a will probably fail silently as the values get later overwritten when A constructor is eventually called. of
From compiler point of view references are nothing else but const pointers. Rewrite your example with pointers and it becomes clear how and why it works:
struct A;
struct B { A* a; };
struct A { B* b; };
struct {A first; B second;} pair = {&(pair.second), &(pair.first)}; //parentheses for clarity
As Schollii wrote: memory is allocated beforehand, thus addressable. There is no access nor evaluation because of references/pointers. That's merely taking addresses of "second" and "first", simple pointer arithmetics.
I could rant about how using references in any place other than operator is language abuse, but I think this example highlights the issue well enough :)
(From now on I write all the ctors manually. Your compiler may or may not do this automagically for you.)
Try using new:
struct A;
struct B { A& a; B(A& arg):a(arg){;} };
struct A { B& b; A(B& arg):b(arg){;} };
typedef struct PAIR{A first; B second; PAIR(B& argB, A& argA):first(argB),second(argA){;}} *PPAIR, *const CPPAIR;
PPAIR pPair = NULL;// just to clean garbage or 0xCDCD
pPair = new PAIR(pPair->second, pPair->first);
Now it depends on order of execution. If assignment is made last (after ctor) the second.p will point to 0x0000 and first.ref to e.g. 0x0004.
Actually, http://codepad.org/yp911ug6 here it's the ctors which are run last (makes most sense!), therefore everything works (even though it appears it shouldn't).
Can't speak about templates, though.
But your question was "Is that legal?". No law forbids it.
Will it work? Well, I don't trust compiler makers enough to make any statements about that.
When I compile and run this with Visual C++ 2010:
#include <iostream>
int main() {
int subtrahend = 5;
struct Subtractor {
int &subtrahend;
int operator()(int minuend) { return minuend - subtrahend; }
} subtractor5 = { subtrahend };
std::cout << subtractor5(47);
}
I get the correct answer, 42.
Nevertheless, the compiler complains that this is impossible:
Temp.cpp(9) : warning C4510: main::Subtractor : default constructor could not be generated
Temp.cpp(6) : see declaration of main::Subtractor
Temp.cpp(9) : warning C4512: main::Subtractor : assignment operator could not be generated
Temp.cpp(6) : see declaration of main::Subtractor
Temp.cpp(9) : warning C4610: struct main::Subtractor can never be instantiated - user defined constructor required
What's going on?
The first two warnings are just letting you know that the implicitly declared member functions cannot be generated due to the presence of a reference data member.
The third warning is a Visual C++ compiler bug.
All three warnings can be ignored with no ill effects, though you can easily make all three go away by making the reference data member a pointer instead (reference data members are almost never worth the trouble).
The first warning is to tell you that a reference value cannot be defaultly constructed(references are guaranteed to point to some value). Switch the subtrahend to a regular integer and the problem will go away.
I am pretty sure the second warning is of similar nature.
(Just saying, it is generally much better to rely on something like boost::function or a similar implementation(std::tr1::function?) instead of writing this code manually)
It's because the variable subtractor5 is an unnamed struct. If you want to make the errors go away, give the structure used for subtractor5 a name.
For example:
struct subtractor {
:
} subtractor5 = { subtrahend };
I unfortunately don't know enough C++ language-ese to know why it works, but I do know why the warning happens.
A user defined constructor is mandatory in following cases:
Initializing constant data members (const int c_member;).
Initializing reference data members (int & r_member;)
Having a data member whose type doesn't have default constructor. Eg:
class NoDefCtor
{
public:
NoDefCtor(int);
};
class ContainThat
{
NoDefCtor no_ctor_member;
};
Inheriting from a base class, where base class doesn't have default constructor. Almost same as above (NoDefCtor).
We know that compiler generates some member functions for user-defined class if that member functions are not defined but used, isn't it. So I have this kind of code:
class AA
{
};
void main()
{
AA a;
AA b(a);
a = b;
}
This code works fine. I mean no compiler error. But the following code....
class AA
{
int member1;
int member2;
};
But this code gives an run time error, because variable "a" is used without being iniltialized!!!
So my question is this: when we instantiate an int, it has a value. So why the default constructer doesn't work and by using those two int numbers initializes variable "a"??
EDIT: Platform: Win Vista, Compiler: Visual Studio 2008 compiler; Flags: Default
The compiler-synthesised default constructor calls the default constructors for all class members that have constructors. But integers don't have constructors, and so are not initialised. However, I find it hard to believe that this will cause a run-time error.
To initialise those variables:
class AA {
public:
AA() : member1(0), member2(0) {}
private:
int member1;
int member2;
};
Firstly, from practical point of view this is not a genuine run-time error. This is a built-in debugging feature of your development environment. The compiler attempts to catch situations when your read an uninitialized value, which is exactly what happens in your case.
Secondly, when we "instantiate" an int, it doesn't have a value. More precisely, it contains an undetermined value which is not even guaranteed to be stable (you can get different values by reading the same uninitialized variable several times in a row). Theoretically, reading an uninitialized int variable leads to undefined behavior, since it might contain an illegal ("trap") representation. In fact, you can perceive your "run-time error" generated by your development environment as a manifestation of that undefined behavior.
What platform? compiler? compiler flags? You must have some extra checking being added because there is nothing in normal C++ that checks initialization status.
In fact, the default and copy constructors do work. But in cpp uninitialized variables actually contain garbage. Therefore, you get your error (int member1, int member2 contains trash and you try to assign this trash to b object).
Firstly, When you instantiate an int without initializing it, it has an indeterminate value. A built-in basic type does not have a constructor.
Secondly, that code should not generate a runtime error. It just copies indeterminate int values in the autogenerated copy constructor and assignment operators. It should generate a compiler warning that an uninitialized variable is being used.
Thirdly, your signature for main is wrong - the correct signature is
int main(void)