What do clang and gcc qualify as variable being unused [duplicate] - c++

This question already has an answer here:
Why do tuples not get unused variable warnings?
(1 answer)
Closed 10 months ago.
I noticed in a PR review an unused variable and we were wondering why compiler didn't catch that. So I tested with godbolt the following code with bunch of unused variables and was surprised that some were reported as unused but others not. Even though all of them are unused.
#include <string>
struct Index
{
Index(int index) : m_index(index) {}
int m_index;
};
int main()
{
std::string str = "hello"; // case 1. no warning here - unexpected
int someValue = 2; // case 2. warning - as expected
const int someConstant = 2; // case 3. warning - as expected
Index index1(2); // case 4. just as equally not used but no warning - unexpected
// here using the assignment but do get a warning here
// but the str assignment doesn't give a warning - weird
Index index2 = 2; // case 5.
Index index3{2}; // case 6. just as equally not used but no warning - unexpected
Index index4 = {2}; // case 7. just as equally not used but no warning - unexpected
return 0;
}
warning: unused variable 'someValue' [-Wunused-variable]
warning: unused variable 'index2' [-Wunused-variable] (warning only on clang, not on gcc)
warning: unused variable 'someConstant' [-Wunused-variable]
So what do clang and gcc qualify as unused? What if I'm using a lock? I declare it but don't use it directly but use it for automatic releasing of a resource. How do I tell the compiler that I am using it if one day it starts to give a warning about the lock?
int g_i = 0;
std::mutex g_i_mutex; // protects g_i
void safe_increment()
{
const std::lock_guard<std::mutex> lock(g_i_mutex);
++g_i;
// g_i_mutex is automatically released when lock goes out of scope
}
flags: -Wunused-variable
clang: 14.0.0
gcc: 11.3

The reason why there's no warning is that variables of non-trivial class type aren't technically unused when you initialize them but then never access them in your function.
Consider this example:
struct Trivial {};
struct NonTrivial {
NonTrivial() {
//Whatever
}
};
void test() {
Trivial t;
NonTrivial nt;
}
GCC warns about Trivial t; being unused since this declaration never causes any user-defined code to run; the only thing that's run are the trivial constructor and trivial destructor, which are no-ops. So no operation at all is performed on Trivial t and it is truly unused (its memory is never even touched).
NonTrivial nt; doesn't cause a warning, however, since it is in fact used to run its constructor, which is user-defined code.
That's also why compilers are not going to warn about "unused lock guards" or similar RAII classes - they're used to run user-defined code at construction and destruction, which means that they are used (a pointer to the object is passed to the user-defined constructor/destructor = address taken = used).
This can further be proved by marking the object's constructor with the gnu::pure attribute:
struct Trivial {};
struct NonTrivial {
[[gnu::pure]] NonTrivial() {
//Whatever
}
};
void test() {
Trivial t;
NonTrivial nt;
}
In this case, GCC warns about both of them because it knows that NonTrivial::NonTrivial() doesn't have side-effects, which in turn enables the compiler to prove that construction and destruction of a NonTrivial is a no-op, giving us back our "unused variable" warning. (It also warns about gnu::pure being used on a void function, which is fair enough. You shouldn't usually do that.)
Clang's warning about the following code also does make sense.
struct Hmm {
int m_i;
Hmm(int i): m_i(i) {}
};
void test() {
Hmm hmm = 2; //Case 5 from the question
}
This is equivalent to the following:
void test() {
Hmm hmm = Hmm(2);
}
Construction of the temporary Hmm(2) has side-effects (it calls the user-defined constructor), so this temporary is not unused. However, the temporary then gets moved into the local variable Hmm hmm. Both the move constructor and the destructor of that local variable are trivial (and therefore don't invoke user code), so the variable is indeed unused since the compiler can prove that the behavior of the program would be the same whether or not that variable is present (trivial ctor + trivial dtor + no other access to the variable = unused variable, as explained above). It wouldn't be unused if Hmm had a non-trivial move constructor or a non-trivial destructor.
Note that a trivial move constructor leaves the moved-from object intact, so it truly does not have any side-effects (other than initializing the object that's being constructed).
This can easily be verified by deleting the move constructor, which causes both Clang and GCC to complain.

Related

Can a constructor affect other fields of an enclosing object, or is this a static analysis false positive?

Consider this C++ code:
struct SomeStruct {
SomeStruct() noexcept;
};
//SomeStruct::SomeStruct() noexcept {}
class SomeClass {
const bool b;
const SomeStruct s;
public:
SomeClass() : b(true) {}
operator bool() const { return b; }
};
void f() {
int *p = new int;
if (SomeClass())
delete p;
}
When I run clang --analyze -Xanalyzer -analyzer-output=text on it, I get this:
q72007867.cpp:20:1: warning: Potential leak of memory pointed to by 'p' [cplusplus.NewDeleteLeaks]
}
^
q72007867.cpp:17:12: note: Memory is allocated
int *p = new int;
^~~~~~~
q72007867.cpp:18:7: note: Assuming the condition is false
if (SomeClass())
^~~~~~~~~~~
q72007867.cpp:18:3: note: Taking false branch
if (SomeClass())
^
q72007867.cpp:20:1: note: Potential leak of memory pointed to by 'p'
}
^
1 warning generated.
Uncommenting the definition of SomeStruct's constructor makes the warning go away, though. Swapping the order of const bool b; and const SomeStruct s; also makes it go away. In the original program, is there actually some other definition of SomeStruct's constructor that would lead to the false branch being taken there, or is this a false positive in Clang's static analyzer?
There is no standard compliant way for a const member to be changed after initialization; any mechanism is going to be UB.
Like
struct foo{
const bool b=true;
foo(){ b=false; }
};
is illegal, as is code that const_casts b to edit it like:
struct foo{
const bool b=true;
foo(){ const_cast<bool&>(b)=false; }
};
(this second version compiles, but produces UB).
Such UB is sadly not that rare. For example, I could implement the constructor of SomeStruct to fiddle with memory before the this pointer address. It would be doubly illegal (modifying a const value after construction, and violating reachability rules), but depending on optimization settings it could work.
On the other hand, the compiler is free to notice that the only constructor assigns true to b and then convert operator bool to just return true.
But instead, the static code analyzer gives up on provong the state of b once a call to a function body outside of the visible source code occurs. This is a pretty reasonable thing to give up on. Here, the function even gets a pointer into the same temporary object; doing a full proof that said pointer cannot change some state regardless of what code is run is possible, but failing to do that seems also reasonable.
Style wise, the code is also a bit of a mess. A provably true branch either should not exist, or the failure branch should semantically make sense. Neither occurs here; anyone reading this code cannot determine correctness from code structure; the code structure looks misleading.

Why does Return Value Optimization not happen if no destructor is defined?

I expected to see copy elision from Named Return Value Optimization (NRVO) from this test program but its output is "Addresses do not match!" so NRVO didn't happen. Why is this?
// test.cpp
// Compile using:
// g++ -Wall -std=c++17 -o test test.cpp
#include <string>
#include <iostream>
void *addr = NULL;
class A
{
public:
int i;
int j;
#if 0
~A() {}
#endif
};
A fn()
{
A fn_a;
addr = &fn_a;
return fn_a;
}
int main()
{
A a = fn();
if (addr == &a)
std::cout << "Addresses match!\n";
else
std::cout << "Addresses do not match!\n";
}
Notes:
If a destructor is defined by enabling the #if above, then the NRVO does happen (and it also happens in some other cases such as defining a virtual method or adding a std::string member).
No methods have been defined so A is a POD struct, or in more recent terminology a trivial class. I don't see an explicit exclusion for this in the above links.
Adding compiler optimisation (to a more complicated example that doesn't just reduce to the empty program!) doesn't make any difference.
Looking at the assembly for a second example shows that this even happens when I would expect mandatory Return Value Optimization (RVO), so the NRVO above was not prevented by taking the address of fn_a in fn(). Clang, GCC, ICC and MSVC on x86-64 show the same behaviour suggesting this behaviour is intentional and not a bug in a specific compiler.
class A
{
public:
int i;
int j;
#if 0
~A() {}
#endif
};
A fn()
{
return A();
}
int main()
{
// Where NRVO occurs the call to fn() is preceded on x86-64 by a move
// to RDI, otherwise it is followed by a move from RAX.
A a = fn();
}
The language rule which allows this in case of returning a prvalue (the second example) is:
[class.temporary]
When an object of class type X is passed to or returned from a function, if X has at least one eligible copy or move constructor ([special]), each such constructor is trivial, and the destructor of X is either trivial or deleted, implementations are permitted to create a temporary object to hold the function parameter or result object.
The temporary object is constructed from the function argument or return value, respectively, and the function's parameter or return object is initialized as if by using the eligible trivial constructor to copy the temporary (even if that constructor is inaccessible or would not be selected by overload resolution to perform a copy or move of the object).
[Note: This latitude is granted to allow objects of class type to be passed to or returned from functions in registers.
— end note
]
Why does Return Value Optimization not happen [in some cases]?
The motivation for the rule is explained in the note of the quoted rule. Essentially, RVO is sometimes less efficient than no RVO.
If a destructor is defined by enabling the #if above, then the RVO does happen (and it also happens in some other cases such as defining a virtual method or adding a std::string member).
In the second case, this is explained by the rule because creating the temporary is only allowed when the destructor is trivial.
In the NRVO case, I suppose this is up to the language implementation.
On many ABIs, if a return value is a trivially copyable object whose size/alignment is equal to or less than that of a pointer/register, then the ABI will not permit elision. The reason being that it is more efficient to just return the value via a register than via a stack memory address.
Note that when you get the address either of the object in the function or the returned object, the compiler will force the object onto the stack. But the actual passing of the object will be via a register.

default constructor in modern c++

When I run this code, the VS compiler return error and says that t1.mem is uninitialized local variable.
#include <string>
#include <iostream>
struct T1
{
int mem;
};
struct T2
{
int mem;
T2() { } // "mem" is not in the initializer list
};
int main()
{
T1 t1; // class, calls implicit default ctor
std::cout << t1.mem << std::endl;
const T2 t2; // const class, calls the user-provided default ctor
// t2.mem is default-initialized (to indeterminate value)
std::cout << t2.mem << std::endl;
}
If I have not assigned the constructor for struct T1, the compiler would have to generate the default constructor? And struct T2's constructor is empty initialization list, why it has no error tips?
My understanding is that the compiler is trying to protect you from its own generated code, and assumes "you know best" when using your provided constructor. In addition, checking whether or not your constructor actually ends up initializing T2.mem anywhere, including in the body of the constructor, could be an arbitrarily complex task, so the compiler authors may have decided that was a task better left unattempted than poorly executed.
This seems to be supported by the warning you would get from MSVC if you declared t1 as a const T1:
'const' automatic data initialized with compiler generated default constructor produces unreliable results
Note the wording "compiler generated default constructor".
Btw, you'll see this same warning if you request the compiler-generated default constructor with T2() = default.
Well, compilers aren't perfect. Sometimes they warn for one thing, but they don't for another, similar thing. Many compilers also offer runtime instrumentation of the generated code, where they insert special instructions that detect errors like use of uninitialized variables and will abort the program when that happens. But again, the system is not perfect and it can miss things.
In any event, you don't actually need a constructor. You can inline-initialize the class members:
struct T1
{
int mem = 0;
};
The inline initialization will be used by default, unless a constructor initializes the member to something else:
struct T1
{
int mem = 0;
T1() = default;
T1(int m) : mem(m) { }
};
// ...
T1 first; // first.mem == 0
T1 second(1); // second.mem == 1

Does the C++ standard guarantee that a function return value has a constant address?

Consider this program:
#include <stdio.h>
struct S {
S() { print(); }
void print() { printf("%p\n", (void *) this); }
};
S f() { return {}; }
int main() { f().print(); }
As far as I can tell, there is exactly one S object constructed here. There is no copy elision taking place: there is no copy to be elided in the first place, and indeed, if I explicitly delete the copy and/or move constructor, compilers continue to accept the program.
However, I see two different pointer values printed. This happens because my platform's ABI returns trivially copyable types such as this one in CPU registers, so there is no way with that ABI of avoiding a copy. clang preserves this behaviour even when optimising away the function call altogether. If I give S a non-trivial copy constructor, even if it's inaccessible, then I do see the same value printed twice.
The initial call to print() happens during construction, which is before the start of the object's lifetime, but using this inside a constructor is normally valid so long as it isn't used in a way that requires the construction to have finished -- no casting to a derived class, for instance -- and as far as I know, printing or storing its value doesn't require the construction to have finished.
Does the standard allow this program to print two different pointer values?
Note: I'm aware that the standard allows this program to print two different representations of the same pointer value, and technically, I haven't ruled that out. I could create a different program that avoids comparing pointer representations, but it would be more difficult to understand, so I would like to avoid that if possible.
T.C. pointed out in the comments that this is a defect in the standard. It's core language issue 1590. It's a subtly different issue than my example, but the same root cause:
Some ABIs require that an object of certain class types be passed in a register [...]. The Standard should be changed to permit this usage.
The current suggested wording would cover this by adding a new rule to the standard:
When an object of class type X is passed to or returned from a function, if each copy constructor, move constructor, and destructor of X is either trivial or deleted, and X has at least one non-deleted copy or move constructor, implementations are permitted to create a temporary object to hold the function parameter or result object. [...]
For the most part, this would permit the current GCC/clang behaviour.
There is a small corner case: currently, when a type has only a deleted copy or move constructor that would be trivial if defaulted, by the current rules of the standard, that constructor is still trivial if deleted:
12.8 Copying and moving class objects [class.copy]
12 A copy/move constructor for class X is trivial if it is not user-provided [...]
A deleted copy constructor is not user-provided, and nothing of what follows would render such a copy constructor non-trivial. So as specified by the standard, such a constructor is trivial, and as specified by my platform's ABI, because of the trivial constructor, GCC and clang create an extra copy in that case too. A one-line addition to my test program demonstrates this:
#include <stdio.h>
struct S {
S() { print(); }
S(const S &) = delete;
void print() { printf("%p\n", (void *) this); }
};
S f() { return {}; }
int main() { f().print(); }
This prints two different addresses with both GCC and clang, even though even the proposed resolution would require the same address to be printed twice. This appears to suggest that while we will get an update to the standard to not require a radically incompatible ABI, we will still need to get an update to the ABI to handle the corner case in a manner compatible with what the standard will require.
This is not an answer, rather a note on the different behavior of g++ and clang in this case, depending on the -O optimization flag.
Consider the following code:
#include <stdio.h>
struct S {
int i;
S(int _i): i(_i) {
int* p = print("from ctor");
printf("about to put 5 in %p\n", (void *)&i);
*p = 5;
}
int* print(const char* s) {
printf("%s: %p %d %p\n", s, (void *) this, i, (void *)&i);
return &i;
}
};
S f() { return {3}; }
int main() {
f().print("from main");
}
We can see that clang (3.8) and g++ (6.1) are taking it a bit differently, but both get to the right answer.
clang (for no -O, -O1, -O2) and g++ (for no -O, -O1)
from ctor: 0x7fff9d5e86b8 3 0x7fff9d5e86b8
about to put 5 in 0x7fff9d5e86b8
from main: 0x7fff9d5e86b0 5 0x7fff9d5e86b0
g++ (for -O2)
from ctor: 0x7fff52a36010 3 0x7fff52a36010
about to put 5 in 0x7fff52a36010
from main: 0x7fff52a36010 5 0x7fff52a36010
It seems that they both do it right in both cases - when they decide to skip the register optimization (g++ -O2) and when they go with the register optimization but copy the value to the actual i on time (all other cases).

Warning: Unreferenced local variable

When I provide a constructor for class A, I don't get the unreferenced local variable why?
What does the empty constructor do to eliminate the warning?
class A
{
public:
A() {}
};
int main()
{
A a;
}
This is only a theory, but because a constructor may contain code that can cause side effects, someone may decide to construct an unused object just to run that code. If you have no constructor and never reference an object that you've constructed, then it can safely be determined that the object has no purpose.
For example if A is something that holds a mutex lock (and release the lock when destructed), then this code
int main()
{
A a;
// other actions
}
is able to keep this function thread-safe, even a does not be referenced.