-O2 and -fPIC option in gcc - c++

For performance optimization, I would like to make use of the reference of a string rather than its value. Depending on the compilation options, I obtain different results. The behavior is a bit unclear to me, and I do not know the actual gcc flag that causes that difference.
My code is
#include <string>
#include <iostream>
const std::string* test2(const std::string& in) {
// Here I want to make use of the pointer &in
// ...
// it's returned only for demonstration purposes...
return &in;
}
int main() {
const std::string* t1 = test2("text");
const std::string* t2 = test2("text");
// only for demonstration, the cout is printed....
std::cout<<"References are: "<<(t1==t2?"equivalent. ":"different. ")<<t1<<"\t"<<t2<<std::endl;
return 0;
}
There are three compilation options:
gcc main.cc -o main -lstdc++ -O0 -fPIC && ./main
gcc main.cc -o main -lstdc++ -O2 -fno-PIC && ./main
gcc main.cc -o main -lstdc++ -O2 -fPIC && ./main
The first two yield equivalent results (References are: different.), so the pointers are different, but the third one results in equivalent pointers (References are: equivalent.).
Why does this happen, and which option do I have to add to the options -O2 -fPIC such that the pointers become again different?
Since this code is embedded into a larger framework, I cannot drop the options -O2 or -fPIC.
Since I get the desired result with the option -O2 and also with -fPIC, but a different behavior if both flags are used together, the exact behavior of these flags is unclear to me.
I tried with gcc4.8 and gcc8.3.

Both t1 and t2 are dangling pointers, they point to a temporary std::string which is already destroyed. The temporary std::string is constructed from the string literal during each call to test2("text") and lives until the end of the full-expression (the ;).
Their exact values depend on how the compiler (re-)uses stack space at a particular optimization level.
which option do I have to add to the options -O2 -fPIC such that the pointers become again different?
The code exhibits undefined behavior because it's illegal to compare invalid pointer values. Simply don't do this.
If we ignore the comparing part, then we end up with this version:
#include <string>
#include <iostream>
void test2(const std::string& in) {
std::cout << "Address of in: " << (void*)&in << std::endl;
}
int main() {
test2("text");
test2("text");
}
Now this code is free from UB, and it will print either the same address or different addresses, depending on how the compiler re-uses stack space between function calls. There is no way to control this, but it's no problem because keeping track of addresses of temporaries is a bad idea to begin with.
You can try using const char* as the input argument instead, then no temporary will be created in a call test2("text"). But here again, whether or not two instances of "text" point to the same location is implementation-defined. Though GCC does coalesce identical string literals, so at least in GCC you should observe the behavior you're after.

Related

How can I make GCC/Clang warn about the use of uninitialized members?

I am compiling the code behind
class Test {
public:
Test() {}
int k;
};
int main() {
Test t;
std::cout << t.k << "\n";
}
like
g/clang++ main.cpp -Wall -Wextra --std=c++14 -o exe; ./exe
Why neither of the compilers does not warn me about indeterminate value of the integer is not it a very serious potential bug? How to enable a warning for indeterminate initializations?
For this example, GCC gives me the desired warning when I give it -O1 (or higher).
Presumably whatever mechanism it uses to detect this is tied into the optimisation effort level somehow. It's a notoriously hard thing to do.
Ensure that you heed your release-build warnings as well as debug-build warnings.

Segmentation fault in capturing aligned variables in lambdas

This little code snippet segfaults with g++ 6.2.0 and clang++ 3.8.1 with:
clang++ -std=c++11 -O3 -mavx -pthread or g++ -std=c++11 -O3 -mavx -pthread
#include <thread>
#include <iostream>
class alignas(32) AlignedObject {
public:
float dummy[8];
};
int main() {
while (true) {
std::thread([](){
AlignedObject x;
std::cout << &x;
std::thread([x](){
std::cout << &x;
}).join();
}).join();
}
return 0;
}
Looking at the disassembly, both compilers are inserting vmovaps instructions that are failing, suggesting that compiler-generated objects somewhere aren't being aligned properly. It works fine if -mavx is removed since the instruction doesn't get used anymore. Is this a compiler bug or is this code relying on undefined behaviour?
Alignment specifiers such as alignas(n) or __attribute__((aligned(n))) are observed only for variables with automatic storage class. However std::function (which is used by the lambda) is permitted (and sometimes required) to dynamically allocate the function closure, in which case alignment specifiers are ignored and only alignments up to std::max_align_t are guaranteed.
In conclusion, short of passing your own custom allocator to the underlying std::function, objects with extended alignment requirements cannot be safely captured by value in lambdas, and must be captured by reference. (I guess this is more a property of std::bind and not lambdas specifically).

How to clearly produce inlining results in C++

I've been reading again Scott Meyers' Effective C++ and more specifically Item 30 about inlining.
So I wrote the following, trying to induce that optimization with gcc 4.6.3
// test.h
class test {
public:
inline int max(int i) { return i > 5 ? 1 : -1; }
int foo(int);
private:
int d;
};
// test.cpp
int test::foo(int i) { return max(i); }
// main.cpp
#include "test.h"
int main(int argc, const char *argv[]) {
test t;
return t.foo(argc);
}
and produced the relevant assembly using alternatively the following:
g++ -S -I. test.cpp main.cpp
g++ -finline-functions -S -I. test.cpp main.cpp
Both commands produced the same assembly as far as the inline method is concerned;
I can see both the max() method body (also having a cmpl statement and the relevant jumps) and its call from foo().
Am I missing something terribly obvious? I can't say that I combed through the gcc man page, but couldn't find anything relevant standing out.
So, I just increased the optimization level to -O3 which has the inline optimizations on by default, according to:
g++ -c -Q -O3 --help=optimizers | grep inline
-finline-functions [enabled]
-finline-functions-called-once [enabled]
-finline-small-functions [enabled]
unfortunately, this optimized (as expected) the above code fragment almost out of existence.
max() is no longer there (at least as an explicitly tagged assembly block) and foo() has been reduced to:
_ZN4test3fooEi:
.LFB7:
.cfi_startproc
rep
ret
.cfi_endproc
which I cannot clearly understand at the moment (and is out of research scope).
Ideally, what I would like to see, would have been the assembly code for max() inside the foo() block.
Is there a way (either through cmd-line options or using a different (non-trivial?) code fragment) to produce such an output?
The compiler is entirely free to inline functiones even if you don't ask it to - both when you use inline keyword or not, or whether you use -finline-functions or not (although probably not if you use -fnoinline-functions - that would be contrary to what you asked for, and although the C++ standard doesn't say so, the flag becomes pretty pointless if it doesn't do something like what it says).
Next, the compiler is also not always certain that your function won't be used "somewhere else", so it will produce an out-of-line copy of most inline functions, unless it's entirely clear that it "can not possibly be called from somewhere else [for example the class is declared such that it can't be reached elsewhere].
And if you don't use the result of a function, and the function doesn't have side-effects (e.g. writing to a global variable, performing I/O or calling a function the compiler "doesn't know what it does"), then the compiler will eliminate that code as "dead" - because you don't really want unnecessary code, do you? Adding a return in front of max(i) in your foo function should help.

Why is no 'unused variable' warning given for boost::scoped_lock instances? [duplicate]

boost::mutex::scoped_lock is a handy RAII wrapper around locking a mutex. I use a similar technique for something else: a RAII wrapper around asking a data interface to detach from/re-attach to a serial device.
What I can't figure out, though, is why in the code below only my object mst — whose instantiation and destruction do have side effects — causes g++ to emit an "unused variable" warning error whereas l manages to remain silent.
Do you know? Can you tell me?
[generic#sentinel ~]$ cat test.cpp
#include <boost/shared_ptr.hpp>
#include <boost/thread/mutex.hpp>
#include <iostream>
struct MyScopedThing;
struct MyWorkerObject {
void a() { std::cout << "a"; }
void b() { std::cout << "b"; }
boost::shared_ptr<MyScopedThing> getScopedThing();
};
struct MyScopedThing {
MyScopedThing(MyWorkerObject& w) : w(w) {
w.a();
}
~MyScopedThing() {
w.b();
}
MyWorkerObject& w;
};
boost::shared_ptr<MyScopedThing> MyWorkerObject::getScopedThing() {
return boost::shared_ptr<MyScopedThing>(new MyScopedThing(*this));
}
int main() {
boost::mutex m;
boost::mutex::scoped_lock l(m);
MyWorkerObject w;
const boost::shared_ptr<MyScopedThing>& mst = w.getScopedThing();
}
[generic#sentinel ~]$ g++ test.cpp -o test -lboost_thread -Wall
test.cpp: In function ‘int main()’:
test.cpp:33: warning: unused variable ‘mst’
[generic#sentinel ~]$ ./test
ab[generic#sentinel ~]$ g++ -v 2>&1 | grep version
gcc version 4.4.5 20110214 (Red Hat 4.4.5-6) (GCC)
Note that the question has changed since the other answers were written.
Likely the reason g++ doesn't warn in the current form is because mst is a reference, and constructing and destructing a reference has no side effects. It's true that here the reference is extending the lifetime of a temporary, which has effects in its constructor and destructor, but apparently g++ doesn't realise that makes a difference.
If my memory serves me right, g++ has the unfortunate habit of emitting unused variable errors differently depending on the optimization settings because the detection works at the optimizer level.
That is, the code is optimized in SSA form, and if the optimizer detects that a variable, after optimization, is unused, then it may emit a warning (I much prefer Clang analysis for this...).
Therefore it is probably a matter of detecting what the destructor does. I wonder if it takes a conservative approach whenever the definition of the destructor is offline, I would surmise this equates a function call then and that the this qualify as a use of the variable.
I suspect the reason is that your class has a trivial
destructor, and that g++ only warns about unused variables if
the destructor is trivial. Invoking a non-trivial destructor is a
"use".

How to trace out why gcc and g++ produces different code

Is it possible to see what is going on behind gcc and g++ compilation process?
I have the following program:
#include <stdio.h>
#include <unistd.h>
size_t sym1 = 100;
size_t *addr = &sym1;
size_t *arr = (size_t*)((size_t)&arr + (size_t)&addr);
int main (int argc, char **argv)
{
(void) argc;
(void) argv;
printf("libtest: addr of main(): %p\n", &main);
printf("libtest: addr of arr: %p\n", &arr);
while(1);
return 0;
}
Why is it possible to produce the binary without error with g++ while there is an error using gcc?
I'm looking for a method to trace what makes them behave differently.
# gcc test.c -o test_app
test.c:7:1: error: initializer element is not constant
# g++ test.c -o test_app
I think the reason can be in fact that gcc uses cc1 as a compiler and g++ uses cc1plus.
Is there a way to make more precise output of what actually has been done?
I've tried to use -v flag but the output is quite similar. Are there different flags passed to linker?
What is the easiest way to compare two compilation procedures and find the difference in them?
In this case, gcc produces nothing because your program is not valid C. As the compiler explains, the initializer element (expression used to initialize the global variable arr) is not constant.
C requires initialization expressions to be compile-time constants, so that the contents of local variables can be placed in the data segment of the executable. This cannot be done for arr because the addresses of variables involved are not known until link time and their sum cannot be trivially filled in by the dynamic linker, as is the case for addr1. C++ allows this, so g++ generates initialization code that evaluates the non-constant expressions and stores them in global variables. This code is executed before invocation of main().
Executables cc1 and cc1plus are internal details of the implementation of the compiler, and as such irrelevant to the observed behavior. The relevant fact is that gcc expects valid C code as its input, and g++ expects valid C++ code. The code you provided is valid C++, but not valid C, which is why g++ compiles it and gcc doesn't.
There is a slightly more interesting question lurking here. Consider the following test cases:
#include <stdint.h>
#if TEST==1
void *p=(void *)(unsigned short)&p;
#elif TEST==2
void *p=(void *)(uintptr_t)&p;
#elif TEST==3
void *p=(void *)(1*(uintptr_t)&p);
#elif TEST==4
void *p=(void *)(2*(uintptr_t)&p);
#endif
gcc (even with the very conservative flags -ansi -pedantic-errors) rejects test 1 but accepts test 2, and accepts test 3 but rejects test 4.
From this I conclude that some operations that are easily optimized away (like casting to an object of the same size, or multiplying by 1) get eliminated before the check for whether the initializer is a constant expression.
So gcc might be accepting a few things that it should reject according to the C standard. But when you make them slightly more complicated (like adding the result of a cast to the result of another cast - what useful value can possibly result from adding two addresses anyway?) it notices the problem and rejects the expression.