GCC still generates guard variables with -fno-threadsafe-statics - c++

Below is a simple example using a static member inside a template. If this is compiled with g++ / avr-g++ using the -fno-threadsafe-statics option, the compiler still generates guard variables. This is unnecessary in my opinion.
struct A {
A() {}
void foo() {}
};
template<typename T>
struct B {
static void foo() {
mTop.foo();
}
inline static T mTop;
};
int main() {
B<A>::foo();
}
Does anybody know how to disable the generation of an 8-Bytes guard variable?
Edit: if you remove the empty ctor (or a defaulted-ctor) in order to use the compiler-generated, no guards are created. And if you simple create a global variable of type A (should be conceptually equal to the above), also no guards are created.

From the docs, emphasis mine:
-fno-threadsafe-statics
Do not emit the extra code to use the routines specified in the C++ ABI for thread-safe initialization of local statics. You can use this option to reduce code size slightly in code that doesn’t need to be thread-safe.
The example program in the question doesn't have any local statics, so there's nothing for this flag to actually apply to. However, even if thread safety isn't necessary, statics do still need a guard variable so as to avoid multiple initialization. Can't get around that.

Related

How does the use of `static` affect the speed of my code?

I was solving an exercise online, and at one point i needed to delete the "" from a the beginning and end of a string. This was my code:
void static inline process_value(std::string &value) {
if (value.back() !='>') {
value = value.substr(1, value.size()-2);
}
}
Called from this benchmark loop:
static void UsingStatic(benchmark::State& state) {
// Code inside this loop is measured repeatedly
for (auto _ : state) {
std::string valor("\"Hola\"");
process_valueS(valor);
// Make sure the variable is not optimized away by compiler
benchmark::DoNotOptimize(valor);
}
}
Just because of curiosity I did a benchmark.
Compiler: Clang-9.0
std: c++20
optim: O3
STL: libstdc++(GNU)
While I was at it I decided to remove static from process_value, making void inline process_value that was otherwise the same. To my surprise it was slower.
I thought that static only meant that the function was just for a file. But here it says that " 'static' means that the function should be inlined by the compiler if possible". But in that case when i removed static I think that the result should not have changed. Now I'm am confused, what other things does static do other than delimiting the function to a single .cpp, how does that affect performance?
The disassembly on QuickBench shows that the NoUsingStatic loop actually calls process_value instead of inlining it, despite the inline keyword making it legal for the compiler to do so. But UsingStatic does inline the call to process_valueS. That difference in compiler decision-making presumably explains the difference in performance, but why would clang choose not to inline a simple function declared void inline process_value(std::string &value){ ... }?
EDIT: Beacuse the question was closed because it was not clear enough, i deleted parts that where not related to the question. But if im missing some information please tell me in the comments
Clang uses a cost based decision whether a function will be inlined or not. This cost is affected by a lot of things. It is affected by static.
Fortunately, clang has an output, where we can observe this. Check out this godbolt link:
void call();
inline void a() {
call();
}
static inline void b() {
call();
}
void foo() {
a();
b();
}
In this little example, a() and b() are the same, the only exception is that b() is static.
If you move the mouse over the calls a() or b() on godbolt (in OptViewer window), you can read:
a(): cost=0, threshold=487
b(): cost=-15000, threshold=487
(clang will inline a call, if the cost is less than the threshold.)
clang gave b() a much lower cost, because it is static. It seems that clang will only give this -15000 cost reduction for a static function only once. If b() is called several times, the cost of all b()s will be zero, except one.
Here are the numbers for your case, link:
process_value(): cost=400, threshold=325 -> it is just above the threshold, won't be inlined
process_valueS():: cost=-14600, threshold=325 -> OK to inline
So, apparently, static can have a lot of impact, if it is only called once. Which makes sense, because inlining a static function once doesn't increase code size.
Tip: if you want to force clang to inline a function, use __attribute__((always_inline)) on it.
inline is just an advise to the compiler, which may or may not really inline that your particular code.
Regarding to the static keyword, if it's applied to a global variable, then it has the file-scope (as you've mentioned) if you compile your code as a separate compilation-unit. So it's even possible to have your static global variables accessible from other files if you compile them as a single compilation unit. This means that in reality, the scope of the global static variables is not the file but the compilation unit (which may or may not be one single file).
But, since you have a global static function, not a variable, it is accessible from everywhere as a global static function.
EDIT:
As suggested by #Peter Cordes in comments below, it may be a real mess with inline and static at the same time, so the official doc ( https://en.cppreference.com/w/cpp/language/inline ) says that the redefinition of inline functions (and variables since C++17) must be non-static.

Can an unused function instantiate a variable template with side-effects according to C++14?

Here is my code:
#include <iostream>
class MyBaseClass
{
public:
static int StaticInt;
};
int MyBaseClass::StaticInt = 0;
template <int N> class MyClassT : public MyBaseClass
{
public:
MyClassT()
{
StaticInt = N;
};
};
template <int N> static MyClassT<N> AnchorObjT = {};
class UserClass
{
friend void fn()
{
std::cout << "in fn()" << std::endl; //this never runs
(void)AnchorObjT<123>;
};
};
int main()
{
std::cout << MyBaseClass::StaticInt << std::endl;
return 0;
}
The output is:
123
...indicating MyClassT() constructor was called despite that fn() was never called.
Tested on gcc and clang with -O0, -O3, -Os and even -Ofast
Question
Does this program have undefined behavior according to C++ standard?
In other words: if later versions of compilers manage to detect that fn() will never be called can they optimize away template instantiation together with running the constructor?
Can this code somehow be made deterministic i.e. force the constructor to run - without referencing function name fn or the template parameter value 123 outside of the UserClass?
UPDATE: A moderator truncated my question and suggested further truncation. Original verbose version can be viewed here.
Template instantiation is a function of the code, not a function of any kind of dynamic runtime conditions. As a simplistic example:
template <typename T> void bar();
void foo(bool b) {
if (b) {
bar<int>();
} else {
bar<double>();
}
}
Both bar<int> and bar<double> are instantiated here, even if foo is never invoked or even if foo is only ever invoked with true.
For variable template, specifically, the rule is [temp.inst]/6:
Unless a variable template specialization has been explicitly instantiated or explicitly specialized, the variable template specialization is implicitly instantiated when it is referenced in a context that requires a variable definition to exist or if the existence of the definition affects the semantics of the program.
In your function:
friend void fn()
{
(void)AnchorObjT<123>;
};
AnchorObjT<123> is referenced in a context that requires a definition (regardless of whether fn() is ever called or even, in this case, it is even possible to call), hence it is instantiated.
But AnchorObjT<123> is a global variable, so its instantiation means we have an object that is constructed before main() - by the time we enter main(), AnchorObjT<123>'s constructor will have been run, setting StaticInt to 123. Note that we do not need to actually run fn() to invoke this constructor - fn()'s role here is just to instantiate the variable template, its constructor is invoked elsewhere.
Printing 123 is the correct, expected behavior.
Note that while the language requires the global object AnchorObjT<123> to exist, the linker may still the object because there is no reference to it. Assuming your real program does more with this object, if you need it to exist, you may need to do more to it to prevent the linker from removing it (e.g. gcc has the used attribute).
"If later versions of compilers manage to detect that fn() will never be called [and] they optimize away template instantiation" then those compilers would be broken.
C++ compilers are free to implement any optimization that has no observable effect. In the situation you have outlined there would be at least one observable effect: namely a static class member does not get constructed and initialized, so a C++ compiler cannot completely optimize that away. It won't happen.
A compiler can ignore everything else about the function call, and not actually compile the function call itself, but the compiler must do whatever it needs to do to make arrangements so that the static class member gets initialized as if that function call was made.
If the compiler can determine that nothing else in the program actually uses the static class member, and removing it completely has no observable effect, then the compiler can remove the static class member, and the function that initializes it (since nothing else references the function).
Note, that even taking an address of a function (or a class member) would result in an observable effect, so even if nothing actually calls the function, but something takes the address of the function, it can't just go away.
P.S. -- all of the above presumes no undefined behavior in the C++ code. With undefined behavior entering the picture, all the rules go out the window.
The short answer is it works.
The long answer is it works unless the linker discards your entire translation unit (.obj).
This can happen when you create a .lib and link it. The linker typically picks which .obj to link from the lib based on a dependency graph of "do I use symbols that obj exports".
So if you use this technique in a cpp file, that cpp files has no symbols that are used elsewhere in your exexutable (including indirectly via other obj in your lib that are in turn used by the executable), the linker may discard yoir obj file.
I have experienced this with clang. We where creating self registering factories, and some where being dropped. To fix it we created some macros that caused a trivial dependency to exist, preventing the obj file from being discarded.
This doesn't contradict the other answers, because the process of linking a lib is aboit deciding what is and what is not in your program.

Why can't a static constexpr member variable be passed to a function?

The following code produces an undefined reference to 'Test::color'.
#include <iostream>
struct Color{
int r,g,b;
};
void printColor(Color color) {
//printing color
}
class Test {
static constexpr Color color = {242,34,4};
public:
void print(){
printColor(color);
}
};
int main() {
Test test;
test.print();
return 0;
}
Why does this code produce the above error and what is the best way to avoid it, considering I want to use the latest version of the standard, C++17?
Should I define the static member variable, just like it was needed in earlier revisions of the standard (see the first answer here: Undefined reference to static constexpr char[]) or should I just create a new Color structure as can be seen below?
printColor(Color{color.r, color.g, color.b});
Edit:
I'm using CLion on Ubuntu 16.04, which as far as I could find out, uses g++ 5.4 for compiling. I have set it to use C++17 and still get the same error. The error is only present when color is passed to a function.
This is due to the fact that, before C++17, you had to specifically define the static variable outside the class:
class Test {
/* ... etc etc ... */
}
const constexpr Color Test::color;
The constexpr-ness of the static member does not let you 'waive' this explicit definition requirement.
With C++17, you no longer need to define static members explicitly. They are implicitly "inline" variables, which get auto-defined at some point, and just once per binary, without you having to take care of it. See here for the long proposal of this feature.
Note that the definition must only appear in a single translation unit (so probably not in a header with class Test that gets included a lot).
The problem was neither with the code itself, nor with the standard being used. CLion's default compiler does not fully support C++17, so that's why it showed a strange behavior that it could compile static constexpr member variables, but only as long as they were not passed to functions.
After updating to the most recent compiler version, I was able to run the code successfully without any changes.
Thank you for all your contribution.

How to make a debuggable file scoped (static?) class in c++?

I define a lot of classes in my cpp files. Often with unimaginative names, such as 'Implementation'. and am worried about name collisions, as these are horrible to debug (assuming the compiler silently drops one of the definitions). However, I also want to be able to debug my program, so anonymous namespaces are not an option. So, can file-scoped (aka. translation-unit scoped) classes be defined in c++ without using namespaces? If yes, how?
File-scoping in c
In c, one can create a global variable with int foo;, and limit the variable to file-scope using static int foo;. Similarly, file-scoped functions are also created with the static keyword, for example static void bar();. And structs do not define data or executable code, and thus are effectively always file-scoped.
Virtual member functions
Now c++ introduced virtual member functions, which caused structs (aka. classes) to contain data and executable code. Hence structs are now globally defined. However, it is not allowed to use static to make the struct file-scoped again. I.e. static struct Foo{}; does not compile.
Namespaces
To tackle the problem of name collisions, c++ also introduced named namespaces. Those can also be used to wrap the struct, such that it becomes pseudo-file-scoped. Though this still forces you to pick a unique name (that you don't even care about) for every file. As a solution, they also introduced anonymous namespaces. Unfortunately, neither gdb nor the visual studio debugger seem to support anonymous namespaces (it is basically impossible to refer to anything inside the anonymous namespace).
A partial solution is to use anonymous structs/classes and typedef these, as both are file-scoped (C - Limit struct Scope), except that it is impossible to explicitly define constructors or destructors (Constructor for a no-named struct). The following example compiles and runs fine with gcc and also allows gdb to put a breakpoint on method():
interface.h:
struct Iface {
virtual void method()=0;
};
implementation_a.cpp and implementation_b.cpp:
#include "interface.h"
typedef struct : Interface {
virtual void method();
} Implementation;
void Implementation::method() {
// Do stuff...
}
Interface * new_implementation_a() { // or new_implementation_b
return new Implementation();
}
main.cpp:
#include "iface.h"
int main() {
Interface * a = new_implementation_a();
Interface * b = new_implementation_b();
a->method();
b->method();
return 0;
}

Can unused data members be optimized out in C++

I have a C++ class which has a private unused char[] strictly to add padding to the class to prevent false sharing when the class is used in a shared array. My question is 2-fold:
Can this data member be optimized out by the compiler in some circumstances?
How can I silence the private field * not used warnings when I compile with -Wall? Preferably, without explicitly silencing the warning as I still want to catch instances of this issue elsewhere.
I wrote a little test to check for my compiler, and it seems that the member isn't removed, but I want to know if the standards allow this sort of optimization.
padding.cc
#include <iostream>
class A {
public:
int a_ {0};
private:
char padding_[64];
};
int main() {
std::cout << sizeof(A) << std::endl;
return 0;
}
compilation
$ clang++ --version
clang version 3.3 (tags/RELEASE_33/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
$ clang++ -std=c++11 -O3 -Wall padding.cc
padding.cc:8:8: warning: private field 'padding_' is not used [-Wunused-private-field]
char padding_[64];
^
1 warning generated.
$ ./a.out
68
I don't know about the compiler optimizations, but you can get rid of the warnings in two ways: Either use pragmas:
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wunused-private-field"
class A{
//...
};
#pragma clang diagnostic pop
or, which is probably better suited for you, include a fake friend function in your class:
class A{
friend void i_do_not_exist();
//...
};
In that way, the compiler cannot know if the field is used or not. Therefore, it does not complain and will definitely not throw anything out. This can lead to safety issues if the i_do_not_exist() function is ever defined anywhere, as that function is given direct access to the private members of the class.
A third solution is to define a dummy function which access the padding_ member:
class A {
private:
void ignore_padding__() { padding_[0] = 0; }
//...
};
The compiler can perform any change that can't be detected by a conforming program. So the answer is yes. But a compiler that makes changes that make your code worse is a lousy compiler. Odds are, you aren't using a lousy compiler.
I'm pretty sure compilers aren't allowed to reorder or remove data members, so the .h files are self-documenting for anyone writing an API that accepts such a struct. They're only allowed to use simple and well-defined padding rules so developers can easily infer the offsets just from reading the code.
That said, why are you making assumptions on the cache size and the likelihood of false sharing? The cache size should be the compiler's responsibility, and I suspect the real issue is trying to share an array between multiple threads. Update the struct locally on each thread and only write out the changes to the shared array at the end.
How can I silence the "private field * not used" warnings when I compile with -Wall?
First, you might used alignas to avoid manual padding
(if value is a power of 2):
class alignas(64) A
{
public:
int a_{0};
};
Demo
It is not the case of your example :-/
So you might use attribute [[maybe_unused]] to silent the warning:
class A
{
public:
int a_ {0};
private:
[[maybe_unused]]char padding_[64];
};
Demo