I have a program which nearly immediately finishes with -O0 on gcc, but hangs forever with gcc and -O3. It also exits immediately if I remove the [[gnu::pure]] function attribute, even though the function does not modify global state. The program is in three files:
thread.hpp
#include <atomic>
extern ::std::atomic<bool> stopthread;
extern void threadloop();
[[gnu::pure]] extern int get_value_plus(int x);
thread.cpp
#include <thread>
#include <atomic>
#include "thread.hpp"
namespace {
::std::atomic<int> val;
}
::std::atomic<bool> stopthread;
void threadloop()
{
while (!stopthread.load())
{
++val;
}
}
[[gnu::pure]] int get_value_plus(int x)
{
return val.load() + x;
}
main.cpp
#include <thread>
#include "thread.hpp"
int main()
{
stopthread.store(false);
::std::thread loop(threadloop);
while ((get_value_plus(5) + get_value_plus(5)) % 2 == 0)
;
stopthread.store(true);
loop.join();
return 0;
}
Is this a compiler bug? A lack of documentation for the proper caveats to using [[gnu::pure]]? A misreading of the documentation for [[gnu::pure]] such that I've coded a bug?
I have a program which nearly immediately finishes with -O0 on gcc, but hangs forever with gcc and -O3
Yes, because the program gets compiled down to an infinite loop when optimizations are enabled.
Is this a compiler bug? A lack of documentation for the proper caveats to using [[gnu::pure]]? A misreading of the documentation for [[gnu::pure]] such that I've coded a bug?
It isn't a compiler bug. get_value_plus is not a pure function:
[[gnu::pure]] int get_value_plus(int x)
{
return val.load() + x;
}
since the return value can change at any time (for the same x), because val is expected to be modified by the other thread.
The compiler, however, thinking that get_value_plus will always return the same value, will perform CSE and therefore will assume this:
while ((get_value_plus(5) + get_value_plus(5)) % 2 == 0);
can be written as:
int x = get_value_plus(5);
while ((x + x) % 2 == 0);
Which, indeed, it is an infinite loop regardless of the value of x:
while (true);
Please see the GCC documentation on pure for more details.
In general, avoid using optimization hints unless they are well understood!
In this case, the misunderstanding is that pure functions are allowed to read global memory, but not if that memory is changing from call to call by someone else than the caller:
However, functions declared with the pure attribute can safely read any non-volatile objects, and modify the value of objects in a way that does not affect their return value or the observable state of the program.
As it turns out, I misread the documentation. From the online documentation about the pure attribute in gcc:
The pure attribute prohibits a function from modifying the state of the program that is observable by means other than inspecting the function’s return value. However, functions declared with the pure attribute can safely read any non-volatile objects, and modify the value of objects in a way that does not affect their return value or the observable state of the program.
and a different paragraph:
Some common examples of pure functions are strlen or memcmp. Interesting non-pure functions are functions with infinite loops or those depending on volatile memory or other system resource, that may change between consecutive calls (such as the standard C feof function in a multithreading environment).
These two paragraphs make it clear that I've been lying to the compiler, and the function I wrote does not qualify as being 'pure' because it depends on a variable that might change at any time.
The reason I asked this question is because the answers to this question: __attribute__((const)) vs __attribute__((pure)) in GNU C didn't address this problem at all (at the time I asked my question anyway). And a recent C++ Weekly episode had a comment asking about threads and pure functions. So it's clear there's some confusion out there.
So the criteria for a function that qualifies for this marker is that it must not modify global state, though it is allowed to read it. But, if it does read global state, it is not allowed to read any global state that could be considered 'volatile', and this is best understood as state that might change between two immediately successive calls to the function, i.e. if the state it's reading can change in a situation like this:
f();
f();
Related
I have a legacy interface that has a function with a signature that looks like the following:
int provide_values(int &x, int &y)
x and y are considered output parameters in this function. Note: I'm aware of the drawbacks of using output parameters and that there are better design choices for such an interface. I'm not trying to debate the merits of this interface.
Within the implementation of this function, it first checks to see if the addresses of the two output parameters are the same, and returns an error code if they are.
if (&x == &y) {
return -1; // Error: both output parameters are the same variable
}
Is there a way at compile time to prevent callers of this function from providing the same variable for the two output parameters without having such a check within the body of the function? I'm thinking of something similar to the restrict keyword in C, but that only is a signal to the compiler for optimization, and only provides a warning when compiling code that calls such a function with the same pointer.
No, there's not. Keep in mind that the calling code could derive x and y from references returned from some arbitrary black-box functions. But even otherwise, it is provably impossible (by the Incompleteness Theorem) for the compiler to robustly determine whether they point to the same object, since what objects they are bound to is determined by the execution of the program.
If all you want to do is preventing that the user calls provide_values(xyz, xyz), you can use a macro as in the following example. However, this won't protect the user from calling provide_values(xyz, reference_to_xyz), so the whole this is probably pointless anyway.
#include <cstring>
void provide_values(int&, int&) {}
#define PROV_VAL(x, y) if (strcmp((#x),(#y))) { provide_values(x, y); } else { throw -1; }
int main()
{
int x;
int y;
PROV_VAL(x,y);
//PROV_VAL(x,x); // this throws
int& z = x;
PROV_VAL(x,z); // this passes though!
}
Consider the following code:
file_1.hpp:
typedef void (*func_ptr)(void);
func_ptr file1_get_function(void);
file1.cpp:
// file_1.cpp
#include "file_1.hpp"
static void some_func(void)
{
do_stuff();
}
func_ptr file1_get_function(void)
{
return some_func;
}
file2.cpp
#include "file1.hpp"
void file2_func(void)
{
func_ptr function_pointer_to_file1 = file1_get_function();
function_pointer_to_file1();
}
While I believe the above example is technically possible - to call a function with internal linkage only via a function pointer, is it bad practice to do so? Could there be some funky compiler optimizations that take place (auto inline, for instance) that would make this situation problematic?
There's no problem, this is fine. In fact , IMHO, it is a good practice which lets your function be called without polluting the space of externally visible symbols.
It would also be appropriate to use this technique in the context of a function lookup table, e.g. a calculator which passes in a string representing an operator name, and expects back a function pointer to the function for doing that operation.
The compiler/linker isn't allowed to make optimizations which break correct code and this is correct code.
Historical note: back in C89, externally visible symbols had to be unique on the first 6 characters; this was relaxed in C99 and also commonly by compiler extension.
In order for this to work, you have to expose some portion of it as external and that's the clue most compilers will need.
Is there a chance that there's a broken compiler out there that will make mincemeat of this strange practice because they didn't foresee someone doing it? I can't answer that.
I can only think of false reasons to want to do this though: Finger print hiding, which fails because you have to expose it in the function pointer decl, unless you are planning to cast your way around things, in which case the question is "how badly is this going to hurt".
The other reason would be facading callbacks - you have some super-sensitive static local function in module m and you now want to expose the functionality in another module for callback purposes, but you want to audit that so you want a facade:
static void voodoo_function() {
}
fnptr get_voodoo_function(const char* file, int line) {
// you tagged the question as C++, so C++ io it is.
std::cout << "requested voodoo function from " << file << ":" << line << "\n";
return voodoo_function;
}
...
// question tagged as c++, so I'm using c++ syntax
auto* fn = get_voodoo_function(__FILE__, __LINE__);
but that's not really helping much, you really want a wrapper around execution of the function.
At the end of the day, there is a much simpler way to expose a function pointer. Provide an accessor function.
static void voodoo_function() {}
void do_voodoo_function() {
// provide external access to voodoo
voodoo_function();
}
Because here you provide the compiler with an optimization opportunity - when you link, if you specify whole program optimization, it can detect that this is a facade that it can eliminate, because you let it worry about function pointers.
But is there a really compelling reason not just to remove the static from infront of voodoo_function other than not exposing the internal name for it? And if so, why is the internal name so precious that you would go to these lengths to hide that?
static void ban_account_if_user_is_ugly() {
...;
}
fnptr do_that_thing() {
ban_account_if_user_is_ugly();
}
vs
void do_that_thing() { // ban account if user is ugly
...
}
--- EDIT ---
Conversion. Your function pointer is int(*)(int) but your static function is unsigned int(*)(unsigned int) and you don't want to have to cast it.
Again: Just providing a facade function would solve the problem, and it will transform into a function pointer later. Converting it to a function pointer by hand can only be a stumbling block for the compiler's whole program optimization.
But if you're casting, lets consider this:
// v1
fnptr get_fn_ptr() {
// brute force cast because otherwise it's 'hassle'
return (fnptr)(static_fn);
}
int facade_fn(int i) {
auto ui = static_cast<unsigned int>(i);
auto result = static_fn(ui);
return static_cast<int>(result);
}
Ok unsigned to signed, not a big deal. And then someone comes along and changes what fnptr needs to be to void(int, float);. One of the above becomes a weird runtime crash and one becomes a compile error.
What if I define main as a reference to function?
#include<iostream>
#include<cstring>
using namespace std;
int main1()
{
cout << "Hello World from main1 function!" << endl;
return 0;
}
int (&main)() = main1;
What will happen? I tested in an online compiler with error "Segmentation fault":
here
And under VC++ 2013 it will create a program crashing at run-time!
A code calling the data of the function pointer as a code will be compiled which will immediately crash on launch.
I would also like an ISO C++ standard quote about this.
The concept will be useful if you want to define either of 2 entry-points depending on some macro like this:
int main1();
int main2();
#ifdef _0_ENTRY
int (&main)() = main1;
#else
int (&main)() = main2;
#endif
That's not a conformant C++ program. C++ requires that (section 3.6.1)
A program shall contain a global function called main
Your program contains a global not-a-function called main, which introduces a name conflict with the main function that is required.
One justification for this would be it allows the hosted environment to, during program startup, make a function call to main. It is not equivalent to the source string main(args) which could be a function call, a function pointer dereference, use of operator() on a function object, or construction of an instance of a type main. Nope, main must be a function.
One additional thing to note is that the C++ Standard never says what the type of main actually is, and prevents you from observing it. So implementations can (and do!) rewrite the signature, for example adding any of int argc, char** argv, char** envp that you have omitted. Clearly it couldn't know to do this for your main1 and main2.
This would be useful if you want to define either of 2 entry-points depending on some macro
No, not really. You should do this:
int main1();
int main2();
#ifdef _0_ENTRY
int main() { return main1(); }
#else
int main() { return main2(); }
#endif
This is soon going to become clearly ill-formed, thanks to the resolution of CWG issue 1886, currently in "tentatively ready" status, which adds, among other things, the following to [basic.start.main]:
A program that declares a variable main at global scope or that
declares the name main with C language linkage (in any namespace) is
ill-formed.
What will happen in practice is highly dependent on the implementation.
In your case your compiler apparently implements that reference as a "pointer in disguise". In addition to that, the pointer has external linkage. I.e. your program exports an external symbol called main, which is actually associated with memory location in data segment occupied by a pointer. The linker, without looking too much into it, records that memory location as the program's entry point.
Later, trying to use that location as an entry point causes segmentation fault. Firstly, there's no meaningful code at that location. Secondly, a mere attempt to pass control to a location inside a data segment might trigger the "data execution protection" mechanisms of your platform.
By doing this you apparently hoped that the reference will get optimized out, i.e. that main will become just another name for main1. In your case it didn't happen. The reference survived as an independent external object.
Looks like you already answered the question about what happens.
As far as why, in your code main is a reference/pointer to a function. That is different than a function. And I would expect a segment fault if the code is calling a pointer instead of a function.
Considering the following setup, I have ran into a quite strange phenomena, which I can not really explain. Using Visual Studio 2005, the following piece of code results in crash. I would like to really know the reason.
playground.cpp
static int local=-1;
#include "common.h"
int main(int arg)
{
setit();
docastorUpdate();
return 0;
}
common.h
#include <stdio.h>
#include <iostream>
void docastorUpdate();
static int *gemini;
inline void setit()
{
gemini = &local;
}
castor.cpp
static int local = 2;
#include "common.h"
void docastorUpdate() {
setit();
// crashing here, dereferencing a null pointer
std::cout << "castor:" << *gemini << std::endl;
}
The thing is, that the crash disappears when
I move the inline function setit() to an unnamed namespace
I make it static
To put it in a nutshell, I would need help to understand the reasons. Any suggestion is appreciated! (I am aware that, this solution is not one of the best partices, just being curious.)
This breaks because you are violating the one-definition rule. The one-definition rule says that in a program, across all translation units, there is only one definition of any given function. inline is sort of an exception to this rule, and it more or less means "dear compiler, there will be several definitions of this function, but they will all be the same, I promise".
static, when used here for local, means "dear compiler, this is an internal detail that only this translation unit will ever see; please do not confuse it with variables named local from other translation units"
So you promised the compiler that all definitions of setit will be the same, and asked the compiler to give each translation unit its very own local variable.
However, since the setit function uses whatever variable named local is in scope, the end result is two different definitions of setit, each one using a different variable. You just broke your promise. The compiler trusted you and the result was a totally messed up program. It thought it could do certain things with the code based on your promises, but since you broke them behind its back, those things it tried to do with the code didn't work at all.
Your code invokes undefined behaviour, in a very subtle way.
[C++11: 7.1.2/4]: An inline function shall be defined in every translation unit in which it is ODR-used and shall have exactly the same definition in every case. [..]
Although the definition looks the same because it's lexically copy-pasted into each translation unit by your #include, it's not because &local isn't taking the address of the same variable, in each case.
As a result, anything can happen when you run your program, including transferring all your hard-earned savings into my bank account, or taking away all of Jon Skeet's rep.
This is why making the function non-inline solves the problem; also, putting it in an unnamed namespace makes it a different function in each translation unit.
you can avoid using static in all the places, you can use extern in your case:
common.h:
extern int *gemini;
common.cpp;
int *gemini = nullptr;
also avoid using local like that , instead you can do this:
inline void setit(int * p)
{
gemini = p;
}
void docastorUpdate()
{
static int local = 2;
setit(&local);
std::cout << "castor:" << *gemini << std::endl;
}
Let's say you have a function in C/C++, that behaves a certain way the first time it runs. And then, all other times it behaves another way (see below for example). After it runs the first time, the if statement becomes redundant and could be optimized away if speed is important. Is there any way to make this optimization?
bool val = true;
void function1() {
if (val == true) {
// do something
val = false;
}
else {
// do other stuff, val is never set to true again
}
}
gcc has a builtin function that let you inform the implementation about branch prediction:
__builtin_expect
http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
For example in your case:
bool val = true;
void function1()
{
if (__builtin_expect(val, 0)) {
// do something
val = false;
}
else {
// do other stuff, val is never set to true again
}
}
You should only make the change if you're certain that it truly is a bottleneck. With branch-prediction, the if statement is probably instant, since it's a very predictable pattern.
That said, you can use callbacks:
#include <iostream>
using namespace std;
typedef void (*FunPtr) (void);
FunPtr method;
void subsequentRun()
{
std::cout << "subsequent call" << std::endl;
}
void firstRun()
{
std::cout << "first run" << std::endl;
method = subsequentRun;
}
int main()
{
method = firstRun;
method();
method();
method();
}
produces the output:
first run subsequent call subsequent call
You could use a function pointer but then it will require an indirect call in any case:
void (*yourFunction)(void) = &firstCall;
void firstCall() {
..
yourFunction = &otherCalls;
}
void otherCalls() {
..
}
void main()
{
yourFunction();
}
One possible method is to compile two different versions of the function (this can be done from a single function in the source with templates), and use a function pointer or object to decide at runtime. However, the pointer overhead will likely outweigh any potential gains unless your function is really expensive.
You could use a static member variable instead of a global variable..
Or, if the code you're running the first time changes something for all future uses (eg, opening a file?), you could use that change as a check to determine whether or not to run the code (ie, check if the file is open). This would save you the extra variable. Also, it might help with error checking - if for some reason the initial change is be unchanged by another operation (eg, the file is on removable media that is removed improperly), your check could try to re-do the change.
A compiler can only optimize what is known at compile time.
In your case, the value of val is only known at runtime, so it can't be optimized.
The if test is very quick, you shouldn't worry about optimizing it.
If you'd like to make the code a little bit cleaner you could make the variable local to the function using static:
void function() {
static bool firstRun = true;
if (firstRun) {
firstRun = false;
...
}
else {
...
}
}
On entering the function for the first time, firstRun would be true, and it would persist so each time the function is called, the firstRun variable will be the same instance as the ones before it (and will be false each subsequent time).
This could be used well with #ouah's solution.
Compilers like g++ (and I'm sure msvc) support generating profile data upon a first run, then using that data to better guess what branches are most likely to be followed, and optimizing accordingly. If you're using gcc, look at the -fprofile-generate option.
The expected behavior is that the compiler will optimize that if statement such that the else will be ordered first, thus avoiding the jmp operation on all your subsequent calls, making it pretty much as fast as if it wern't there, especially if you return somewhere in that else (thus avoiding having to jump past the 'if' statements)
One way to make this optimization is to split the function in two. Instead of:
void function1()
{
if (val == true) {
// do something
val = false;
} else {
// do other stuff
}
}
Do this:
void function1()
{
// do something
}
void function2()
{
// do other stuff
}
One thing you can do is put the logic into the constructor of an object, which is then defined static. If such a static object occurs in a block scope, the constructor is run the fist time that an execution of that scope takes place. The once-only check is emitted by the compiler.
You can also put static objects at file scope, and then they are initialized before main is called.
I'm giving this answer because perhaps you're not making effective use of C++ classes.
(Regarding C/C++, there is no such language. There is C and there is C++. Are you working in C that has to also compile as C++ (sometimes called, unofficially, "Clean C"), or are you really working in C++?)
What is "Clean C" and how does it differ from standard C?
To remain compiler INDEPENDENT you can code the parts of if() in one function and else{} in another. almost all compilers optimize the if() else{} - so, once the most LIKELY being the else{} - hence code the occasional executable code in if() and the rest in a separate function that's called in else