To the best of my knowledge, g++ with optimization turned on will remove the function call to bar entirely in the following case:
int bar() { }
int foo() { bar(); }
However, consider the following two cases, with bar defined as above:
Case 1:
int foo(int a, int b) {
if (a > b) bar();
}
Case 2:
int foo() { bar(); }
int foo2() {foo(); }
In Case 1, will the if statement also be removed, since it executes dead code even if the condition is true?
In Case 2:, will the call to foo inside foo2 be removed?
Following the suggestions in the comments, I tried this myself and it appears that the empty function calls are indeed removed recursively and completely in both cases I described, at least for gcc 4.8.1 and g++ 4.8.1.
I compiled the following two programs, first with gcc -S and then with gcc -S -O2.
Program 1:
int bar() { }
int foo() { bar(); }
int main() {
foo();
}
Program 2:
int bar() { }
int foo(int a, int b) {
if (a > b) bar();
}
int main() {
foo(2,1);
}
I also tried with foo's arguments passed in from the command line, to make sure the removal was not because of constants passed to foo.
int main(int argc, char** argv) {
foo(argc,1);
}
The compiler doesn't remove entire functions, the linker does.
If you're building an executable (or the functions are not exported out of a library), then the linker will remove all the orphan functions. If it doesn't, it's a bug :).
BTW, storing the address of a function (e.g. in a variable or passing it to another function) guaranties that the function will stay.
Edit
Just, to be clear, the optimizing compiler will inline functions, effectively removing the function calls where it deems necessary. In the above case (super simple functions) there's no doubt it will inline them. BTW, STL implementations (and boost) depend on this feature heavily.
Related
As I understood if (https://learn.microsoft.com/en-us/cpp/cpp/noalias?view=vs-2019) __declspec(noalias) means that the function only modifies memory inside her body or through the parameters, so its not modifying static variables, or memory throught double pointers, is that correct?
static int g = 3;
class Test
{
int x;
Test& __declspec(noalias) operator +(const int b) //is noalias correct?
{
x += b;
return *this;
}
void __declspec(noalias) test2(int& x) { //correct here?
x = 3;
}
void __declspec(noalias) test3(int** x) { //not correct here!?
*x = 5;
}
}
Given something like:
extern int x;
extern int bar(void);
int foo(void)
{
if (x)
bar();
return x;
}
a compiler that knows nothing about bar() would need to generate code that allows for the possibility that it might change the value of x, and would thus have to load the value of x both before and after the function call. While some systems use so-called "link time optimization" to defer code generation of a function until after any function it calls have been analyzed to see what external objects, if any, they might access, MS uses a simpler approach of simply allowing function prototypes to say that they don't access any outside objects which the calling code might want to cache. This is a crude approach, but allows compilers to reap low hanging fruit cheaply and easily.
Preamble
I'm using avr-g++ for programming AVR microcontrollers and therefore I always need to get very efficient code.
GCC usually can optimize a function if its argument are compile-time constants, e.g. I have function pin_write(uint8_t pin, bool val) which determine AVR's registers for the pin (using my special map from integer pin to a pair port/pin) and write to these registers correspondent values. This function isn't too small, because of its generality. But if I call this function with compile-time constant pin and val, GCC can make all calculations at compile-time and eliminate this call to a couple of AVR instructions, e.g.
sbi PORTB,1
sbi DDRB,1
Amble
Let's write a code like this:
class A {
int x;
public:
A(int x_): x(x_) {}
void foo() { pin_write(x, 1); }
};
A a(8);
int main() {
a.foo();
}
We have only one object of class A and it's initialized with a constant (8). So, it's possible to make all calculations at compile-time:
foo() -> pin_write(x,1) -> pin_write(8,1) -> a couple of asm instructions
But GCC doesn't do so.
Surprisely, but if I remove global A a(8) and write just
A(8).foo()
I get exactly what I want:
00000022 <main>:
22: c0 9a sbi 0x18, 0 ; 24
24: b8 9a sbi 0x17, 0 ; 23
Question
So, is there a way to force GCC make all possible calculation at compile-time for single global objects with constant initializers?
Because of this trouble I have to manually expand such cases and replace my original code with this:
const int x = 8;
class A {
public:
A() {}
void foo() { pin_write(x, 1); }
}
UPD. It very wonderful: A(8).foo() inside main optimized to 2 asm instructions. A a(8); a.foo() too! But if I declare A a(8) as global -- compiler produce big general code. I tried to add static -- it didn't help. Why?
But if I declare A a(8) as global -- compiler produce big general code. I tried to add static -- it didn't help. Why?
In my experience, gcc is very reluctant if the object / function has external linkage. Since we don't have your code to compile, I made a slightly modified version of your code:
#include <cstdio>
class A {
int x;
public:
A(int x_): x(x_) {}
int f() { return x*x; }
};
A a(8);
int main() {
printf("%d", a.f());
}
I have found 2 ways to achive that the generated assembly corresponds to this:
int main() {
printf("%d", 64);
}
In words: to eliminate everything at compile time so that only the necessary minimum remains.
One way to achive this both with clang and gcc is:
#include <cstdio>
class A {
int x;
public:
constexpr A(int x_): x(x_) {}
constexpr int f() const { return x*x; }
};
constexpr A a(8);
int main() {
printf("%d", a.f());
}
gcc 4.7.2 already eliminates everything at -O1, clang 3.5 trunk needs -O2.
Another way to achieve this is:
#include <cstdio>
class A {
int x;
public:
A(int x_): x(x_) {}
int f() const { return x*x; }
};
static const A a(8);
int main() {
printf("%d", a.f());
}
It only works with clang at -O3. Apparently the constant folding in gcc is not that aggressive. (As clang shows, it can be done but gcc 4.7.2 did not implement it.)
You can force the compiler to fully optimize the function with all known constants by changing the pin_write function into a template. I don't know if the particular behavior is guaranteed by the standard though.
template< int a, int b >
void pin_write() { some_instructions; }
This will probably require fixing all lines where pin_write is used.
Additionally, you can declare the function as inline. The compiler isn't guaranteed to inline the function (the inline keyword is just an hint), but if it does, it has a greater chance to optimize compile time constants away (assuming the compiler can know it is an compile time constant, which may be not always the case).
Your a has external linkage, so the compiler can't be sure that there isn't other code somewhere modifying it.
If you were to declare a const then you make clear it shouldn't change, and also stop it having external linkage; both of those should help the compiler to be less pessimistic.
(I'd probably declare x const too - it may not help here, but if nothing else it makes it clear to the compiler and the next reader of the code that you never change it.)
My tentative answer is no, as observed by the following test code:
#include <functional>
#include <iostream>
#include <string>
#include <vector>
using namespace std;
void TestFunc (void);
int TestFuncHelper (vector<int>&, int, int);
int main (int argc, char* argv[]) {
TestFunc ();
return 0;
} // End main ()
void TestFunc (void) {
// Recursive lambda
function<int (vector<int>&, int, int)> r = [&] (vector<int>& v_, int d_, int a_) {
if (d_ == v_.size ()) return a_;
else return r (v_, d_ + 1, a_ + v_.at (d_));
};
int UpperLimit = 100000; // Change this value to possibly observe different behaviour
vector<int> v;
for (auto i = 1; i <= UpperLimit; i++) v.push_back (i);
// cout << TestFuncHelper (v, 0, 0) << endl; // Uncomment this, and the programme works
// cout << r (v, 0, 0) << endl; // Uncomment this, and we have this web site
} // End Test ()
int TestFuncHelper (vector<int>& v_, int d_, int a_) {
if (d_ == v_.size ()) return a_;
else return TestFuncHelper (v_, d_ + 1, a_ + v_.at (d_));
} // End TestHelper ()
Is there a way to force the compiler to optimise recursive tail calls in lambdas?
Thanks in advance for your help.
EDIT
I just wanted to clarify that I meant to ask if C++11 optimizes recursive tail calls in lambdas. I am using Visual Studio 2012, but I could switch environments if it is absolutely known that GCC does the desired optimization.
You are not actually doing a tail-call in the "lambda" code, atleast not directly. std::function is a polymorphic function wrapper, meaning it can store any kind of callable entity. A lambda in C++ has a unique, unnamed class type and is not a std::function object, they can just be stored in them.
Since std::function uses type-erasure, it has to jump through several hoops to call the thing that was originally passed to it. These hoops are commenly done with either virtual functions or function-pointers to function template specializations and void*.
The sole nature of indirection makes it very hard for optimizers to see through them. In the same vein, it's very hard for a compiler to see through std::function and decide whether you have a tail-recursive call.
Another problem is that r may be changed from within r or concurrently, since it's a simple variable, and suddenly you don't have a recursive call anymore! With function identifiers, that's just not possible, they can't change meanings mid-way.
I just wanted to clarify that I meant to ask if C++11 optimizes recursive tail calls in lambdas.
The C++11 standard describes how a working program on an abstract machine behaves, not how the compiler optimizes stuff. In fact, the compiler is only allowed to optimize things if it doesn't change the observable behaviour of the program (with copy-elision/(N)RVO being the exception).
I'm considering a certain solution where I would like to initialize a cell of an array that is defined in other module (there will be many modules initializing one table). The array won't be read before running main (so there is not problem with static initialization order).
My approach:
/* secondary module */
extern int i[10]; // the array
const struct Initialize {
Initialize() { i[0] = 12345; }
} init;
/* main module */
#include <stdio.h>
int i[10];
int main()
{
printf("%d\n", i[0]); // check if the value is initialized
}
Compiler won't strip out init constant because constructor has side effects. Am I right? Is the mechanism OK? On GCC (-O3) everything is fine.
//EDIT
In a real world there will be many modules. I want to avoid an extra module, a central place that will gathered all minor initialization routines (for better scalability). So this is important that each module triggers its own initialization.
This works with MSVC compilers but with GNU C++ does not (at least for me). GNU linker will strip all the symbol not used outside your compilation unit. I know only one way to guarantee such initialization - "init once" idiom. For examle:
init_once.h:
template <typename T>
class InitOnce
{
T *instance;
static unsigned refs;
public:
InitOnce() {
if (!refs++) {
instance = new T();
}
}
~InitOnce() {
if (!--refs) {
delete instance;
}
}
};
template <typename T> unsigned InitOnce<T>::refs(0);
unit.h:
#include "init_once.h"
class Init : public InitOnce<Init>
{
public:
Init();
~Init();
};
static Init module_init_;
secondary.cpp:
#include "unit.h"
extern int i[10]; // the array
Init::Init()
{
i[0] = 12345;
}
...
I don't think you want the extern int i[10]; in your main module, though, adf88.
EDIT
/*secondary module (secondary.cpp) */
int i[10];
void func()
{
i[0]=1;
}
.
/*main module (main.cpp)*/
#include<iostream>
extern int i[];
void func();
int main()
{
func();
std::cout<<i[0]; //prints 1
}
Compile, link and create and executable using g++ secondary.cpp main.cpp -o myfile
In general constructors are used(and should be used) for initializing members of a class only.
This might work, but it's dangerous. Globals/statics construction order within a single module is undefined, and so is module loading order (unless you're managing it explicitly). For example, you assume that during secondary.c Initialize() ctor run, i is already present. You'd have to be very careful not to have two modules initialize the same common data, or have two modules carry out initializations with overlapping side effects.
I think a cleaner design to tackle such a need is to have the owner of the common data (your main module) expose it as a global singleton, with an interface to carry out whichever data initializations needed. You'd have a central place to control init-order, and maybe even control concurrent access (using critical sections or other concurrency primitives). Along the lines of your simplified example, that might be -
/main module (main.c)/
#include
class CommonDat
{
int i;
public:
const int GetI() { return i;}
void SetI(int newI) { i = newI; }
void incI()
{
AcquireSomeLock();
i++;
ReleaseTheLock();
}
}
CommonDat g_CommonDat;
CommonDat* getCommonDat() { return &g_CommonDat; }
int main(void)
{
printf("%d",getCommonDat()->GetI());
}
It's also preferable to have the secondary modules call these interfaces at controlled times in runtime (and not during the global c'tors pass).
(NOTE: you named the files as C files, but tagged the question as c++. The suggested code is c++, of course).
May I ask why you use an array (running the risk of getting out of bounds) when you could use a std::vector ?
std::vector<int>& globalArray()
{
static std::vector<int> V;
return V;
}
bool const push_back(std::vector<int>& vec, int v)
{
vec.push_back(v);
return true; // dummy return for static init
}
This array is lazily initialized on the first call to the function.
You can use it like such:
// module1.cpp
static bool const dummy = push_back(globalArray(), 1);
// module2.cpp
static bool const dummy = push_back(globalArray(), 2);
It seems much easier and less error-prone. It's not multithread compliant until C++0x though.
Could someone please tell me if this is possible in C or C++?
void fun_a();
//int fun_b();
...
main(){
...
fun_a();
...
int fun_b(){
...
}
...
}
or something similar, as e.g. a class inside a function?
thanks for your replies,
Wow, I'm surprised nobody has said yes! Free functions cannot be nested, but functors and classes in general can.
void fun_a();
//int fun_b();
...
main(){
...
fun_a();
...
struct { int operator()() {
...
} } fun_b;
int q = fun_b();
...
}
You can give the functor a constructor and pass references to local variables to connect it to the local scope. Otherwise, it can access other local types and static variables. Local classes can't be arguments to templates, though.
C++ does not support nested functions, however you can use something like boost::lambda.
C — Yes for gcc as an extension.
C++ — No.
you can't create a function inside another function in C++.
You can however create a local class functor:
int foo()
{
class bar
{
public:
int operator()()
{
return 42;
}
};
bar b;
return b();
}
in C++0x you can create a lambda expression:
int foo()
{
auto bar = []()->int{return 42;};
return bar();
}
No but in C++0x you can http://en.wikipedia.org/wiki/C%2B%2B0x#Lambda_functions_and_expressions which may take another few years to fully support. The standard is not complete at the time of this writing.
-edit-
Yes
If you can use MSVC 2010. I ran the code below with success
void test()
{
[]() { cout << "Hello function\n"; }();
auto fn = [](int x) -> int { cout << "Hello function (" << x << " :))\n"; return x+1; };
auto v = fn(2);
fn(v);
}
output
Hello function
Hello function (2 :))
Hello function (3 :))
(I wrote >> c:\dev\loc\uniqueName.txt in the project working arguments section and copy pasted this result)
The term you're looking for is nested function. Neither standard C nor C++ allow nested functions, but GNU C allows it as an extension. Here is a good wikipedia article on the subject.
Clang/Apple are working on 'blocks', anonymous functions in C! :-D
^ ( void ) { printf("hello world\n"); }
info here and spec here, and ars technica has a bit on it
No, and there's at least one reason why it would complicate matters to allow it. Nested functions are typically expected to have access to the enclosing scope. This makes it so the "stack" can no longer be represented with a stack data structure. Instead a full tree is needed.
Consider the following code that does actually compile in gcc as KennyTM suggests.
#include <stdio.h>
typedef double (*retdouble)();
retdouble wrapper(double a) {
double square() { return a * a; }
return square;
}
int use_stack_frame(double b) {
return (int)b;
}
int main(int argc, char** argv) {
retdouble square = wrapper(3);
printf("expect 9 actual %f\n", square());
printf("expect 3 actual %d\n", use_stack_frame(3));
printf("expect 16 actual %f\n", wrapper(4)());
printf("expect 9 actual %f\n", square());
return 0;
}
I've placed what most people would expect to be printed, but in fact, this gets printed:
expect 9 actual 9.000000
expect 3 actual 3
expect 16 actual 16.000000
expect 9 actual 16.000000
Notice that the last line calls the "square" function, but the "a" value it accesses was modified during the wrapper(4) call. This is because a separate "stack" frame is not created for every invocation of "wrapper".
Note that these kinds of nested functions are actually quite common in other languages that support them like lisp and python (and even recent versions of Matlab). They lead to some very powerful functional programming capabilities, but they preclude the use of a stack for holding local scope frames.
void foo()
{
class local_to_foo
{
public: static void another_foo()
{ printf("whatevs"); }
};
local_to_foo::another_foo();
}
Or lambda's in C++0x.
You can nest a local class within a function, in which case the class will only be accessible to that function. You could then write your nested function as a member of the local class:
#include <iostream>
int f()
{
class G
{
public:
int operator()()
{
return 1;
}
} g;
return g();
}
int main()
{
std::cout << f() << std::endl;
}
Keep in mind, though, that you can't pass a function defined in a local class to an STL algorithm, such as sort().
int f()
{
class G
{
public:
bool operator()(int i, int j)
{
return false;
}
} g;
std::vector<int> v;
std::sort(v.begin(), v.end(), g); // Fails to compile
}
The error that you would get from gcc is "test.cpp:18: error: no matching function for call to `sort(__gnu_cxx::__normal_iterator > >, __gnu_cxx::__normal_iterator > >, f()::G&)'
"
It is not possible to declare a function within a function. You may, however, declare a function within a namespace or within a class in C++.
Not in standard C, but gcc and clang support them as an extension. See the gcc online manual.
Though C and C++ both prohibit nested functions, a few compilers support them anyway (e.g., if memory serves, gcc can, at least with the right flags). A nested functor is a lot more portable though.
No nested functions in C/C++, unfortunately.
As other answers have mentioned, standard C and C++ do not permit you to define nested functions. (Some compilers might allow it as an extension, but I can't say I've seen it used).
You can declare another function inside a function so that it can be called, but the definition of that function must exist outside the current function:
#include <stdlib.h>
#include <stdio.h>
int main( int argc, char* argv[])
{
int foo(int x);
/*
int bar(int x) { // this can't be done
return x;
}
*/
int a = 3;
printf( "%d\n", foo(a));
return 0;
}
int foo( int x)
{
return x+1;
}
A function declaration without an explicit 'linkage specifier' has an extern linkage. So while the declaration of the name foo in function main() is scoped to main(), it will link to the foo() function that is defined later in the file (or in a another file if that's where foo() is defined).