I don't really understand what pragma does and i have a question.
If i run my program with -O2 flag and there is #pragma optimize("O3") in my code, will it use O3 or O2 optimization ?
Speaking from an MSVC standpoint, #pragma generally behaves independent of compiler flags. From the point where you declare the #pragma statement onward, your compiler will use O3 optimization, if you use the on parameter, like such #pragma optimize("O3",on)
As stated in the MS docs...
The optimize pragma must appear outside a function and takes effect at the first function defined after the pragma is seen. The on and off arguments turn options specified in the optimization-list on or off.
Gleaned from https://msdn.microsoft.com/en-us/library/chh3fb0k.aspx
Related
I consistently run into compiler errors where I forget to put the opening brace for a #pragma omp critical section on the line following the statement, instead of on the same line:
#pragma omp parallel
{
static int i = 0;
// this code fails to compile
#pragma omp critical {
i++;
}
// this code compiles fine
#pragma omp critical
{
i++;
}
}
My question is, why can't the compiler parse the braces on the same line? It can do this for any C++ syntax. Why should white space matter for the OpenMP #pragma statements, when it does not in C++?
According to cppreference:
The preprocessing directives control the behavior of the preprocessor. Each directive occupies one line and has the following format:
# character
preprocessing instruction (one of define, undef, include, if, ifdef, ifndef, else, elif, endif, line, error, pragma)
arguments (depends on the instruction)
line break.
The null directive (# followed by a line break) is allowed and has no effect.
So the pre-processor, not the compiler, reads the entire line as a directive. Unlike the compiler, which does not care about line breaks.
Source: http://en.cppreference.com/w/cpp/preprocessor
Because it does.
To say that whitespace "never matters" in any C++ construct is foolhardy. For example, the following pieces of code are not the same, and I don't believe that anyone would expect them to be:
1.
int x = 42;
2.
intx=42;
It is more true to say that newlines and space characters are generally treated in the same way, but that's still not quite right. Like:
3.
void foo() // a comment
{
4.
void foo() // a comment {
Of course, in this case, the reason the snippets aren't the same is because // takes effect until the end of the line.
But so does #.
Both constructs are resolved by the preprocessor, not by the compiler, and the preprocessor works in lines. It is not until later in the build process that more complex parsing takes place. This is logical, consistent, predictable, and practical. All syntax highlighters expect it to work that way.
Could the preprocessor be modified to treat a { at the end of a preprocessor directive as if it were written on the next line? Sure, probably. But it won't be.
Thinking purely about this actual example, the range of acceptable parameters to a #pragma is implementation defined (in fact this is the whole point of the #pragma directive), so it is literally not possible for the C++ standard to define a more complex set of semantics for it than "use the whole line, whatever's provided". And, without the C++ standard guiding it, such logic would potentially result in the same source code meaning completely different things on completely different compilers. No thanks!
My question is, why can't the compiler parse the braces on the same line? It can do this for any C++ syntax. Why should white space matter for the OpenMP #pragma statements, when it does not in C++?
There are two standards which defines what compiler can: C/C++ language standard and OpenMP specification. C/C++ specification of OpenMP says (chapter 2, 10-11; 2.1 7):
In C/C++, OpenMP directives are specified by using the #pragma mechanism provided by the C and C++ standards.
2.1 Directive Format
The syntax of an OpenMP directive is as follows:
#pragma omp directive-name [clause[ [,] clause] ... ] new-line
So, new line is required by OpenMP (and by syntax of #pragma in C and C++ standards - while they look like preprocessor directive, they are actually compiler directives).
But if you want to use OpenMP pragmas in places where newlines are prohibited (inside macro directive) or you want to place { on the same line, sometimes there is alternative: _Pragma (from C99 standard and from C++11 standard) or non-standard MS-specific __pragma: Difference between #pragma and _Pragma() in C and https://gcc.gnu.org/onlinedocs/cpp/Pragmas.html
_Pragma("omp parallel for")
_Pragma("omp critical")
This variant may work in some C++ compilers and may not work in another; it also depends on language options of the compilation process.
Let's imagine a blah.h header file that contains:
// A declaration without any code. We force inline
__attribute__((always_inline)) void inline_func();
And a blah.cpp source file that contains:
#include "blah.h"
// The code of the inline function
void inline_func() {
...
}
// Use the inline function
void foo() {
inline_func();
}
The question is, will the compiler actually inline the inline_func()? Should the code be with the declaration or they can be separate?
Assume no LTO
Note the (GCC) force inline decoration in inline_func()
Inlining is a two-step process:
* Is it possible?
* Is it worthwhile?
The first step is fairly trivially decided by the compiler, the second is a far more complex heuristic. Thus it makes sense to only consider the benefits of possible optimizations.
always_inline means that the second step is ignored. It does not affect the first consideration. Now, you've also stated that LTO is disabled, which means that first consideration, the ability for inlining, is restricted. This shows that LTO and always_inline are pretty unrelated since they affect two different inlining considerations.
Not that LTO matters for your example anyway. The two functions under consideration are in the same Translation Unit. There appear to be no other restrictions such as recursion, library calls, or other observable side effects. That means it should be possible to inline, and since that's the only consideration, it should be inlined.
You need to have the body available at the time the inlining is supposed to happen.
I.e. if you have the following files:
inlineFunc.h
inlineFunc.c
main.c
And you compile with:
compile inline.c
compile main.c
link innline.o mcompile inline.c
compile main.c
link innline.o main.o yourCoookProgramain.o yourCoookProgram
there is no way that inlineFunc gets inlined in main.c however calls to inlineFunc in inlineFunc.c can be inlined.
As Paolo mentioned, inline is only a hint to a compiler however some compilers also have ways to force the inining, i.e. for gcc you can use __attribute__(always_inline). Take alook here for a discussion on how gcc handles inlining.
An interesting sitenote:
The warning is issued in this case as the definition of foo() is not
available when main() is compiled. However, with -O2 or better
optimizations, gcc performs a kind of "backward inlining", meaning
that even function definitions that are further ahead in the source
file can be embedded into a caller. Consequently, the warning
disappears as soon as one uses at least -O2 optimizations. Is there
any specific option responsible for this behavior? I would like to
enable "backward inlining" even with -O1 or -O0.
Well, it depends. In your example, it will be inlined, because the definition of the function is in the same translation unit where it is used.
Otherwise, if no LTO is possible, and at compile time the definition of the function is not available to the compiler, then no, the function will not be inlined.
Prior answer
The answer is: it depends. It depends on the compiler, and it may depend on compiler's configuration(1)(2) too.
See also inline description at cppreference.com (quoted below):
The intent of the inline keyword is to serve as an indicator to the optimizer that inline substitution of the function is preferred over function call, that is, instead of executing the call CPU instruction to transfer control to the function body, a copy of the function body is executed without generating the call. This avoids extra overhead created by the function call (copying the arguments and retrieving the result) but it may result in a larger executable as the code for the function has to be repeated multiple times.
Since this meaning of the keyword inline is non-binding, compilers are free to use inline substitution for any function that's not marked inline, and are free to generate function calls to any function marked inline. Those choices do not change the rules regarding multiple definitions and shared statics listed above.
How portable is code that uses #pragma optimize? Do most compilers support it and how complete is the support for this #pragma?
#pragma is the sanctioned and portable way for compilers to add non-sanctioned and non-portable language extensions *.
Basically, you never know for sure, and at least one major C++ compiler (g++) does not support this pragma as is.
*:
From the C++ standard (N3242):
16.6 Pragma directive [cpp.pragma]
A preprocessing directive of the form
# pragma pp-tokensopt new-line
causes the implementation to behave in an implementation-defined manner. The behavior might cause translation to fail or cause the translator or the resulting program to behave in a non-conforming manner. Any pragma that is not recognized by the implementation is ignored.
From the C standard (Committee Draft — April 12, 2011):
6.10.6 Pragma directive
Semantics
A preprocessing directive of the form
# pragma pp-tokensopt new-line
where the preprocessing token STDC does not immediately follow pragma in the
directive (prior to any macro replacement)174) causes the implementation to behave in an
implementation-defined manner. The behavior might cause translation to fail or cause the
translator or the resulting program to behave in a non-conforming manner. Any such
pragma that is not recognized by the implementation is ignored.
And here's an example:
int main () {
#pragma omp parallel for
for (int i=0; i<16; ++i) {}
}
A big part of the C and C++ OpenMP API is implemented as #pragmas.
Often this is not a good idea to rely on compiler flags, since each compiler has its own behaviour.
This flag should not be used as it is a compiling level spec you inject into your code.
Normally and theoretically this flag should be ignored by compilers if not used.
The #pragma keyword is portable in the sense that it should always compile despite on the compiler. However, the pragmas are compiler-specific so it's probable that when changing compiler it will complain with some warnings. Some pragmas are wide used, such as these from OpenMP. In order to make the code the most portable possible, you might surround your pragmas with #ifdef/#endif that depend on the compiler you're using. For example:
#ifdef __ICC
#pragma optimize
#endif
Compilers usually define some macros such as __ICC that make the code know which compiler is being used.
Any use of #pragma is compiler specific.
For example :
GNU, Intel and IBM :
#warning "Do not use ABC, which is deprecated. Use XYZ instead."
Microsoft :
#pragma message("Do not use ABC, which is deprecated. Use XYZ instead.")
Regarding your specific question about the #pragma optimize, it is supported by gcc and microsoft, but it doesn't mean it will be in the future.
#pragma is not portable, full stop. There was a version of gcc that used to start of a game whenever it came across that
Of the compilers we use at work, two definitely don't support #pragma optimise, and I can't answer for the others.
And even if they did, as the command line switches for optimisation are different, the chances are that the options for the pragma would be different.
which compiling a multithreaded program we use gcc like below:
gcc -lpthread -D_REENTRANT -o someprogram someprogram.c
what exactly is the flag -D_REENTRANT doing over here?
Defining _REENTRANT causes the compiler to use thread safe (i.e. re-entrant) versions of several functions in the C library.
You can search your header files to see what happens when it's defined.
Excerpt from the libc 8.2 manual:
Macro: _REENTRANT
Macro: _THREAD_SAFE
These macros are obsolete. They have the same effect as defining
_POSIX_C_SOURCE with the value 199506L.
Some very old C libraries required one of these macros to be defined
for basic functionality (e.g. getchar) to be thread-safe.
We recommend you use _GNU_SOURCE in new programs. If you don’t specify
the ‘-ansi’ option to GCC, or other conformance options such as
-std=c99, and don’t define any of these macros explicitly, the effect is the same as defining _DEFAULT_SOURCE to 1.
When you define a feature test macro to request a larger class of
features, it is harmless to define in addition a feature test macro
for a subset of those features. For example, if you define
_POSIX_C_SOURCE, then defining _POSIX_SOURCE as well has no effect. Likewise, if you define _GNU_SOURCE, then defining either
_POSIX_SOURCE or _POSIX_C_SOURCE as well has no effect.
JayM replied:
Defining _REENTRANT causes the compiler to use thread safe (i.e. re-entrant) versions of several functions in the C library.
You can search your header files to see what happens when it's defined.
Since OP and I were both interested in the question, I decided to actually post the answer. :) The following things happen with _REENTRANT on Mac OS X 10.11.6:
<math.h> gains declarations for lgammaf_r, lgamma_r, and lgammal_r.
On Linux (Red Hat Enterprise Server 5.10), I see the following changes:
<unistd.h> gains a declaration for the POSIX 1995 function getlogin_r.
So it seems like _REENTRANT is mostly a no-op, these days. It might once have declared a lot of new functions, such as strtok_r; but these days those functions are mostly mandated by various decades-old standards (C99, POSIX 95, POSIX.1-2001, etc.) and so they're just always enabled.
I have no idea why the two systems I checked avoid declaring lgamma_r resp. getlogin_r when _REENTRANT is not #defined. My wild guess is that this is just historical cruft that nobody ever bothered to go through and clean up.
Of course my observations on these two systems might not generalize to all systems your code might ever encounter. You should definitely still pass -pthread to the compiler (or, less good but okay, -lpthread -D_REENTRANT) whenever your program requires pthreads.
In multithreaded programs, you tell the compiler that you need this feature by defining the _REENTRANT macro before any #include lines in your program. This does three things, and does them so elegantly that usually you don’t even need to know what was done:
Some functions get prototypes for a re-entrant safe equivalent.
These are normally the same function name, but with _r appended so
that, for example, gethostbyname is changed to gethostbyname_r.
Some stdio.h functions that are normally implemented as macros
become proper re-entrant safe functions.
The variable errno, from errno.h, is changed to call a function,
which can determine the real errno value in a multithread safe way.
Taken from Beginning Linux Programming
It simply defined _REENTRANT for the preprocessor. Somewhere in the associated code, you'll probably find #ifdef _REENTRANT or #if defined(_REENTRANT) in at least a few places.
Also note that the name "_REENTRANT: is in the implementer's name space (any name starting with an underscore followed by another underscore or a capital letter is), so defining it means you've stepped outside what the standard defines (at least the C or C++ standards).
Does GCC, when compiling C++ code, ever try to optimize for speed by choosing to inline functions that are not marked with the inline keyword?
Yes. Any compiler is free to inline any function whenever it thinks it is a good idea. GCC does that as well.
At -O2 optimization level the inlining is done when the compiler thinks it is worth doing (a heuristic is used) and if it will not increase the size of the code. At -O3 it is done whenever the compiler thinks it is worth doing, regardless of whether it will increase the size of the code. Additionally, at all levels of optimization (enabled optimization that is), static functions that are called only once are inlined.
As noted in the comments below, these -Ox are actually compound settings that envelop multiple more specific settings, including inlining-related ones (like -finline-functions and such), so one can also describe the behavior (and control it) in terms of those more specific settings.
Yes, especially if you have a high level of optimizations enabled.
There is a flag you can provide to the compiler to disable this: -fno-inline-functions.
If you use '-finline-functions' or '-O3' it will inline functions. You can also use '-finline_limit=N' to tune how much inlining it does.
Yes, it does, although it will also generate a non-inlined function body for non-static non-inline functions as this is needed for calls from other translation units.
For inline functions, it is an error to fail to provide a function body if the function is used in any particular translation unit so this isn't a problem.