Good day,
I noticed that when I have the following code:
int foo(const int arg){
return arg*10;
}
const int MY_VAR = foo(10);
main(){
while(true){
}
}
Then the MY_VAR is placed in the RW data section (RAM). Honestly I expected a compiler error. I'm using GNU ARM 6.2 2016q4 release.
If I make MY_VAR constexpr, then I get a compiler error. If I make foo constexpr then, as expected, MY_VAR is placed into the .text section (i.e. in ROM).
As constexpr variables can not be used as extern, I will have to use const variables for truly global constants.
What ways are there that I can automatically (i.e. compiler warning or error) detect that a constant is not being assigned to ROM?
I do want to use the ability to initialise some of the const globals with functions. Though I would want to catch the cases where the function is not constexpr automatically.
Your constant variable MY_VAR is initialized with the result of a function call - This means it cannot be initialized at compile time and thus cannot be put in ROM. The initialisation is done during the startup of your application at run-time.
There is no way to generate a warning if such placements are done - After all, you told the compiler to do so.
You can, however, have the linker generate a link map and manually check whether all your constants have actually ended up in the proper segments.
Related
Is declaring static const array inside getter function is reasonable way of keeping code coherent? I never saw it before, but in code review I found something like this:
static const std::array<std::string, 4>& fooArrayGetter()
{
static const std::array<std::string, 4> fooArray =
{"foo" ,"bar", "baz", "qux"};
return fooArray;
}
It looks correct: array is allocated only once, quite elegant because getter is combined with its value, but running it inside Godbolt https://godbolt.org/z/K8Wv94 gives really messy assembler comparing to making whole operation in more standard way. As a motivation for this code I received reference to Item 4 from Meyers' Efficient C++
Edit: GCC compiler with --std=c++11 flag
Is declaring static const array inside getter function is reasonable way of keeping code coherent?
It can be. It can also be unnecessarily complicated if you don't need it. Whether there is a better alternative depends on what you intend to do with it.
gives really messy assembler
Messiness of non-optimised assembly rarely matters.
Warning
static keyword here has two different meanings!
This which you thing applies to return value says: function fooArrayGetter should be visible only in this translation unit (only current source code can use this function)! This doesn't have any impact on return value type!
this inside a function says: variable fooArray has lifetime like a global scope (it is initialized on first use of fooArrayGetter and lives as long as application). This static makes returning a const reference a safe thing to do since makes this variable/constant ethernal.
Introduction/confirmation of basic facts
It is well known that with GCC style C and C++ compilers, you can use inline assembly with a "memory" clobber:
asm("":::"memory");
to prevent reordering of (most) code past it, acting as a (thread local) "memory barrier" (for example for the purpose of interacting with async signals).
Note: these "compiler barriers" do NOT accomplish inter-threads synchronization.
It does the equivalent of a call to a non inline function, potentially reading all objects that can be read outside of the current scope and altering all those that can be altered (non const objects):
int i;
void f() {
int j = i;
asm("":::"memory"); // can change i
j += i; // not j *= 2
// ... (assume j isn't unused)
}
Essentially it's the same as calling a NOP function that's separately compiled, except that the non inline NOP function call is later (1) inlined so nothing survives from it.
(1) say, after compiler middle pass, after analysis
So here j cannot be changed as it's local, and is still the copy of the old i value, but i might have changed, so the compilation is pretty much the same as:
volatile int vi;
int f2() {
int j = vi;
; // can "change" vi
j += vi; // not j *= 2
return j;
}
Both reads of vi are needed (for a different reason) so the compiler doesn't change that into 2*vi.
Is my understanding correct up to that point? (I presume it is. Otherwise the question doesn't make sense.)
The real issue: extern or static
The above was just the preamble. The issue I have is with static variables, possible calls to static functions (or the C++ equivalent, anonymous namespaces):
Can a memory clobber access static data that isn't otherwise accessible via non static functions, and call static functions that aren't otherwise callable, as none of these are visible at link stage, from other modules, if they aren't named explicitly in the input arguments of the asm directive?
static int si;
int f3() {
int j = si;
asm("":::"memory"); // can access si?
j += si; // optimized to j = si*2; ?
return j;
}
[Note: the use of static is a little ambiguous. The suggestion is that the boundary of the TU is important, and that the static variable is TU-private, but I have not described how it was manipulated. Let's assume it is really manipulated that in that TU, or the compiler might assume it's effectively a constant.]
In other words, is that "clobber" the equivalent of a call to:
an external NOP function, which wouldn't be able to name si directly, nor to access it in any indirect way, as no function in the TU either communicates the address of si, or makes si indirectly modifiable
a locally defined NOP function that can access si
?
Bonus question: global optimization
If the answer is that static variables aren't treated like extern variables in that case, what is the impact when compiling the program at once? More specifically:
During global compilation of the whole program, with global analysis and inference over variables values, is the knowledge of the fact that for example a global variable is never modified (or never assigned a negative value...), except possibly in an asm "clobber", an input of the optimizer?
In other words, if non static i is only named in one TU, can it be optimized as if it was a static int even if there are asm statements? Should global variables be explicitly listed as clobbers in that case?
It does the equivalent of a call to a non inline function, potentially reading all objects that can be read outside of the current scope and altering all those that can be altered (non const objects):
No.
The compiler can decide to inline any function in the same compilation unit (and then, if the function wasn't static, also provide a separate "not inlined" copy for callers in other compilation units so that the linker can find one); and with link-time code optimization/link-time code generation the linker can decide to inline any functions in different compilation units. The only case where it's currently impossible for any function to be inlined is when it is in a shared library; but this limitation currently exists because operating systems currently aren't capable of "load-time optimization".
In other words; any appearance of any kind of barrier for any function is an unintended side-effect of optimizer weaknesses and not guaranteed; and therefore can not/should not be relied on.
The real issue: inline assembly
There are 5 possibilities:
a) The compiler understands all assembly, and is able to examine the inline assembly and determine what is/isn't clobbered; there is no clobber list (and no need for one). In this case (depending on how advanced the compiler/optimiser is) the compiler may be able to determine things like "this area of memory may be clobbered but that area of memory won't be clobbered" and avoid the cost of reloading data from the area of memory that wasn't clobbered.
b) The compiler doesn't understand any assembly and there is no clobber list, so the compiler has to assume everything will be clobbered; which means that the compiler has to generate code that saves the everything (e.g. currently in use values in registers, etc) to memory before the inline assembly is executed and reload everything afterwards, which will give extremely bad performance.
c) The compiler doesn't understand any assembly, and expects the programmer to provide a clobber list to avoid (some of) the performance disaster of having to assume everything will be clobbered.
d) The compiler understands some assembly but not all assembly, and doesn't have a clobber list. If it doesn't understand the assembly it assumes everything may have been clobbered.
e) The compiler understands some assembly but not all assembly, and does have an (optional?) clobber list. If it doesn't understand the assembly it relies on the clobber list (and/or falls back to "assume everything is clobbered" if there is no clobber list), and if it does understand the assembly it ignores the clobber list.
Of course a compiler that uses "option c)" can be improved to use "option e)"; and a compiler that uses "option e)" can be improved to use "option a)".
In other words; any appearance of any kind of barrier for something like "asm("":::"memory");" is an unintended side-effect of the compiler being "improvable"; and therefore can not/should not be relied on.
Summary
None of the things you've mentioned are actually a barrier of any kind. It's all just "unintended and undesired failure to optimize".
If you do need a barrier, then use an actual barrier (e.g. "asm("mfence":::"memory");". However (unless you need inter-threads synchronization and aren't using atomics) its extremely likely that you do not need a barrier in the first place.
I have the following global constant in my C++ program:
const int K = 123456 ;
When I compile the program, the resulting executable contains the literal value 123456 in all the places where the value is used (dozens of times).
But, if I remove the const qualifier, the value 123456 appears only once in the entire executable (in the .data section).
This is the result I'm looking for. I want the value 123456 to appear only once so that it can be changed simply by editing the .exe file with a HEX editor.
However, I don't want to remove the const qualifier because I want the compiler to prevent me from accidentally modifying the constant in the source code.
Is it possible to instruct the compiler somehow to not inline the value of said constant?
The reason I need to do this is so that the executable is easily modifiable by students who will be tasked with "cracking" an example program to alter its behavior. The exercise must be simple enough for inexperienced people.
If you don't want K to be inlined then put this in a header file:
extern const int K;
This means "K is defined somewhere else". Then put this in a cpp file:
const int K = 123456;
In all the places where K is used, the compiler only knows that K is a const int declared externally. The compiler doesn't know the value of K so it cannot be inlined. The linker will find the definition of K in the cpp file put it in the .data section of the executable.
Alternatively, you could define K like this:
const volatile int K = 123456;
This means "K might magically change so you better not assume its value". It has a similar effect to the previous approach as the compiler won't inline K because it can't assume that K will always be 123456. The previous approach would fail if LTO was enabled but using volatile should work in that case.
I must say, this is a really weird thing to do. If you want to make your program configurable, you should put the value of K into a text file and then read the file at startup.
The simplest option is probably to declare it as global without const, so the compiler can't assume that it still has the value of the static initializer.
int K = 123456;
Even link-time optimization can't know that a library function doesn't access this global, assuming you call any in your program.
If your used static int K = 123456;, the compiler could notice that no functions in the compilation unit write the value, and none of them pass or return its address, so escape analysis for the whole compilation unit could discover that it was effectively a constant and could be optimized away.
(If you really wanted it to be static int K;, include a global function like void setK(int x){K=x;} that you never actually call. Without Link-Time Optimization, the compiler will have to assume that something outside this compilation unit could have called this function and changed K, and that any call to a function whose definition isn't visible might result in such a call.)
Beware that volatile const int K = 123456; can hurt optimization significantly more than making it non-const, especially if you have expressions that use K multiple times.
(But either of these can hurt a lot, depending on what optimizations were possible. Constant-propagation can be a huge win.)
The compiler is required to emit asm that loads exactly K once for each time the C abstract machine reads it. (e.g. reading K is considered a visible side-effect, like a read from an MMIO port or a location you have a hardware watchpoint on.)
If you want to let a compiler load it once per loop, and assume K is a loop invariant, then code that uses it should do int local_k = K;. It's up to you how often you want to re-read K, i.e. what scope you do / redo local_k = K at.
On x86, using a memory source operand that stays hot in L1d cache is probably not much of a performance problem, but it will prevent auto-vectorization.
The reason I need to do this is so that the executable is easily modifiable by students who will be tasked with "cracking" an example program to alter its behavior. The exercise must be simple enough for inexperienced people.
For this use-case, yes volatile is exactly what you want. Having all uses re-read from memory on the spot makes it slightly simpler than following the value cached in a register.
And performance is essentially irrelevant, and you won't want auto-vectorization. Probably just light optimization so the students don't have to wade through store/reload of everything after every C++ statement. Like gcc's -Og would be ideal.
With MSVC, maybe try -O1 or -O2 and see if it does anything confusing. I don't think it has options for some but not too aggressive optimization, it might be either debug build (nice for single-stepping the C++ source, bad for reading asm), or fully optimized for size or speed.
Try declaring the constant as volatile. That should result in a single and changeable value that won't be inlined.
When do I need to initialize variables in c++? Some people assert that its important but maybe this is more an issue in c-language?
I am refferering to primitives i.e. char, int, long, double
Let say I have the following code-snippet
int len;
double sum, mean;
char ch;
while (true) {
// here I use these primitives where they are initialized.
}
So - should I initialized these primitives as a good programming pratice here?
In c++ compiler usualy do not initialize local (automatic) variables. These variables are created on the stack and they are filled with random values. Usualy you do not need to inicialize variables but read carefuly what the compiler says. Try:
int main() {
int x;
x=x+1;
}
and compile it with -Wall switch (I'm using gcc). When the message
x.cpp: In function ‘int main()’:
x.cpp:3:6: warning: ‘x’ is used uninitialized in this function [-Wuninitialized]
x=x+1;
is written, then it would be better to initialize such variable.
The problem is, of course, the use of unitialised variables as in
int x;
int y=1+x; // oops what is y?
AFAIK, the language standard allows the compiler to initialise x to 0, but also to leave it unitialised. In any case, most optimisations (-O) will omit an initialisation in the above situation.
If you use full warning compiler flags (e.g. -Wall -Wextra -pedantic) the compiler will almost certainly spot the usage of unitialised variables (it will also warn about usage of unitialised variables in library header files, such as boost headers -- the boost developers appear to not use such useful diagnostics).
In general, whether or not to initialise all variables is a matter of style. I would provide an explicit initialisation whenever there is a sensible initial value for a variable and/or if there is the danger of it being used unitialised. Different from C, the possibility of unitialised variables is quite rare in C++, in particular when passing by return value (including move semantics).
You should initalize all variables to prevent: "trash in input - trash in output".
When do I need to initialize variables in c++?
You should initialize local variables with a sensible value when you define them. If you cannot give a variable a sensible value yet, then you should probably define it later.
The goal here is to minimize the amount of state in your functions in order to make them easier to understand. When all variables are defined at the beginning of the function, you don't know what they are used for. When they are defined at the point they are needed, it's clear that they are not used before that point. This also helps to limit the scope in which variables are declared (e.g. inside the loop instead of before it, thus less state outside the loop) and it allows you to define more variables as const (thus not adding state).
I was reading intro on gtest and found this part confusing:
The compiler complains about "undefined references" to some static
const member variables, but I did define them in the class body.
What's wrong?
If your class has a static data member:
// foo.h
class Foo {
...
static const int kBar = 100;
};
You also need to define it outside of the class body in foo.cc:
const int Foo::kBar; // No initializer here.
Otherwise your code is invalid C++, and may break in unexpected
ways. In particular, using it in Google Test comparison assertions
(EXPECT_EQ, etc) will generate an "undefined reference" linker error.
Can somebody explain why defining a static const in in a class without defining it outside class body is illegal C++?
First things first, inside a class body is not a definition, it's a declaration. The declaration specifies the type and value of the constant, the definition reserves storage space. You might not need the storage space, for instance if you only use the value as a compile time constant. In this case your code is perfectly legal C++. But if you do something like pass the constant by reference, or make a pointer point to the constant then you are going to need the storage as well. In these cases you would get an 'undefined reference' error.
The standard basically states that even though you can give a value in the header, if the static variable is "used" you must still define it in the source file.
In this context "used" is generally understood to mean that some part of the program needs actual memory and/or an address of the variable.
Most likely the google test code takes the address of the variable at some point (or uses it in some other equivalent way).
Roughly: In the class definition, static const int kBar = 100; tells the compiler "Foo will have a kBar constant (which I promise will always be 100)". However, the compiler doesn't know where that variable is yet. In the foo.cc file, the const int Foo::kBar; tells the compiler "alright, make kBar in this spot". Otherwise, the linker goes looking for kBar, but can't find it anywhere.