Why is this "invalid C++" - c++

I was reading intro on gtest and found this part confusing:
The compiler complains about "undefined references" to some static
const member variables, but I did define them in the class body.
What's wrong?
If your class has a static data member:
// foo.h
class Foo {
...
static const int kBar = 100;
};
You also need to define it outside of the class body in foo.cc:
const int Foo::kBar; // No initializer here.
Otherwise your code is invalid C++, and may break in unexpected
ways. In particular, using it in Google Test comparison assertions
(EXPECT_EQ, etc) will generate an "undefined reference" linker error.
Can somebody explain why defining a static const in in a class without defining it outside class body is illegal C++?

First things first, inside a class body is not a definition, it's a declaration. The declaration specifies the type and value of the constant, the definition reserves storage space. You might not need the storage space, for instance if you only use the value as a compile time constant. In this case your code is perfectly legal C++. But if you do something like pass the constant by reference, or make a pointer point to the constant then you are going to need the storage as well. In these cases you would get an 'undefined reference' error.

The standard basically states that even though you can give a value in the header, if the static variable is "used" you must still define it in the source file.
In this context "used" is generally understood to mean that some part of the program needs actual memory and/or an address of the variable.
Most likely the google test code takes the address of the variable at some point (or uses it in some other equivalent way).

Roughly: In the class definition, static const int kBar = 100; tells the compiler "Foo will have a kBar constant (which I promise will always be 100)". However, the compiler doesn't know where that variable is yet. In the foo.cc file, the const int Foo::kBar; tells the compiler "alright, make kBar in this spot". Otherwise, the linker goes looking for kBar, but can't find it anywhere.

Related

Return static const value

Is declaring static const array inside getter function is reasonable way of keeping code coherent? I never saw it before, but in code review I found something like this:
static const std::array<std::string, 4>& fooArrayGetter()
{
static const std::array<std::string, 4> fooArray =
{"foo" ,"bar", "baz", "qux"};
return fooArray;
}
It looks correct: array is allocated only once, quite elegant because getter is combined with its value, but running it inside Godbolt https://godbolt.org/z/K8Wv94 gives really messy assembler comparing to making whole operation in more standard way. As a motivation for this code I received reference to Item 4 from Meyers' Efficient C++
Edit: GCC compiler with --std=c++11 flag
Is declaring static const array inside getter function is reasonable way of keeping code coherent?
It can be. It can also be unnecessarily complicated if you don't need it. Whether there is a better alternative depends on what you intend to do with it.
gives really messy assembler
Messiness of non-optimised assembly rarely matters.
Warning
static keyword here has two different meanings!
This which you thing applies to return value says: function fooArrayGetter should be visible only in this translation unit (only current source code can use this function)! This doesn't have any impact on return value type!
this inside a function says: variable fooArray has lifetime like a global scope (it is initialized on first use of fooArrayGetter and lives as long as application). This static makes returning a const reference a safe thing to do since makes this variable/constant ethernal.

C++ global constant in RAM not ROM

Good day,
I noticed that when I have the following code:
int foo(const int arg){
return arg*10;
}
const int MY_VAR = foo(10);
main(){
while(true){
}
}
Then the MY_VAR is placed in the RW data section (RAM). Honestly I expected a compiler error. I'm using GNU ARM 6.2 2016q4 release.
If I make MY_VAR constexpr, then I get a compiler error. If I make foo constexpr then, as expected, MY_VAR is placed into the .text section (i.e. in ROM).
As constexpr variables can not be used as extern, I will have to use const variables for truly global constants.
What ways are there that I can automatically (i.e. compiler warning or error) detect that a constant is not being assigned to ROM?
I do want to use the ability to initialise some of the const globals with functions. Though I would want to catch the cases where the function is not constexpr automatically.
Your constant variable MY_VAR is initialized with the result of a function call - This means it cannot be initialized at compile time and thus cannot be put in ROM. The initialisation is done during the startup of your application at run-time.
There is no way to generate a warning if such placements are done - After all, you told the compiler to do so.
You can, however, have the linker generate a link map and manually check whether all your constants have actually ended up in the proper segments.

Can the compiler not determine whether a variable is const by Itself?

I know for a function this simple it will be inlined:
int foo(int a, int b){
return a + b;
}
But my question is, can't the compiler just auto-detect that this is the same as:
int foo(const int a, const int b){
return a + b;
}
And since that could be detected, why would I need to type const anywhere? I know that the inline keyword has become obsolete because of compiler advances. Isn't it time that const do the same?
You don't put const as the result of not modifying a variable. You use const to enforce you not modifying it. Without const, you are allowed to modify the value. With const, the compiler will complain.
It's a matter of semantics. If the value should not be mutable, then use const, and the compiler will enforce that intention.
Yes, the compiler can prove constness in your example.
No, it would be of no use :-).
Update: Herb Sutter dedicated one of his gotchas to the topic (http://www.gotw.ca/gotw/081.htm). Summary:
const helps most by making the compiler and linker choose functions for const objects including const member functions which can be coded to be more efficient.
const doesn't help with the usual translation unit model [differs from what I supposed]; the compiler needs to see the whole program for verifying factual constness (which the mere declaration does not guarantee) and exploiting it, as well as prove the absence of aliasing ...
... and when the compiler can see the whole program and can prove factual constness it actually of course doesn't need the const declaration any longer! It can prove it. Duh.
The one place where const makes a big difference is a definition because the compiler may store the object in read-only memory.
The article is, of course, worth reading.
With respect to whole program optimization/translation which usually is necessary to exploit constness cf. the comments below from amdn and Angew.
can't the compiler just auto-detect that this is the same as...
If by that you mean whether the compiler can detect that the variables are not modified in the second case, most likely yes. The compiler is likely to produce the same output for both code samples. However, const might help the compiler in more complex situations. But the most important point is that it keeps you from inadvertently modifying one of the variables.
The compiler will always know what you did and will infer internal constness from that in order to optimize the code.
What the compiler can never know is what you wanted to do.
If you wanted a variable to remain constant but accidentally change it later in the code the compiler can only trap this error if you tell the compiler what you wanted.
This is what the const keyword is for.
struct bar {
const int* x;
};
bar make_bar(const int& x){
return {&x};
}
std::map<int,bar> data;
shuffle(data);
knowing that bar will never modify x (or cause it to be modified) in its lifetime requires understanding every use of bar in the program, or, say, making x a pointer to const.
Even with perfect whole program optimization (which cannot exist: turing machines are not perfectly understandable), dynamic linking means you cannot know at compile time how data will be used. const is a promise, and breaking that promise (in certain contexts) can be UB. The compiler can use that UB to optimize in ways that ignores the promise being broken.
inline is not obsolete: it means the same thing it ever did, that linker collisions of this symbol are to be ignored, and it mildly suggests injecting the code into the calling scope.
const simplifies certain optimizations (which may make them possible), and enforces things on the programmer (which helps the programmer), and can change what code means (const overloading).
Maybe he could but the const statement is also for you. If you set a variable as const and try to assign a new value afterwards you will get an error. If the compiler would make a var out of it by himself this would not work.
Const qualifier is a method to enforce behavior of the variables inside your scope. It only provides the compiler the means to scream at you if you try to modify them inside the scope where they are declared const.
A variable might be truly const (meaning it is writen in a read only location, hence compiler optimizations) if it's const at the time of it's declaration.
You can provide your 2nd function non const variables who will become "const" inside the function scope.
Or alternativelly you can bypass the const by casting , so the compiler cannot parse your whole code in an attempt to figure out if the valuea will be changed or not inside the function scope.
Considering that const qualifiers are mainly for code enforcing, and that compilers will generate the same code in 99% of cases if a variable is const or non const, then NO, the compiler shouldn't auto-detect constness.
Short answer: because not all problems are that simple.
Longer answer: You cannot assume that an approach which works with a simple problem also works with a complex problem
Exact answer: const is an intent. The main goal of const is to prevent you doing anything accidentially. If the compiler would add const automatically it would just see that the approach is NOT const and leave it at it. Using the const keyword will raise an error instead.

Is it a fixed order that all global variables are initialized prior to main()? [duplicate]

C++ guarantees that variables in a compilation unit (.cpp file) are initialised in order of declaration. For number of compilation units this rule works for each one separately (I mean static variables outside of classes).
But, the order of initialization of variables, is undefined across different compilation units.
Where can I see some explanations about this order for gcc and MSVC (I know that relying on that is a very bad idea - it is just to understand the problems that we may have with legacy code when moving to new GCC major and different OS)?
As you say the order is undefined across different compilation units.
Within the same compilation unit the order is well defined: The same order as definition.
This is because this is not resolved at the language level but at the linker level. So you really need to check out the linker documentation. Though I really doubt this will help in any useful way.
For gcc: Check out ld
I have found that even changing the order of objects files being linked can change the initialization order. So it is not just your linker that you need to worry about, but how the linker is invoked by your build system. Even try to solve the problem is practically a non starter.
This is generally only a problem when initializing globals that reference each other during their own initialization (so only affects objects with constructors).
There are techniques to get around the problem.
Lazy initialization.
Schwarz Counter
Put all complex global variables inside the same compilation unit.
Note 1: globals:
Used loosely to refer to static storage duration variables that are potentially initialized before main().
Note 2: Potentially
In the general case we expect static storage duration variables to be initialized before main, but the compiler is allowed to defer initialization in some situations (the rules are complex see standard for details).
I expect the constructor order between modules is mainly a function of what order you pass the objects to the linker.
However, GCC does let you use init_priority to explicitly specify the ordering for global ctors:
class Thingy
{
public:
Thingy(char*p) {printf(p);}
};
Thingy a("A");
Thingy b("B");
Thingy c("C");
outputs 'ABC' as you'd expect, but
Thingy a __attribute__((init_priority(300))) ("A");
Thingy b __attribute__((init_priority(200))) ("B");
Thingy c __attribute__((init_priority(400))) ("C");
outputs 'BAC'.
Since you already know that you shouldn't rely on this information unless absolutely necessary, here it comes. My general observation across various toolchains (MSVC, gcc/ld, clang/llvm, etc) is that the order in which your object files are passed to the linker is the order in which they will be initialized.
There are exceptions to this, and I do not claim to all of them, but here are the ones I ran into myself:
1) GCC versions prior to 4.7 actually initialize in the reverse order of the link line. This ticket in GCC is when the change happened, and it broke a lot of programs that depended on initialization order (including mine!).
2) In GCC and Clang, usage of constructor function priority can alter the initialization order. Note that this only applies to functions that are declared to be "constructors" (i.e. they should be run just like a global object constructor would be). I have tried using priorities like this and found that even with highest priority on a constructor function, all constructors without priority (e.g. normal global objects, constructor functions without priority) will be initialized first. In other words, the priority is only relative to other functions with priorities, but the real first class citizens are those without priority. To make it worse, this rule is effectively the opposite in GCC prior to 4.7 due to point (1) above.
3) On Windows, there is a very neat and useful shared-library (DLL) entry-point function called DllMain(), which if defined, will run with parameter "fdwReason" equal to DLL_PROCESS_ATTACH directly after all global data has been initialized and before the consuming application has a chance to call any functions on the DLL. This is extremely useful in some cases, and there absolutely is not analogous behavior to this on other platforms with GCC or Clang with C or C++. The closest you will find is making a constructor function with priority (see above point (2)), which absolutely is not the same thing and won't work for many of the use cases that DllMain() works for.
4) If you are using CMake to generate your build systems, which I often do, I have found that the order of the input source files will be the order of their resultant object files given to the linker. However, often times your application/DLL is also linking in other libraries, in which case those libraries will be on the link line after your input source files. If you are looking to have one of your global objects be the very first one to initialize, then you are in luck and your can put the source file containing that object to be the first in the list of source files. However, if you are looking to have one be the very last one to initialize (which can effectively replicate DllMain() behavior!) then you can make a call to add_library() with that one source file to produce a static library, and add the resulting static library as the very last link dependency in your target_link_libraries() call for your application/DLL. Be wary that your global object may get optimized out in this case and you can use the --whole-archive flag to force the linker not to remove unused symbols for that specific tiny archive file.
Closing Tip
To absolutely know the resulting initialization order of your linked application/shared-library, pass --print-map to ld linker and grep for .init_array (or in GCC prior to 4.7, grep for .ctors). Every global constructor will be printed in the order that it will get initialized, and remember that the order is opposite in GCC prior to 4.7 (see point (1) above).
The motivating factor for writing this answer is that I needed to know this information, had no other choice but to rely on initialization order, and found only sparse bits of this information throughout other SO posts and internet forums. Most of it was learned through much experimentation, and I hope that this saves some people the time of doing that!
http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.12 - this link moves around. this one is more stable but you will have to look around for it.
edit: osgx supplied a better link.
A robust solution is to use a getter function that returns a reference to an static variable. A simple example is shown below, a complex variant in our SDG Controller middleware.
// Foo.h
class Foo {
public:
Foo() {}
static bool insertIntoBar(int number);
private:
static std::vector<int>& getBar();
};
// Foo.cpp
std::vector<int>& Foo::getBar() {
static std::vector<int> bar;
return bar;
}
bool Foo::insertIntoBar(int number) {
getBar().push_back(number);
return true;
}
// A.h
class A {
public:
A() {}
private:
static bool a1;
};
// A.cpp
bool A::a1 = Foo::insertIntoBar(22);
The initialization would being with the only static member variable bool A::a1. This would then call Foo::insertIntoBar(22). This would then call Foo::getBar() in which the initialization of the static std::vector<int> variable would occur before returning a reference to the initialized object.
If the static std::vector<int> bar were placed directly as a member variable of the Foo class, there would be a possibility, depending on the naming ordering of the source files, that bar would be initialized after insertIntoBar() were called, thereby crashing the program.
If multiple static member variables would call insertIntoBar() during their initialization, the order would not be dependent on the names of the source files, i.e., random, but the std::vector<int> would be guaranteed to be initialized before any values be inserted into it.
In addition to Martin's comments, coming from a C background, I always think of static variables as part of the program executable, incorporated and allocated space in the data segment. Thus static variables can be thought of as being initialised as the program loads, prior to any code being executed. The exact order in which this happens can be ascertained by looking at the data segment of map file output by the linker, but for most intents and purposes the initialisation is simultaeneous.
Edit: Depending on construction order of static objects is liable to be non-portable and should probably be avoided.
If you really want to know the final order I would recommend you to create a class whose constructor logs the current timestamp and create several static instances of the class in each of your cpp files so that you could know the final order of initialization. Make sure to put some little time consuming operation in the constructor just so you don't get the same time stamp for each file.

Volatile keyword allows access to const structures in UnitTest++

I'm using the UnitTest++ framework to implement unit tests on some C code I'm responsible for. The end product is embedded and uses const structures to hold configuration information. Since the target host can modify the configuration asynchronously the members of the structure are all volatile. Some of the structures are also declared as volatile.
I'm getting segmentation faults when I use const_cast to try to modify the structure instances lacking the volatile keyword on the UnitTest Windows 7 host. This makes sense to me. However, if the structure instance was declared with the volatile keyword then the test passes. This does not make sense to me.
Here's a quick code example that shows the problem with gcc on Win7. Switching the define value causes the segfault to appear or not, depending on if the volatile instance of the struct is used or not.
typedef struct
{
volatile int foo;
volatile int bar;
} TestStruct;
const TestStruct constStruct = { 1, 2};
volatile const TestStruct volatileConstStruct = { 3, 4};
#define SEG_FAULT 0
int main(void)
{
TestStruct * constPtr = const_cast<TestStruct*>(&constStruct);
TestStruct * constVolPtr = const_cast<TestStruct*>(&volatileConstStruct);
#if(SEG_FAULT == 0)
constVolPtr->foo = 10;
#else
constPtr->foo = 20;
#endif
}
Can anyone help me understand why the volatile keyword presents a workaround for the segfault? Also, can anyone suggest a method to allow me to modify the values in the structure for unit test without adding the volatile keyword to all the structure instances?
EDIT:
I've just discovered that you can do this in C:
#define const
Including the effective "const undefine" above in the test fixture allows my target compiler to see the const keyword and correctly place the structures into flash memory. However, the preprocessor on the UnitTest++ compiler strips out the const keyword, so my test fixture is able to modify the struct.
The drawback to this solution is that I cannot add unit tests that verify correct const operation of function calls. However, since removing the const from the struct instances is not an option (need the data to be placed in flash) this appears to be a drawback I will have to live with.
Why this strange behavior?
Modifying a const object using const_cast is an Undefined Behavior.
const_cast is used when you have a const pointer to an non const object and you want to point your pointer to it.
Why it works with volatile?
Not sure. However, It is still an Undefined Behavior and you are just lucky that it works.
The problem with Undefined Behavior is that all safe bets are off and the program might show any behavior. It might appear to work or it may not work. may crash or show any weird behavior.
it is best to not write any code exhibiting Undefined Behavior, that saves warranting explanations for such situations.
How to solve this?
Don't declare the objects you modify as const, Since you intend to modify them during the course of your program/test, they should not be const. Currently, You are making a promise to the compiler that your structure objects are immutable(const) but later you break that contract by modifying it. Make this promise only if you can keep it.
I believe a footnote in the standard gives you the answer. (Note that footnotes are not normative.)
In ยง6.7.3 of the standard draft N1570:
132) The implementation may place a const object that is not volatile
in a read-only region of storage.
This mean that the structure defined with the volatile keyword will be placed in read-write memory, despite the fact that it's defined const.
One could argue that the compiler is not allowed to place any of the structures in read-only memory, as they both contains volatile members. I would send in a compiler bug report, if I were you.
Can anyone help me understand why the volatile keyword presents a
workaround for the segfault? Also, can anyone suggest a method to
allow me to modify the values in the structure for unit test without
adding the volatile keyword to all the structure instances?
You can't. A const object is placed in read-only memory, and you will trigger a segfault if you write to it. Either drop the const or add volatile -- I would strongly recommend dropping const.