Is clang's global-constructors warning too strict? - c++

In our project we often use a construct like this (simplified for clarity, we actually use a more safe to use version):
struct Info
{
Info(int x, int y) : m_x(x), m_y(y)
{}
int m_x;
int m_y;
};
struct Data
{
static const Info M_INFO_COLLECTION[3];
};
const Info Data::M_INFO_COLLECTION[] = // global-constructors warning
{
Info(1, 2),
Info(10, 9),
Info(0, 1)
};
M_INFO_COLLECTION can contain a large number of data points. The initialization part resides in the cpp file, which is often code-generated.
Now this structure gives us a fair amount of global-constructors-warnings in clang across our code base. I've read in a blog post that using warnings in the -Weverything group is a bad idea for nightly builds and I agree, and even the clang docdoes not recommend to use it.
Since the decision to switch off the warning is out of my hands, I could use a helpful trick to get rid of the warning (and a potential static initialization order fiasco), by converting the static member into a function initializing and returning a local static variable.
However, since our project does not use dynamically allocated memory as a rule, the original idea has to be used without a pointer, which can lead to deinitialization problems when my Data class is used by other objects in a weird way.
So, long story short: The global-constructors warning points to a piece of code that I can review as safe, since I know what the Data class does. I could get rid of it using a workaround that can lead to problems if other classes use Data in a particular way, but this won't generate a warning. My conclusion is that I would be better off just leaving the code as it is and ignore the warning.
So now I am stuck with a bunch of warnings, that may in some cases point to a SIOF and which I would like to address, but which get buried unter a mountain of warnings that I deliberately do not want to fix, because the fix would actually make things worse.
This brings me to my actual question: Is clang too strict in its interpretation of the warning? From my limited compiler understanding, shouldn't it be possible for the compiler to realize that in this particular case, the static member M_INFO_COLLECTION cannot possibly lead to a SIOF, since all its dependencies are non-static?
I played around with this issue a bit and even this piece of code gets the warning:
//at global scope
int get1()
{
return 1;
}
int i = get1(); // global-constructors warning
Whereas this works fine, as I would expect:
constexpr int get1()
{
return 1;
}
int i = 1; // no warning
int j = get1(); // no warning
This looks fairly trivial to me. Am I missing something or should clang be able to suppress the warning for this example (and possibly my original example above as well)?

The problem is that it is not constant initialized. This means that M_INFO_COLLECTION may be zero-initialized and then dynamically initialized at run time.
Your code generates assembley to set M_INFO_COLLECTION dynamically because of the "global constructor" (non-constant initialization): https://godbolt.org/z/45x6q6
An example where this leads to unexpected behaviour:
// data.h
struct Info
{
Info(int x, int y) : m_x(x), m_y(y)
{}
int m_x;
int m_y;
};
struct Data
{
static const Info M_INFO_COLLECTION[3];
};
// data.cpp
#include "data.h"
const Info Data::M_INFO_COLLECTION[] =
{
Info(1, 2),
Info(10, 9),
Info(0, 1)
};
// main.cpp
#include "data.h"
const int first = Data::M_INFO_COLLECTION[0].m_x;
int main() {
return first;
}
Now if you compile main.cpp before data.cpp, first may access Info out of it's lifetime. In practice, this UB just makes first 0.
E.g.,
$ clang++ -I. main.cpp data.cpp -o test
$ ./test ; echo $?
0
$ clang++ -I. data.cpp main.cpp -o test
$ ./test ; echo $?
1
Of course, this is undefined behaviour. At -O1, this issue disappears, and clang behaves as if M_INFO_COLLECTION is constant initialized (as-if it reordered the dynamic initialization to before first's dynamic initialization (and every other dynamic initialization), which it is allowed to do).
The fix to this is to not use global constructors. If your static storage duration variable is able to be constant initialized, make the relevant functions/constructors constexpr.
If you are not able to add constexprs or have to use a non-constant initialized variable, then you can resolve the static initialization order fiasco without dynamic memory using placement-new:
// data.h
struct Info
{
Info(int x, int y) : m_x(x), m_y(y)
{}
int m_x;
int m_y;
};
struct Data
{
static auto M_INFO_COLLECTION() -> const Info(&)[3];
static const Info& M_ZERO();
};
// data.cpp
#include "data.h"
#include <new>
auto Data::M_INFO_COLLECTION() -> const Info(&)[3] {
// Need proxy type for array reference
struct holder {
const Info value[3];
};
alignas(holder) static char storage[sizeof(holder)];
static auto& data = (new (storage) holder{{
Info(1, 2),
Info(10, 9),
Info(0, 1)
}})->value;
return data;
}
const Info& Data::M_ZERO() {
// Much easier for non-array types
alignas(Info) static char storage[sizeof(Info)];
static const Info& result = *new (storage) Info(0, 0);
return result;
}
Though this does have minor runtime overhead for each access, especially the very first, compared to regular static storage duration variables. It should be faster than the new T(...) trick, since it doesn't call a memory allocation operator.
In short, it's probably best to add constexpr to be able to constant initialize your static storage duration variables.

Related

Can a C++ pointer point to a static member array of string literals?

Everything works in Visual Studio 2017, but I get linker errors in GCC (6.5.0).
Here is some sample code that isolates my problem:
#include <iostream>
struct Foo{
static constexpr const char* s[] = {"one","two","three"};
};
int main(){
std::cout << Foo::s[0] << std::endl; //this works in both compilers
const char* const* str_ptr = nullptr;
str_ptr = Foo::s; //LINKER ERROR in GCC; works in VS
std::cout << str_ptr[1] << std::endl; //works in VS
return 0;
}
In GCC, I get undefined reference to 'Foo::s'. I need the initialization of Foo::s to stay in the declaration of the struct, which is why I used constexpr. Is there a way to reference Foo::s dynamically, i.e. with a pointer?
More Background Info
I'll now explain why I want to do this. I am developing embedded software to control configuration of a device. The software loads config files that contain the name of a parameter and its value. The set of parameters being configured is determined at compile-time, but needs to be modular so that it's easy to add new parameters and expand them as development of the device continues. In other words, their definition needs to be in one single place in the code base.
My actual code base is thousands of lines and works in Visual Studio, but here is a boiled-down toy example:
#include <iostream>
#include <string>
#include <vector>
//A struct for an arbitrary parameter
struct Parameter {
std::string paramName;
int max_value;
int value;
const char* const* str_ptr = nullptr;
};
//Structure of parameters - MUST BE DEFINED IN ONE PLACE
struct Param_FavoriteIceCream {
static constexpr const char* n = "FavoriteIceCream";
enum { vanilla, chocolate, strawberry, NUM_MAX };
static constexpr const char* s[] = { "vanilla","chocolate","strawberry" };
};
struct Param_FavoriteFruit {
static constexpr const char* n = "FavoriteFruit";
enum { apple, banana, grape, mango, peach, NUM_MAX };
static constexpr const char* s[] = { "apple","banana","grape","mango","peach" };
};
int main() {
//Set of parameters - determined at compile-time
std::vector<Parameter> params;
params.resize(2);
//Configure these parameters objects - determined at compile-time
params[0].paramName = Param_FavoriteIceCream::n;
params[0].max_value = Param_FavoriteIceCream::NUM_MAX;
params[0].str_ptr = Param_FavoriteIceCream::s; //!!!! LINKER ERROR IN GCC !!!!!!
params[1].paramName = Param_FavoriteFruit::n;
params[1].max_value = Param_FavoriteFruit::NUM_MAX;
params[1].str_ptr = Param_FavoriteFruit::s; //!!!! LINKER ERROR IN GCC !!!!!!
//Set values by parsing files - determined at run-time
std::string param_string = "FavoriteFruit"; //this would be loaded from a file
std::string param_value = "grape"; //this would be loaded from a file
for (size_t i = 0; i < params.size(); i++) {
for (size_t j = 0; j < params[i].max_value; j++) {
if (params[i].paramName == param_string
&& params[i].str_ptr[j] == param_value) {
params[i].value = j;
break;
}
}
}
return 0;
}
As you can see, there are enums and string arrays involved and these need to match, so for maintenance purposes I need to keep these in the same place. Furthermore, since this code is already written and will be used in both Windows and Linux environments, the smaller the fix, the better. I would prefer not to have to re-write thousands of lines just to get it to compile in Linux. Thanks!
The program is valid in C++17. The program is not valid in C++14 or older standards. The default standard mode of GCC 6.5.0 is C++14.
To make the program conform to C++14, you must define the static member (in exactly one translation unit). Since C++17, the constexpr declaration is implicitly an inline variable definition and thus no separate definition is required.
Solution 1: Upgrade your compiler and use C++17 standard (or later if you're from the future), which has inline variables. Inline variables have been implemented since GCC 7.
Solution 2: Define the variable outside the class definition in exactly one translation unit (the initialisation remains in the declaration).
It looks like you are not using C++17, and pre-C++17 this is undefined behavior, and one that I really dislike. You do not have a definition for your s, but you are ODR-using it, which means that you need to have a definition.
To define this, you have to define it in .cpp file.
In C++17 that would be the valid code. I am not familiar with MSVC, so I am not sure why it works fine there - be it that it is compiled as C++17, or because it is just a different manifestation of undefined behavior.
For C++98, C++11, and C++14 you need to explicitly tell the compiler where the initialization of Foo::s lives (see below) and you are good to go.
struct Foo{
static const char* s[];
};
const char* Foo::s[] = {"one","two","three"};
And as explained in one of the comments your initialization is OK from C++17.

C++ How to avoid access of members, of a object that was not yet initialized

What are good practice options for passing around objects in a program, avoiding accessing non initialized member variables.
I wrote a small example which I think explains the problem very well.
#include <vector>
using namespace std;
class container{public:container(){}
vector<int> LongList;
bool otherInfo;
};
class Ship
{
public:Ship(){}
container* pContainer;
};
int main()
{
//Create contianer on ship1
Ship ship1;
ship1.pContainer = new container;
ship1.pContainer->LongList.push_back(33);
ship1.pContainer->otherInfo = true;
Ship ship2;
//Transfer container from ship1 onto ship2
ship2.pContainer = ship1.pContainer;
ship1.pContainer = 0;
//2000 lines of code further...
//embedded in 100 if statements....
bool info = ship1.pContainer->otherInfo;
//and the program crashes
return 0;
}
The compiler cannot determine if you are introducing undefined behavior like shown in your example. So there's no way to determine if the pointer variable was initialized or not, other than initializing it with a "special value".
What are good practice options for passing around objects in a program, avoiding accessing non initialized member variables.
The best practice is always to initialize the pointer, and check before dereferencing it:
class Ship {
public:
Ship() : pContainer(nullptr) {}
// ^^^^^^^^^^^^^^^^^^^^^
container* pContainer;
};
// ...
if(ship1.pContainer->LongList) {
ship1.pContainer->LongList.push_back(33);
}
As for your comment:
So there are no compiler flags that could warn me?
There are more simple and obvious cases, where the compiler may leave you with a warning:
int i;
std::cout << i << std::endl;
Spits out
main.cpp: In functin 'int main()':
main.cpp:5:18: warning: 'i' is used uninitialized in this function [-Wuninitialized]
std::cout << i << std::endl;
^
See Live Demo
One good practice to enforce the checks is to use std::optional or boost::optional.
class Ship
{
public:
Ship() : pContainer(nullptr) {}
std::optional<container*> Container()
{
if(!pContainer)
return {};
return pContainer;
}
private:
container* pContainer;
};
It will force you (or better: provide a firm reminder) to check the result of your getter:
std::optional<container*> container = ship1.Container();
container->otherInfo; // will not compile
if(container)
(*container)->otherInfo; // will compile
You would always need to check the result of operation if you use pointers. What I mean is that with optional the situation is more explicit and there's less probability that you as the programmer will forget to check the result.
It seems that you are looking for a way to make your code
bool info = ship1.pContainer->otherInfo;
work even though the pContainer may be null.
You can use a sentinel object, which holds some default data:
container default_container;
default_container.otherInfo = false; // or whatever the default is
Then use a pointer to the sentinel object instead of a null pointer:
//Transfer container from ship1 onto ship2
ship2.pContainer = ship1.pContainer;
ship1.pContainer = &default_container; // instead of 0
//2000 lines of code further...
//embedded in 100 if statements....
bool info = ship1.pContainer->otherInfo;
If you use this, you should make sure the sentinel object cannot be destroyed (e.g. make it a static member, or a singleton).
Also, in the constructor, initialize your pointers so they point to the sentinel object:
class Ship
{
public: Ship(): pContainer(&default_container) {}
...
};
I found an additional solution. It is admittedly not preventing the access of uninitialized objects, but at least the program crashes AND returns an error message, that enables us to correct our mistake. (This solution is particularly for the g++ compiler.)
First of all set the compiler flag _GLIBCXX_DEBUG. Then instead of naked pointer use unique_ptr.
#include <vector>
#include <iostream>
#include <memory>
using namespace std;
class container{
public:container(){}
int otherInfo = 33;
};
class Ship
{
public:Ship(){}
std::unique_ptr<container> upContainer;
};
int main()
{
Ship ship1;
cout<<ship1.upContainer->otherInfo<<endl;
return 0;
}
This code will produce an error:
std::unique_ptr<_Tp, _Dp>::pointer = container*]: Assertion 'get() != pointer()' failed.
Hence telling us that we should probably include an if(ship1.upContainer) check.
What are good practice options for passing around objects in a program, avoiding accessing non initialized member variables.
Good practice would be to initialize everything in the constructor.
Debatable better practice is to initialize everything in the constructor and provide no way of modifying any members.

Can GCC optimize methods of a class with compile-time constant variables?

Preamble
I'm using avr-g++ for programming AVR microcontrollers and therefore I always need to get very efficient code.
GCC usually can optimize a function if its argument are compile-time constants, e.g. I have function pin_write(uint8_t pin, bool val) which determine AVR's registers for the pin (using my special map from integer pin to a pair port/pin) and write to these registers correspondent values. This function isn't too small, because of its generality. But if I call this function with compile-time constant pin and val, GCC can make all calculations at compile-time and eliminate this call to a couple of AVR instructions, e.g.
sbi PORTB,1
sbi DDRB,1
Amble
Let's write a code like this:
class A {
int x;
public:
A(int x_): x(x_) {}
void foo() { pin_write(x, 1); }
};
A a(8);
int main() {
a.foo();
}
We have only one object of class A and it's initialized with a constant (8). So, it's possible to make all calculations at compile-time:
foo() -> pin_write(x,1) -> pin_write(8,1) -> a couple of asm instructions
But GCC doesn't do so.
Surprisely, but if I remove global A a(8) and write just
A(8).foo()
I get exactly what I want:
00000022 <main>:
22: c0 9a sbi 0x18, 0 ; 24
24: b8 9a sbi 0x17, 0 ; 23
Question
So, is there a way to force GCC make all possible calculation at compile-time for single global objects with constant initializers?
Because of this trouble I have to manually expand such cases and replace my original code with this:
const int x = 8;
class A {
public:
A() {}
void foo() { pin_write(x, 1); }
}
UPD. It very wonderful: A(8).foo() inside main optimized to 2 asm instructions. A a(8); a.foo() too! But if I declare A a(8) as global -- compiler produce big general code. I tried to add static -- it didn't help. Why?
But if I declare A a(8) as global -- compiler produce big general code. I tried to add static -- it didn't help. Why?
In my experience, gcc is very reluctant if the object / function has external linkage. Since we don't have your code to compile, I made a slightly modified version of your code:
#include <cstdio>
class A {
int x;
public:
A(int x_): x(x_) {}
int f() { return x*x; }
};
A a(8);
int main() {
printf("%d", a.f());
}
I have found 2 ways to achive that the generated assembly corresponds to this:
int main() {
printf("%d", 64);
}
In words: to eliminate everything at compile time so that only the necessary minimum remains.
One way to achive this both with clang and gcc is:
#include <cstdio>
class A {
int x;
public:
constexpr A(int x_): x(x_) {}
constexpr int f() const { return x*x; }
};
constexpr A a(8);
int main() {
printf("%d", a.f());
}
gcc 4.7.2 already eliminates everything at -O1, clang 3.5 trunk needs -O2.
Another way to achieve this is:
#include <cstdio>
class A {
int x;
public:
A(int x_): x(x_) {}
int f() const { return x*x; }
};
static const A a(8);
int main() {
printf("%d", a.f());
}
It only works with clang at -O3. Apparently the constant folding in gcc is not that aggressive. (As clang shows, it can be done but gcc 4.7.2 did not implement it.)
You can force the compiler to fully optimize the function with all known constants by changing the pin_write function into a template. I don't know if the particular behavior is guaranteed by the standard though.
template< int a, int b >
void pin_write() { some_instructions; }
This will probably require fixing all lines where pin_write is used.
Additionally, you can declare the function as inline. The compiler isn't guaranteed to inline the function (the inline keyword is just an hint), but if it does, it has a greater chance to optimize compile time constants away (assuming the compiler can know it is an compile time constant, which may be not always the case).
Your a has external linkage, so the compiler can't be sure that there isn't other code somewhere modifying it.
If you were to declare a const then you make clear it shouldn't change, and also stop it having external linkage; both of those should help the compiler to be less pessimistic.
(I'd probably declare x const too - it may not help here, but if nothing else it makes it clear to the compiler and the next reader of the code that you never change it.)

G++: Can __attribute__((__may_alias__)) be used for pointer to class instance rather than to class definition itself?

I'm looking to the answer to the following question: is may_alias suitable as attribute for pointer to an object of some class Foo? Or must it be used at class level only?
Consider the following code(it is based on a real-world example which is more complex):
#include <iostream>
using namespace std;
#define alias_hack __attribute__((__may_alias__))
template <typename T>
class Foo
{
private:
/*alias_hack*/ char Data[sizeof (T)];
public:
/*alias_hack*/ T& GetT()
{
return *((/*alias_hack*/ T*)Data);
}
};
struct Bar
{
int Baz;
Bar(int baz)
: Baz(baz)
{}
} /*alias_hack*/; // <- uncommeting this line apparently solves the problem, but does so on class-level(rather than pointer-level)
// uncommenting previous alias_hack's doesn't help
int main()
{
Foo<Bar> foo;
foo.GetT().Baz = 42;
cout << foo.GetT().Baz << endl;
}
Is there any way to tell gcc that single pointer may_alias some another?
BTW, please note that gcc detection mechanism of such problem is imperfect, so it is very easy to just make this warning go away without actually solving the problem.
Consider the following snippet of code:
#include <iostream>
using namespace std;
int main()
{
long i = 42;
long* iptr = &i;
//(*(short*)&i) = 3; // with warning
//(*(short*)iptr) = 3; // without warning
cout << i << endl;
}
Uncomment one of the lines to see the difference in compiler output.
Simple answer - sorry, no.
__attrbite__ gives instructions to the compiler. Objects exist in the memory of the executed program. Hence nothing in __attribute__ list can relate to the run-time execution.
Dimitar is correct. may_alias is a type attribute. It can only apply to a type, not an instance of the type. What you'd like is what gcc calls a "variable attribute". It would not be easy to disable optimizations for one specific pointer. What would the compiler do if you call a function with this pointer? The function is potentially already compiled and will behave based on the type passed to the function, not based on the address store in the pointer (you should see now why this is a type attribute)
Now depending on your code something like that might work:
#define define_may_alias_type(X) class X ## _may alias : public X { } attribute ((may_alias));
You'd just pass your pointer as Foo_may_alias * (instead of Foo *) when it might alias. That's hacky though
Wrt your question about the warning, it's because -Wall defaults to -Wstrict-aliasing=3 which is not 100% accurate. Actually, -Wstrict-aliasing is never 100% accurate but depending on the level you'll get more or less false negatives (and false positives). If you pass -Wstrict-aliasing=1 to gcc, you'll see a warning for both

how to initialize a static struct in c++?

I have managed to initialize correct any variable of basic type(i.e. int, char, float etc) but when declaring a little complex variable all i can see is errors.
In the header file timer.h i declare
class AndroidTimerConcept {
...
private:
//struct that holds the necessary info for every event
struct Resources{
timer_delegate_t membFunct;
void *data;
int size;
millis_t time;
};
//declaring an array of 10 Resources structs
static struct Resources ResData;
static int best;
...
}
inside the timer.cpp file
#include <iostream>
#include "timer.h"
using namespace std;
int AndroidTimerModel::best=1000;
struct Resources AndroidTimerModel::ResData.size; //line 17!!
//constructor that initializes all the necessary variables
AndroidTimerModel::AndroidTimerModel()
{
signal(SIGALRM,signalHandler);
for(int i=0; i<MAX_EVENTS; i++)
{
//ResData.data=NULL;
ResData.size=-1;
//ResData.time=-1;
}
best=1000;
}
when compiling the .cpp file i get the error:
timer.cpp:7: error: expected initializer before ‘.’ token
Any suggestions would be really helpful.
btw i use g++
You can use a struct initializer in C++, but only in the pre-C99 style (i.e, you cannot use designated initializers). Designated intializers, which allow you to specify the members to be initialized by name, rather than relying on declaration order, were introduced in C99, but aren't part of any C++ standard at the moment (belying the common assumption that C++ is a superset of C).
If you are willing to write non-portable C++ code that specifically targets g++, you can always use the GCC-specific extension which has the same functionality as designated constructors. The syntax is like this:
struct value_t my_val = { member_a: 1, member_b: 1.2f };
This reference provides a pretty good overview of both types of initialization in the C context.
Here's an excerpt that shows both the earlier (without designators) and C99 styles:
When initializing a struct, the first initializer in the list
initializes the first declared member (unless a designator is
specified) (since C99), and all subsequent initializers without
designators (since C99) initialize the struct members declared after
the one initialized by the previous expression.
struct point {double x,y,z;} p = {1.2, 1.3}; // p.x=1.2, p.y=1.3, p.z=0.0
div_t answer = {.quot = 2, .rem = -1 }; // order of elements in div_t may vary
In some cases you may need to write some code to initialize a structure, and in this case you can use the result of a function, like:
struct Resources AndroidTimerModel::ResData = function_that_acts_like_a_constructor();
You don't separately define individual instance members within a static member.
This should be enough:
AndroidTimerModel::Resources AndroidTimerModel::ResData;
You need to declare and define a constructor for struct Resources.
eg
struct Resources{
timer_delegate_t membFunct;
void *data;
int size;
millis_t time;
Resources():membFunct(0), data(0), size(0), time(0) {}
....
};
You need to initialise the whole struct variable, something like this:
AndroidTimerConcept::Resources AndroidTimerModel::ResData = { NULL, NULL, 0, 0 };
Is it AndroidTimerModel or AndroidTimerConcept, you can't use different names and expect the compiler to think they're the same thing.
You need to scope the name Resources, it's not in global scope, it's in the scope of the AndroidTimerModel class:
AndroidTimerModel::Resources AndroidTimerModel::ResData;
I suggest you give Resources a constructor:
struct Resources{
Resources(timer_delegate_t aMembFunct, void* aData, int aSize, millis_t aTime )
: membFunc(aMembFunct)
, data(aData)
, size(aSize)
, time(aTime)
{}
timer_delegate_t membFunct;
void *data;
int size;
millis_t time;
};
And you can then define Res in your .cpp as:
AndroidTimerModel::Resources AndroidTimerModel::ResData(/* params here */);
Why is your struct part of a class? I would make it global outside of the class.
memset(&structname, 0, sizeof(structname)); will initialize your structure to 0.