Is there a way to detect inline function ODR violations? - c++

So I have this code in 2 separate translation units:
// a.cpp
#include <stdio.h>
inline int func() { return 5; }
int proxy();
int main() { printf("%d", func() + proxy()); }
// b.cpp
inline int func() { return 6; }
int proxy() { return func(); }
When compiled normally the result is 10. When compiled with -O3 (inlining on) I get 11.
I have clearly done an ODR violation for func().
It showed up when I started merging sources of different dll's into fewer dll's.
I have tried:
GCC 5.1 -Wodr (which requires -flto)
gold linker with -detect-odr-violations
setting ASAN_OPTIONS=detect_odr_violation=1 before running an instrumented binary with the address sanitizer.
Asan can supposedly catch other ODR violations (global vars with different types or something like that...)
This is a really nasty C++ issue and I am amazed there isn't reliable tooling for detecting it.
Pherhaps I have misused one of the tools I tried? Or is there a different tool for this?
EDIT:
The problem remains unnoticed even when I make the 2 implementations of func() drastically different so they don't get compiled to the same amount of instructions.
This also affects class methods defined inside the class body - they are implicitly inline.
// a.cpp
struct A { int data; A() : data(5){} };
// b.cpp
struct A { int data; A() : data(6){} };
Legacy code with lots of copy/paste + minor modifications after that is a joy.

The tools are imperfect.
I think Gold's check will only notice when the symbols have different types or different sizes, which isn't true here (both functions will compile to the same number of instructions, just using a different immediate value).
I'm not sure why -Wodr doesn't work here, but I think it only works for types, not functions, i.e. it will detect two conflicting definitions of a class type T but not your func().
I don't know anything about ASan's ODR checking.

The simplest way to detect such concerns is to copy all the functions into a single compilation unit (create one temporarily if needed). Any C++ compiler will then be able to detect and report duplicate definitions when compiling that file.

Related

For a function that takes a const struct, does the compiler not optimize the function body?

I have the following piece of code:
#include <stdio.h>
typedef struct {
bool some_var;
} model_t;
const model_t model = {
true
};
void bla(const model_t *m) {
if (m->some_var) {
printf("Some var is true!\n");
}
else {
printf("Some var is false!\n");
}
}
int main() {
bla(&model);
}
I'd imagine that the compiler has all the information required to eliminate the else clause in the bla() function. The only code path that calls the function comes from main, and it takes in const model_t, so it should be able to figure out that that code path is not being used. However:
With GCC 12.2 we see that the second part is linked in.
If I inline the function this goes away though:
What am I missing here? And is there some way I can make the compiler do some smarter work? This happens in both C and C++ with -O3 and -Os.
The compiler does eliminate the else path in the inlined function in main. You're confusing the global function that is not called anyway and will be discarded by the linker eventually.
If you use the -fwhole-program flag to let the compiler know that no other file is going to be linked, that unused segment is discarded:
[See online]
Additionally, you use static or inline keywords to achieve something similar.
The compiler cannot optimize the else path away as the object file might be linked against any other code. This would be different if the function would be static or you use whole program optimization.
The only code path that calls the function comes from main
GCC can't know that unless you tell it so with -fwhole-program or maybe -flto (link-time optimization). Otherwise it has to assume that some static constructor in another compilation unit could call it. (Including possibly in a shared library, but another .cpp that you link with could do it.) e.g.
// another .cpp
typedef struct { bool some_var; } model_t;
void bla(const model_t *m); // declare the things from the other .cpp
int foo() {
model_t model = {false};
bla(&model);
return 1;
}
int some_global = foo(); // C++ only: non-constant static initializer.
Example on Godbolt with these lines in the same compilation unit as main, showing that it outputs both Some var is false! and then Some var is true!, without having changed the code for main.
ISO C doesn't have easy ways to get init code executed, but GNU C (and GCC specifically) have ways to get code run at startup, not called by main. This works even for shared libraries.
With -fwhole-program, the appropriate optimization would be simply not emitting a definition for it at all, as it's already inlined into the call-site in main. Like with inline (In C++, a promise that any other caller in another compilation unit can see its own definition of the function) or static (private to this compilation unit).
Inside main, it has optimized away the branch after constant propagation. If you ran the program, no branch would actually execute; nothing calls the stand-alone definition of the function.
The stand-alone definition of the function doesn't know that the only possible value for m is &model. If you put that inside the function, then it could optimize like you're expecting.
Only -fPIC would force the compiler to consider the possibility of symbol-interposition so the definition of const model_t model isn't the one that is in effect after (dynamic) linking. But you're compiling code for an executable not a library. (You can disable symbol-interposition for a global variable by giving it "hidden" visibility, __attribute__((visibility("hidden"))), or use -fvisibility=hidden to make that the default).

Can an inline variable be changed after initialization in C++17?

My scenario is the following (it worked in clang but not in gcc)
liba.hpp:
inline int MY_GLOBAL = 0;
libother.cpp: (dll)
#include "myliba.hpp"
void myFunc() {
//
MYGLOBAL = 28;
}
someexe.cpp:
RunAppThatUsesBothLibAandLibOther();
The problem is that the inline variable was showing 0 in places where I expected 28 because it was alrady modified at run-time. MSVC disagrees with this, but clang does the thing I would expect.
The question is: can inline variables be modified at run-time in my scenario? (I solved the problem by de-inlining the variable.)
Yes, inline variables can be modified after initialization.
However, DLLs are strange things on Windows with MSVC. To a close approximation, each DLL is modelled as its own C++ program, with an entirely independent runtime. Therefore, there is one copy of your inline variable for the main program, and another for the DLL.

Can't get warnings to work for header-only library

I'm creating an header-only library, and I would like to get warnings for it displayed during compilation. However, it seems that only warnings for the "main" project including the library get displayed, but not for the library itself.
Is there a way I can force the compiler to check for warnings in the included library?
// main.cpp
#include "MyHeaderOnlyLib.hpp"
int main() { ... }
// Compile
g++ ./main.cpp -Wall -Wextra -pedantic ...
// Warnings get displayed for main.cpp, but not for MyHeaderOnlyLib.hpp
I'm finding MyHeaderOnlyLib.hpp via a CMake script, using find_package. I've checked the command executed by CMake, and it's using -I, not -isystem.
I've tried both including the library with <...> (when it's in the /usr/include/ directory), or locally with "...".
I suppose that you have a template library and you are complaining about the lack of warnings from its compilation. Don't look for bad #include path, that would end up as an error. Unfortunately, without specialization (unless the templates are used by the .cpp), the compiler has no way to interpret the templates reliably, let alone produce sensible warnings. Consider this:
#include <vector>
template <class C>
struct T {
bool pub_x(const std::vector<int> &v, int i)
{
return v.size() < i;
}
bool pub_y(const std::vector<int> &v, int i)
{
return v.size() < i;
}
};
typedef T<int> Tint; // will not help
bool pub_z(const std::vector<int> &v, unsigned int i) // if signed, produces warning
{
return v.size() < i;
}
class WarningMachine {
WarningMachine() // note that this is private
{
//T<int>().pub_y(std::vector<int>(), 10); // to produce warning for the template
}
};
int main()
{
//Tint().pub_y(std::vector<int>(), 10); // to produce warning for the template
return 0;
}
You can try it out in codepad. Note that the pub_z will immediately produce signed / unsigned comparison warning when compiled, despite never being called. It is a whole different story for the templates, though. Even if T::pub_y is called, T::pub_x still passes unnoticed without a warning. This depends on a compiler implementation, some compilers perform more aggressive checking once all the information is available, other tend to be lazy. Note that neither T::pub_x or T::pub_y depend on the template argument.
The only way to do it reliably is to specialize the templates and call the functions. Note that the code which does that does not need to be accessible for that (such as in WarningMachine), making it a candidate to be optimized away (but that depends), and also meaning that the values passed to the functions may not need to be valid values as the code will never run (that will save you allocating arrays or preparing whatever data the functions may need).
On the other hand, since you will have to write a lot of code to really check all the functions, you may as well pass valid data and check for result correctness and make it useful, instead of likely confusing the hell of anyone who reads the code after you (as is likely in the above case).

same class, different size...?

See the code, then you would understand what I'm confused.
Test.h
class Test {
public:
#ifndef HIDE_VARIABLE
int m_Test[10];
#endif
};
Aho.h
class Test;
int GetSizeA();
Test* GetNewTestA();
Aho.cpp
//#define HIDE_VARIABLE
#include "Test.h"
#include "Aho.h"
int GetSizeA() { return sizeof(Test); }
Test* GetNewTestA() { return new Test(); }
Bho.h
class Test;
int GetSizeB();
Test* GetNewTestB();
Bho.cpp
#define HIDE_VARIABLE // important!
#include "Test.h"
#include "Bho.h"
int GetSizeB() { return sizeof(Test); }
Test* GetNewTestB() { return new Test(); }
TestPrj.cpp
#include "Aho.h"
#include "Bho.h"
#include "Test.h"
int _tmain(int argc, _TCHAR* argv[]) {
int a = GetSizeA();
int b = GetSizeB();
Test* pA = GetNewTestA();
Test* pB = GetNewTestB();
pA->m_Test[0] = 1;
pB->m_Test[0] = 1;
// output : 40 1
std::cout << a << '\t' << b << std::endl;
char temp;
std::cin >> temp;
return 0;
}
Aho.cpp does not #define HIDE_VARIABLE, so GetSizeA() returns 40, but
Bho.cpp does #define HIDE_VARIABLE, so GetSizeB() returns 1.
But, Test* pA and Test* pB both have member variable m_Test[].
If the size of class Test from Bho.cpp is 1, then pB is weird, isn't it?
I don't understand what's going on, please let me know.
Thanks, in advance.
Environment:
Microsoft Visual Studio 2005 SP1 (or SP2?)
You violated the requirements of One Definition Rule (ODR). The behavior of your program is undefined. That's the only thing that's going on here.
According to ODR, classes with external linkage have to be defined identically in all translation units.
Your code exhibits undefined behavior. You are violating the one definition rule (class Test is defined differently in two places). Therefore the compiler is allowed to do whatever it wants, including "weird" behavior.
In addition to ODR.
Most of the grief is caused by including the headers only in the cpp files, allowing you to change the definition between the compilation units.
But, Test* pA and Test* pB both have member variable m_Test[].
No, pB doesn't have m_Test[] however the TestPrj compilation unit doesn't know that and is applying the wrong structure of the class so it will compile.
Unless you compile in debug with capturing of memory overrun you would most times not see a problem.
pB->m_Test[9] = 1; would cause writing to memory not assigned by pB but may or may not be a valid space for you to write.
Like many people told here, you've violated the so-called One Definition Rule (ODR).
It's important to realize how C/C++ programs are assembled. That is, the translation units (cpp files) are compiled separately, without any connection to each other. Next linker assembles the executable according to the symbols and the code pieces generated by the compiler. It doesn't have any high-level type information, hence it's unable (and should not) to detect the problem.
So that you've actually cheated the compiler, beaten yourself, shoot your foot, whatever you like.
One point that I'd like to mention is that actually ODR rule is violated very frequently due to subtle changes in the various include header files and miscellaneous defines, but usually there's no problem and people don't even realize this.
For instance a structure may have a member of LPCTSTR type, which is a pointer to either char or wchar_t, depending on the defines, includes and etc. But this type of violation is "almost ok". As long as you don't actually use this member in differently compiled translation units there's no problem.
There're also many other common examples. Some arise from the in-class implemented member functions (inlined), which actually compile into different code within different translation units (due to different compiler options for different translation units for instance).
However this is usually ok. In your case however the memory layout of the structure has changed. And here we have a real problem.

Weird seg fault problem

Greetings,
I'm having a weird seg fault problem. My application dumps a core file at runtime. After digging into it I found it died in this block:
#include <lib1/c.h>
...
x::c obj;
obj.func1();
I defined class c in a library lib1:
namespace x
{
struct c
{
c();
~c();
void fun1();
vector<char *> _data;
};
}
x::c::c()
{
}
x::c::~c()
{
for ( int i = 0; i < _data.size(); ++i )
delete _data[i];
}
I could not figure it out for some time till I ran nm on the lib1.so file: there are more function definitions than I defined:
x::c::c()
x::c::c()
x::c::~c()
x::c::~c()
x::c::func1()
x::c::func2()
After searching in code base I found someone else defined a class with same name in same namespace, but in another library lib2 as follows:
namespace x
{
struct c
{
c();
~c();
void func2();
vector<string> strs_;
};
}
x::c::c()
{
}
x::c::~c()
{
}
My application links to lib2, which has dependency on lib1. This interesting behavior brings several questions:
Why would it even work? I would expect a "multiple definitions" error while linking against lib2 (which depends upon lib1) but never had such. The application seems to be doing what's defined in func1 except it dumps a core at runtime.
After attaching debugger, I found my application calls the ctor of class c in lib2, then calls func1 (defined in lib1). When going out of scope it calls dtor of class c in lib2, where the seg fault occurs. Can anybody teach me how this could even occur?
How can I prevent such problems from happening again? Is there any C++ syntax I can use?
Forgot to mention I'm using g++ 4.1 on RHEL4, thank you very much!
1.
Violations of the "one definition rule" don't have to be diagnosed by your compiler. In fact, they are often only going to be known at link time when you link multiple object files together.
At link time, the information about the original class definitions may not exist any more (they are not needed after the compiler step) so having multiple definitions of a class is typically not easy to flag to the user.
2.
Once you have two distinct definitions pretty much anything can happen, you are in the territory of undefined behaviour. Whatever happens, it's a possible outcome.
3.
The most sensible thing to do is to communicate with the other members of your team. Agree who's going to use which namespaces and you won't get these problems. Otherwise, you point a documentation tool or static analysis tool over your entire project. Many such tools will be able to diagnose multiple inconsistent definitions of classes.
Just a guess but I don't see any using namespace x; so perhaps it used one namespace instead of the other?
With the advent of templates it became necessary to allow multiple definitions of a body of code with the same name; there was no way for the compiler to know if the same template code had already been generated in another compilation unit i.e. source file. When the linker finds these duplicates, it assumes they are identical. The burden is on you to make sure that they are - this is called the One Definition Rule.
On the linker level this is library interpositioning. The effective symbol bound unfortunately depends on the order of object files on linker command line (this is, sigh, historical).
From what you describe it looks that lib1 comes first in linker argument list and lib2 comes second and interposes on symbols from lib1. This explains the calls to constructors and destructors from the lib2 but calls to func1 from lib1 (since there's no func1-derived symbol in lib2, so there's no "hiding", the call is bound to lib1.)
The solution to this particular problem is to reverse the order of libraries on the linker invocation command.
There's lots of answers about the one definition rule. However, to me, this looks a lot more like a missing copy constructor.
To elaborate:
If the copy constructor is called on your object, then you will get a memory leak. This is because delete will be called on the same set of pointers twice.
namespace x
{
struct c
{
c() {
}
~c() {
for ( int i = 0; i < _data.size(); ++i )
delete _data[i];
}
c(const c & rhs) {
for (int i=0; i< rhs.size(); ++i) {
int len = strlen(rhs[i]);
char *mem = malloc(len + 1);
strncpy(mem, rhs[i], len + 1);
_data.push_back(mem);
}
void fun1();
vector<char *> _data;
};
}