same class, different size...? - c++

See the code, then you would understand what I'm confused.
Test.h
class Test {
public:
#ifndef HIDE_VARIABLE
int m_Test[10];
#endif
};
Aho.h
class Test;
int GetSizeA();
Test* GetNewTestA();
Aho.cpp
//#define HIDE_VARIABLE
#include "Test.h"
#include "Aho.h"
int GetSizeA() { return sizeof(Test); }
Test* GetNewTestA() { return new Test(); }
Bho.h
class Test;
int GetSizeB();
Test* GetNewTestB();
Bho.cpp
#define HIDE_VARIABLE // important!
#include "Test.h"
#include "Bho.h"
int GetSizeB() { return sizeof(Test); }
Test* GetNewTestB() { return new Test(); }
TestPrj.cpp
#include "Aho.h"
#include "Bho.h"
#include "Test.h"
int _tmain(int argc, _TCHAR* argv[]) {
int a = GetSizeA();
int b = GetSizeB();
Test* pA = GetNewTestA();
Test* pB = GetNewTestB();
pA->m_Test[0] = 1;
pB->m_Test[0] = 1;
// output : 40 1
std::cout << a << '\t' << b << std::endl;
char temp;
std::cin >> temp;
return 0;
}
Aho.cpp does not #define HIDE_VARIABLE, so GetSizeA() returns 40, but
Bho.cpp does #define HIDE_VARIABLE, so GetSizeB() returns 1.
But, Test* pA and Test* pB both have member variable m_Test[].
If the size of class Test from Bho.cpp is 1, then pB is weird, isn't it?
I don't understand what's going on, please let me know.
Thanks, in advance.
Environment:
Microsoft Visual Studio 2005 SP1 (or SP2?)

You violated the requirements of One Definition Rule (ODR). The behavior of your program is undefined. That's the only thing that's going on here.
According to ODR, classes with external linkage have to be defined identically in all translation units.

Your code exhibits undefined behavior. You are violating the one definition rule (class Test is defined differently in two places). Therefore the compiler is allowed to do whatever it wants, including "weird" behavior.

In addition to ODR.
Most of the grief is caused by including the headers only in the cpp files, allowing you to change the definition between the compilation units.
But, Test* pA and Test* pB both have member variable m_Test[].
No, pB doesn't have m_Test[] however the TestPrj compilation unit doesn't know that and is applying the wrong structure of the class so it will compile.
Unless you compile in debug with capturing of memory overrun you would most times not see a problem.
pB->m_Test[9] = 1; would cause writing to memory not assigned by pB but may or may not be a valid space for you to write.

Like many people told here, you've violated the so-called One Definition Rule (ODR).
It's important to realize how C/C++ programs are assembled. That is, the translation units (cpp files) are compiled separately, without any connection to each other. Next linker assembles the executable according to the symbols and the code pieces generated by the compiler. It doesn't have any high-level type information, hence it's unable (and should not) to detect the problem.
So that you've actually cheated the compiler, beaten yourself, shoot your foot, whatever you like.
One point that I'd like to mention is that actually ODR rule is violated very frequently due to subtle changes in the various include header files and miscellaneous defines, but usually there's no problem and people don't even realize this.
For instance a structure may have a member of LPCTSTR type, which is a pointer to either char or wchar_t, depending on the defines, includes and etc. But this type of violation is "almost ok". As long as you don't actually use this member in differently compiled translation units there's no problem.
There're also many other common examples. Some arise from the in-class implemented member functions (inlined), which actually compile into different code within different translation units (due to different compiler options for different translation units for instance).
However this is usually ok. In your case however the memory layout of the structure has changed. And here we have a real problem.

Related

Why is the declaration/definition order still important in C++?

For many times now, I have had problems with the declaration and definition order in C++:
struct A {
void Test() { B(); }
};
void B() {
A a;
}
Of course this can be solved by predeclaring B(). Usually this is good enough to solve any of these problems. But when working with module based header-only libraries or similarily complex include systems, this declaration/definition concept can be really painful. I have included a simple example below.
Nowadays most modern language compilers do a two-pass over the source files to build the declarations in the first pass and process the definitions in the second one. Introducing this scheme into C++ shouldn't break any old code either. Therefore,
Why hasn't this, or a similar approach, been introduced into c++ already?
Are there any relevant clauses in the current standard inhibiting this approach?
Example
This is an example of a module based header library, which has blocking includes because of missing predeclarations. To solve this, the user of the library would have to predeclare the "missing" classes, which is not feasible.
Of course this problem might be solved by using a common include header that orders all declarations before definitions, but with a two-pass this code would also work, no modification required.
oom.h
#pragma once
#include "string.h"
struct OOM {
String message;
};
string.h
#pragma once
#include "array.h"
struct String {
Array data;
};
array.h
#pragma once
struct Array {
void Alloc();
};
#include "oom.h"
void Array::Alloc() { throw OOM(); }
str_usage.cpp
#include "string.h"
int main() {
String str;
}
void f(int);
void g() { f(3.14); }
void f(double);
g currently calls f(int), because it's the only f visible. What does it call in your world?
If it calls f(double), you just broke copious existing code.
If you came up with some rules to make it still call f(int), then that means if I write
void g2() { f2(3.14); }
void f2(double);
and then introduce a worse match for the argument - say, void f2(int); before g2, g2 will suddenly start calling the wrong thing. That's a maintainability nightmare.
A much simpler solution is to separate class definitions from function definitions:
struct A {
void Test();
};
struct B {
A a;
};
inline void A::Test() {
B();
}
There are ambiguities in the C++ grammar that can only be resolved if you know what an identifier refers to.
For example:
a * b;
can be either a multiplication if a is a variable, or a pointer declaration if a is a type. Each of these leads to a different parse tree, so the parser must know what a is.
This means that parsing and name resolution cannot be performed in separate passes, but must be done in one pass, leading to the requirement to pre-declare names.

Including different versions of the same class

I lost most of this afternoon tracking down a bug which basically came down to including two different versions of the same header file declaring the same class in Visual Studio 2015. Greatly simplified, it appears as follows;
oldcamera.h
#pragma once
class camera
{
public:
camera();
int a;
double x,y,z;
};
camera.h
#pragma once
class camera
{
public:
camera();
double x,y,z;
};
camera.cpp
#include camera.h
camera::camera()
{
x = y = 0;
z = 1;
};
mytransclass.h
#pragma once
#include "oldcamera.h"
class trans
{
public:
camera m_camera;
};
func.cpp
#include "mytransclass.h"
void MyFunc(trans *ptrans)
{
ptrans->x = 1.0;
ptrans->y = 2.0;
ptrans->z = 3.0;
}
The project includes camera.cpp and func.cpp, and when single stepping through MyFunc, the debugger showed the assignments weren't actually doing anything. Question is whether this should compile and link without warning, and if it is legal (which knowing the convoluted heritage of c++ is likely), why does the assignment fail? If it is legal, is there any way to flag it as an error? Compiler is Visual C++ 2015.
When you #include something you're basically copying and pasting that file into where you #included. It's not actually an error to have multiple of the same class definition, as long they are not multiple of the same class definition being included into the same compilation unit (cpp). If you do that, it breaks the one definition rule.
In your setup, the functions in the version of Camera in oldcamera.h are calling the functions written for camera.h (since those functions are the only which were compiled, since you said oldcamera.cpp wasn't in the project). However, those functions rely on the data layout of the class Camera being a certain way. Since oldcamera.h's Camera and new camera.h's Camera have a different data layout (which you didn't demonstrate, but I'm assuming), shit hits the fan.
Your program violates One Definition Rule, which in particular says ([basic.def.odr]/6):
There can be more than one definition of a class type (Clause 9), ... in a program provided that each definition
appears in a different translation unit, and provided the definitions satisfy the following requirements. Given
such an entity named D defined in more than one translation unit, then
(6.1) — each definition of D shall consist of the same sequence of tokens
...
If the definitions of D do not satisfy these requirements, then the behavior is undefined.

Is there a way to detect inline function ODR violations?

So I have this code in 2 separate translation units:
// a.cpp
#include <stdio.h>
inline int func() { return 5; }
int proxy();
int main() { printf("%d", func() + proxy()); }
// b.cpp
inline int func() { return 6; }
int proxy() { return func(); }
When compiled normally the result is 10. When compiled with -O3 (inlining on) I get 11.
I have clearly done an ODR violation for func().
It showed up when I started merging sources of different dll's into fewer dll's.
I have tried:
GCC 5.1 -Wodr (which requires -flto)
gold linker with -detect-odr-violations
setting ASAN_OPTIONS=detect_odr_violation=1 before running an instrumented binary with the address sanitizer.
Asan can supposedly catch other ODR violations (global vars with different types or something like that...)
This is a really nasty C++ issue and I am amazed there isn't reliable tooling for detecting it.
Pherhaps I have misused one of the tools I tried? Or is there a different tool for this?
EDIT:
The problem remains unnoticed even when I make the 2 implementations of func() drastically different so they don't get compiled to the same amount of instructions.
This also affects class methods defined inside the class body - they are implicitly inline.
// a.cpp
struct A { int data; A() : data(5){} };
// b.cpp
struct A { int data; A() : data(6){} };
Legacy code with lots of copy/paste + minor modifications after that is a joy.
The tools are imperfect.
I think Gold's check will only notice when the symbols have different types or different sizes, which isn't true here (both functions will compile to the same number of instructions, just using a different immediate value).
I'm not sure why -Wodr doesn't work here, but I think it only works for types, not functions, i.e. it will detect two conflicting definitions of a class type T but not your func().
I don't know anything about ASan's ODR checking.
The simplest way to detect such concerns is to copy all the functions into a single compilation unit (create one temporarily if needed). Any C++ compiler will then be able to detect and report duplicate definitions when compiling that file.

Weird seg fault problem

Greetings,
I'm having a weird seg fault problem. My application dumps a core file at runtime. After digging into it I found it died in this block:
#include <lib1/c.h>
...
x::c obj;
obj.func1();
I defined class c in a library lib1:
namespace x
{
struct c
{
c();
~c();
void fun1();
vector<char *> _data;
};
}
x::c::c()
{
}
x::c::~c()
{
for ( int i = 0; i < _data.size(); ++i )
delete _data[i];
}
I could not figure it out for some time till I ran nm on the lib1.so file: there are more function definitions than I defined:
x::c::c()
x::c::c()
x::c::~c()
x::c::~c()
x::c::func1()
x::c::func2()
After searching in code base I found someone else defined a class with same name in same namespace, but in another library lib2 as follows:
namespace x
{
struct c
{
c();
~c();
void func2();
vector<string> strs_;
};
}
x::c::c()
{
}
x::c::~c()
{
}
My application links to lib2, which has dependency on lib1. This interesting behavior brings several questions:
Why would it even work? I would expect a "multiple definitions" error while linking against lib2 (which depends upon lib1) but never had such. The application seems to be doing what's defined in func1 except it dumps a core at runtime.
After attaching debugger, I found my application calls the ctor of class c in lib2, then calls func1 (defined in lib1). When going out of scope it calls dtor of class c in lib2, where the seg fault occurs. Can anybody teach me how this could even occur?
How can I prevent such problems from happening again? Is there any C++ syntax I can use?
Forgot to mention I'm using g++ 4.1 on RHEL4, thank you very much!
1.
Violations of the "one definition rule" don't have to be diagnosed by your compiler. In fact, they are often only going to be known at link time when you link multiple object files together.
At link time, the information about the original class definitions may not exist any more (they are not needed after the compiler step) so having multiple definitions of a class is typically not easy to flag to the user.
2.
Once you have two distinct definitions pretty much anything can happen, you are in the territory of undefined behaviour. Whatever happens, it's a possible outcome.
3.
The most sensible thing to do is to communicate with the other members of your team. Agree who's going to use which namespaces and you won't get these problems. Otherwise, you point a documentation tool or static analysis tool over your entire project. Many such tools will be able to diagnose multiple inconsistent definitions of classes.
Just a guess but I don't see any using namespace x; so perhaps it used one namespace instead of the other?
With the advent of templates it became necessary to allow multiple definitions of a body of code with the same name; there was no way for the compiler to know if the same template code had already been generated in another compilation unit i.e. source file. When the linker finds these duplicates, it assumes they are identical. The burden is on you to make sure that they are - this is called the One Definition Rule.
On the linker level this is library interpositioning. The effective symbol bound unfortunately depends on the order of object files on linker command line (this is, sigh, historical).
From what you describe it looks that lib1 comes first in linker argument list and lib2 comes second and interposes on symbols from lib1. This explains the calls to constructors and destructors from the lib2 but calls to func1 from lib1 (since there's no func1-derived symbol in lib2, so there's no "hiding", the call is bound to lib1.)
The solution to this particular problem is to reverse the order of libraries on the linker invocation command.
There's lots of answers about the one definition rule. However, to me, this looks a lot more like a missing copy constructor.
To elaborate:
If the copy constructor is called on your object, then you will get a memory leak. This is because delete will be called on the same set of pointers twice.
namespace x
{
struct c
{
c() {
}
~c() {
for ( int i = 0; i < _data.size(); ++i )
delete _data[i];
}
c(const c & rhs) {
for (int i=0; i< rhs.size(); ++i) {
int len = strlen(rhs[i]);
char *mem = malloc(len + 1);
strncpy(mem, rhs[i], len + 1);
_data.push_back(mem);
}
void fun1();
vector<char *> _data;
};
}

When should linkers generate multiply defined X warnings?

Never turn your back on C++. It'll getcha.
I'm in the habit of writing unit tests for everything I do. As part of this I frequently define classes with names like A and B, in the .cxx of the test to exercise code, safe in the knowledge that i) because this code never becomes part of a library or is used outside of the test, name collisions are likely very rate and ii) the worst that could happen is that the linker will complain about multiply defined A::A() or what every and I'll fix that error. How wrong I was.
Here are two compilation units:
#include <iostream>
using namespace std;
// Fwd decl.
void runSecondUnit();
class A {
public:
A() : version( 1 ) {
cerr << this << " A::A() --- 1\n";
}
virtual ~A() {
cerr << this << " A::~A() --- 1\n";
}
int version; };
void runFirstUnit() {
A a;
// Reports 1, correctly.
cerr << " a.version = " << a.version << endl;
// If you uncomment these, you will call
// secondCompileUnit: A::getName() instead of A::~A !
//A* a2 = new A;
//delete a2;
}
int main( int argc, char** argv ) {
cerr << "firstUnit BEGIN\n";
runFirstUnit();
cerr << "firstUnit END\n";
cerr << "secondUnit BEGIN\n";
runSecondUnit();
cerr << "secondUnit END\n";
}
and
#include <iostream>
using namespace std;
void runSecondUnit();
// Uncomment to fix all the errors:
//#define USE_NAMESPACE
#if defined( USE_NAMESPACE )
namespace mySpace
{
#endif
class A {
public:
A() : version( 2 ) {
cerr << this << " A::A() --- 2\n";
}
virtual const char* getName() const {
cerr << this << " A::getName() --- 2\n"; return "A";
}
virtual ~A() {
cerr << this << " A::~A() --- 2\n";
}
int version;
};
#if defined(USE_NAMESPACE )
} // mySpace
using namespace mySpace;
#endif
void runSecondUnit() {
A a;
// Reports 1. Not 2 as above!
cerr << " a.version = " << a.version << endl;
cerr << " a.getName()=='" << a.getName() << "'\n";
}
Ok, ok. Obviously I shouldn't have declared two classes called A. My bad. But I bet you can't guess what happens next...
I compiled each unit, and linked the two object files (successfully) and ran. Hmm...
Here's the output (g++ 4.3.3):
firstUnit BEGIN
0x7fff0a318300 A::A() --- 1
a.version = 1
0x7fff0a318300 A::~A() --- 1
firstUnit END
secondUnit BEGIN
0x7fff0a318300 A::A() --- 1
a.version = 1
0x7fff0a318300 A::getName() --- 2
a.getName()=='A'
0x7fff0a318300 A::~A() --- 1
secondUnit END
So there are two separate A classes. In the second use, the destructor and constructor for the first on was used, even though only the second one was in visible in its compilation unit. Even more bizarre, if I uncomment the lines in runFirstUnit, instead of calling either A::~A, the A::getName is called. Clearly in the first use, the object gets the vtable for the second definition (getName is the second virtual function in the second class, the destructor the second in the first). And it even correcly gets the constructor from the first.
So my question is, why didn't the linker complain about the multiply defined symbols.
It appears to choose the first match. Reordering the objects in the link step confirm.
The behavior is identical in Visual Studio, so I'm guessing that this is some standard-defined behavior. My question is, why? Clearly it would be easy for the linker to barf given the duplicate names.
If I add,
void f() {}
to both files it complains. Why not for my class constructors and destructors?
EDIT The problem isn't, "what should I have done to avoid this", or "how is the behavior explained". It is, "why don't linkers catch it?" Projects may have thousands of compile units. Sensible naming practices don't really solve this issue -- they only make the problem obscure and only then if you can train everyone to follow them.
The above example leads to ambiguous behavior that is easy and definitively solvable by compiler tools. So, why do they not? Is this simply a bug. (I suspect not.)
** EDIT ** See litb's answer below. I'm repeating is back to make sure my understanding's right:
Linkers only generate warnings for strong references.
Because we have shared headers, inline function definitions (i.e. where declaration and definition is made at the same place, or template functions) are be compiled into multiple object files for each TU that sees them. Because there's no easy way to restrict the generation this code to a single object file, the linker has the job of choosing one of many definitions. So that errors are not generated by the linker, the symbols for these compiled definitions are tagged as weak references in the object file.
The compiler and linker relies on both classes to be exactly the same. In your case, they are different and so strange things happen. The one definition rule says that the result is undefined behavior - so behavior is not at all required to be consistent among compilers. . I suspect that in runFirstUnit, in the delete line, it puts a call to the first virtual table entry (because in its translation unit, the destructor may occupy the first entry).
In the second translation unit, this entry happens to point to A::getName, but in the first translation unit (where you execute the delete), the entry points to A::~A. Since these two are differently named (A::~A vs A::getName) you don't get a name clash (you will have code emitted for both the destructor and getName). But since their class name is the same, their v-tables will clash on purpose, because since both classes have the same name, the linker will think they are the same class and assume same contents.
Notice that all member functions were defined in-class, which means they are all inline functions. These functions can be defined multiple times in a program. In the case of in-class definitions, the rationale is that you may include the same class definition into different translation units from their header files. Your test function, however, isn't an inline function and thus including it into different translation units will triggers a linker error.
If you enable namespaces, there will be no clash what-so ever, because ::A and ::mySpace::A are different classes, and of course will get different v-tables.
A simple way to restrict each class to the current translation unit is to enclose it in an anonymous namespace:
// a.cpp
namespace {
class A {
// ...
};
}
// b.cpp
namespace {
class A {
// ...
};
}
is perfecetly legal. Because the two classes are in separate translation units, and are inside anonymous namespaces, they won't conflict.
The functions are defined as inline. inline functions can be defined multiple times in the program. See point 3 in the summary here:
http://en.wikipedia.org/wiki/One_Definition_Rule
The important point is:
For a given entity, each definition must be the same.
Try not defining the functions as inline. The linker should start to give duplicate symbol errors then.