gcc linking externs incorrectly? - c++

I'm relatively new to gcc and I'm using 'gcc (tdm-1) 5.1.0'. I've come across a very peculiar circumstance by mistake. I've shrunk my concern down to a very small reproducible example...
main.cpp
extern int g;
int main(){
return g;
}
someOtherFile.cpp
#include<windows.h>
RECT g;
this I compile with
gcc -c someOtherFile.cpp
gcc main.cpp someOtherFile.o
and this will link without error.
Am I missing something here as to why this is allowed to link?

3.5/10:
After all adjustments of types (during which typedefs (7.1.3) are replaced by their definitions), the types specified by all declarations referring to a given variable or function shall be identical, except that declarations for an array object can specify array types that differ by the presence or absence of a major array bound (8.3.4). A violation of this rule on type identity does not require a diagnostic.
That last sentence means the compiler and linker are not required to give you an error message. It's your job to get it straight.

In C++ it won't link as the types of g are non-matching between main.cpp and someOtherFile.cpp. You have to have int g in someOtherFile.cpp or opposite, extern RECT g; in main.cpp.
In C this will compile and link, but on in c++.
Com compile and link as c++:
g++ -c someOtherFile.cpp
g++ -c main.cpp
g++ main.o someOtherFile.o -o test
Alternatively, you may use functions for this:
main.cpp
int g();
int main{
return g();
}
someOtherFile.cpp
#include<windows.h>
RECT r;
int g()
{
return (int)r;
}
Obviously not recommended (as there isn't much point to cast RECT to int), but that would be similar in effect to what you were trying to do.
Variable g inside someOtherFile.cpp isn't related to the other extern int g declared in main.cpp. In effect, the int g simply isn't defined anywhere and linker will fail at the end for this reason.
Apparently, this will actually compile and link by g++, while microsoft linker will error on this type of mistake.

Related

C++ assign function in order to re-export it as extern "C"

Is it possible to export a C++ function with a C-compatible type (in this case (int, int) -> int) as a C symbol just by assigning it? Are there any hidden gotchas you have to watch out for?
The following code compiles without warnings and the resulting file has two symbols exposed.
I was surprised that it compiles at all, since I'm not sure what it means to copy a function.
namespace simplemath {
int add(int x, int y) {
return x + y;
}
}
extern "C" auto add = simplemath::add;
$ clang++ -Wall -Werror -pedantic --std=c++11 -c example.cc
$ nm example.o
0000000000000000 T __ZN10simplemath3addEii
0000000000000018 D _add
Is the code above equivalent to the following (up to whether or not simplemath::add is inlined)?
extern "C" int add(int x, int y) {
return simplemath::add(x, y);
}
namespace simplemath {
int add(int x, int y) {
return x + y;
}
}
You get
$ clang++ -Wall -Werror -pedantic -c example.cc
$ nm example.o
0000000000000020 T __ZN10simplemath3addEii
0000000000000000 T _add
No, this won't, in general, work. Functions with C linkage and functions with C++ linkage have different types, even if they take the same argument types and return the same type. So that function-pointer assignment is not legal. Most compilers don't enforce this, however, so you might get away with it. Of course, as soon as you upgrade to a compiler that enforces the correct semantics here your code will break.
Many people misunderstand the difference between C and C++ linkage, and think that it's just a matter of name mangling. But it's more than that. For example, the compiler can use different calling conventions for the two different linkages, and in that case, there's no way you could use a pointer to a C++ function in C code. extern "C" tells the compiler to compile the function so that it can be called from C.
The first version doesn't really copy a function (you can't do that), but rather creates a pointer to the function.
That might perhaps work if the compiler uses the same calling conventions for C and C++ functions. It might fail to compile otherwise, I don't know.
Anyway, I would use the second version as that is the intended way to create wrapper functions, and will be more portable. By "repacking" the parameters in the extern "C" function we know that they are passed correctly.

What's the usecase of gcc's used attribute?

#include <stdio.h>
// xyz will be emitted with -flto (or if it is static) even when
// the function is unused
__attribute__((__used__))
void xyz() {
printf("Hello World!\n");
}
int main() {
return 0;
}
What do I need this for?
Is there any way I could still reach xyz somehow besides directly calling the function, like some dlsym() like magic?
Attribute used is helpful in situation when you want to force compiler to emit symbol, when normally it may be omitted. As GCC's documentation says (emphasis mine):
This attribute, attached to a function, means that code must be
emitted for the function even if it appears that the function is not
referenced. This is useful, for example, when the function is
referenced only in inline assembly.
For instance, if you have code as follows:
#include <iostream>
static int foo(int a, int b)
{
return a + b;
}
int main()
{
int result = 0;
// some inline assembly that calls foo and updates result
std::cout << result << std::endl;
}
you might notice, that no symbol foo is present with -O flag (optimization level -O1):
g++ -O -pedantic -Wall check.cpp -c
check.cpp:3: warning: ‘int foo(int, int)’ defined but not used
nm check.o | c++filt | grep foo
As a result you cannot reference foo within this (imaginary) inline assembly.
By adding:
__attribute__((__used__))
it turns into:
g++ -O -pedantic -Wall check.cpp -c
nm check.o | c++filt | grep foo
00000000 t foo(int, int)
thus now foo can be referenced within it.
You may also have spotted that gcc's warning is now gone, as you have tell you compiler that you are sure that foo is actually used "behind the scene".
A particular usecase is for interrupt service routines in a static library.
For example, a timer overflow interrupt:
void __attribute__((interrupt(TIMERA_VECTOR),used)) timera_isr(void)
This timera_isr is never called by any function in the user code, but it might form an essential part of a library.
To ensure it is linked and there isn't a interrupt vector pointing to an empty section the keyword ensures the linker doesn't optimise it out.
If you declare a global variable or function that is unused, gcc will optimized it out (with warning), but if you declared the global variable or the function with '__attribute__((used))', gcc will include it in object file (and linked executable).
https://gcc.gnu.org/legacy-ml/gcc-help/2013-09/msg00108.html
Another use case is to generate proper coverage information for header files. Functions declared in header files are usually removed by the compiler when unreferenced. Therefore, you will get 100% coverage in your coverage reports even if you forgot to call some functions that are located in the header file. To prevent this, you may mark your function with __attribute__((used)) in your coverage builds.

Confusion about pointer values being compile-time constatns

In C++, it is possible for pointer values to be compile-time constants. This is true, otherwise, non-type template parameters and constexpr won't work with pointers. However, as far as I know, addresses of functions and objects of static storage are known (at least) at link-time rather than compile-time. Following is an illustration:
main.cpp
#include <iostream>
template <int* p>
void f() { std::cout << p << '\n'; }
extern int a;
int main() {
f<&a>();
}
a.cpp
int a = 0;
I'm just wondering how the address of a could possibly be known when compiling main.cpp. I hope somebody could explain this a little to me.
In particular, consider this
template <int* p, int* pp>
constexpr std::size_t f() {
return (p + 1) == (pp + 7) ? 5 : 10;
}
int main() {
int arr[f<&a, &b>()] = {};
}
How should the storage for arr be allocated?
PLUS: This mechanism seems to be rather robust. Even when I enabled Randomized Base Address, the correct output is obtained.
The compiler doesn't need to know the value of &a at compile time any more than it needs the value of function addresses.
Think of it like this: the compiler will instantiate your function template with &a as a parameter and generate "object code" (in whatever format it uses to pass to the linker). The object code will look like (well it won't, but you get the idea):
func f__<funky_mangled_name_to_say_this_is_f_for_&a>__:
reg0 <- /* linker, pls put &std::cout here */
reg1 <- /* hey linker, stuff &a in there ok? */
call std::basic_stream::operator<<(int*) /* linker, fun addr please? */
[...]
If you instantiate f<b&>, assuming b is another global static, compiler does the same thing:
func f__<funky_mangled_name_to_say_this_is_f_for_&b>__:
reg0 <- /* linker, pls put &std::cout here */
reg1 <- /* hey linker, stuff &b in there ok? */
call std::basic_stream::operator<<(int*) /* linker, fun addr please? */
[...]
And when your code calls for calling either of those:
fun foo:
call f__<funky_mangled_name_to_say_this_is_f_for_&a>__
call f__<funky_mangled_name_to_say_this_is_f_for_&b>__
Which exact function to call is encoded in the mangled function name.
The generated code doesn't depend on the runtime value of &a or &b.
The compiler knows there will be such things at runtime (you told it so), that's all it needs. It'll let the linker fill in the blanks (or yell at you if you failed to deliver on your promise).
For your addition I'm afraid I'm not familiar enough about the constexpr rules, but the two compilers I have tell me that this function will be evaluated at runtime, which, according to them, makes the code non-conforming. (If they're wrong, then the answer above is, at least, incomplete.)
template <int* p, int* pp>
constexpr std::size_t f() {
return (p + 1) == (pp + 7) ? 5 : 10;
}
int main() {
int arr[f<&a, &b>()] = {};
}
clang 3.5 in C++14 standards conforming mode:
$ clang++ -std=c++14 -stdlib=libc++ t.cpp -pedantic
t.cpp:10:10: warning: variable length arrays are a C99 feature [-Wvla-extension]
int arr[f<&a, &b>()];
^
1 warning generated.
GCC g++ 5.1, same mode:
$ g++ -std=c++14 t.cpp -O3 -pedantic
t.cpp: In function 'int main()':
t.cpp:10:22: warning: ISO C++ forbids variable length array 'arr' [-Wvla]
int arr[f<&a, &b>()];
As far as I know, the variables of static storage and functions are stored simply as symbols/place holders in the symbol table while compiling. It is in the linking phase when the place holders are resolved.
The compiler outputs machine code keeping the placeholders intact. Then the linker replaces the placeholders of the variables / functions with their respective memory locations. So in this case too, if you just compile main.cpp without compiling a.cpp and linking with it, you are bound to face linker error, as you can see here http://codepad.org/QTdJCgle (I compiled main.cpp only)

How to trace out why gcc and g++ produces different code

Is it possible to see what is going on behind gcc and g++ compilation process?
I have the following program:
#include <stdio.h>
#include <unistd.h>
size_t sym1 = 100;
size_t *addr = &sym1;
size_t *arr = (size_t*)((size_t)&arr + (size_t)&addr);
int main (int argc, char **argv)
{
(void) argc;
(void) argv;
printf("libtest: addr of main(): %p\n", &main);
printf("libtest: addr of arr: %p\n", &arr);
while(1);
return 0;
}
Why is it possible to produce the binary without error with g++ while there is an error using gcc?
I'm looking for a method to trace what makes them behave differently.
# gcc test.c -o test_app
test.c:7:1: error: initializer element is not constant
# g++ test.c -o test_app
I think the reason can be in fact that gcc uses cc1 as a compiler and g++ uses cc1plus.
Is there a way to make more precise output of what actually has been done?
I've tried to use -v flag but the output is quite similar. Are there different flags passed to linker?
What is the easiest way to compare two compilation procedures and find the difference in them?
In this case, gcc produces nothing because your program is not valid C. As the compiler explains, the initializer element (expression used to initialize the global variable arr) is not constant.
C requires initialization expressions to be compile-time constants, so that the contents of local variables can be placed in the data segment of the executable. This cannot be done for arr because the addresses of variables involved are not known until link time and their sum cannot be trivially filled in by the dynamic linker, as is the case for addr1. C++ allows this, so g++ generates initialization code that evaluates the non-constant expressions and stores them in global variables. This code is executed before invocation of main().
Executables cc1 and cc1plus are internal details of the implementation of the compiler, and as such irrelevant to the observed behavior. The relevant fact is that gcc expects valid C code as its input, and g++ expects valid C++ code. The code you provided is valid C++, but not valid C, which is why g++ compiles it and gcc doesn't.
There is a slightly more interesting question lurking here. Consider the following test cases:
#include <stdint.h>
#if TEST==1
void *p=(void *)(unsigned short)&p;
#elif TEST==2
void *p=(void *)(uintptr_t)&p;
#elif TEST==3
void *p=(void *)(1*(uintptr_t)&p);
#elif TEST==4
void *p=(void *)(2*(uintptr_t)&p);
#endif
gcc (even with the very conservative flags -ansi -pedantic-errors) rejects test 1 but accepts test 2, and accepts test 3 but rejects test 4.
From this I conclude that some operations that are easily optimized away (like casting to an object of the same size, or multiplying by 1) get eliminated before the check for whether the initializer is a constant expression.
So gcc might be accepting a few things that it should reject according to the C standard. But when you make them slightly more complicated (like adding the result of a cast to the result of another cast - what useful value can possibly result from adding two addresses anyway?) it notices the problem and rejects the expression.

runtime initialization a const

I have a header (only) file constants.h, where I define all the constant variables, to be used later in the library. However, there is one variable, which I would like to define run-time in an implementation file. I tried to do something like this:
constant.hpp
extern const unsigned int numTests;
somewhere else in run.cpp
const unsigned int numTests = 10;
and, then yet another file tester.cpp uses
if ( n < numTests) {
// do something
}
Now, when I compile it, I get a linker error in tester.o as undefined symbol numTests. I sort of understand why this is happening: the tester.cpp includes constants.hpp and not the run.cpp and so, it can not find the constant numTests initialized in run.cpp.
Is there any better way to do it?
TIA,
Nikhil
Make sure you are compiling both run.cpp and tester.cpp when you compile your program and you won't get a linker error.
You need to link run.o when creating the executable:
g++ -o tester tester.cpp run.o ; for GNU C++
(Check your own compiler's command line switches if you're not using GNU C++)