I am trying to compile a program so that it starts on a different entry point. I am running WSL1 with Ubuntu 20.04.5, and GCC and G++ 9.4.0
I found that adding the flag -Wl,--entry=foo to the compiler will link foo() as the entry function. Testing, this has worked with gcc, but not with g++.
Using the example file src/main.c:
#include <stdlib.h>
#include <stdio.h>
int main()
{
printf("Entering %s:%s\n", __FILE__, __func__);
return 0;
}
int not_main()
{
printf("Entering %s:%s\n", __FILE__, __func__);
exit(0); // Necessary, otherwise it throws a segfault
}
When compiled with gcc -Wl,--entry=not_main -o entry.o src/main.c the output is what I want: Entering src/main.c:not_main.
However, when compiled with g++ -Wl,--entry=not_main -o entry.o src/main.c, the following warning appears: /usr/bin/ld: warning: cannot find entry symbol not_main; defaulting to 0000000000001080.
This defaults to the main() function, outputting Entering src/main.c:main. The function not_main() is not found by the linker, but it is present in the source code.
The documentation for g++ says:
g++ is a program that calls GCC and automatically specifies linking against the C++ library.
I don't see how g++ can differ from gcc, if internally one calls the other. I understand that it is not the compiler but the linker which changes the entry point, and that g++ (unlike gcc) is linking against the C++ library, but I fail to understand how that is problematic.
What am I missing?
Because of name mangling, the function is not not_main but _Z8not_mainv.
how g++ can differ from gcc,
What is the difference between g++ and gcc? why use g++ instead of gcc to compile *.cc files?
C++, unlike C, uses name mangling to distinguish different overloads of the same function name.
When compiled with gcc:
$ objdump -t entry.o | grep not_main
000000000000117c g F .text 0000000000000036 not_main
When compiled with g++:
$ objdump -t entry.o | grep not_main
0000000000000000 *UND* 0000000000000000 not_main
000000000000117c g F .text 0000000000000036 _Z8not_mainv
The *UND*efined reference to not_main was probably placed there by the linker since you requested this as the entry point. The actual not_main function has its name mangled to _Z8not_mainv.
To export not_main under its original name, use extern "C":
extern "C" int not_main()
{
printf("Entering %s:%s\n", __FILE__, __func__);
exit(0); // Necessary, otherwise it throws a segfault
}
Related
I am trying to write a program to JIT some code. The JITTed code needs to make calls back into the running application for run-time support and the run-time support symbols are not found when the function is materialized.
I tried to follow the Kaleidoscope tutorial. I need to call a function in the run-time from some IR generated code. For example, I want to call this function from some llvm IR.
extern "C" void* llvmNewVector() {
return new vector<int>();
}
According to the Kaleidoscope tutorial it should be declared extern "C" and in the run-time of the application. Within the LLVM IR I have created a function prototype and the IR is correctly generated (no errors after checking the function I am jitting).
It would seem to me that there would be something more to do to link this function to the jitted code, but the Kaleidoscope tutorial doesn't seem to do that.
My problem is that the jitted code fails to materialize because the external symbols are not resolved.
The following code prints "made it here" but gets no further.
cerr << "made it here." << endl;
auto Sym = ExitOnErr(TheJIT->lookup(name));
NativeCodePtr FP = (NativeCodePtr)Sym.getAddress();
assert(FP && "Failed to find function ");
cerr << "returning jitted function " << name << endl;
return FP;
I am sure I am doing something wrong or missing some step, but I have not been able to find it.
The output I get is:
made it here.
JIT session error: Symbols not found: { llvmNewVector }
Failed to materialize symbols: { my_test }
The code was compiled using LLVM-9 with the following flags:
clang++ -I. -g -I../include/ -std=c++11 -fexceptions -fvisibility=hidden -fno-rtti -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -MT main.o -MD -MP -MF .deps/main.Tpo -c -o main.o main.cpp
For linking the following was used:
llvm-config --libs
I ran into this same issue and could solve it the following way:
The following lines of code in the tutorial, whose goal is to resolve symbols in the host process, does not seem to work.
ES.getMainJITDylib().setGenerator(
cantFail(DynamicLibrarySearchGenerator::GetForCurrentProcess(DL)));
So instead, I manually registered the symbols that I wanted linked like this:
SymbolMap M;
// Register every symbol that can be accessed from the JIT'ed code.
M[Mangle("llvmNewVector")] = JITEvaluatedSymbol(
pointerToJITTargetAddress(&llvmNewVector), JITSymbolFlags());
}
cantFail(ES.getMainJITDylib().define(absoluteSymbols(M)));
I added this code right after the two lines of code that I mentioned above, from the tutorial.
How about adding -Xlinker --export-dynamic option for clang?
I met the similar problem in the tutorial.
In my environment (Ubuntu 20.04), sin and cos can be resolved, but printd or putchard (the functions defined in the source code of Kaleidoscope processor) cannot.
After compilation, can you see the function name in the dynamic symbol table of the program?
objdump -T program | grep llvmNewVector
If there are no -T option in objdump (e.g., Mac), it might not be the case.
In my case, printd nor putchard do not appear in the dynamic symbol table (but appear in symbol table).
To add these function names into the dynamic symbol table, you need to pass -Xlinker --export-dynamic option for clang (actually, the option is passed to ld), for example (this is one for the tutorial),
clang++ -Xlinker --export-dynamic -g toy.cpp `llvm-config --ldflags --system-libs --libs all` -O3 -o toy
After compilation, the function names appear in dynamic symbol table, and the examples of the tutorial work well.
It depends on which llvm version you use. LLVM 10 has LLJIT class and it was working for me the following way
auto J = ExitOnErr(LLJITBuilder().create());
auto M = createDemoModule();
auto &dl = J->getDataLayout();
MangleAndInterner Mangle(J->getExecutionSession(), dl);
auto &jd = J->getMainJITDylib();
auto s = absoluteSymbols({{ Mangle("printd"), JITEvaluatedSymbol(pointerToJITTargetAddress(&printd), JITSymbolFlags::Exported)}});
jd.define(s);
the printd function was defined in the same file
extern "C" int32_t printd() {
std::cout << "calling " << __FUNCTION__ << "...\n";
return 11;
}
I am using Cygwin 32-bit under Win7 in a 64-bit machine.
The following program
makefile:
runme: main.cpp asm.o
g++ main.cpp asm.o -o executable
asm.o: asm.asm
nasm -f elf asm.asm -o asm.o
asm.asm:
section .data
section .bss
section .text
global GetValueFromASM
GetValueFromASM:
mov eax, 9
ret
main.cpp:
#include <iostream>
using namespace std;
extern "C" int GetValueFromASM();
int main()
{
cout<<"GetValueFromASM() returned = "<<GetValueFromASM()<<endl;
return 0;
}
is giving me the following error:
$ make
nasm -f elf asm.asm -o asm.o
g++ main.cpp asm.o -o executable
/tmp/cc3F1pPh.o:main.cpp:(.text+0x26): undefined reference to `GetValueFromASM'
collect2: error: ld returned 1 exit status
make: *** [makefile:2: runme] Error 1
I am not understanding why this error is being generated.
How can I get rid of this issue?
You have to prefix your symbols with _, as is customary in Windows/Cygwin:
section .data
section .bss
section .text
global _GetValueFromASM
_GetValueFromASM:
mov eax, 9
ret
The rest of your code should work fine.
An alternative would be to compile with -fno-leading-underscore. However, this may break linking with other (Cygwin system) libraries. I suggest using the first option if portability to other platforms does not matter to you.
Quoting from the GNU Online Docs:
-fleading-underscore
This option and its counterpart, -fno-leading-underscore, forcibly change the way C symbols are represented in the object file. One use is to help link with legacy assembly code.
Warning: the -fleading-underscore switch causes GCC to generate code that is not binary compatible with code generated without that switch. Use it to conform to a non-default application binary interface. Not all targets provide complete support for this switch.
I'd like to build a dynamic library from a Rust program and link it to an existing C++ project.
For the C++ project, we are stuck on using gcc for compilation (a relatively old gcc 4.8.2, but I'm also trying with gcc 7.3.0 with the same issue).
This is a minimal example of the issue:
src/lib.rs
#[no_mangle]
pub unsafe extern "C" fn hello() {
println!("Hello World, Rust here!");
}
Cargo.toml
[package]
name = "gcc-linking"
version = "0.1.0"
authors = ..
edition = "2018"
[lib]
crate-type = ["dylib"]
[dependencies]
hello.cpp:
extern "C" void hello();
int main() {
hello();
return 0;
}
Now, when I link with clang, everything is fine:
cargo build --lib
clang -L target/debug -l gcc_linking hello.cpp -o hello
LD_LIBRARY_PATH=target/debug:$LD_LIBRARY_PATH ./hello
As expected, this results in:
Hello World, Rust here!
But if I try to link this with gcc, I get the following linking error:
gcc -L target/debug -l gcc_linking hello.cpp -o hello
Output:
/tmp/ccRdGJOK.o: In function `main':
hello.cpp:(.text+0x5): undefined reference to `hello'
collect2: error: ld returned 1 exit status
Looking at the dynamical library:
# objdump -T output
0000000000043f60 g DF .text 0000000000000043 Base hello
# nm -gC output
0000000000043f60 T hello
I suspect the problem has something to do with mangling of function names, but I cannot figure out how to solve it.
Any ideas?
As #Jmb suggested, the solution was to change the order of arguments to gcc and list the shared library after the C++ file:
gcc -L target/debug hello.cpp -l gcc_linking -o hello
I am using Apple LLVM version 8.0.0 (clang-800.0.42.1) to compile. It's about 1200 files, but I have used them before. I go and compile them all, no problems. Then I make my static library (ar rcs libblib.a *.o), no problems. So when I try to use my brand new library, I have my problem.
gcc main.c -L. -lblib
Undefined symbols for architecture x86_64:
"_N_method", referenced from:
_main in main-7fc584.o
ld: symbol(s) not found for architecture x86_64
But, I know this is defined. I check to see that the file is included (ar -t libblib.a | grep N_METHOD.o) and it is in there. Check the source file, and there is the method, exactly named as it is in the header file. What is the problem I am having here? I am at a complete loss and I am hoping I am missing something simple.
I did nm -g N_METHOD.o and got back:
0000000000000000 T __Z8N_methodP6stacks
Transferring comments into an answer.
Based on the question content, I asked:
Have you checked that N_METHOD.o is a 64-bit object file (or a fat object file with both 32-bit and 64-bit code in it)? If it is a 32-bit object file, then it is no use for a 64-bit program. However, that's a little unlikely; you have to go out of your way to create a 32-bit object file on Mac.
Have you run nm -g N_METHOD.o to see whether _N_method is defined in the object file?
I did nm -g N_METHOD.o and got back:
0000000000000000 T __Z8N_methodP6stacks
Don't compile C code with a C++ compiler. Or don't try to compile C++ code with a C compiler. The mangled name (__Z8N_methodP6stacks) is for C++. Maybe you simply need to link with g++ instead of gcc? They are different languages — this is the property of 'type-safe linkage' that is characteristic of C++ and completely unknown to C.
First step — compile and link with:
g++ main.c -L. -lblib
Assuming that the source is in the C++ subset of C (or C subset of C++), then the chances are that should work. At least, if the code contains N_Method(&xyz) where xyz is a variable of type stacks, then there's a chance it will call __Z8N_methodP6stacks.
The following code:
typedef struct stacks stacks;
extern int N_method(stacks*);
extern int relay(stacks *r);
int relay(stacks *r) { return N_method(r); }
compiles with a C++ compiler to produce the nm -g output:
0000000000000000 T __Z5relayP6stacks
U __Z8N_methodP6stacks
It also compiles with a C compiler to produce the nm -g output:
0000000000000038 s EH_frame1
U _N_method
0000000000000000 T _relay
I am trying to use my own printf function so i don't want to include standard include files... so I am compiling my code with -nostdinc
I have created my program something like this:
extern int printf(const char*,...);
printf("Value:%d",1234);
//printf("\n");
It is working fine for this code, but when I use printf("\n") then it is showing undefined reference to 'putchar'.
If i comment printf("\n"); then nm command is showing
$ nm test1.o
U exit
00000000 T main
U printf
00000030 T _start
but if I use printf("\n"); then nm command is showing
$nm test1.o
U exit
00000000 T main
U printf
U putchar
0000003c T _start
I am not getting how and from where putchar is getting included
gcc version 4.8.2 (GCC)
gcc optimizes printf in certain situations. You can look at the function fold_builtin_printf here for the complete details. IIRC, it optimizes calls with one argument followed by a newline to puts/putchar. You can turn it off by specifying -fno-builtin(gcc docs).