Symbol not found when using template defined in a library - c++

I'm trying to use the adobe xmp library in an iOS application but I'm getting link errors. I have the appropriate headers and libraries in my path, but I'm getting link errors. I double-checked to make sure the headers and library are on my path. I checked the mangled names of the methods, but they aren't in the library (I checked using the nm command). What am I doing wrong?
Library Header:
#if defined ( TXMP_STRING_TYPE )
#include "TXMPMeta.hpp"
#include "TXMPIterator.hpp"
#include "TXMPUtils.hpp"
typedef class TXMPMeta <TXMP_STRING_TYPE> SXMPMeta; // For client convenience.
typedef class TXMPIterator <TXMP_STRING_TYPE> SXMPIterator;
typedef class TXMPUtils <TXMP_STRING_TYPE> SXMPUtils;
.mm file:
#include <string>
using namespace std;
#define IOS_ENV
#define TXMP_STRING_TYPE string
#import "XMP.hpp"
void DoStuff()
{
SXMPMeta meta;
string returnValue;
meta.SetProperty ( kXMP_NS_PDF, "test", "{ formId: {guid} }" );
meta.DumpObject(DumpToString, &returnValue);
}
Link Errors:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::DumpObject(int (*)(void*, char const*, unsigned int), void*) const", referenced from:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::TXMPMeta()", referenced from:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::SetProperty(char const*, char const*, char const*, unsigned int)", referenced from:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::~TXMPMeta()", referenced from:
(null): Linker command failed with exit code 1 (use -v to see invocation)

Basically what's happened is that you only have the definitions in the headers
if I say
template<class T> T something(T); somewhere, that tells the compiler "trust me bro, it exists, leave it to the linker"
and it adds the symbol to the object file as if it did exist. Because it can see the prototype it knows how much stack space, what type it returns and such, so it just sets it up so the linker can just come along and put the address of the function in.
BUT in your case there is no address. You /MUST/ have the template definition (not just declaration) in the same file so the compiler can create one (with weak linkage) so here it's assumed they exist, but no where does it actually stamp out this class from a template, so the linker doesn't find it, hence the error.
Will fluff out my answer now, hope this helps.
Addendum 1:
template<class T> void output(T&);
int main(int,char**) {
int x = 5;
output(x);
return 0;
}
This will compile but NOT link.
Output:
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/main.cpp >> build/main.o.d ; then rm build/main.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/main.cpp -o build/main.o
g++ build/main.o -o a.out
build/main.o: In function `main':
(my home)/src/main.cpp:13: undefined reference to `void output<int>(int&)'
collect2: error: ld returned 1 exit status
make: *** [a.out] Error 1
(I hijacked an open projet for this hence the names)
As you can see the compile command works fine (the one that ends in -o build/main.o) because we tell it "look this function exists"
So in the object file it says to the linker (in some "name managled form" to keep the templates) "put the location in memory of void output(int&); here" the linker can't find it.
Compiles and links
#include <iostream>
template<class T> void output(T&);
int main(int,char**) {
int x = 5;
output(x);
return 0;
}
template<class T> void output(T& what) {
std::cout<<what<<"\n";
std::cout.flush();
}
Notice line 2, we tell it "there exists a function, a template in T called output, that returns nothing and takes a T reference", that means it can use it in the main function (remember when it's parsing the main function it hasn't seen the definition of output yet, it has just been told it exists), the linker then fixes that. 'though modern compilers are much much smarter (because we have more ram :) ) and rape the structure of your code, link-time-optimisation does this even more, but this is how it used to work, and how it can be considered to work these days.
Output:
make all
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/main.cpp >> build/main.o.d ; then rm build/main.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/main.cpp -o build/main.o
g++ build/main.o -o a.out
As you can see it compiled fine and linked fine.
Multiple files without include as proof of this
main.cpp
#include <iostream>
int TrustMeCompilerIExist();
int main(int,char**) {
std::cout<<TrustMeCompilerIExist();
std::cout.flush();
return 0;
}
proof.cpp
int TrustMeCompilerIExist() {
return 5;
}
Compile and link
make all
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/main.cpp >> build/main.o.d ; then rm build/main.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/main.cpp -o build/main.o
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/proof.cpp >> build/proof.o.d ; then rm build/proof.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/proof.cpp -o build/proof.o
g++ build/main.o build/proof.o -o a.out
(Outputs 5)
Remember #include LITERALLY dumps a file where it says "#include" (+ some other macros that adjust line numbers) this is called a translation unit. Rather than using a header file to contain "int TrustMeCompilerIExist();" which declares that the function exists (but the compiler again doesn't know where it is, the code inside of it, just that it exists) I repeated myself.
Lets look at proof.o
command
objdump proof.o -t
output
proof.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 proof.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .debug_info 0000000000000000 .debug_info
0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev
0000000000000000 l d .debug_aranges 0000000000000000 .debug_aranges
0000000000000000 l d .debug_line 0000000000000000 .debug_line
0000000000000000 l d .debug_str 0000000000000000 .debug_str
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 g F .text 0000000000000006 _Z21TrustMeCompilerIExistv
Right at the bottom there, there's a function, at offset 6 into the file, with debugging information, (the g is global though) you can see it's called _Z (this is why _ is reserved for some things, I forget what exactly... but it's to do with this) and Z is "integer", 21 is the name length, and after the name, the v is "void" the return type.
The zeros at the start btw are the section number, remember binaries can be HUGE.
Disassembly
running:
objdump proof.o -S gives
proof.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z21TrustMeCompilerIExistv>:
int TrustMeCompilerIExist() {
return 5;
}
0: b8 05 00 00 00 mov $0x5,%eax
5: c3 retq
Because I have -g you can see it put the code that the assembly relates to (it makes more sense with bigger functions, it shows you what the following instructions until the next code block actually do) that wouldn't normally be there.
main.o
Here's the symbol table, obtained the same way as the above:
objdump main.o -t
main.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 main.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .text.startup 0000000000000000 .text.startup
0000000000000030 l F .text.startup 0000000000000026 _GLOBAL__sub_I_main
0000000000000000 l O .bss 0000000000000001 _ZStL8__ioinit
0000000000000000 l d .init_array 0000000000000000 .init_array
0000000000000000 l d .debug_info 0000000000000000 .debug_info
0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev
0000000000000000 l d .debug_loc 0000000000000000 .debug_loc
0000000000000000 l d .debug_aranges 0000000000000000 .debug_aranges
0000000000000000 l d .debug_ranges 0000000000000000 .debug_ranges
0000000000000000 l d .debug_line 0000000000000000 .debug_line
0000000000000000 l d .debug_str 0000000000000000 .debug_str
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 g F .text.startup 0000000000000026 main
0000000000000000 *UND* 0000000000000000 _Z21TrustMeCompilerIExistv
0000000000000000 *UND* 0000000000000000 _ZSt4cout
0000000000000000 *UND* 0000000000000000 _ZNSolsEi
0000000000000000 *UND* 0000000000000000 _ZNSo5flushEv
0000000000000000 *UND* 0000000000000000 _ZNSt8ios_base4InitC1Ev
0000000000000000 *UND* 0000000000000000 .hidden __dso_handle
0000000000000000 *UND* 0000000000000000 _ZNSt8ios_base4InitD1Ev
0000000000000000 *UND* 0000000000000000 __cxa_atexit
See how it says undefined, that's because it doesn't know where it is, it just knows it exists (along with the standard lib stuff, which the linker will find itself)
In closing
USE HEADER GUARDS and with templates put #include file.cpp at the bottom BEFORE the closing header guard. that way you can include header files as usual :)

The answer to your question is present in ever sample that comes with XMP SDK Toolkit.Clients must compile XMP.incl_cpp to ensure that all client-side glue code is generated. Do this by including it in exactly one of your source files.
For your ready reference I am pasting below a more detailed explanation present in section Template classes and accessing the API of XMPProgrammersGuide.pdf that comes with XMP SDK Toolkit
Template classes and accessing the API
The full client API is defined and documented in the TXMP*.hpp header files. The TXMP* classes are C++ template classes that must be instantiated with a string class such as std::string, which is used to return text strings for property values, serialized XMP, and so on. To allow your code to access the entire XMP API you must:
Provide a string class such as std::string to instantiate the template classes.
Provide access to XMPCore and XMPFiles by including the necessary defines and headers. To do this, add the necessary define and includes directives to your source code so that all necessary code is incorporated into the build:
#include <string>
#define XMP_INCLUDE_XMPFILES 1 //if using XMPFiles
#define TXMP_STRING_TYPE std::string
#include "XMP.hpp"
The SDK provides complete reference documentation for the template classes, but the templates must be instantiated for use. You can read the header files (TXMPMeta.hpp and so on) for information, but do not include them directly in your code. There is one overall header file, XMP.hpp, which is the only one that C++ clients should include using the #include directive. Read the instructions in this file for instantiating the template classes. When you have done this, the API is available through the concrete classes named SXMP*; that is, SXMPMeta, SXMPUtils, SXMPIterator, and SXMPFiles. This document refers to the SXMP* classes, which you can instantiate and which provide static functions.
Clients must compile XMP.incl_cpp to ensure that all client-side glue code is generated. Do this by including it in exactly one of your source files.
Read XMP_Const.h for detailed information about types and constants for namespace URIs and option flags.

Related

How to export template instantiation as non weak?

C++ template functions are exported as weak symbols to work around the one definition rule (related question). In a situation where the function is explicitly instantiated for every use case, is there a way to export the symbol as non-weak?
Example use case:
// foo.hpp
template<typename T>
void foo();
// All allowed instantiations are explicitly listed.
extern template void foo<int>();
extern template void foo<short>();
extern template void foo<char>();
// foo.cpp
template<typename T>
void foo()
{
// actual implementation
}
// All explicit instantiations.
template void foo<int>();
template void foo<short>();
template void foo<char>();
When I compile the code above with GCC or ICC, they are tagged as weak:
$ nm foo.o
U __gxx_personality_v0
0000000000000000 W _Z3fooIcEvv
0000000000000000 W _Z3fooIiEvv
0000000000000000 W _Z3fooIsEvv
Is there a way to prevent that? Since they are actually definitive, I would want them to not be candidate for replacement.
objcopy supports the --weaken option, but you want the opposite.
It also supports the --globalize-symbol, but that appears to have no effect on weak symbols:
gcc -c t.cc
readelf -Ws t.o | grep _Z3fooI
14: 0000000000000000 7 FUNC WEAK DEFAULT 7 _Z3fooIiEvv
15: 0000000000000000 7 FUNC WEAK DEFAULT 8 _Z3fooIsEvv
16: 0000000000000000 7 FUNC WEAK DEFAULT 9 _Z3fooIcEvv
objcopy -w --globalize-symbol _Z3fooI* t.o t1.o &&
readelf -Ws t1.o | grep _Z3fooI
14: 0000000000000000 7 FUNC WEAK DEFAULT 7 _Z3fooIiEvv
15: 0000000000000000 7 FUNC WEAK DEFAULT 8 _Z3fooIsEvv
16: 0000000000000000 7 FUNC WEAK DEFAULT 9 _Z3fooIcEvv
Not to be deterred, we can first localize the symbols, then globalize them:
objcopy -w -L _Z3fooI* t.o t1.o &&
objcopy -w --globalize-symbol _Z3fooI* t1.o t2.o &&
readelf -Ws t2.o | grep _Z3fooI
14: 0000000000000000 7 FUNC GLOBAL DEFAULT 7 _Z3fooIiEvv
15: 0000000000000000 7 FUNC GLOBAL DEFAULT 8 _Z3fooIsEvv
16: 0000000000000000 7 FUNC GLOBAL DEFAULT 9 _Z3fooIcEvv
VoilĂ : the symbols are now strongly defined.
The problem I am trying to solve is that the link time is too slow and I want to reduce the work of the linker to the minimum.
If this makes the linker do less work (which I doubt), I'd consider that a bug in the linker -- if the symbol is defined once, it shouldn't matter to the linker whether that definition is strong or weak.

g++ skipping a function when compiling to object file [duplicate]

This question already has answers here:
Why can templates only be implemented in the header file?
(17 answers)
Closed 2 years ago.
the solution to this is probably trivial, but I can't find it. I tried to google it but with no luck.
I'm working on a C++ project using g++ on linux (gcc version 10.1.0 Ubuntu 10.1.0-2ubuntu1-18.04).
g++ compiles a C++ file into an object .o without raising any error, but the end object file is missing a function! The other 8 library files that I wrote are all compiled and linked fine, only this one is giving me trouble. Why, and how do I solve it?
The library header file bpo_interface.h is:
#pragma once
#include <boost/program_options/options_description.hpp>
#include <boost/program_options/variables_map.hpp>
#include <boost/algorithm/string.hpp>
#include <optional>
#include <string>
namespace bpo = boost::program_options;
namespace ibsimu_client::bpo_interface {
template <typename T>
std::optional<T> get(bpo::variables_map &params_op, std::string key)
}
The bpo_interface.cpp:
#include "bpo_interface.h"
namespace ic_bpo = ibsimu_client::bpo_interface;
template <typename T>
std::optional<T> ic_bpo::get(bpo::variables_map &params_op, std::string key)
{
try {
const T& value =
params_op[key].as<T>();
return value;
}
catch(const std::exception& e) {
return std::nullopt;
}
return std::nullopt;
}
The g++ command used to compile the file:
g++-10 -std=c++20 -lboost_program_options -Wall -g `pkg-config --cflags ibsimu-1.0.6dev` -c -o bin/build/bpo_interface.o src/bpo_interface.cpp
and the output of objdump -t -C bin/build/bpo_interface.o:
bin/build/bpo_interface.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 bpo_interface.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000000 l O .rodata 0000000000000001 __pstl::execution::v1::seq
0000000000000001 l O .rodata 0000000000000001 __pstl::execution::v1::par
0000000000000002 l O .rodata 0000000000000001 __pstl::execution::v1::par_unseq
0000000000000003 l O .rodata 0000000000000001 __pstl::execution::v1::unseq
0000000000000004 l O .rodata 0000000000000004 __gnu_cxx::__default_lock_policy
0000000000000008 l O .rodata 0000000000000008 boost::container::ADP_nodes_per_block
0000000000000010 l O .rodata 0000000000000008 boost::container::ADP_max_free_blocks
0000000000000018 l O .rodata 0000000000000008 boost::container::ADP_overhead_percent
0000000000000020 l O .rodata 0000000000000008 boost::container::ADP_only_alignment
0000000000000028 l O .rodata 0000000000000008 boost::container::NodeAlloc_nodes_per_block
0000000000000030 l O .rodata 0000000000000001 boost::container::ordered_range
0000000000000031 l O .rodata 0000000000000001 boost::container::ordered_unique_range
0000000000000032 l O .rodata 0000000000000001 boost::container::default_init
0000000000000033 l O .rodata 0000000000000001 boost::container::value_init
0000000000000000 l d .debug_info 0000000000000000 .debug_info
0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev
0000000000000000 l d .debug_aranges 0000000000000000 .debug_aranges
0000000000000000 l d .debug_line 0000000000000000 .debug_line
0000000000000000 l d .debug_str 0000000000000000 .debug_str
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .comment 0000000000000000 .comment
Coherently with the objdump result, the linker complains that it cannot find the ic_bpo::get() function - specifically:
undefined reference to 'std::optional<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > ibsimu_client::bpo_interface::get<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(boost::program_options::variable_maps&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >)
If I copy&paste the function body into the definition file bpo_interface.h and remove the bpo_interface.cpp and bpo_interface.o from the project, everything works fine.
So I guess g++ at compile time is perfectly able to process that function and match its declaration with its use in the project.
But why is not compiled into the bpo_interface.o object file?
Thank you
You have the placed the definition of the template function get() in a source (cpp) file. This means the definition is not available when needed for instantiations of the function template for particular specializations outside of this source file. Note that the definition of a function template is not equivalent to the definition of a non-function template, it is more like a blueprint for how to generate definitions for particular specializations; as needed for implicit or explicit instantiation (definitions).
When moving the definition to the header file, the function template definition is readily available as needed when particular specializations are instantiated. You can place the definition of a function template in a source file, but as that source file is then the only translation unit that can see the definitions, you would also need to provide explicit instantiation definitions of all the specializations you would like the template function to provide instantiated definitions for. This is quite uncommon, and usually only used for e.g. static dependency injection into class template which have only a single specialization used for production code intent (which can can then be explicitly instantiated) and e.g. other instantiations for test code (e.g. injecting mocked or stubbed implementations).
But why is not compiled into the bpo_interface.o object file?
From cppreference - function templates [emphasis mine]:
Function template instantiation
A function template by itself is not a type, or a function, or any other entity. No code is generated from a source file that contains only template definitions. In order for any code to appear, a template must be instantiated: the template arguments must be determined so that the compiler can generate an actual function (or class, from a class template).
tl;dr: Add the following to your .cpp file:
template std::optional<std::string> ic_bpo::get<std::string(bpo::variables_map &, std::string);
(and of course make sure to include the <string> header.)
But why is not compiled into the bpo_interface.o object file?
Because you defined a template; you did not instantiate that template at all. Only instantiations are actual functions, which can be compiled and put in object files. So, you need to force an instantiation of your template; that's what the line above does, for the case of T = std::string.
Alternatively, if you keep the template definition in your header, than other translation units can instantiate it themselves as needed.
See also:
Explicit template instantiation - when is it used?

Multiple definition of CUDA device functions

I'm trying to compile some functions to use them in host code and in device cuda code but I'm getting a multiple definition linking error.
What Im trying to achieve is the following:
I have a CudaConfig.h file with the following content
CudaConfig.h
#ifdef __CUDACC__
#define CUDA_CALLABLE_DEVICE __device__
#define CUDA_CALLABLE_HOST __host__
#define CUDA_CALLABLE __host__ __device__
#else
#define CUDA_CALLABLE_DEVICE
#define CUDA_CALLABLE_HOST
#define CUDA_CALLABLE
#endif
In my foo.h file I have some functions with the following signature
#include "CudaConfig.h"
struct Bar {Eigen::Vector3d v;};
CUDA_CALLABLE_DEVICE Eigen::Vector3d &foo(Bar &aBar);
and I implement them in foo.cpp and a foo.cu files.
foo.cpp
#include "foo.h"
Eigen::Vector3d &foo(Bar &aBar) {aBar.v += {1,1,1}; return aBar.v;}
foo.cu
#include "foo.h"
Eigen::Vector3d &foo(Bar &aBar) {aBar.v += {1,1,1}; return aBar.v;}
I need to separate both implementations in different files as Eigen disables some SIMD operations when you use it from a __device__ function, so I dont want to implement both in foo.cu file for performance reasons.
Should I implement the function directly in the .h file, marking them as inline so I dont have the multiple definition linking error? As Eigen disables the SIMD for the __device__ code, wouldn't this make the __host__ and __device__ functions different unlike what inline expects?
Here's what's happening:
rthoni#rthoni-lt1:~/projects/nvidia/test_device_host$ cat test.cu
extern "C" {
__device__ void test_device_fn()
{
}
}
rthoni#rthoni-lt1:~/projects/nvidia/test_device_host$ nvcc test.cu -c -o test_cu.o
rthoni#rthoni-lt1:~/projects/nvidia/test_device_host$ objdump -t test_cu.o
test_cu.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 tmpxft_000004d9_00000000-5_test.cudafe1.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l O .bss 0000000000000001 _ZL22__nv_inited_managed_rt
0000000000000008 l O .bss 0000000000000008 _ZL32__nv_fatbinhandle_for_managed_rt
0000000000000000 l F .text 0000000000000016 _ZL37__nv_save_fatbinhandle_for_managed_rtPPv
0000000000000010 l O .bss 0000000000000008 _ZZL22____nv_dummy_param_refPvE5__ref
000000000000002f l F .text 0000000000000016 _ZL22____nv_dummy_param_refPv
0000000000000000 l d __nv_module_id 0000000000000000 __nv_module_id
0000000000000000 l O __nv_module_id 000000000000000f _ZL15__module_id_str
0000000000000018 l O .bss 0000000000000008 _ZL20__cudaFatCubinHandle
0000000000000045 l F .text 0000000000000022 _ZL26__cudaUnregisterBinaryUtilv
0000000000000067 l F .text 000000000000001a _ZL32__nv_init_managed_rt_with_modulePPv
0000000000000000 l d .nv_fatbin 0000000000000000 .nv_fatbin
0000000000000000 l .nv_fatbin 0000000000000000 fatbinData
0000000000000000 l d .nvFatBinSegment 0000000000000000 .nvFatBinSegment
0000000000000000 l O .nvFatBinSegment 0000000000000018 _ZL15__fatDeviceText
0000000000000020 l O .bss 0000000000000008 _ZZL31__nv_cudaEntityRegisterCallbackPPvE5__ref
0000000000000081 l F .text 0000000000000026 _ZL31__nv_cudaEntityRegisterCallbackPPv
00000000000000a7 l F .text 0000000000000045 _ZL24__sti____cudaRegisterAllv
0000000000000000 l d .init_array 0000000000000000 .init_array
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000016 g F .text 0000000000000019 test_device_fn
0000000000000000 *UND* 0000000000000000 _GLOBAL_OFFSET_TABLE_
0000000000000000 *UND* 0000000000000000 exit
0000000000000000 *UND* 0000000000000000 __cudaUnregisterFatBinary
0000000000000000 *UND* 0000000000000000 __cudaInitModule
0000000000000000 *UND* 0000000000000000 __cudaRegisterFatBinary
0000000000000000 *UND* 0000000000000000 atexit
As you can see, even though the function is tagged as __device__ only, nvcc will still generate a symbol for it in the object file.
This behavior is a bug of nvcc. (#845649 in our bug tracker)
There's 3 ways of getting rid of this error:
Let nvcc generate both device and host code
Change the way you compile cu files to just build device code
Wrap your __device__ function in an empty namespace
In your specific case it looks like you can just make it a constexpr undecorated function:
constexpr Eigen::Vector3d &foo(Bar &aBar) noexcept {aBar.v += {1,1,1}; return aBar.v;}
and invoke nvcc with --expt-relaxed-constexpr:
--expt-relaxed-constexpr (-expt-relaxed-constexpr)
Experimental flag: Allow host code to invoke __device__ constexpr functions,
and device code to invoke __host__ constexpr functions.Note that the behavior
of this flag may change in future compiler releases.
this should work for both device and host code.

Using dlopen/dlsym to open C++ shared library - dlsym returns NULL

I have not yet dealt with shared libraries in C++, and am having some trouble. I want to create a shared library and then have a C function pick up on that library. So here is my shared library file:
extern int nothing();
//sym.cpp
int nothing() {
return 0;
}
Below is my dlopen/dlsym script:
//symtest.c
#include <stdio.h>
#include <dlfcn.h>
int main(){
void *handle;
handle = dlopen("/path/to/lib/sym.so",RTLD_NOW);
int (*onload)(void *, void **, int);
onload = (int (*)(void *, void **, int))(unsigned long) dlsym(handle,"nothing");
if(onload==NULL) {
printf("NULL");
}
return 0;
}
Compile and ran as following:
$ g++ -shared -fPIC -o sym.so sym.cpp
$ gcc symtest.c -ldl -o symtest
$ ./symtest
NULL
Why am I getting NULL? I am pretty sure this symbol is getting exported, at least by observing the output of the following commands.
nm:
$ nm -CD sym.so | grep " T "
0000000000000670 T nothing()
000000000000067c T _fini
0000000000000518 T _init
objdump:
$ objdump -T sym.so
sym.so: file format elf64-x86-64
DYNAMIC SYMBOL TABLE:
0000000000000518 l d .init 0000000000000000 .init
0000000000000000 w D *UND* 0000000000000000 __gmon_start__
0000000000000000 w D *UND* 0000000000000000 _Jv_RegisterClasses
0000000000000000 w D *UND* 0000000000000000 _ITM_deregisterTMCloneTable
0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable
0000000000000000 w DF *UND* 0000000000000000 GLIBC_2.2.5 __cxa_finalize
0000000000200970 g D .bss 0000000000000000 Base _end
0000000000200968 g D .got.plt 0000000000000000 Base _edata
0000000000200968 g D .bss 0000000000000000 Base __bss_start
0000000000000518 g DF .init 0000000000000000 Base _init
000000000000067c g DF .fini 0000000000000000 Base _fini
0000000000000670 g DF .text 000000000000000b Base _Z7nothingv
There is a bigger picture here (Creating C++ Redis Module - "does not export RedisModule_OnLoad() symbol") but I looked through some Redis source code to produce a minimalistic example. Anyone have any idea what I am doing wrong here?
As requested, nm without -C option:
$ nm -D sym.so
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
0000000000000670 T _Z7nothingv
0000000000200968 B __bss_start
w __cxa_finalize
w __gmon_start__
0000000000200968 D _edata
0000000000200970 B _end
000000000000067c T _fini
0000000000000518 T _init
C++ has function overloading so there can be multiple functions with the same name and just different parameters.
Since object files store only names, C++ applies what is called name mangling. It adds extra symbols representing email parameters the name to differentiate between different versions.
When using dlsym, one must use the mangled name to get at the function address.
Now since name mangling is platform specific it is often better to use C linkage (no name mangling)
This can be done with the extern "C" declaration.

ld of data file makes size of data an *ABS* and not an integer

I have a c++ program which includes an external dependency on an empty xlsx file. To remove this dependency I converted this file to a binary object in view of linking it in directly, using:
ld -r -b binary -o template.o template.xlsx
followed by
objcopy --rename-section .data=.rodata,alloc,load,readonly,data,contents template.o template.o
Using objdump, I can see three variables declared :
$ objdump -x template.o
template.o: file format elf64-x86-64
template.o
architecture: i386:x86-64, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .rodata 00000fd1 0000000000000000 0000000000000000 00000040 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000fd1 g *ABS* 0000000000000000 _binary_template_xlsx_size
0000000000000000 g .rodata 0000000000000000 _binary_template_xlsx_start
0000000000000fd1 g .rodata 0000000000000000 _binary_template_xlsx_end
I then tell my program about this data :
template.h:
#ifndef TEMPLATE_H
#define TEMPLATE_H
#include <cstddef>
extern "C" {
extern const char _binary_template_xlsx_start[];
extern const char _binary_template_xlsx_end[];
extern const int _binary_template_xlsx_size;
}
#endif
This compiles and links fine,(although I am having some trouble automating it with cmake, see here : compile and add object file from binary with cmake)
However, when I use _binary_template_xlsx_size in my code, it is interpreted as a pointer to an address that doesn't exist. So to get the size of my data, I have to pass (int)&_binary_template_xlsx_size (or (int)(_binary_template_xlsx_end - _binary_template_xlsx_start))
Some research tells me that the *ABS* in the objdump above means "absolute value" but I don't get why. How can I get my c++ (or c) program to see the variable as an int and not as a pointer?
An *ABS* symbol is an absolute address; it's more often created by passing --defsym foo=0x1234 to ld.
--defsym symbol=expression
Create a global symbol in the output file, containing the absolute
address given by expression. [...]
Because an absolute symbol is a constant, it's not possible to link it into a C source file as a variable; all C object variables have an address, but a constant doesn't.
To make sure you don't dereference the address (i.e. read the variable) by accident, it's best to define it as const char [] as you have with the other symbols:
extern const char _binary_template_xlsx_size[];
If you want to make sure you're using it as an int, you could use a macro:
extern const char _abs_binary_template_xlsx_size[] asm("_binary_template_xlsx_size");
#define _binary_template_xlsx_size ((int) (intptr_t) _abs_binary_template_xlsx_size)