linker error: undefined symbol - c++

What I'm trying to do
I am attempting to create 2 c++ classes.
One, named Agent that will be implemented as a member of class 2
Two, named Env that will be exposed to Python through boost.python (though I suspect this detail to be inconsequential to my problem)
The problem
After successful compilation with my make file, I attempt to run my python script and I receive an import error on my extension module (the c++ code) that reads "undefined symbol: _ZN5AgentC1Effff". All the boost-python stuff aside, I believe this to be a simple c++ linker error.
Here are my files:
Agent.h
class Agent {
public:
float xy_pos[2];
float xy_vel[2];
float yaw;
float z_pos;
Agent(float x_pos, float y_pos, float yaw, float z_pos);
};
Agent.cpp
#include "Agent.h"
Agent::Agent(float x_pos, float y_pos, float yaw, float z_pos)
{
xy_vel[0] = 0;
xy_vel[1] = 0;
xy_pos[0] = x_pos;
xy_pos[1] = y_pos;
z_pos = z_pos;
yaw = yaw;
};
test_ext.cpp (where my Env class lives)
#include "Agent.h"
#include <boost/python.hpp>
class Env{
public:
Agent * agent;
//some other members
Env() {
agent = new Agent(13, 10, 0, 2);
}
np::ndarray get_agent_vel() {
return np::from_data(agent->xy_vel, np::dtype::get_builtin<float>(),
p::make_tuple(2),
p::make_tuple(sizeof(float)),
p::object());
}
void set_agent_vel(np::ndarray vel) {
agent->xy_vel[0] = p::extract<float>(vel[0]);
agent->xy_vel[1] = p::extract<float>(vel[1]);
}
}
BOOST_PYTHON_MODULE(test_ext) {
using namespace boost::python;
class_<Env>("Env")
.def("set_agent_vel", &Env::set_agent_vel)
.def("get_agent_vel", &Env::get_agent_vel)
}
Makefile
PYTHON_VERSION = 3.5
PYTHON_INCLUDE = /usr/include/python$(PYTHON_VERSION)
# location of the Boost Python include files and library
BOOST_INC = /usr/local/include/boost_1_66_0
BOOST_LIB = /usr/local/include/boost_1_66_0/stage/lib/
# compile mesh classes
TARGET = test_ext
CFLAGS = --std=c++11
$(TARGET).so: $(TARGET).o
g++ -shared -Wl,--export-dynamic $(TARGET).o -L$(BOOST_LIB) -lboost_python3 -lboost_numpy3 -L/usr/lib/python3.5/config-3.5m-x86_64-linux-gnu -lpython3.5 -o $(TARGET).so
$(TARGET).o: $(TARGET).cpp Agent.o
g++ -I$(PYTHON_INCLUDE) -I$(BOOST_INC) -fPIC -c $(TARGET).cpp $(CFLAGS)
Agent.o: Agent.cpp Agent.h
g++ -c -Wall Agent.cpp $(CFLAGS)

You never link with Agent.o anywhere.
First of all you need to build it like you build test_ext.o with the same flags. Then you need to actually link with Agent.o when creating the shared library.

Related

Undefined symbol when trying to link with shared library built from CUDA objects

I'm experimenting with building a simple application from a couple of .cu source files and a very simple C++ main that calls a function from one of the .cu files. I'm making a shared library (.so file) from the compiled .cu files. I'm finding that everything builds without trouble, but when I try to run the application, I get a linker undefined symbol error, with the mangled name of the .cu function I'm calling from main(). If I build a static library instead, my application runs just fine. Here's the makefile I've set up:
.PHONY: clean
NVCCFLAGS = -std=c++11 --compiler-options '-fPIC'
CXXFLAGS = -std=c++11
HLIB = libhello.a
SHLIB = libhello.so
CUDA_OBJECTS = bridge.o add.o
all: driver
%.o :: %.cu
nvcc -o $# $(NVCCFLAGS) -c -I. $<
%.o :: %.cpp
c++ $(CXXFLAGS) -o $# -c -I. $<
$(HLIB): $(CUDA_OBJECTS)
ar rcs $# $^
$(SHLIB): $(CUDA_OBJECTS)
nvcc $(NVCCFLAGS) --shared -o $# $^
#driver : driver.o $(HLIB)
# c++ -std=c++11 -fPIC -o $# driver.o -L. -lhello -L/usr/local/cuda-10.1/targets/x86_64-linux/lib -lcudart
driver : driver.o $(SHLIB)
c++ -std=c++11 -fPIC -o $# driver.o -L. -lhello
clean:
-rm -f driver *.o *.so *.a
Here are the various source files that the makefile takes as fodder.
add.cu:
__global__ void add(int n, int* a, int* b, int* c) {
int index = threadIdx.x;
int stride = blockDim.x;
for (int ii = index; ii < n; ii += stride) {
c[ii] = a[ii] + b[ii];
}
}
add.h:
extern __global__ void add(int n, int* a, int* b, int* c);
bridge.cu:
#include <iostream>
#include "add.h"
void bridge() {
int N = 1 << 16;
int blockSize = 256;
int numBlocks = (N + blockSize - 1)/blockSize;
int* a;
int* b;
int* c;
cudaMallocManaged(&a, N*sizeof(int));
cudaMallocManaged(&b, N*sizeof(int));
cudaMallocManaged(&c, N*sizeof(int));
for (int ii = 0; ii < N; ii++) {
a[ii] = ii;
b[ii] = 2*ii;
}
add<<<numBlocks, blockSize>>>(N, a, b, c);
cudaDeviceSynchronize();
for (int ii = 0; ii < N; ii++) {
std::cout << a[ii] << " + " << b[ii] << " = " << c[ii] << std::endl;
}
cudaFree(a);
cudaFree(b);
cudaFree(c);
}
bridge.h:
extern void bridge();
driver.cpp:
#include "bridge.h"
int main() {
bridge();
return 0;
}
I'm very new to cuda, so I expect that's where I'm doing something wrong. I've played a bit with using extern "C" declarations, but that just seems to move the "undefined symbol" error from run time to build time.
I'm familiar with various ways that one can end up with an undefined symbol, and I've mentioned various experiments I've already performed (static linking, extern "C" declarations) that make me think that this problem isn't addressed by the proposed duplicate question.
My unresolved symbol is _Z6bridgev
It looks to me as though the linker should be able resolve the symbol. If I can nm on driver.o, I see:
0000000000000000 T main
U _Z6bridgev
And if I run nm on libhello.so, I see:
0000000000006e56 T _Z6bridgev
When Robert Crovella was able to get my example to work on his machine, while I wasn't able to get his example to work on mine, I started realizing that my problem had nothing to do with cuda or nvcc. It was the fact that with a shared library, the loader has to resolve symbols at runtime, and my shared library wasn't in a "well-known location". I built a simple test case just now, purely with c++ sources, and repeated my failure. Once I copied libhello.so to /usr/local/lib, I was able to run driver successfully. So, I'm OK with closing my original question, if that's the will of the people.

cublasSasum causing segmentation fault in custom TensorFlow op [duplicate]

The code of cublas below give us the errors:core dumped while being at "cublasSnrm2(handle,row,dy,incy,de)",could you give some advice?
main.cu
#include <iostream>
#include "cublas.h"
#include "cublas_v2.h"
#include "helper_cuda.h"
using namespace std;
int main(int argc,char *args[])
{
float y[10] = {1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0};
int dev=0;
checkCudaErrors(cudaSetDevice(dev));
//cublas init
cublasStatus stat;
cublasInit();
cublasHandle_t handle;
stat = cublasCreate(&handle);
if (stat !=CUBLAS_STATUS_SUCCESS)
{
printf("cublas handle create failed!\n");
cublasShutdown();
}
float * dy,*de,*e;
int incy = 1,ONE = 1,row = 10;
e = (float *)malloc(sizeof(float)*ONE);
e[0]=0.0f;
checkCudaErrors(cudaMalloc(&dy,sizeof(float)*row));
checkCudaErrors(cudaMalloc(&de,sizeof(float)*ONE));
checkCudaErrors(cudaMemcpy(dy,y,row*sizeof(float),cudaMemcpyHostToDevice));
checkCudaErrors(cudaMemcpy(de,e,ONE*sizeof(float),cudaMemcpyHostToDevice));
stat = cublasSnrm2(handle,row,dy,incy,de);
if (stat !=CUBLAS_STATUS_SUCCESS)
{
printf("norm2 compute failed!\n");
cublasShutdown();
}
checkCudaErrors(cudaMemcpy(e,de,ONE*sizeof(float),cudaMemcpyDeviceToHost));
std::cout<<e[0]<<endl;
return 0;
}
makefile is below:
NVIDIA = $(HOME)/NVIDIA_CUDA-5.0_Samples
CUDA = /usr/local/cuda-5.0
NVIDINCADD = -I$(NVIDIA)/common/inc
CUDAINCADD = -I$(CUDA)/include
CC = -L/usr/lib64/ -lstdc++
GCCOPT = -O2 -fno-rtti -fno-exceptions
INTELOPT = -O3 -fno-rtti -xW -restrict -fno-alias
DEB = -g
NVCC = -G
ARCH = -arch=sm_35
bcg:main.cu
nvcc $(DEB) $(NVCC) $(ARCH) $(CC) -lm $(NVIDINCADD) $(CUDAINCADD) -lcublas -I./ -o $(#) $(<)
clean:
rm -f bcg
rm -f hyb
My OS is linux redhat 6.2,CUDA's version is 5.0, GPU is K20M.
The problem is here:
cublasSnrm2(handle,row,dy,incy,de);
By default, the last parameter is a host pointer. So either pass e to the snrm2 call rather than de or do this:
cublasSetPointerMode(handle,CUBLAS_POINTER_MODE_DEVICE);
stat = cublasSnrm2(handle,row,dy,incy,de);
The pointer mode needs to be set to device if you want to pass a device pointer to store the result.

Dispatching SIMD instructions + SIMDPP + qmake

I'm developing a QT widget that makes use of SIMD instruction sets. I've compiled 3 versions: SSE3, AVX, and AVX2(simdpp allows to switch between them by a single #define).
Now, what I want is for my widget to switch automatically between these implementations, according to best supported instruction set. Guide that is provided with simdpp makes use of some makefile magic:
CXXFLAGS=""
test: main.o test_sse2.o test_sse3.o test_sse4_1.o test_null.o
g++ $^ -o test
main.o: main.cc
g++ main.cc $(CXXFLAGS) -c -o main.o
test_null.o: test.cc
g++ test.cc -c $(CXXFLAGS) -DSIMDPP_EMIT_DISPATCHER \
-DSIMDPP_DISPATCH_ARCH1=SIMDPP_ARCH_X86_SSE2 \
-DSIMDPP_DISPATCH_ARCH2=SIMDPP_ARCH_X86_SSE3 \
-DSIMDPP_DISPATCH_ARCH3=SIMDPP_ARCH_X86_SSE4_1 -o test_null.o
test_sse2.o: test.cc
g++ test.cc -c $(CXXFLAGS) -DSIMDPP_ARCH_X86_SSE2 -msse2 -o test_sse2.o
test_sse3.o: test.cc
g++ test.cc -c $(CXXFLAGS) -DSIMDPP_ARCH_X86_SSE3 -msse3 -o test_sse3.o
test_sse4_1.o: test.cc
g++ test.cc -c $(CXXFLAGS) -DSIMDPP_ARCH_X86_SSE4_1 -msse4.1 -o test_sse4_1.o
Here is a link to the guide: http://p12tic.github.io/libsimdpp/v2.0~rc2/libsimdpp/arch/dispatch.html
I have no idea how to implement such behavior with qmake. Any ideas?
First that comes to mind is to create a shared library with dispatched code, and link it to the project. Here I'm stuck again. App is cross-platform, which means it has to compile with both GCC and MSVC(vc120, to be exact), which forces using nmake in Windows, and I tried, really, but it was like the worst experience in my whole programmer life.
Thanks in advance, programmers of the world!
sorry if this is a bit late. Hope I can still help.
You need to consider 2 areas: Compile time and run time.
Compile time - need to create code to support different features.
Run time - need to create code to decide which features you can run.
What you are wanting to do is create a dispatcher...
FuncImpl.h:
#pragma once
void execAvx2();
void execAvx();
void execSse();
void execDefault();
FuncImpl.cpp:
// Compile this file once for each variant with different compiler settings.
#if defined(__AVX2__)
void execAvx2()
{
// AVX2 impl
...
}
#elif defined (__AVX__)
void execAvx()
{
// AVX impl
...
}
#elif defined (__SSE4_2__)
void execSse()
{
// Sse impl
...
}
#else
void execDefault()
{
// Vanilla impl
...
}
#endif
DispatchFunc.cpp
#include "FuncImpl.h"
// Decide at runtime which code to run
void dispatchFunc()
{
if(CheckCpuAvx2Flag())
{
execAvx2();
}
else if(CheckCpuAvxFlag())
{
execAvx();
}
else if(CheckCpuSseFlags())
{
execSse();
}
else
{
execDefault();
}
}
What you can do is create a set of QMAKE_EXTRA_COMPILERS.
SampleCompiler.pri (Do this for each variant):
MyCompiler.name = MyCompiler # Name
MyCompiler.input = MY_SOURCES # Symbol of the source list to compile
MyCompiler.dependency_type = TYPE_C
MyCompiler.variable_out = OBJECTS
# EXTRA_CXXFLAGS = -mavx / -mavx2 / -msse4.2
# _var = creates FileName_var.o => replace with own variant (_sse, etc)
MyCompiler.output = ${QMAKE_VAR_OBJECTS_DIR}${QMAKE_FILE_IN_BASE}_var$${first(QMAKE_EXT_OBJ)}
MyCompiler.commands = $${QMAKE_CXX} $(CXXFLAGS) $${EXTRA_CXXFLAGS} $(INCPATH) -c ${QMAKE_FILE_IN} -o${QMAKE_FILE_OUT}
QMAKE_EXTRA_COMPILERS += MyCompiler # Add my compiler
MyProject.pro
...
include(SseCompiler.pri)
include(AvxCompiler.pri)
include(Avx2Compiler.pri)
..
# Normal sources
# Will create FuncImpl.o and DispatchFunc.o
SOURCES += FuncImpl.cpp \
DispatchFunc.cpp
# Give the other compilers their sources
# Will create FuncImpl_avx2.o FuncImpl_avx.o FuncImpl_sse.o
AVX2_SOURCES += FuncImpl.cpp
AVX_SOURCES += FuncImpl.cpp
SSE_SOURCES += FuncImpl.cpp
# Link all objects
...
All you need now is to call dispatchFunc()!
Checking cpu flags is another exercise for you:
cpuid
These are just project defines. You set them with DEFINES += in your .pro file.You set the flags for the instructions sets you want to support and simdpp takes care of selecting the best one for the processor at runtime.
See for example, Add a define to qmake WITH a value?
Here is a qmake .pro file for use with SIMD dispatchers. It is quite verbose, so for more instruction sets, it is better to generate the dispatched blocks by a script, write it to a .pri file and then include it from your main .pro file.
TEMPLATE = app
TARGET = simd_test
INCLUDEPATH += .
QMAKE_CXXFLAGS = -O3 -std=c++17
SOURCES += main.cpp
SOURCES_dispatch = test.cpp
{
# SSE2
DISPATCH_CXXFLAGS = -msse2
DISPATCH_SUFFIX = _sse2
src_dispatch_sse2.name = src_dispatch_sse2
src_dispatch_sse2.input = SOURCES_dispatch
src_dispatch_sse2.dependency_type = TYPE_C
src_dispatch_sse2.variable_out = OBJECTS
src_dispatch_sse2.output = ${QMAKE_VAR_OBJECTS_DIR}${QMAKE_FILE_IN_BASE}$${DISPATCH_SUFFIX}$${first(QMAKE_EXT_OBJ)}
src_dispatch_sse2.commands = $${QMAKE_CXX} $(CXXFLAGS) $${DISPATCH_CXXFLAGS} $(INCPATH) -c ${QMAKE_FILE_IN} -o ${QMAKE_FILE_OUT}
QMAKE_EXTRA_COMPILERS += src_dispatch_sse2
}
{
# SSE3
DISPATCH_CXXFLAGS = -msse3
DISPATCH_SUFFIX = _sse3
src_dispatch_sse3.name = src_dispatch_sse3
src_dispatch_sse3.input = SOURCES_dispatch
src_dispatch_sse3.dependency_type = TYPE_C
src_dispatch_sse3.variable_out = OBJECTS
src_dispatch_sse3.output = ${QMAKE_VAR_OBJECTS_DIR}${QMAKE_FILE_IN_BASE}$${DISPATCH_SUFFIX}$${first(QMAKE_EXT_OBJ)}
src_dispatch_sse3.commands = $${QMAKE_CXX} $(CXXFLAGS) $${DISPATCH_CXXFLAGS} $(INCPATH) -c ${QMAKE_FILE_IN} -o ${QMAKE_FILE_OUT}
QMAKE_EXTRA_COMPILERS += src_dispatch_sse3
}
{
# SSE41
DISPATCH_CXXFLAGS = -msse4.1
DISPATCH_SUFFIX = _sse41
src_dispatch_sse41.name = src_dispatch_sse41
src_dispatch_sse41.input = SOURCES_dispatch
src_dispatch_sse41.dependency_type = TYPE_C
src_dispatch_sse41.variable_out = OBJECTS
src_dispatch_sse41.output = ${QMAKE_VAR_OBJECTS_DIR}${QMAKE_FILE_IN_BASE}$${DISPATCH_SUFFIX}$${first(QMAKE_EXT_OBJ)}
src_dispatch_sse41.commands = $${QMAKE_CXX} $(CXXFLAGS) $${DISPATCH_CXXFLAGS} $(INCPATH) -c ${QMAKE_FILE_IN} -o ${QMAKE_FILE_OUT}
QMAKE_EXTRA_COMPILERS += src_dispatch_sse41
}

How to Link static or shared library to Kernel Module?

There is a function in aaa.c
int myadd(int a, int b){
return a+b;
}
and aaa.c was built into a static library using
gcc -c aaa.c -o aaa.o && ar -cr libaaa.a aaa.o
and a shared library using
gcc -c aaa.c -o aaa.o && gcc -shared -fPCI -o libaaa.so aaa.o
Then I wrote a file call.c, and try to call function myadd() in libaaa.so, but failed.
Please give me some advice,
test.c:
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
MODULE_LICENSE("Dual BSD/GPL");
extern int myadd(int a, int b);
static int hello_init(void)
{
int c = 0;
printk(KERN_ALERT "hello,I am Destiny\n");
c = myadd(1, 2);
printk(KERN_ALERT "res is (%d)\n", c);
return 0;
}
static void hello_exit(void)
{
printk(KERN_ALERT "goodbye,kernel\n");
}
module_init(hello_init);
module_exit(hello_exit);
MODULE_AUTHOR("Destiny");
MODULE_DESCRIPTION("This is a simple example!\n");
MODULE_ALIAS("A simplest example");
This Makefile will make both c file into call.ko, and it will work. But that's not what I want.
Makefile :
KVERSION = $(shell uname -r)
obj-m = call.o
call-objs = aaa.o test.o
Debug:
make -C /lib/modules/$(KVERSION)/build M=$(PWD) modules
All:Debug
cleanDebug:
make -C /lib/modules/$(KVERSION)/build M=/home/Destiny/myProject/kernel/cbtest/ clean
clean:cleanDebug
installDebug:Debug
rmmod /lib/modules/2.6.18-348.12.1.el5/test/call.ko
/bin/cp call.ko /lib/modules/$(KVERSION)/test/
depmod -a
insmod /lib/modules/2.6.18-348.12.1.el5/test/call.ko
install:installDebug
main.o : defs.h
Ko files are running in kernel space , not user space where application running. Libc or libc++ and so on on are prepared for user space application. So you can not link libc/c++ functions, Just like you cannot link any libc functions in kernel.

How do you wrap C++ OpenCV code with Boost::Python?

I want to wrap my C++ OpenCV code with boost::python, and to learn how to do it, I tried a toy example, in which
I use the Boost.Numpy project to provide me with boost::numpy::ndarray.
The C++ function to be wrapped, square() takes a boost::numpy::ndarray and modifies it in place by squaring each element in it.
The exported Python module name is called test.
The square() C++ function is exported as the square name in the exported module.
I am not using bjam because IMO it is too complicated and just doesn't work for me no matter what. I'm using good old make.
Now, here's the code:
// test.cpp
#include <boost/python.hpp>
#include <boost/numpy.hpp>
#include <boost/scoped_array.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <iostream>
namespace py = boost::python;
namespace np = boost::numpy;
void square(np::ndarray& array)
{
if (array.get_dtype() != np::dtype::get_builtin<int>())
{
PyErr_SetString(PyExc_TypeError, "Incorrect array data type.");
py::throw_error_already_set();
}
size_t rows = array.shape(0), cols = array.shape(1);
size_t stride_row = array.strides(0) / sizeof(int),
stride_col = array.strides(1) / sizeof(int);
cv::Mat mat(rows, cols, CV_32S);
int *row_iter = reinterpret_cast<int*>(array.get_data());
for (int i = 0; i < rows; i++, row_iter += stride_row)
{
int *col_iter = row_iter;
int *mat_row = (int*)mat.ptr(i);
for (int j = 0; j < cols; j++, col_iter += stride_col)
{
*(mat_row + j) = (*col_iter) * (*col_iter);
}
}
for (int i = 0; i < rows; i++, row_iter += stride_row)
{
int *col_iter = row_iter;
int *mat_row = (int*)mat.ptr(i);
for (int j = 0; j < cols; j++, col_iter += stride_col)
{
*col_iter = *(mat_row + j);
}
}
}
BOOST_PYTHON_MODULE(test)
{
using namespace boost::python;
def("square", square);
}
And here's the Makefile:
PYTHON_VERSION = 2.7
PYTHON_INCLUDE = /usr/include/python$(PYTHON_VERSION)
BOOST_INC = /usr/local/include
BOOST_LIB = /usr/local/lib
OPENCV_LIB = $$(pkg-config --libs opencv)
OPENCV_INC = $$(pkg-config --cflags opencv)
TARGET = test
$(TARGET).so: $(TARGET).o
g++ -shared -Wl,--export-dynamic \
$(TARGET).o -L$(BOOST_LIB) -lboost_python \
$(OPENCV_LIB) \
-L/usr/lib/python$(PYTHON_VERSION)/config -lpython$(PYTHON_VERSION) \
-o $(TARGET).so
$(TARGET).o: $(TARGET).cpp
g++ -I$(PYTHON_INCLUDE) $(OPENCV_INC) -I$(BOOST_INC) -fPIC -c $(TARGET).cpp
With this scheme, I can type make and test.so gets created. But when I try to import it,
In [1]: import test
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-73ae3ffe1045> in <module>()
----> 1 import test
ImportError: ./test.so: undefined symbol: _ZN5boost6python9converter21object_manager_traitsINS_5numpy7ndarrayEE10get_pytypeEv
In [2]:
This is a linker error which I can't seem to fix. Can anyone please help me with what's going on? Do you have (links to) code that already does integrate OpenCV, numpy and Boost.Python without things like Py++ or the likes?.
Okay I fixed this. It was a simple issue, but a sleepy brain and servings of bjam had made me ignore it. In the Makefile, I'd forgotten to put -lboost_numpy that links the Boost.Numpy libs to my lib. So, the modified Makefile looks like this:
PYTHON_VERSION = 2.7
PYTHON_INCLUDE = /usr/include/python$(PYTHON_VERSION)
BOOST_INC = /usr/local/include
BOOST_LIB = /usr/local/lib
OPENCV_LIB = $$(pkg-config --libs opencv)
OPENCV_INC = $$(pkg-config --cflags opencv)
TARGET = test
$(TARGET).so: $(TARGET).o
g++ -shared -Wl,--export-dynamic \
$(TARGET).o -L$(BOOST_LIB) -lboost_python -lboost_numpy \
$(OPENCV_LIB) \
-L/usr/lib/python$(PYTHON_VERSION)/config -lpython$(PYTHON_VERSION) \
-o $(TARGET).so
$(TARGET).o: $(TARGET).cpp
g++ -I$(PYTHON_INCLUDE) $(OPENCV_INC) -I$(BOOST_INC) -fPIC -c $(TARGET).cpp