Unable to link PAPI library with opt llvm - c++

I am working on a project where I need to generate just the bitcode using clang, run some optimization passes using opt and then create an executable and measure its hardware counters.
I am able to link through clang directly using:
clang -g -O0 -w -I/opt/apps/papi/5.3.0/include -Wl,-rpath,$PAPI_LIB -L$PAPI_LIB \
-lpapi /scratch/02681/user/papi_helper.c prog.c -o a.out
However now I want to link it after using the front end of clang and applying optimization passes using opt.
I am trying the following way:
clang -g -O0 -w -c -emit-llvm -I/opt/apps/papi/5.3.0/include -Wl,-rpath,$PAPI_LIB -L$PAPI_LIB \
-lpapi /scratch/02681/user/papi_helper.c prog.c -o prog.o
llvm-link prog.o papi_helper.o -o prog-link.o
// run optimization passes
opt -licm prog-link.o -o prog-opt.o
llc -filetype=obj prog-opt.o -o prog-exec.o
clang prog-exec.o
After going through the above process I get the following error:
undefined reference to `PAPI_event_code_to_name'
It's not able to resolve papi functions. Thanks in advance for any help.

Clearly, you need to add -lpapi to the last clang invocation. How else the linker would know about libpapi?

Related

clang: error: unknown argument: '-fno-leading-underscore'

I am trying to compile my .cpp with -fno-leading-underscore option but it raises an error saying:
clang: error: unknown argument: '-fno-leading-underscore'
g++ -m32 -fno-use-cxa-atexit -nostdlib -fno-builtin -fno-rtti -fno-exceptions -fno-leading-underscore -o kernel.o -c kernel.cpp
How can I fix this I am new to Mac it used to work on Linux Mint
In your terminal, type g++ and press enter. You will probably get:
$ clang: No input files
As you can see its still clang underneath.
To fix this, first cd into /usr/local/bin and type ls. You will see the binaries inside:
g++-9
...
Next, create a symlink to it so that you can invoke g++ directly
ln -s g++-9 g++
If it still doesn't work for some reason, you can explicitly write g++-9 or whatever version you have. You can even give the full path /usr/local/bin/g++-9

How to pass compiler flags to nvcc from clang

I am trying to compile CUDA with clang, but the code I am trying to compile depends on a specific nvcc flag (-default-stream per-thread). How can I tell clang to pass the flag to nvcc?
For example, I can compile with nvcc and everythign works fine:
nvcc -default-stream per-thread *.cu -o app
But when I compile from clang, the program does not behave correctly because I can not pass the default-steam flag:
clang++ --cuda-gpu-arch=sm_35 -L/usr/local/cuda/lib64 *.cu -o app -lcudart_static -ldl -lrt -pthread
How do I get clang to pass flags to nvcc?
It looks like it may not be possible.
nvcc behind the scenes calls either clang/gcc with some custom generated flags and then calls ptxas and some other stuff to create the binary.
e.g.
nvcc -default-stream per-thread foo.cu
# Behind the scenes
gcc -custom-nvcc-generated-flag -DCUDA_API_PER_THREAD_DEFAULT_STREAM=1 -o foo.ptx
ptxas foo.ptx -o foo.cubin
When compiling to CUDA from clang, clang compiles directly to ptx and then calls ptxas:
clang++ foo.cu -o app -lcudart_static -ldl -lrt -pthread
# Behind the scenes
clang++ -triple nvptx64-nvidia-cuda foo.cu -o foo.ptx
ptxas foo.ptx -o foo.cubin
clang never actually calls nvcc. It just targets ptx and calls the ptx assembler.
Unless you know what custom backend flags will be produced by nvcc and manually include them when calling clang, I'm not sure you can automatically pass an nvcc flag from clang.
If you are using features specific to clang only for the host side and don't actually need it for the device side - you're probably looking for this :
https://devblogs.nvidia.com/separate-compilation-linking-cuda-device-code/
As #Increasingly-Idiotic points out - I believe clang does not "call" nvcc internally, hence I don't think you can pass arguments to it.

gcc - linking and compiling in one command

I am new to C++ and learning RTI DDS at the moment by compiling their examples. I am currently using their make files but I want to learn how to compile individual files using gcc directly. The make files first compiles objects and links them together as per below.
g++ -DRTI_UNIX -DRTI_LINUX -DRTI_64BIT -m64 -O2 -o objs/x64Linux3gcc4.8.2/HelloPublisher.o -Isrc -Isrc/idl -I/opt/rti_connext_dds-5.2.3/include -I/opt/rti_connext_dds-5.2.3/include/ndds -c src/HelloPublisher.cpp
g++ -m64 -static-libgcc -Wl,--no-as-needed objs/x64Linux3gcc4.8.2/HelloPublisher.o -o objs/x64Linux3gcc4.8.2/HelloPublisher -L/opt/rti_connext_dds-5.2.3/lib/x64Linux3gcc4.8.2 -lnddscppz -lnddscz -lnddscorez -ldl -lnsl -lm -lpthread -lrt
How can I write a single command using g++/gcc to do both?
The usual way is
g++ -o $prog -DRTI_UNIX $moreflags $file1.cpp $file2.cpp $prog.cpp $libs
You'll have to try a bit with the myriad of arguments you got since order matters.

Linking CUDA + plain C++ code: undefined reference to `__fatbinwrap_66_tmpxft_ etc

Somehow my CUDA binary build process has been messed up. All of the .cu files compile nicely to .o files, but when I try to link, I get:
CMakeFiles/tester.dir/tester_intermediate_link.o: In function `__cudaRegisterLinkedBinary_66_tmpxft_00007a5f_00000000_16_cuda_device_runtime_compute_52_cpp1_ii_8b1a5d37':
/tmp/tmpxft_00006b54_00000000-2_tester_intermediate_link.reg.c:7: undefined reference to `__fatbinwrap_66_tmpxft_00007a5f_00000000_16_cuda_device_runtime_compute_52_cpp1_ii_8b1a5d37'
Now, I have not used compute_52 anywhere. My nvcc command-line is:
/usr/local/cuda/bin/nvcc -M -D__CUDACC__ /home/joeuser/src/my_project/src/kernel_specific/elementwise/Add.cu -o /home/joeuser/src/my_project/CMakeFiles/tester.dir/src/kernel_specific/elementwise/tester_generated_Add.cu.o.NVCC-depend -ccbin /usr/bin/gcc-4.9.3 -m64 --std c++11 -D__STRICT_ANSI__ -Xcompiler ,\"-Wall\",\"-g\",\"-g\",\"-O0\" -gencode arch=compute_35,code=compute_35 -g -G --generate-line-info -DNVCC -I/usr/local/cuda/include -I/opt/cub -I/usr/local/cuda/include
and my link line is:
/usr/bin/g++-4.9.3 -Wall -std=c++11 -g some.o files.o here.o blah.o blahblah.o bar.cu.o baz.cu.o -o bin/myapp -rdynamic -Wl,-Bstatic -lcudart_static -Wl,-Bdynamic -lpthread -lrt -ldl /usr/lib/libboost_system.so /usr/lib/libboost_program_options.so -Wl,-Bstatic -lcudart_static -Wl,-Bdynamic -lpthread -lrt -ldl /usr/local/cuda/extras/CUPTI/lib64/libcupti.so -lnvToolsExt -lOpenCL /usr/lib/libboost_system.so /usr/lib/libboost_program_options.so /usr/local/cuda/extras/CUPTI/lib64/libcupti.so -lnvToolsExt -lOpenCL -Wl,-rpath,/usr/lib:/usr/local/cuda/extras/CUPTI/lib64
I'll note I have separate compilation enabled, and do not seem to have skipped my intermediate link phase.
Why is this happening?
CUDA has two compilation modes, relocatable and static.
The relocatable mode is required for some configurations-which we will not get into now.
If you want to compile in relocatable mode -rdc=true, you'll need the Cuda device runtime library.
Which is located in the file cudadevrt.lib.
On some instances, supplying -lcudadevrt as a command line switch to the CUDA linker does the job, but on e.g. MSVC, you'll also need to specify cudadebrt.lib as a link dependency.
Well, I'm not sure why I'm seeing missing references to Compute 5.2 calls, but adding -lcudadevrt to the end of the link command makes the error go away.

how to run two executable generated with different gcc version in same system

I have two Executable generated with different gcc version
One is using gcc 3.4.2 and other using gcc 4.3.2 in my Linux box
Both has to be run in same environment i.e. having same LD_LIBRARY_PATH.
Currently path of 4.3.2 is placed before the 3.4.2 the 3.4.2 is giving error.
libstdc++.so.6: version 'GLIBCXX_3.4.9' not found (required by../../src/hello)
I am thinking about the solution where I can store the information in the exe where to find the
loadtime files needed.
I created below build scripts which are giving problem:
Basically O3 option is doing optimization.
/opt/gcc-4.3.2/bin/g++ -pipe -O3 -c hello4_3_2.cpp
/opt/gcc-4.3.2/bin/g++ -o hello4_3_2 hello4_3_2.o -L$/opt/gcc-4.3.2/lib64/libstdc++
/opt/gcc-3.4.2/bin/g++ -pipe -O3 -c hello3_4_2.cpp
/opt/gcc-3.4.2/bin/g++ -o hello3_4_2 hello3_4_2.o -L$/opt/gcc-3.4.2/lib64/libstdc++
Below script works for me:(With out O3 option)
/opt/gcc-4.3.2/bin/g++ -pipe -c hello4_3_2.cpp
/opt/gcc-4.3.2/bin/g++ -o hello4_3_2 hello4_3_2.o -L$/opt/gcc-4.3.2/lib64/libstdc++
/opt/gcc-3.4.2/bin/g++ -pipe -c hello3_4_2.cpp
/opt/gcc-3.4.2/bin/g++ -o hello3_4_2 hello3_4_2.o -L$/opt/gcc-3.4.2/lib64/libstdc++
Now:
I wanted to know if there is any other way to achieve it.
Is there is any draw back of doing in this way.
Specify an rpath when linking:
/opt/gcc-4.3.2/bin/g++ -o hello4_3_2 hello4_3_2.o -Wl,-rpath,/opt/gcc-4.3.2/lib64
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This will hardcode a library search path into the executable.
You can use ldd ./hello4_3_2 to check without running whether the correct libraries are being found.
The libstdc++ manual lists several options