CUDA linker error with template class

CUDA linker error with template class - templates

Using CUDA 5.0 on ubuntu with gcc/g++ 4.6, I'm getting errors when linking against CUDA code with templates.
cu_array.cu:
#include "cu_array.hpp"
template<class T>
CuArray<T>::CuArray(unsigned int n) {
cudaMalloc(&data,n*sizeof(T));
}
cu_array.hpp:
#pragma once
template<class T>
class CuArray {
public:
CuArray(unsigned int n);
private:
T* data;
};
main.cu:
#include "cu_array.hpp"
int main() {
CuArray<float> a(10);
}
These compile fine with nvcc -c, but linking with nvcc cu_array.o main.o gives undefined reference to CuArray<float>::CuArray(unsigned int). If I move the contents of cu_array.cu into the header and only build the main, it uses the templates just fine. Or if I remove the templates altogether, the code naturally links fine.
I'm sure there's a simple answer for this. Any ideas?

You haven't instantiated the class in the compilation unit where it is defined, so the compiler doesn't emit any code for the class member function, and linkage fails. This isn't specific to CUDA, this greedy style of instantiation is the compilation/linkage model g++ uses, and lots of people get caught out by it.
As you have found already, the simplest solution is to include everything into the same compilation unit, and the problem disappears.
Otherwise if you explicitly instantiate CuArray::CuArray at the bottom of cu_array.cu like this:
template CuArray<float>::CuArray(unsigned int);
the compiler will emit code where it would otherwise not, and the linkage problem will be fixed. You will need to instantiate every class function for every type you want to use elsewhere in the code to make this approach work.

Related

CUDA C++ Templating of Kernel Parameter

I'm trying to templatize a CUDA kernel based on a boolean variable (as shown here: Should I unify two similar kernels with an 'if' statement, risking performance loss?), but I keep getting a compiler error that says my function is not a template. I think that I'm just missing something obvious so it's pretty frustrating.
The following does NOT work:
util.cuh
#include "kernels.cuh"
//Utility functions
kernels.cuh
#ifndef KERNELS
#define KERNELS
template<bool approx>
__global__ void kernel(...params...);
#endif
kernels.cu
template<bool approx>
__global__ void kernel(...params...)
{
if(approx)
{
//Approximate calculation
}
else
{
//Exact calculation
}
}
template __global__ void kernel<false>(...params...); //Error occurs here
main.cu
#include "kernels.cuh"
kernel<false><<<dimGrid,dimBlock>>>(...params...);
The following DOES work:
util.cuh
#include "kernels.cuh"
//Utility functions
kernels.cuh
#ifndef KERNELS
#define KERNELS
template<bool approx>
__global__ void kernel(...params...);
template<bool approx>
__global__ void kernel(...params...)
{
if(approx)
{
//Approximate calculation
}
else
{
//Exact calculation
}
}
#endif
main.cu
#include "kernels.cuh"
kernel<false><<<dimGrid,dimBlock>>>(...params...);
If I throw in the
template __global__ void kernel<false>(...params...);
line at the end of kernels.cuh it also works.
I get the following errors (both referring to the marked line above):
kernel is not a template
invalid explicit instantiation declaration
If it makes a difference I compile all of my .cu files in one line, like:
nvcc -O3 -arch=sm_21 -I. main.cu kernels.cu -o program

All explicit specialization declarations must be visible at the time of the template instantiation. Your explicit specialization declaration is visible only in the kernels.cu translation unit, but not in main.cu.
The following code is indeed working correctly (apart from adding a __global__ qualifier at the explicit instantiation instruction).
#include<cuda.h>
#include<cuda_runtime.h>
#include<stdio.h>
#include<conio.h>
template<bool approx>
__global__ void kernel()
{
if(approx)
{
printf("True branch\n");
}
else
{
printf("False branch\n");
}
}
template __global__ void kernel<false>();
int main(void) {
kernel<false><<<1,1>>>();
getch();
return 0;
}
EDIT
In C++, templated functions are not compiled until an explicit instantiation of the function is encountered. From this point of view, CUDA, which now fully supports templates, behaves exactly the same way as C++.
To make a concrete example, when the compiler finds something like
template<class T>
__global__ void kernel(...params...)
{
...
T a;
...
}
it just checks the function syntax, but produces no object code. So, if you would compile a file with a single templated function as above, you will have an "empty" object file. This is reasonable, since the compiler would not know which type assigning to a.
The compiler produces an object code only when it encounters an explicit instantiation of the function template. This is, at that moment, how compilation of templated functions work and this behavior introduces a restriction for multiple-file projects: the implementation (definition) of a templated function must be in the same file as its declaration. So, you cannot separate the interface contained in kernels.cuh in a header file separated from kernels.cu, which is the main reason why the first version of your code does not compile. Accordingly, you must include both interface and implementation in any file that uses the templates, namely, you must include in main.cu both, kernels.cuh and kernels.cu.
Since no code is generated without an explicit instantiation, compilers tolerate the inclusion more than once of the same template file with both declarations and definitions in a project without generating linkage errors.
There are several tutorials on using templates in C++. An Idiot's Guide to C++ Templates - Part 1, apart from the irritating title, will provide you with a step-by-step introduction to the topic.

Curiously Recursive Template Pattern in GCC 3.4 (MoSync to be exact)

I'm currently trying to write an Artemis like game component/entity system in C++. I was planning on getting this system to work with a cross platform tool for writing applications on Android and iOS called MoSync.
Unfortunately MoSync currently uses an old version of GCC and when porting the library that I had been testing in Visual Studio, I got a whole bunch of errors. Most of these I could solve, but there is one bug with templates that I can't get my head around.
I wrote a small example
template <typename T>
struct Base
{
static int type;
};
struct Derived : public Base<Derived>
{
};
template <typename T>
int Base<T>::type(-1);
extern "C" int MAMain()
{
Derived d;
d.type = 0;
}
My library uses the Curiously Recursive Template Pattern for defining Components. This example compiles fine in GCC 4.4 and Visual Studio 2010. However when I try to compile this in MoSync (which uses GCC 3.4.6) I get this linker error
C:\MoSync\workspace\pede\main.cpp: Error: Unresolved symbol '__ZN4BaseI7DerivedE4typeE',
Is there a workaround to get this to work in this compiler, or will I have to find another way to define my Components?
Edit*
In fact I can make this error occur with an even simpler example:
template <typename T>
struct Component {
static int t;
};
template <typename T>
int Component<T>::t(-1);
extern "C" int MAMain()
{
Component<int>::t = 0;
}
Gives this error
C:\MoSync\workspace\Components\main.cpp:9: Error: Unresolved symbol '__ZN9ComponentIiE1tE',
I guess this might not have anything to do with the Curiously Recursive Template Pattern at all. What can I do to get this to compile under GCC 3.4.6?

According to this bug report on the gcc bugtracker, the problem is caused by specifying a default value in the static variable definition. The code should link if you remove the initialisation as so:
int Base<T>::type;
The bug report seems to have been resolved as not a bug. Despite this, your samples compile fine in GCC 4.4.
To work around this, you can use a class type with a constructor that will automatically initialise itself.

Does adding
int Base<Derived>::type(-1);
helps ?
gcc 3.4 is really starting to be old and don't coep well with template sorcery.

Separate compiling with MinGW

Using this tutorial Makefile I want to build a simple program with a separate compiling, The main problem is that the IDE Eclpse Indigo C\C++ (prespective) or MinGW I cannot compile the files. The error which I get is :
undefined reference to double getAverage<int, 85u>(int (&) [85u])'
undefined reference to int getMax<int, 85u>(int (&) [85u])'
undefined reference to int getMin<int, 85u>(int (&) [85u])'
undefined reference to void print<int, 85u>(int (&) [85u])'
undefined reference to void sort<int, 85u>(int (&) [85u])'
undefined reference to void print<int, 85u>(int (&) [85u])'
The main.cpp file is this :
#include "Tools.h"
#include <iostream>
using namespace std;
int main(int argc,char* argv[])
{
int numbers[] = {1,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8,-2,7,14,5,6,16,8};
cout <<"Average = "<< getAverage(numbers) << endl;
cout <<"Max element = "<< getMax(numbers) << endl;
cout <<"Minimal element = "<< getMin(numbers) << endl;
print(numbers);
sort(numbers);
print(numbers);
return 0;
}
and I have a Tools.h file :
#ifndef TOOLS_H_
#define TOOLS_H_
#include <iostream>
int getBigger(int numberOne,int numberTwo);
template <typename T,size_t N> double getAverage(T (&numbers)[N]);
template <typename T,size_t N> T getMax(T (&numbers)[N]);
template <typename T,size_t N> T getMin(T (&numbers)[N]);
/**
* Implementing a simple sort method of big arrays
*/
template <typename T,size_t N> void sort(T (&numbers)[N]);
/**
* Implementing a method for printing arrays
*/
template <typename T,size_t N> void print(T (&numbers)[N]);
#endif

When you compile Tools.cpp your compiler has no idea about the template parameters that you have used in main.cpp. Therefore it compiles nothing related to this templates.
You need to include theses template definitions from the compilation unit that uses them. The file Tools.cpp is often renamed to something like Tools.inl to indicate that it's neither a header file nor a separate compilation unit.
The compilation unit "main.cpp" could look like this:
#include "tools.h"
#include "tools.inl"
main()
{
int number[] = {1,2,3};
getaverage(numbers);
}
Since the compiler identifies the required specialization it can generate the code from the implementation file.

For most cases, harper's answer is appropriate. But for completeness' sake, explicit template instantiation should also be mentioned.
When you include the implementation in every compilation unit, your template classes and functions will be instantiated and compiled in all of them. Sometimes, this is not desirable. It is mostly due to compile-time memory restrictions and compilation time, if your template classes and functions are very complicated. This becomes a very real issue when you, or the libraries you use rely heavily on template metaprogramming. Another situation could be that your template function implementations might be used in many compilation units, and when you change the implementation, you will be forced to re-compile all those compilation units.
So, the solution in these situations is to include a header file like your tools.h, and have a tools.cpp, implementing the templates. The catch is that, you should explicitly instantiate your templates for all the template arguments that will be used throughout your program. This is accomplished via adding the following to tools.cpp:
template double getAverage<int,85>(int (&numbers)[85]);
Note: You obviously have to do something about that "85", such as defining it in a header file and using it across tools.cpp and main.cpp

I've found this article which is useful : templates and header files
I declared the function in the Tools.h file and include there the file Tool.hpp and after this I defined them in the Tools.hpp file.

I haven't tried to compile .cpp and .c files together but maybe my example will help.
I had similar problem compiling two separate assembly files .s on mingw with standard gcc
compiler and i achieved it as follows:
gcc -m32 -o test test.s hello.s
-m32 means i'm compiling 32bit code
-o is the output file ( which in my example is the "test" file )
test.s and hello.s are my source files. test.s is the main file and hello.s has the helper function. (Oh, to mention is the fact that both files are in the same directory)

C++ template function compiles in header but not implementation

I'm trying to learn templates and I've run into this confounding error. I'm declaring some functions in a header file and I want to make a separate implementation file where the functions will be defined.
Here's the code that calls the header (dum.cpp):
#include <iostream>
#include <vector>
#include <string>
#include "dumper2.h"
int main() {
std::vector<int> v;
for (int i=0; i<10; i++) {
v.push_back(i);
}
test();
std::string s = ", ";
dumpVector(v,s);
}
Now, here's a working header file (dumper2.h):
#include <iostream>
#include <string>
#include <vector>
void test();
template <class T> void dumpVector(const std::vector<T>& v,std::string sep);
template <class T> void dumpVector(const std::vector<T>& v, std::string sep) {
typename std::vector<T>::iterator vi;
vi = v.cbegin();
std::cout << *vi;
vi++;
for (;vi<v.cend();vi++) {
std::cout << sep << *vi ;
}
std::cout << "\n";
return;
}
With implementation (dumper2.cpp):
#include <iostream>
#include "dumper2.h"
void test() {
std::cout << "!olleh dlrow\n";
}
The weird thing is that if I move the code that defines dumpVector from the .h to the .cpp file, I get the following error:
g++ -c dumper2.cpp -Wall -Wno-deprecated
g++ dum.cpp -o dum dumper2.o -Wall -Wno-deprecated
/tmp/ccKD2e3G.o: In function `main':
dum.cpp:(.text+0xce): undefined reference to `void dumpVector<int>(std::vector<int, std::allocator<int> >, std::basic_string<char, std::char_traits<char>, std::allocator<char> >)'
collect2: ld returned 1 exit status
make: *** [dum] Error 1
So why does it work one way and not the other? Clearly the compiler can find test(), so why can't it find dumpVector?

The problem you're having is that the compiler doesn't know which versions of your template to instantiate. When you move the implementation of your function to x.cpp it is in a different translation unit from main.cpp, and main.cpp can't link to a particular instantiation because it doesn't exist in that context. This is a well-known issue with C++ templates. There are a few solutions:
1) Just put the definitions directly in the .h file, as you were doing before. This has pros & cons, including solving the problem (pro), possibly making the code less readable & on some compilers harder to debug (con) and maybe increasing code bloat (con).
2) Put the implementation in x.cpp, and #include "x.cpp" from within x.h. If this seems funky and wrong, just keep in mind that #include does nothing more than read the specified file and compile it as if that file were part of x.cpp In other words, this does exactly what solution #1 does above, but it keeps them in seperate physical files. When doing this kind of thing, it is critical that you not try to compile the #included file on it's own. For this reason, I usually give these kinds of files an hpp extension to distinguish them from h files and from cpp files.
File: dumper2.h
#include <iostream>
#include <string>
#include <vector>
void test();
template <class T> void dumpVector( std::vector<T> v,std::string sep);
#include "dumper2.hpp"
File: dumper2.hpp
template <class T> void dumpVector(std::vector<T> v, std::string sep) {
typename std::vector<T>::iterator vi;
vi = v.begin();
std::cout << *vi;
vi++;
for (;vi<v.end();vi++) {
std::cout << sep << *vi ;
}
std::cout << "\n";
return;
}
3) Since the problem is that a particular instantiation of dumpVector is not known to the translation unit that is trying to use it, you can force a specific instantiation of it in the same translation unit as where the template is defined. Simply by adding this: template void dumpVector<int>(std::vector<int> v, std::string sep); ... to the file where the template is defined. Doing this, you no longer have to #include the hpp file from within the h file:
File: dumper2.h
#include <iostream>
#include <string>
#include <vector>
void test();
template <class T> void dumpVector( std::vector<T> v,std::string sep);
File: dumper2.cpp
template <class T> void dumpVector(std::vector<T> v, std::string sep) {
typename std::vector<T>::iterator vi;
vi = v.begin();
std::cout << *vi;
vi++;
for (;vi<v.end();vi++) {
std::cout << sep << *vi ;
}
std::cout << "\n";
return;
}
template void dumpVector<int>(std::vector<int> v, std::string sep);
By the way, and as a total aside, your template function is taking a vector by-value. You may not want to do this, and pass it by reference or pointer or, better yet, pass iterators instead to avoid making a temporary & copying the whole vector.

This was what the export keyword was supposed to accomplish (i.e., by exporting the template, you'd be able to put it in a source file instead of a header. Unfortunately, only one compiler (Comeau) ever really implemented export completely.
As to why the other compilers (including gcc) didn't implement it, the reason is pretty simple: because export is extremely difficult to implement correctly. Code inside the template can change meaning (almost) completely, based on the type over which the template is instantiated, so you can't generate a conventional object file of the result of compiling the template. Just for example, x+y might compile to native code like mov eax, x/add eax, y when instantiated over an int, but compile to a function call if instantiated over something like std::string that overloads operator+.
To support separate compilation of templates, you have to do what's called two-phase name lookup (i.e., lookup the name both in the context of the template and in the context where the template is being instantiated). You typically also have the compiler compile the template to some sort of database format that can hold instantiations of the template over an arbitrary collection of types. You then add in a stage between compiling and linking (though it can be built into the linker, if desired) that checks the database and if it doesn't contain code for the template instantiated over all the necessary types, re-invokes the compiler to instantiate it over the necessary types.
Due to the extreme effort, lack of implementation, etc., the committee has voted to remove export from the next version of the C++ standard. Two other, rather different, proposals (modules and concepts) have been made that would each provide at least part of what export was intended to do, but in ways that are (at least hoped to be) more useful and reasonable to implement.

Template parameters are resolved as compile time.
The compiler finds the .h, finds a matching definition for dumpVector, and stores it. The compiling is finished for this .h. Then, it continues parsing files and compiling files. When it reads the dumpVector implementation in the .cpp, it's compiling a totally different unit. Nothing is trying to instantiate the template in dumper2.cpp, so the template code is simply skipped. The compiler won't try every possible type for the template, hoping there will be something useful later for the linker.
Then, at link time, no implementation of dumpVector for the type int has been compiled, so the linker won't find any. Hence why you're seeing this error.
The export keyword is designed to solve this problem, unfortunately few compilers support it. So keep your implementation with the same file as your definition.

A template function is not real function. The compiler turns a template function into a real function when it encounters a use of that function. So the entire template declaration has to be in scope it finds the call to DumpVector, otherwise it can't generate the real function.
Amazingly, a lot of C++ intro books get this wrong.

This is exactly how templates work in C++, you must put the implementation in the header.
When you declare/define a template function, the compiler can't magically know which specific types you may wish to use the template with, so it can't generate code to put into a .o file like it could with a normal function. Instead, it relies on generating a specific instantiation for a type when it sees the use of that instantiation.
So when the implementation is in the .C file, the compiler basically says "hey, there are no users of this template, don't generate any code". When the template is in the header, the compiler is able to see the use in main and actually generate the appropriate template code.

Most compilers don't allow you to put template function definitions in a separate source file, even though this is technically allowed by the standard.
See also:
http://www.parashift.com/c++-faq-lite/templates.html#faq-35.12
http://www.parashift.com/c++-faq-lite/templates.html#faq-35.14

"Undefined symbols" linker error with simple template class

Been away from C++ for a few years and am getting a linker error from the following code:
Gene.h
#ifndef GENE_H_INCLUDED
#define GENE_H_INCLUDED
template <typename T>
class Gene {
public:
T getValue();
void setValue(T value);
void setRange(T min, T max);
private:
T value;
T minValue;
T maxValue;
};
#endif // GENE_H_INCLUDED
Gene.cpp
#include "Gene.h"
template <typename T>
T Gene<T>::getValue() {
return this->value;
}
template <typename T>
void Gene<T>::setValue(T value) {
if(value >= this->minValue && value <= this->minValue) {
this->value = value;
}
}
template <typename T>
void Gene<T>::setRange(T min, T max) {
this->minValue = min;
this->maxValue = max;
}
Using Code::Blocks and GCC if it matters to anyone. Also, clearly porting some GA stuff to C++ for fun and practice.

The template definition (the cpp file in your code) has to be included prior to instantiating a given template class, so you either have to include function definitions in the header, or #include the cpp file prior to using the class (or do explicit instantiations if you have a limited number of them).

Including the cpp file containing the implementations of the template class functions works. However, IMHO, this is weird and awkward. There must surely be a slicker way of doing this?
If you have only a few different instances to create, and know them beforehand, then you can use "explicit instantiation"
This works something like this:
At the top of gene.cpp add the following lines
template class Gene<int>;
template class Gene<float>;

In if(value >= this->minValue && value <= this->minValue) the second minValue should be maxValue, no?
Echo what Sean said: What's the error message? You've defined and declared the functions, but you've not used them in anything anywhere, nor do I see an error (besides the typo).

TLDR
It seems that you need an Explicit Instantiation i.e. to actually create the class. Since template classes are just "instructions" on how to create a class you actually need to tell the compiler to create the class. Otherwise the linker won't find anything when it goes looking.
The thorough explanation
When compiling your code g++ goes through a number of steps the problem you're seeing occurs in the Linking step. Template classes define how classes "should" be created, they're literally templates. During compile time g++ compiles each cpp file individually so the compiler sees your template on how to create a class but no instructions on what "classes" to create. Therefore ignores it. Later during the linking step the g++ attempts to link the file containing the class (the one that doesn't exist) and fails to find it ultimately returning an error.
To remedy this you actually need to "explicitly instantiate" the class by adding the following lines to Gene.cpp after the definition of the class
template class Gene<whatever_type_u_wanna_use_t>;int
Check out these docs I found them to be super helpful.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js