Calling a functor from a Cuda Kernel [duplicate]

Calling a functor from a Cuda Kernel [duplicate] - c++

This question already has answers here:
default visibility of C++ class/struct members
(4 answers)
Closed 7 years ago.
I am trying to call a functor from a Cuda Kernel. The functor is given by the programmer and my library uses it to perform some functions and returns the processed array.
Since the functor is in Host Memory Space, I am Copying the object to Device and using the functor in my kernel call.
Error : It says the Functor operator() is inaccessible from the kernel.'
I cannot understand where I am Wrong.
Note : Full Error Message dumped At the end.
Here is the Full Code :
#include <cstdio>
using namespace std;
class my_functor
{
__device__
int operator() (int x)
{
return x*10;
}
};
template <class T,typename Func>
__global__
void for_each_kernel (T* d_v,int N,Func f)
{
int idx = blockIdx.x*blockDim.x + threadIdx.x;
int num_threads = gridDim.x * blockDim.x;
__shared__ T s_x[1024];
for(int i = idx; i < N; i += num_threads)
{
s_x[threadIdx.x] = d_v[i];
// Call The Functor Here
s_x[threadIdx.x] = f (s_x[threadIdx.x]);
d_v[i] = s_x[threadIdx.x];
}
}
template <class T>
class device_vector
{
T *d_v;
int numEle;
public :
device_vector (T *h_v,int N)
{
cudaMalloc ((T**)&d_v,N*sizeof(T));
cudaMemcpy(d_v, h_v, N * sizeof(T), cudaMemcpyHostToDevice);
numEle = N;
}
void set (T data,int index)
{
cudaMemcpy (&d_v[index],&data,sizeof(T),cudaMemcpyHostToDevice);
}
T get (int index)
{
T temp;
cudaMemcpy (&temp,&d_v[index],sizeof(T),cudaMemcpyDeviceToHost);
return temp;
}
// Only Provide Start And End Vertices Fot Which you Want To Do Some Operation
template <typename Func>
void for_each (int start,int end,Func f)
{
Func *new_func;
cudaMalloc (&new_func,sizeof(my_functor));
cudaMemcpy (new_func,&f,sizeof (my_functor),cudaMemcpyHostToDevice);
for_each_kernel<<<26,1024>>> (d_v,end-start+1,*new_func);
}
};
int a[1<<28];
int main ()
{
int N = 1<<28;
my_functor functor;
for (int i=0;i<N;i++)
a[i] = i;
device_vector<int> d (a,N);
d.for_each (0,N-1,functor);
printf ("Getting Element At Index %d : %d \n",100,d.get(100));
return 0;
}
Error Message Dump :
device_vector.cu(40): error: function "my_functor::operator()"
(18): here is inaccessible
detected during:
instantiation of "void for_each_kernel(T *, int, Func) [with T=int, Func=my_functor]"
(107): here
instantiation of "void device_vector<T>::for_each(int, int, Func) [with T=int, Func=my_functor]"
(125): here
1 error detected in the compilation of "/tmp/tmpxft_00005da2_00000000-9_device_vector.cpp1.ii".

You are getting the inaccessible error because my_functor is a class. Class members are, by default, private. If you change your definition of my_functorlike this:
class my_functor
{
public:
__device__
int operator() (int x)
{
return x*10;
}
};
or change it to a struct (note struct members are public by default):
struct my_functor
{
__device__
int operator() (int x)
{
return x*10;
}
};
you might find the code compiles. There are possibly other things wrong with the code, but either of these modifications should remove the source of that particular compilation error.

Related

Templated elipsis constructor C++

I want to create a templated math vector class. But it says code editor says ambiguous constructor.
I use only header file!!
.h
template<class T>
class MathVec {
private:
std::vector<T> mathVec;
size_t dimension;
public:
MathVec(size_t dim);
MathVec(size_t dim, ...);
MathVec(const MathVec& other);
MathVec();
void print();
};
template<class T>
MathVec<T>::MathVec(size_t dim) {
this->dimension = dim;
this->mathVec = std::vector<T>(dim,0);
}
template<class T>
MathVec<T>::MathVec(size_t dim, ...) {
this->mathVec = std::vector<T>(dim);
this->dimension = dim;
va_list list;
va_start(list, dim);
for (size_t i = 0; i < dim; i++){
this->mathVec[i] = va_arg(list, T);
}
va_end(list);
}
template<class T>
MathVec<T>::MathVec(const MathVec & other) {
this->dimension = other.dimension;
this->mathVec = other.mathVec;
}
template<class T>
MathVec<T>::MathVec() {
this->dimension = 0;
}
template<class T>
void MathVec<T>::print() {
for(int i = 0; i < this->dimension; ++i)
std::cout << this->mathVec[i] << " ";
std::cout << std::endl;
}
In the main.cpp file I use the following code it is working.
MathVec<int> vect(3,1,2,3);
vect.print();
But if I use the following code it is not working
MathVec<int> vect(1);
vect.print();
The error message is the following.
main.cpp:9:18: error: call to constructor of 'MathVec<int>' is ambiguous
mathvec.h:14:9: note: candidate constructor
mathvec.h:15:9: note: candidate constructor
mathvec.h:16:9: note: candidate constructor
I assume that wrong with the ellipsis, but I do not know what can be the problem, and how to solve this issue.

Modern C++ has better way to accomplish that. Create the following constructor:
MathVec( std::initializer_list<T> values )
Usage:
MathVec<int> vect( { 1, 2, 3 } );

My guess would be that the compiler does not know what constructor to call, as these two:
MathVec(size_t dim);
MathVec(size_t dim, ...);
can both be called by
MathVec<int> vect(1);

may I offer some improvements to your design?
There is no reason to pass dimension as a parameter to constructor. I think be better to make it template parameter.
Thus, when you try mix in some expression matrices with different dimensions, you catch error at compile time, not when program executes ;)
Working example:
#include <vector>
#include <iostream>
template <typename T, unsigned int dim>
class Matrix
{
const unsigned int m_dim = dim;
std::vector<T> m_vec;
public:
Matrix(): m_vec(std::vector<T>(m_dim, 0)) {}
virtual ~Matrix() {}
//suppose, we want m_vec be R/O for public
const std::vector<T>& Read() {return m_vec;}
};
int main(int, char **)
{
Matrix<double, 4> m;
for (auto& x : m.Read())
std::cout << x << std::endl;
}

In general I would avoid using C style variadic arguments, check variadic templates instead (C++11) or fold expressions (C++17)

How do I call template array operator overloading function?

I need to create an adapter C++ class, which accepts an integer index, and retrieves some types of data from a C module by the index, and then returns it to the C++ module.
The data retrieving functions in the C module are like:
int getInt(int index);
double getDouble(int index);
const char* getString(int index);
// ...and etc.
I want to implement an array-like interface for the C++ module, so I created the following class:
class Arguments {
public:
template<typename T> T operator[] (int index);
};
template<> int Arguments::operator[] (int index) { return getInt(index); }
template<> double Arguments::operator[] (int index) { return getdouble(index); }
template<> std::string Arguments::operator[] (int index) { return getString(index); }
(Template class doesn't help in this case, but only template member functions)
The adapter class is no biggie, but calling the Arguments::operator[] is a problem!
I found out that I can only call it in this way:
Arguments a;
int i = a.operator[]<int>(0); // OK
double d = a.operator[]<double>(1); // OK
int x = a[0]; // doesn't compile! it doesn't deduce.
But it looks like a joke, doesn't it?
If this is the case, I would rather create normal member functions, like template<T> T get(int index).
So here comes the question: if I create array-operator-overloading function T operator[]() and its specializations, is it possible to call it like accessing an array?
Thank you!

The simple answer is: No, not possible. You cannot overload a function based on its return type. See here for a similar quesiton: overload operator[] on return type
However, there is a trick that lets you deduce a type from the lhs of an assignment:
#include <iostream>
#include <type_traits>
struct container;
struct helper {
container& c;
size_t index;
template <typename T> operator T();
};
struct container {
helper operator[](size_t i){
return {*this,i};
}
template <typename T>
T get_value(size_t i){
if constexpr (std::is_same_v<T,int>) {
return 42;
} else {
return 0.42;
}
}
};
template <typename T>
helper::operator T(){
return c.get_value<T>(index);
}
int main() {
container c;
int x = c[0];
std::cout << x << "\n";
double y = c[1];
std::cout << y ;
}
Output is:
42
0.42
The line int x = c[0]; goes via container::get_value<int> where the int is deduced from the type of x. Similarly double y = c[1]; uses container::get_value<double> because y is double.
The price you pay is lots of boilerplate and using auto like this
auto x = c[1];
will get you a helper, not the desired value which might be a bit unexpected.

How to pass a dynamically allocated array with "size determined at run time" as a reference?

I know how to pass an array of constant size as a reference, but I want to know that how to pass an array of variable size as a reference to another function. Any help would be much appreciated. Thank you
For example, I have the following code snippet:
void y(int (&arr)[n]) //Gives error
{}
void x(Node * tree, int n)
{
int arr[n];
y(arr);
}
I heard that we can templateize the function and make the size a template parameter but I am unable to do so.

Simple: don't. Use std::array or std::vector instead:
int get_max(std::vector<int> & vec) {//Could use const& instead, if it doesn't need to be modified
int max = std::numeric_limits<int>::min();
for(int & val : vec) {if(max < val) max = val;
return max;
}
int get_max(std::array<int, 20> & arr) {//Could use const& instead
int max = std::numeric_limits<int>::min();
for(int & val : arr) {if(max < val) max = val;
return max;
}
If you want this to work for any std::array or any std::vector, you can template them like so:
template<typename T>
T get_max(std::vector<T> const& vec) {
if(vec.size() == 0) throw std::runtime_error("Vector is empty!");
T const* max = &vec[0];
for(T const& val : vec) if(*max < val) max = &val;
return *max;
}
template<typename T, size_t N>
T get_max(std::array<T, N> const& arr) {
static_assert(N > 0, "Array is empty!");
T * max = &arr[0];
for(T & val : arr) if(*max < val) max = &val;
return *max;
}
Your code should now look like this to compensate:
void y(std::vector<int> & arr) //Can be const& if you don't need to modify it.
{}
void x(Node * tree, int n)
{
std::vector<int> arr(n); //Will initialize n elements to all be 0.
y(arr);
}

This answer is to illustrate how to work with VLA in C++ when passing it as a function parameter.
In c99, the syntax allows you to pass the size of the array as a parameter to the function, and use the function parameter to declare the size of the VLA:
void y (int n, int (*arr)[n])
{}
void x (int n)
{
int arr[n];
y(n, &arr);
}
C++ uses "function name mangling" as a technique to encode the parameter types accepted by the function into the function name to support function overloading. However, in GCC, since VLA is not a C++ supported feature, there is no mangling convention for it. One could argue this is a G++ bug (or incomplete support of the VLA extension), but it is what it is. To mimic the pass by reference, accept the decayed pointer as the parameter, and cast it to a reference to the VLA.
void y(int n, int *x)
{
int (&arr)[n] = reinterpret_cast<int (&)[n]>(*x);
}
void x(int n)
{
int arr[n];
y(n, arr);
}
I have verified this works for GCC 4.8.

Xirema already mentioned how to resolve this using std::vector/std::array.
I don't know your exact case, so will just describe another options, despite the fact, that std::vector/std::array is the best.
Pointers option
Here you have believe, that arr and n arguments of y are consistent. And handle arr size manually.
void y(int * arr, const int n) {}
void x(Node * tree, int n) {
int arr[n];
y(arr, n);
}
Templates option
This will work, howether it will instantiate 2 templates on each new N value.
template <size_t N>
void y(int (&arr)[N]) {}
template <size_t N>
void x(Node * tree) {
int arr[N];
y<N>(arr);
}

The proper way to invoke function template in main() having reference parameters in C++

I have a template as below
template<typename T>
T insert(T a[], int& n ,const T& x)
{
int i;
for (i = n-1; i > 1 && x < a[i]; --i)
a[i+1] = a[i];
a[i+1] = x;
++n;
}
What's the correct way to invoke it in main()

I'm not sure that is even correctly defined. Aren't you missing :: ?
Secondly, I don't see the template definition or its name.
Thirdly you're attributing a T return type to this function, but you're not even returning anything.
a correct template should look like this
template <typename T> class tMyExample
{
public:
tMyExample <T> () { // constructor code }
insert (T a[], int &n, const T &x) { // your code here }
};
int main (void)
{
tMyExample <int> MyExample;
return 0;
}
Wll, my template definition is just the bare minimum. If you don't specialize it or add implementations outside the definition you get to another story. But the declaration inside a function (like main) is the same: template_name identifier;
Honestly I never used [] in parameters, I use pointers (no need for allocations if you you arrays).

Nontype template parameter

I'm having trouble with nontype(int variable) template parameter.
Why can't I pass a constant int variable to a function and let the function instantiate the template?
template<int size>
class MyTemplate
{
// do something with size
};
void run(const int j)
{
MyTemplate<j> b; // not fine
}
void main()
{
const int i = 3;
MyTemplate<i> a; // fine;
run(i); // not fine
}
not fine : compiler says, error: 'j' cannot appear in constant-expression
EDIT
This is what I ended up with.
Maybe someone might use it, someone might suggest better way.
enum PRE_SIZE
{
PRE_SIZE_256 = 256,
PRE_SIZE_512 = 512,
PRE_SIZE_1024 = 1024,
};
template<int size>
class SizedPool : public Singleton< SizedPool<size> >
{
public:
SizedPool()
: mPool(size)
{
}
void* Malloc()
{
return mPool.malloc();
}
void Free(void* memoryPtr)
{
mPool.free(memoryPtr);
}
private:
boost::pool<> mPool;
};
template<int size>
void* SizedPoolMalloc()
{
return SizedPool<size>::GetInstance()->Malloc();
}
template<int size>
void SizedPoolFree(void* memoryPtr)
{
SizedPool<size>::GetInstance()->Free(memoryPtr);
}
void* SizedPoolMalloc(int size)
{
if (size <= PRE_SIZE_256)
return SizedPoolMalloc<PRE_SIZE_256>();
else if (size <= PRE_SIZE_512)
return SizedPoolMalloc<PRE_SIZE_512>();
}
void toRun(const int j)
{
SizedPoolMalloc(j);
}
void Test17()
{
const int i = 3;
toRun(i);
}

Because non-type template parameters require values at compile-time. Remember that templates are a compile-time mechanism; templates do not exist in the final executable. Also remember that functions and the passing of arguments to functions are runtime mechanisms. The value of the j parameter in run() will not be known until the program actually runs and invokes the run() function, well past after the compilation stage.
void run(const int j)
{
// The compiler can't know what j is until the program actually runs!
MyTemplate<j> b;
}
const int i = 3;
run(i);
That's why the compiler complains says "'j' cannot appear in constant-expression".
On the other hand, this is fine because the value of i is known at compile-time.
const int i = 3;
// The compiler knows i has the value 3 at this point,
// so we can actually compile this.
MyTemplate<i> a;
You can pass compile-time values to run-time constructs, but not the other way around.
However, you can have your run() function accept a non-type template parameter the same way your MyTemplate template class accepts a non-type template parameter:
template<int j>
void run()
{
MyTemplate<j> b;
}
const int i = 3;
run<i>();

Basically, C++ has two kinds of constants:
const int a = 5;
MyTemplate<a> foo; // OK
const int b = rand();
MyTemplate<b> foo; // Not OK.
The first example is a compile-time constant. In C++ standard speak, it's an Integral Constant Expression (ICE). The second example is a run-time constant. It has the same C++ type (const int) but it's not an ICE.
Your function void run(const int j) is a run-time constant. You could even pass in user input. Therefore it's not a valid template argument.
The reason for the rule is that the compiler must generate code based on the template argument value. It can't do so if it doesn't have a compile-time constant.

Because j should be known at compile time. In your example it is not.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Calling a functor from a Cuda Kernel [duplicate] - c++

Related

Templated elipsis constructor C++

How do I call template array operator overloading function?

How to pass a dynamically allocated array with "size determined at run time" as a reference?

The proper way to invoke function template in main() having reference parameters in C++

Nontype template parameter

Categories

Resources