How to specify OpenMP execution policy in cuda thrust calls on Windows? - c++

While porting a code from Linux to Windows, thanks to Visual Studio C++ 2015 Community, I encountered a compilation error that I cannot understand.
Below is a sample program exhibiting this error, building a vector of doubles and then sorting it with cuda thrust, using OpenMP.
# include <thrust/sort.h>
# include <thrust/system/omp/execution_policy.h>
# include <chrono>
# include <random>
# include <vector>
double unit_random()
{
static std::default_random_engine generator(std::chrono::system_clock::now().time_since_epoch().count());
static std::uniform_real_distribution<double> distribution(double(0), double(1));
return distribution(generator);
}
int main(int argc, char* argv[])
{
constexpr size_t input_size = 100000;
std::vector< double > input(input_size, 0);
for ( size_t i = 0; i < input_size; ++i)
input[i] = unit_random() * 1000;
thrust::sort(thrust::omp::par, input.begin(), input.end());
return 0;
}
Here is the error seen in the Visual Studio console (file names are shortened):
thrust/system/omp/detail/sort.inl(136): error C2146: syntax error: missing ';' before identifier 'nseg'
thrust/detail/sort.inl(83): note: see reference to function template instantiation 'void thrust::system::omp::detail::stable_sort<thrust::system::omp::detail::par_t,RandomAccessIterator,StrictWeakOrdering>(thrust::system::omp::detail::execution_policy<thrust::system::omp::detail::par_t> &,RandomAccessIterator,RandomAccessIterator,StrictWeakOrdering)' being compiled
with
[
RandomAccessIterator=std::_Vector_iterator<std::_Vector_val<std::_Simple_types<double>>>,
StrictWeakOrdering=thrust::less<value_type>
]
thrust/system/detail/generic/sort.inl(63): note: see reference to function template instantiation 'void thrust::stable_sort<DerivedPolicy,RandomAccessIterator,StrictWeakOrdering>(const thrust::detail::execution_policy_base<DerivedPolicy> &,RandomAccessIterator,RandomAccessIterator,StrictWeakOrdering)' being compiled
with
[
DerivedPolicy=thrust::system::omp::detail::par_t,
RandomAccessIterator=std::_Vector_iterator<std::_Vector_val<std::_Simple_types<double>>>,
StrictWeakOrdering=thrust::less<value_type>
]
thrust/detail/sort.inl(56): note: see reference to function template instantiation 'void thrust::system::detail::generic::sort<Derived,RandomAccessIterator,StrictWeakOrdering>(thrust::execution_policy<Derived> &,RandomAccessIterator,RandomAccessIterator,StrictWeakOrdering)' being compiled
with
[
Derived=thrust::system::omp::detail::par_t,
RandomAccessIterator=std::_Vector_iterator<std::_Vector_val<std::_Simple_types<double>>>,
StrictWeakOrdering=thrust::less<value_type>
]
thrust/system/detail/generic/sort.inl(49): note: see reference to function template instantiation 'void thrust::sort<DerivedPolicy,RandomAccessIterator,thrust::less<value_type>>(const thrust::detail::execution_policy_base<DerivedPolicy> &,RandomAccessIterator,RandomAccessIterator,StrictWeakOrdering)' being compiled
with
[
DerivedPolicy=thrust::system::omp::detail::par_t,
RandomAccessIterator=std::_Vector_iterator<std::_Vector_val<std::_Simple_types<double>>>,
StrictWeakOrdering=thrust::less<value_type>
]
thrust/detail/sort.inl(41): note: see reference to function template instantiation 'void thrust::system::detail::generic::sort<Derived,RandomAccessIterator>(thrust::execution_policy<Derived> &,RandomAccessIterator,RandomAccessIterator)' being compiled
with
[
Derived=thrust::system::omp::detail::par_t,
RandomAccessIterator=std::_Vector_iterator<std::_Vector_val<std::_Simple_types<double>>>
]
windows_cuda_thrust_error.cc(24): note: see reference to function template instantiation 'void thrust::sort<DerivedPolicy,std::_Vector_iterator<std::_Vector_val<std::_Simple_types<double>>>>(const thrust::detail::execution_policy_base<DerivedPolicy> &,RandomAccessIterator,RandomAccessIterator)' being compiled
with
[
DerivedPolicy=thrust::system::omp::detail::par_t,
RandomAccessIterator=std::_Vector_iterator<std::_Vector_val<std::_Simple_types<double>>>
]
thrust/system/omp/detail/sort.inl(136): error C2275: 'IndexType': illegal use of this type as an expression
thrust/system/omp/detail/sort.inl(113): note: see declaration of 'IndexType'
thrust/system/omp/detail/sort.inl(136): error C2065: 'nseg': undeclared identifier
thrust/system/omp/detail/sort.inl(142): error C2065: 'nseg': undeclared identifier
thrust/system/omp/detail/sort.inl(159): error C2065: 'nseg': undeclared identifier
========== Build: 0 succeeded, 1 failed, 1 up-to-date, 0 skipped ==========
The same code works fine on Linux.
How are we supposed to specify an OpenMP execution policy in a cuda thrust call on Windows? Alternatively, what I am doing wrong in this particular context?
The thrust version used is 1.8.1 and here is an excerpt of the thrust function, in file thrust/system/omp/detail/sort.inl, raising the compilation errors:
template<typename DerivedPolicy,
typename RandomAccessIterator,
typename StrictWeakOrdering>
void stable_sort(execution_policy<DerivedPolicy> &exec,
RandomAccessIterator first,
RandomAccessIterator last,
StrictWeakOrdering comp)
{
// ...
typedef typename thrust::iterator_difference<RandomAccessIterator>::type IndexType;
if(first == last)
return;
#pragma omp parallel
{
thrust::system::detail::internal::uniform_decomposition<IndexType> decomp(last - first, 1, omp_get_num_threads());
// process id
IndexType p_i = omp_get_thread_num();
// every thread sorts its own tile
if(p_i < decomp.size())
{
thrust::stable_sort(thrust::seq,
first + decomp[p_i].begin(),
first + decomp[p_i].end(),
comp);
}
#pragma omp barrier
IndexType nseg = decomp.size(); // line 136
// ...
}
}

As suggested by #kangshiyin, I filed an issue on github (see issue #817) and thrust developers found a workaround. The problem came from the way MSVC currently deals with OpenMP code, so the code provided in the question was perfectly fine.
If a similar problem arise, try first to update to the latest version of thrust. You can also try to apply the same workaround: simply add a semi-colon before the line raising the error.

Related

Using lambda expression as Compare for std::set, when it's inside a vector

I want to use a lambda expression as custom Compare for a std::set of integers. There are many answers on this site explaining how to do this, for example https://stackoverflow.com/a/46128321/10774939. And indeed,
#include <vector>
#include <set>
#include <iostream>
int main() {
auto different_cmp = [](int i, int j) -> bool {
return j < i;
};
std::set<int, decltype(different_cmp)> integers(different_cmp);
integers.insert(3);
integers.insert(4);
integers.insert(1);
for (int integer : integers) {
std::cout << integer << " ";
}
return 0;
}
compiles and outputs
4 3 1
as expected. However, when I try to put this set in a vector with
std::vector<std::set<int, decltype(different_cmp)>> vec_of_integers;
vec_of_integers.push_back(integers);
the compiler complains. I'm using Visual Studio 2017 and I get different compiler errors depending on the surrounding code. In the above example, it's
1>c:\program files (x86)\microsoft visual studio\2017\community\vc\tools\msvc\14.16.27023\include\utility(77): error C2664: 'void std::swap(std::exception_ptr &,std::exception_ptr &) noexcept': cannot convert argument 1 from '_Ty' to 'std::exception_ptr &'
1> with
1> [
1> _Ty=main::<lambda_48847b4f831139ed92f5310c6e06eea1>
1> ]
Most of the errors I've seen so far with this seem to have to do with copying the set.
So my question is:
Why does the above code not work and how can I make it work, while still using a locally defined lambda?
This seems to be a bug in MS compiler as it compiles well with GCC and Clang.
To make it work in MS Compiler (Visual Studio 2017) you can do this:
std::vector<std::set<int, decltype(different_cmp)>> vec_of_integers{integers};
This compiles cleanly. See here.

Visual Studio warning about function not in global namespace

I didn't really know what to write in the title, but basically I have a single .cpp, with only standard library headers included and no "using" keywords. I made my own "generate(...)" function. After including the library, Visual Studio shows me an error (where the function is being called), basically saying that it doesn't know whether to choose std::generate(...) or generate(...) because they have matching argument lists.
Is this a bug or have I missed something? I might also add that I am using VS2015.
#include <iostream>
#include <ctime>
#include <vector>
#include <algorithm>
template<typename Iter, typename Function>
Function generate(Iter begin, Iter end, Function f)
{
while (begin != end)
{
*begin = f();
++begin;
}
return f;
}
class Random
{
public:
Random(int low, int high)
: mLow(low), mHigh(high)
{}
int operator()()
{
return mLow + rand() % (mHigh - mLow + 1);
}
private:
int mLow;
int mHigh;
};
class Print
{
void operator()(int t)
{
std::cout << t << " ";
}
};
int main()
{
srand(time(0));
std::vector<int> intVec;
intVec.resize(15);
Random r(2, 7);
generate(intVec.begin(), intVec.end(), r);
}
Error output:
1>------ Build started: Project: Functor, Configuration: Debug Win32 ------
1> Main.cpp
1>c:\users\michael sund\documents\visual studio 2015\projects\gi_cpp\functor\main.cpp(44): warning C4244: 'argument': conversion from 'time_t' to 'unsigned int', possible loss of data
1>c:\users\michael sund\documents\visual studio 2015\projects\gi_cpp\functor\main.cpp(50): error C2668: 'generate': ambiguous call to overloaded function
1> c:\users\michael sund\documents\visual studio 2015\projects\gi_cpp\functor\main.cpp(7): note: could be 'Function generate<std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>>,Random>(Iter,Iter,Function)'
1> with
1> [
1> Function=Random,
1> Iter=std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>>
1> ]
1> c:\program files (x86)\microsoft visual studio 14.0\vc\include\algorithm(1532): note: or 'void std::generate<std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>>,Random>(_FwdIt,_FwdIt,_Fn0)' [found using argument-dependent lookup]
1> with
1> [
1> _FwdIt=std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>>,
1> _Fn0=Random
1> ]
1> c:\users\michael sund\documents\visual studio 2015\projects\gi_cpp\functor\main.cpp(50): note: while trying to match the argument list '(std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>>, std::_Vector_iterator<std::_Vector_val<std::_Simple_types<int>>>, Random)'
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
This happens on not just VC++ (VS 2015), but g++ 4.9+ as well. The issue here is the tricky Argument Dependent Lookup (Koenig Lookup).
It looks at the two iterators you're adding and it sees the "generate" function in std because the iterators also come from the std namespace (this is the point of Argument Dependent Lookup).
This problem actually bit me at one point: when I wrote my own tie implementation that did a few things extra to tie. I had to call mine tye because Koenig Lookup caused the considered overloads to be equal in their ranking and thus cause an error like this.
Either prefix generate with :: to start lookup from the global namespace (::generate( vec.begin(), vec.end(), ... );), or name it differently.

Why I fail to compile VS2012 C++ code with boost 1.54 over win64?

I have a win64 application compiled using VS2010 and boost 1_54_0 version - everything works as expected.
i'm now transferring the application to a new platform which requires VS2012 compiled libraries.
when trying to compile boost with VS2012 (to later link into my project) i get the following compiler warning:
1>c:\<my_path>\boost\boost_1_54_0\boost\functional\hash\hash.hpp(176): warning C6295: Ill-defined for-loop: 'unsigned int' values are always of range '0' to '4294967295'. Loop executes infinitely.
1>C:\<my_path>\boost\boost_1_54_0\boost/functional/hash/hash.hpp(201) : see reference to function template instantiation 'size_t boost::hash_detail::hash_value_unsigned<T>(T)' being compiled
1> with
1> [
1> T=boost::ulong_long_type
1> ]
1> C:\<my_path>\boost\boost_1_54_0\boost/functional/hash/hash.hpp(439) : see reference to function template instantiation 'boost::hash_detail::enable_hash_value::type boost::hash_value<boost::ulong_long_type>(T)' being compiled
1> with
1> [
1> T=boost::ulong_long_type
1> ]
(<my_path> represent the local path on my development machine, so this can be ignored)
looking around the hash.hpp file at lines #176 (for example) - i see the following
template <class T>
inline std::size_t hash_value_unsigned(T val)
{
const int size_t_bits = std::numeric_limits<std::size_t>::digits;
// ceiling(std::numeric_limits<T>::digits / size_t_bits) - 1
const int length = (std::numeric_limits<T>::digits - 1)
/ size_t_bits;
std::size_t seed = 0;
// Hopefully, this loop can be unrolled.
for(unsigned int i = length * size_t_bits; i > 0; i -= size_t_bits)
{
seed ^= (std::size_t) (val >> i) + (seed<<6) + (seed>>2);
}
seed ^= (std::size_t) val + (seed<<6) + (seed>>2);
return seed;
}
Line #176 is the for statement: for(unsigned int i = length * size_t_bits; i > 0; i -= size_t_bits).
Now i don't seem to understand what exactly the compiler warns me? if the condition was i>=0 this make sense (as of MSDN explanation of C6295) but the for statement logic looks ok to me.
What is the root cause of this warning? how to workaround it?
P.S. as my application uses warning level 4 (warning treated as error) - i can't compile my application due to this warning.
Thanks
This was fixed in Boost: https://svn.boost.org/trac/boost/ticket/8568.
I suggest updating to Boost 1.55, patching your copy of Boost localy, or disabling that specific warning using /wd6295 or pragma's around Boost includes.
Although maybe not applicable to you in this case, this is a reason that forcing warnings=errors in build scripts contained in released sources is generally a bad thing: new compiler releases add new warnings.

Workaround for VS 2013 SFINAE deficiencies

I'm trying to fix up a library (entityx) that does not currently compile on Windows using VS 2013. It compiles fine on Linux with gcc and also on Windows with MinGW.
It seems the problem is with SFINAE - I guess VS 2013 doesn't properly ignore substitution failures for templates.
There is a report of this issue on Microsoft Connect, here.
Before I dive into entityx, there is a sample of the problem (taken from the report at Microsoft Connect):
#include <vector>
#include <future>
using namespace std;
typedef int async_io_op;
struct Foo
{
//! Invoke the specified callable when the supplied operation completes
template<class R> inline std::pair < std::vector < future < R >> , std::vector < async_io_op >> call(const std::vector<async_io_op> &ops, const std::vector < std::function < R() >> &callables);
//! Invoke the specified callable when the supplied operation completes
template<class R> std::pair < std::vector < future < R >> , std::vector < async_io_op >> call(const std::vector < std::function < R() >> &callables) { return call(std::vector<async_io_op>(), callables); }
//! Invoke the specified callable when the supplied operation completes
template<class R> inline std::pair<future<R>, async_io_op> call(const async_io_op &req, std::function<R()> callback);
//! Invoke the specified callable when the supplied operation completes
template<class C, class... Args> inline std::pair<future<typename std::result_of<C(Args...)>::type>, async_io_op> call(const async_io_op &req, C callback, Args... args);
};
int main(void)
{
Foo foo;
std::vector<async_io_op> ops;
std::vector < std::function < int() >> callables;
foo.call(ops, std::move(callables));
return 0;
}
I get the following error when attempting to compile this:
error C2064: term does not evaluate to a function taking 0 arguments
c:\program files (x86)\microsoft visual studio 12.0\vc\include\xrefwrap 58
Apparently, we can use std::enable_if to work around this problem. However, I can't figure out how.
Does anyone know how I can fix this compile error?
Edit: Full output from VS 2013:
1>------ Build started: Project: VS2013_SFINAE_Failure, Configuration: Debug Win32 ------
1> test_case.cpp
1>c:\program files (x86)\microsoft visual studio 12.0\vc\include\xrefwrap(58): error C2064: term does not evaluate to a function taking 0 arguments
1> c:\program files (x86)\microsoft visual studio 12.0\vc\include\xrefwrap(118) : see reference to class template instantiation 'std::_Result_of<_Fty,>' being compiled
1> with
1> [
1> _Fty=std::vector<std::function<int (void)>,std::allocator<std::function<int (void)>>>
1> ]
1> c:\users\jarrett\downloads\vs2013_sfinae_failure\vs2013_sfinae_failure\test_case.cpp(25) : see reference to class template instantiation 'std::result_of<std::vector<std::function<int (void)>,std::allocator<_Ty>> (void)>' being compiled
1> with
1> [
1> _Ty=std::function<int (void)>
1> ]
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========
VS2013 compiles the code if you replace std::result_of with its decltype + std::declval equivalent. So change the last Foo::call() definition to
template<class C, class... Args>
inline pair<future<decltype(declval<C>()(declval<Args>()...))>, async_io_op>
call(const async_io_op& req, C callback, Args... args);
If I understand the error correctly, it's related to the defect described here, but then it's surprising that both GCC and clang manage to compile the result_of code without errors.

Filling std::map with std::transform (error: cannot deduce argument)

I'm trying to fill std::map with std::transform. Next code compiles without error:
std::set<std::wstring> in; // "in" is filled with data
std::map<std::wstring, unsigned> out;
std::transform(in.begin(), in.end()
, boost::counting_iterator<unsigned>(0)
, std::inserter(out, out.end())
, [] (std::wstring _str, unsigned _val) { return std::make_pair(_str, _val); }
);
But If I replace string
, [] (std::wstring _str, unsigned _val) { return std::make_pair(_str, _val); }
with
, std::make_pair<std::wstring, unsigned>
or
, std::ptr_fun(std::make_pair<std::wstring, unsigned>)
I get errors:
foo.cpp(327): error C2784: '_OutTy *std::transform(_InIt1,_InIt1,_InTy (&)[_InSize],_OutTy (&)[_OutSize],_Fn2)' : could not deduce template argument for '_InTy (&)[_InSize]' from 'boost::counting_iterator<Incrementable>'
with
[
Incrementable=unsigned int
]
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\algorithm(1293) : see declaration of 'std::transform'
foo.cpp(327): error C2784: '_OutTy *std::transform(_InIt1,_InIt1,_InIt2,_OutTy (&)[_OutSize],_Fn2)' : could not deduce template argument for '_OutTy (&)[_OutSize]' from 'std::insert_iterator<_Container>'
with
[
_Container=std::map<std::wstring,unsigned int>
]
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\algorithm(1279) : see declaration of 'std::transform'
foo.cpp(327): error C2784: '_OutIt std::transform(_InIt1,_InIt1,_InTy (&)[_InSize],_OutIt,_Fn2)' : could not deduce template argument for '_InTy (&)[_InSize]' from 'boost::counting_iterator<Incrementable>'
with
[
Incrementable=unsigned int
]
C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\algorithm(1267) : see declaration of 'std::transform'
foo.cpp(327): error C2914: 'std::transform' : cannot deduce template argument as function argument is ambiguous
and so on...
Please explain what is the problem is with compilation?
UPDATE: Thanks for answers. I realized, that it is MSVC2010 bug. By the way the line
&std::make_pair<const std::wstring&, const unsigned&>
causes the same error
This is a bug in the Visual C++ library implementation. The following program demonstrates the issue:
#include <utility>
int main()
{
&std::make_pair<int, int>;
};
This program yields the error:
error C2568: 'identifier' : unable to resolve function overload
In the C++ language specification, make_pair is not an overloaded function. The Visual C++ 2010 library implementation includes four overloads taking various combinations of lvalue and rvalue references.
While an implementation of the C++ Standard Library is allowed to add overloads for member functions, it isn't allowed to add overloads for nonmember functions, thus this is a bug.
The bug was reported and will be fixed in the next version of Visual C++. However, as STL notes in the resolution to that bug, you'll need to make the template arguments to std::make_pair lvalue references:
&std::make_pair<const std::wstring&, const unsigned&>
g++ 4.4.5 compiles it without errors. Seems to be a Visual Studio 10 defect.