Are there OpenCL bindings that aren't written in C/C++ style? - c++

The current OpenCL C++ bindings in CL/cl.hpp are a very thin wrapper over the C OpenCL API. I understand reasons why it was done this way, although I actually really don't.
Are there any existing alternative wrappers which rely on exceptions as error handling, allowing one to just write code like this:
auto platform_list = cl::Platform::get();
because, well, RVO and readability and such, instead of the current
std::vector<cl::Platform> platform_list;
auto error = cl::Platform::get(&platformList);
if(error != CL_SUCCESS)
Or if one opts in on exception handling (by defining __CL_ENABLE_EXCEPTIONS):
std::vector<cl::Platform> platform_list;
cl::Platform::get(&platformList);
Note the actual error handling code is not shown, although in the non-exceptions case this can get quite messy.
I'm sure such bindings would not be terribly hard to write, but edge cases remain edge cases and I'd prefer a solid pre-written wrapper. Call me spoiled, but if C++ bindings do not offer a real C++ interface, I don't really see the point of them.

Check out the Boost.Compute library. It is header-only and provides a high-level C++ API for GPGPU/parallel-computing based on OpenCL.
Getting the list of platforms looks like this:
for(auto platform : boost::compute::system::platforms()){
std::cout << platform.vendor() << std::endl;
}
And it uses exceptions for error handling (which vastly reduces the amount of explicit checking required and gives much nicer error messages on failure):
try {
// attempt to compile to program
program.build();
}
catch(boost::compute::opencl_error &e){
// program failed to compile, print out the build log
std::cout << program.build_log() << std::endl;
}
On top of all that it also offers a STL-like interface with containers like vector<T> and array<T, N> as well as algorithms like sort() and transform() (along with other features like random number generation and lambda expression support).
For example, to sort a vector of floats on the device you just:
// vector of floats on the device
boost::compute::vector<float> vec = ...;
// sort the vector
boost::compute::sort(vec.begin(), vec.end(), queue);
// copy the sorted vector back to the host
boost::compute::copy(vec.begin(), vec.end(), host_vec.begin(), queue);
There are more tutorials and examples in the documentation.

The C++ wrappers are designed to be just a thin layer on top of OpenCL so they can be included just as a header file. There are some C++/OpenCL libraries that offer various kinds of support for C++, such as AMD Bolt.
There is a proposal for a layer/library for C++, SYCL. It is slightly more complex than a wrapper, as it requires a device compiler to produce OpenCL kernels, but provides (IMHO) nice abstractions and exception handling.
The provisional specification is already available , and there is already a (work in progress) open source implementation.

Related

C++ std::filesystem::filesystem_error exception trying to read system volume information, etc

I am trying to work around an exception being thrown when trying to recursively walk through all files in root drives, like C:, D:, etc. I am using GCC compiler version 9.3.0 on Mingw64.
I got std::filesystem::filesystem_error when trying to read system volume information, example output:
Checking "D:\\System Volume Information"
filesystem error: cannot increment recursive directory iterator: Invalid argument
Code snippet:
try {
for (auto& p : fs::recursive_directory_iterator(dp, fs::directory_options::skip_permission_denied)) {
cout << "Checking " << p.path() << endl;
string path = p.path().string();
if (fs::is_regular_file(p) && p.path().extension() == ".xyz") {
files.push_back(p.path().string());
}
}
}
catch (fs::filesystem_error &e) {
// How to go back, skip this, and resume?
cerr << e.what() << endl;
}
What I would like to do is skip over these exceptions. Does anyone know how to do that?
Thank you!
Since your error refers to incrementing the recursive_filesystem_iterator, the error appears to be coming from the for statement itself, not your subsequent code. The for statement is internally doing an increment (operator++) on the recursive_filesystem_iterator.
To me, this feels like an error in the implementation of recursive_filesystem_iterator, and your code should have worked with no exceptions. But reading the standard closely, I suppose there are enough ambiguities for an implementation to say that the behavior you see still conforms to the standard.
I don't have an official copy of the c++17 standard, so the references I give here are to the freely-available draft n4659.pdf.
At 30.10.2.1 Posix conformance, it says
Implementations that do not support exact POSIX behavior are encouraged to provide
behavior as close to POSIX behavior as is reasonable given the limitations of actual
operating systems and file systems. If an implementation cannot provide any reasonable
behavior, the implementation shall report an error as specified in 30.10.7. [Note:This
allows users to rely on an exception being thrown or an error code being set when an
implementation cannot provide any reasonable behavior.— end note]
Implementations are not required to provide behavior that is not supported by a
particular file system. [Example: The FAT file system used by some memory cards, camera
memory, and floppy disks does not support hard links, symlinks, and many other features
of more capable file systems, so implementations are not required to support those
features on the FAT file system but instead are required to report an error as described
above.— end example]
So an attempt to iterate into D:\System Volume Information may fail and throw an exception if the underlying filesystem won't let you do that.
Your constructor specifies directory_options::skip_permission_denied. I seems like this should be sufficient to avoid the exception.
In 30.10.14.1 recursive_directory_iterator members for operator++ it says:
...then either directory(*this)->path() is recursively iterated into or, if
(options() & directory_options::skip_permission_denied) != directory_options::none
and an error occurs indicating that permission to access directory(*this)->path() is denied,
then directory(*this)->path() is treated as an empty directory and no error is reported.
The actual exception that you get doesn't say "permission denied", so I guess it could be argued that the skip_permission_denied option does not apply to it. This would allow the implementation of operator++ to throw an exception in this case. I don't like this interpretation, since it seems like the whole idea of skip_permission_denied is to avoid exceptions like this. But it's not up to me. :)
Aside from perhaps trying to submit a defect back to your standard library implementation, what can you do? Perhaps you could write out an old-style for loop, and use the increment method on the recursive_filesystem_iterator. The increment method returns an error code rather than throwing an exception. So your code would look something like:
auto iter = fs::recursive_directory_iterator(dp, fs::directory_options::skip_permission_denied);
auto end_iter = fs::end(iter);
auto ec = std::error_code();
for (; iter != end_iter; iter.increment(ec))
{
if (ec)
{
continue;
}
// The rest of your loop code here...
}
I think the above looks reasonable, but definitely needs testing to make sure there's not some weird corner case where you get an infinite loop or something. Actually I'm not so sure the continue stuff is even needed, but you may want to experiment with it.
Finally, when you catch a filesystem_error, you could print out e.path1.native() in addition to e.what(). I think you already mostly know that information because you're printing the path within your loop. But it might provide more information in some cases.

Implement Asynchronous Lazy Generator in C++

My intention is to use a generic interface for iterating over files from a variety of I/O sources. For example, I might want an iterator that, authorization permitting, will lazily open every file on my file system and return the open file handle. I'd then want to use the same interface for iterating over, perhaps, objects from an AWS S3 bucket. In this latter case, the iterator would download each object/file from S3 to the local file system, then open that file, and again return a file handle. Obviously the implementation behind both iterator interfaces would be very different.
I believe the three most important design goals are these:
For each iter++ invocation, a std::future or PPL pplx::task is returned representing the requested file handle. I need the ability to do the equivalent of the PPL choice(when_any), because I expect to have multiple iterators running simultaneously.
The custom iterator implementation must be durable / restorable. That is, it periodically records where it is in a file system scan (or S3 bucket scan, etc.) so that it can attempt to resume scanning from the last known position in the event of an application crash and restart.
Best effort to not go beyond C++11 (and possibly C++14).
I'd assume to make the STL input_iterator my point of departure for an interface. After all, I see this 2014 SO post with a simple example. It does not involve IO, but I see another article from 2001 that allegedly does incorporate IO into a custom STL iterator. So far so good.
Where I start to get concerned is when I read an article like "Generator functions in C++". Ack! That article gives me the impression that I can't achieve my intent to create a generator function, disguised as an iterator, possibly not without waiting for C++20. Likewise, this other 2016 SO post makes it sound like it is a hornets-nest to create generator functions in C++.
While the implementation for my custom iterators will be complex, perhaps what those last two links were tackling was something beyond what I'm trying to achieve. In other words, perhaps my plan is not flawed? I'd like to know what barriers I'm fighting if I assume to make a lazy-generator implementation behind a custom input_iterator. If I should be using something else, like Boost iterator_facade, I'd appreciate a bit of explanation around "why". Also, I'd like to know if what I'm doing has already been implemented elsewhere. Perhaps the PPL, which I've only just started to learn, already has a solution for this?
p.s. I gave the example of an S3 iterator that lazily downloads each requested file and then returns an open file handle. Yes I know this means the iterator is producing a side effect, which normally I would want to avoid. However, for my intended purpose, I'm not sure of a more clean way to do this.
Have you looked at CoroutineTS? It is coming with C++20 and allows what you are looking for.
Some compilers (GNU 10, MSVC) already have some support.
Specific library features on top of standard coroutines that may interest you:
generator<T>
cppcoro::generator<const std::uint64_t> fibonacci()
{
std::uint64_t a = 0, b = 1;
while (true)
{
co_yield b;
auto tmp = a;
a = b;
b += tmp;
}
}
void usage()
{
for (auto i : fibonacci())
{
if (i > 1'000'000) break;
std::cout << i << std::endl;
}
}
A generator represents a coroutine type that produces a sequence of values of type, T, where values are produced lazily and synchronously.
The coroutine body is able to yield values of type T using the co_yield keyword. Note, however, that the coroutine body is not able to use the co_await keyword; values must be produced synchronously.
async_generator<T>
An async_generator represents a coroutine type that produces a sequence of values of type, T, where values are produced lazily and values may be produced asynchronously.
The coroutine body is able to use both co_await and co_yield expressions.
Consumers of the generator can use a for co_await range-based for-loop to consume the values.
Example
cppcoro::async_generator<int> ticker(int count, threadpool& tp)
{
for (int i = 0; i < count; ++i)
{
co_await tp.delay(std::chrono::seconds(1));
co_yield i;
}
}
cppcoro::task<> consumer(threadpool& tp)
{
auto sequence = ticker(10, tp);
for co_await(std::uint32_t i : sequence)
{
std::cout << "Tick " << i << std::endl;
}
}
Sidenote: Boost Asio has experimental support for CoroutineTS for several releases, so if you want you can combine it.

Is it possible to run piece of pure C++ code in GPU

I don't know OpenCL very much but I know C/C++ API requires programmer to provide OpenCL code as a string. But lately I discovered ArrayFire library that doesn't require string-code to invoke some calculations. I wondered how is it working (it is open source but the code is a bit confusing). Would it be possible to write parallel for with OpenCL backend that invokes any piece of compiled (x86 for example) code like following:
template <typename F>
void parallel_for(int starts, int ends, F task) //API
{ /*some OpenCL magic */ }
//...
parallel_for(0, 255, [&tab](int i){ tab[i] *= 0.7; } ); //using
PS: I know I am for 99% too optimistic
You cannot really call C++ Host code from the device using standard OpenCL.
You can use SYCL, the Khronos standard for single-source C++ programming. SYCL allows to compile C++ directly into device code without requiring the OpenCL strings. You can call any C++ function from inside a SYCL kernel (as long as the source code is available). SYCL.tech has more links and updated information.

How can I implement interop between OCaml and C++?

I want to create a bridge between OCaml and C++. For instance I want to use some constructions written in OCaml in C++.
How can I achieve this? Are there any libraries, bindings for this?
You should read the relevant part of the language manual: Interfacing C with OCaml. It is quite detailed even if, by nature, painfully low-level.
If you don't need tight communication between C++ and OCaml code (eg. you interface GUI code and computation code, but the computationally intensive kernel of your application does not cross application boundaries, or at least the cost of communication is expected to be neglectible compared to time spent on either side), I would recommend that you explore simpler ways, where C++ and OCaml code run in separate processes, and exchange information through message passing (in whatever format that is most convenient to define: text, s-expressions, binary format, JSON, etc.). I would only try to bridge code in the same process if I'm sure the simpler approach cannot work.
Edit: since I wrote this answer last year, the Ctypes library emerged, from Jeremy Yallop; it is a very promising approach that may be significantly simpler than directly interfacing C with OCaml.
The easiest way to do this is in two steps: OCaml → C and then C → C++ using the extern keyword. I do this extensively in my COH*ML project which binds OCaml with the Coherence library in C++. For example in OCaml I have:
type coh_ptr (* Pointer to a Cohml C++ object *)
external coh_getcache: string -> coh_ptr = "caml_coh_getcache"
Then in C++, first a C function:
extern "C" {
value caml_coh_getcache(value cn) {
CAMLparam1(cn);
char* cache_name = String_val(cn);
Cohml* c;
try {
c = new Cohml(cache_name);
} catch (Exception::View ce) {
raise_caml_exception(ce);
}
value v = caml_alloc_custom(&coh_custom_ops, sizeof(Cohml*), 0, 1);
Cohml_val(v) = c;
CAMLreturn(v);
}
}
And finally the C++ implementation:
Cohml::Cohml(char* cn) {
String::View vsCacheName = cn;
hCache = CacheFactory::getCache(vsCacheName);
}
Going the other way is basically the same principle.

Linux g++ Embedding Prolog Logic Engine Within C++

I have some logic in a C++ program that is not only insanely complex, it requires multiple solutions for which Prolog is ideal. It's sort of like a firewall config script, checking input for actions, but sometimes more that one action is required.
What I want is something like this:
class PrologEngine
{
LoadLogic(const char* filename) throw PrologException; // Load a file of prolog rules, predicates facts etc in textual format. Must be callable multiple times to load AND COMPILE (for speed) prolog rule files.
std::vector<std::string> Evaluate(const char* predicate_in_string_form = "execute(input, Result)") throw PrologException; Returns a vector of matching predicates in text form.
};
It needs no ability to call back into C++.
AMI Prolog seems to get it, but it's not available on Linux. I'm trying to use SWI-Prolog and can only find 2 examples and and incredibly byzantine API (my opinion)
Can anyone point me to an example that is close to what I'm looking for?
There is A C++ interface to SWI-Prolog, that's high level.
I'm fighting with it, here an example of bridging to OpenGL:
PREDICATE(glEvalCoord1d, 1) {
double u = A1;
glEvalCoord1d( u );
return TRUE;
}
This clean code hides many 'bizantinism', using implicit type conversion and some macro. The interface is well tought and bidirectional: to call Prolog from C++ there are PlCall ('run' a query, similar to Evaluate you expose in the answer) or a more structured PlQuery, for multiple results...
If you don't need to link to openGl, or can wait to ear about the answer that hopefully I'll get from SWI-Prolog mailing list, you should evaluate it.
If you don't mind rewriting the prolog code for use in a native c++ header only library, I'd look into the castor library:
http://www.mpprogramming.com/cpp/