Using clang, I am able to compile OpenCL-C++ kernels (using clang -c). I am trying to load these compiled kernels into my OpenCL application, but am at a loss how to achieve that. I am using Ubuntu 22.04, with an Intel CPU and an Nvidia GPU. The GPU unfortunately does not support SPIR-V injection via clCreateProgramWithIL - if it did, I would happily take that route. I also cannot use clCreateProgamWithSource, because that unfortunately does not support C++ features inside the kernels.
Is there any way I can compile OpenCL-C++ kernels using clang and then load them into my OpenCL application? Or is there a way I can still use clCreateProgramWithSource with C++ features inside the kernels, maybe? Either way would work well! (There has been a similar question here but focusing on macOS, which has its own OpenCL implementation and compiler, as far as I know.)
Related
I'm doing academic robotics research, so we need to integrate several libraries in the field of vision, sensing, actuators.
There's a huge problem when trying to use libraries that solve problems and also how to integrate them together, since some use CUDA, othres ROCm, others OpenCL. I don't have an NVidia hardware in my host machine.
I'm starting the research on how to be a bit independent on this (I'm willing to sacrifice on performance), but there are several libraries that compile CUDA to portable C++, or CUDA to OpenCL, so it seems it shouldn't be a blocker having either NVidia or AMD in my opinion.
I'd suggest having these libraries in mind
https://github.com/hughperkins/coriander (convert CUDA to OpenCL to run in other cards)
https://github.com/ROCm-Developer-Tools/HIP (convert CUDA to portable C++).
Can you suggest alternatives to this? There may be better ways on how to use CUDA enabled libraries on a non NVidia enabled host.
The specific case would be to run PoseCNN library (it was built with CUDA) without CUDA or Nvidia in an Ubuntu machine. https://github.com/yuxng/PoseCNN
I have a requirement to optimize an OpenCL program for AMD GPUs.
I would like to try rewriting some of the core OpenCL kernels in GCN ISA assembly, but I have to support both Windows and Linux.
I have found the ROCm Platform which looks like it can do the job for Linux, but does not support Windows.
Is there a tool chain I can use to accomplish this?
Yes, RGA (Radeon GPU Analyzer) is what you are looking for.
Version 1.4 of the product added support for OpenCL on top of AMD's LLVM-based Lightning Compiler, the OpenCL compiler for the ROCm platform.
Version 2.0 added a graphical user interface.
RGA acts as an offline compiler, so your machine doesn't have to be ROCm-capable.
Check out the RGA Releases page for more info and download links.
I'm trying to build a simple application with CUDA and I've been trying for hours on end and I just can't make it work on windows. nvcc absolutely refuses to compile without Visual Studio's compiler which doesn't support things I need. I tried building using nvcc with clang but It just asks me to use Visual Studio's compiler. I've also tried using clang directly since it now supports CUDA but I receive this error:
clang++.exe: error: Unsupported CUDA gpu architecture: compute_52
This makes no sense to me because I have the CUDA toolkit version 7.5 and my graphics card is a GTX 970 (two of them). I have googled this extensively and everywhere I come across the error the person always has is their CUDA toolkit is < 7.5. I'm on the brink of tears right now trying to get something as simple as VLA to work on this CUDA application and I just can't achieve it...
The CUDA windows toolchain requires the Visual Studio C++ compiler. You cannot use anything else on that platform. If the VS compiler doesn't support the language features you need within CUDA host code, you have no choice but to change platforms, or your expectations.
You can still potentially compile non-CUDA host code using another compiler and then link that code using NVCC and the VS toolchain.
Try to use clang-cl, --cubin=clang-cl.exe
It may be worth to work on a Linux VM or WSL2 within windows. As per the CUDA docs.
To compile new CUDA applications, a CUDA Toolkit for Linux x86 is
needed. CUDA Toolkit support for WSL is still in preview stage as
developer tools such as profilers are not available yet. However, CUDA
application development is fully supported in the WSL2 environment, as
a result, users should be able to compile new CUDA Linux applications
with the latest CUDA Toolkit for x86 Linux.
https://docs.nvidia.com/cuda/wsl-user-guide/index.html#:~:text=However%2C%20CUDA%20application%20development%20is,becomes%20available%20within%20WSL%202.
Using the AMD C++ binding and SDK (the most recent one) running an OpenCL program that gets a platform, a GPU, then compiles 4 kernels has the above error upon startup. It works fine on my computer, whose GPU only supports up to 1.1, but other computers seem to have the above error. Is this a problem in the compilation (As in, I have to define some macros), in the lack of a driver, the C++ binding, or something else? I don't explicitly call clRetainDevice in my own codeāis it part of the binding somewhere?
It happens when you use the C++ bindings header file with OpenCL 1.2 header. For instance, when you run an application compiled with AMD SDK (OpenCL 1.2) on NVIDIA platform (OpenCL 1.1 only).
As fast and dirty work around, you can just edit the AMD SDK cl.h header and undef "CL_VERSION_1_2" preprocessor symbol. If you are not interested to 1.2 features, it should fix your problem.
I am trying to get a program that will run on both ATI and NVidia, and as such, I want to avoid using either SDK. Is it possible to do this without an SDK, using only VS2010 and Windows (XP or 7)?
If so, how can I go about configuring VS2010 Linker so that it will work?
Strictly speaking, no SDK is needed. In fact, no SDK is desired, as both the NVIDIA and AMD/ATI SDKs tie the code to their environments, and, by extension, their hardware. What you do need is:
1) A GPU that will run OpenCL code. See this Question: List of OpenCl Compliant CPU/GPU
2) The OpenCL library (libOpenCL.so on Linux); this is usually included and installed with the Graphics driver, which may be downloaded from AMD or NVIDIA.
3) The OpenCL header files. These may be obtained from Khronos.org, but are included with all OpenCL SDKs that I am aware of. On a Linux system these typically go in the directory /usr/include/CL
The NVIDIA and AMD SDKs provide a number of utilities and wrappers that make using the OpenCL API easier, but they are not required for writing OpenCL code, or for making API calls. These wrappers and utilities are not poratble. If you're interested in writing portable code, stick to the OpenCL spec, also available from Khronos.org.
To write code, all that you need to do is include opencl.h in your host program, and then make the API calls that are necessary to set up the OpenCL environment and run your OpenCL program. Also, don't forget to link against the OpenCL library (give gcc the -lOpenCL flag under Linux).
OpenCL is a standard. It only defines conventions. To use it, you need a driver for your graphical card. NVidia, AMD (ATI) and Apple all provide such drivers. You definitively need a SDK.
#virtuallinux alludes to the right answer: If you're worried about accidentally using some vendor-specific extensions, get the Khronos SDK.