Use Intel OpenCL.dll alongside a NVIDA CUDA installation - c++

I have a computer that has an Intel CPU and an NVIDIA GPU, running Windows 7. I have a software module that is written in NVIDIA CUDA, and another module written in OpenCL. I would like to run the OpenCL module on the CPU, using the Intel implementation of OpenCL, and at the same time, use the CUDA module.
In my system I installed first the CUDA SDK, and then the SDK from Intel.
I've compiled the program in Visual Studio 2012, instructing the linker to use the Intel's library (and I compiled against the OpenCL headers provided by intel).
However when I run a simple program to query the hardware I'm only able to see the NVIDIA card.
I've tried modifying the Windows Registry, and the PATH variable, with no look. When I query the dependencies with "Dependecy Walker" I see that the program depends on a dll located in c:\windows\system32, which is not the folder where the Intel dll is. I've tried deleting this dll but I still see this dependency, and I'm only able to access the GPU.
Any idea about what could be happening?

On Windows, "OpenCL.dll" is the ICD provided by Khronos and redistributed by AMD, NVIDIA and Intel.
The actual drivers are referenced by the Registry, and the ICD enumerates them all.
When you query the OpenCL platforms, you'll see one for each installed driver (AMD, NVIDIA, Intel).
Within each platform there will be devices (or device), for example, in the NVIDIA platform you'll find your NVIDIA GPU and under the Intel platform you'll find your CPU.
Don't replace OpenCL.dll
Run clinfo or GPU-Z to see what OpenCL platforms and devices it sees.
Re-install the Intel CPU driver (a new one was just posted 2 days ago) to make sure their driver is installed.
Note: Your CPU needs to have SSE 4.2 for the Intel CPU driver to work.

You could try the Installable Client Driver (ICD) Loader. However, I have no experience if it works on Windows.
Or:
Since you don't want to use the GPU with OpenCL you can simply copy the Intel OpenCL.dll into your working directory. The working directory is visited first when .dlls are loaded. So, even if the Nvidia OpenCL.dll is installed into your windows/system32 directory the Intel library is found first and therefore loaded. There may be better solutions maybe load the dll on demand as discussed here Dynamically load a function from a DLL but as a fast solution it should work.

Related

Setting up openCL SDKs

I have a task on uni starts with setting the visual studio environment to :
OpenCL SDKs:
AMD – AMD APP (Accelerated Parallel Processing)
NVIDIA – CUDA (Compute Unified Device Architecture)
Intel – Intel SDK for OpenCL Applications
OpenCL uses an “Installable Client Driver” (ICD), model
To allow platforms from different vendors to co-exist
Applications can choose a platform at runtime
And I don't know how to do it ..
i need halp and thanks
I checked by running Regedit for the settings but I only found the default
In order to make OpenCL available for pre-compiled programs you simply need to install the Nvidia, AMD or Intel GPU drivers, depending on which GPU you have (not that older Intel integrated GPUs don't support OpenCL).
For CPU OpenCL support you can install the Intel runtime (Intel only) or POCL (open source, all modern CPUs supported, but you need to compile it from source). Unfortunately AMD does not provide APP SDK with CPU support anymore (although a simple web search will still get you the executables).
All of the above automatically register the respective ICD, so you don't have to do anything special about it.
For developing OpenCL applications you need a standalone OpenCL ICD loader (.lib/.a and .dll) and the OpenCL headers (.h), which you can get from those links, though you need to compile the former. These are also provided in ready to use, binary form in OpenCL SDKs such as the ones provided by Intel (which includes Intel's OpenCL CPU runtime) or AMD.

GCN ISA assembly in OpenCL program for both Windows and Linux

I have a requirement to optimize an OpenCL program for AMD GPUs.
I would like to try rewriting some of the core OpenCL kernels in GCN ISA assembly, but I have to support both Windows and Linux.
I have found the ROCm Platform which looks like it can do the job for Linux, but does not support Windows.
Is there a tool chain I can use to accomplish this?
Yes, RGA (Radeon GPU Analyzer) is what you are looking for.
Version 1.4 of the product added support for OpenCL on top of AMD's LLVM-based Lightning Compiler, the OpenCL compiler for the ROCm platform.
Version 2.0 added a graphical user interface.
RGA acts as an offline compiler, so your machine doesn't have to be ROCm-capable.
Check out the RGA Releases page for more info and download links.

How to run a compiled CUDA code on a machine that doesn't have the CUDA toolkit installed?

will any memory bound application benefit from high memory throughput of tesla(cc2.0) more than high number of cuda cores of geforce (cc5.0)?
how can i run exe filed compiled on machine with geforce card on another machine with tesla card without installing VS2010 and cuda on tesla machine (ie i want this exe file to be stand alone application)?
will any memory bound application benefit from high memory throughput of tesla(cc2.0) more than high number of cuda cores of geforce (cc5.0)?
A memory bound CUDA application will likely run fastest on whichever GPU has higher memory bandwidth. There are certainly other factors that could affect this, but this is a reasonable general principle. I'm not sure which 2 cards you are referring to, but it's entirely possible that a particular GeForce GPU could have higher memory bandwidth than a particular Tesla GPU. The cc2.0 Tesla GPUs (e.g. M2050, C/M2070, C/M2075, M2090) probably do have higher memory bandwidth (over 100GB/s) than the cc5.0 GeForce GPUs I am aware of (e.g. GeForce GTX 750/750Ti -- less than 90GB/s).
how can i run exe filed compiled on machine with geforce card on another machine with tesla card without installing VS2010 and cuda on tesla machine (ie i want this exe file to be stand alone application)?
There are a few things that are pretty easy to do, which will make it easier to move a compiled CUDA code from one machine to another.
make sure the CUDART library is statically linked. This should be the default settings for recent CUDA versions. You can read more about it here. If you are using other libraries (e.g. CUBLAS, etc.) you will want to make sure those other libraries are statically linked also (if possible) or bundle the library (.so file in linux, .dll in windows) with your application.
compile for a range of compute architectures. If you know, for example that you only need to and want to target cc2.0 and cc5.0, then make sure your nvcc compile command line contains switches that target both cc2.0 and cc5.0. This is a fairly complicated topic, but if you review the CUDA sample codes (makefiles or VS projects) you will find examples of projects that build for a wide variety of architectures. For maximum compatibility, you probably want to make sure you are including both PTX and SASS in your executables. You can read more about it here and here.
Make sure the machines have compatible drivers. For example, if you compile a CUDA code using CUDA 7.0 toolkit, you will only be able to run it on a machine that has a compatible GPU driver installed (the driver is a separate item from the toolkit. A GPU driver is required to make use of a GPU, the CUDA toolkit is not.) For CUDA 7, this roughly means you want an r346 or newer driver installed on any machine that you want to run a CUDA 7 compiled code on. Other CUDA toolkit versions have other associated minimum driver versions. For reference, this answer gives an idea of the approximate minimum GPU driver versions needed for some recent CUDA toolkit versions.

OpenCL development on Intel CPU/GPU under Linux

I have an intel i7 haswell cpu, and I would like to start exploring OpenCL development. In particular, I am interested to run OpenCL code on the integrated GPU.
Unfortunately, by now, I was not able to find any SDK on Intel's site..
May you provide some links, together with a summary of the current status of OpenCL tools for the Linux platform and Intel hardware?
I think this would be useful to many other people..
Thanks a lot!
Intel does not provide free support for OpenCL on their iGPUs under Linux - you have to buy the Intel Media Server Studio, minimum $499. On Windows, you can download a free driver to get OpenCL capability for the iGPU: https://software.intel.com/en-us/articles/opencl-drivers#philinux.
Note that you can use any OpenCL SDK you want - it doesn't have to be Intel. The SDK is only useful for building your program. For running an OpenCL program, you need an appropriate runtime (driver) from the manufacturer. The AMD SDK will give you access to the CPU as an OpenCL device, but not the iGPU.
There is Open Source OpenCL implementation for Intel GPUs on Linux called Beignet, maintained by bunch of guys from Intel.
Sadly, couldn't personally try and check if Your's GPU is properly supported, but on their wiki they states:
Supported Targets
4th Generation Intel Core Processors "Haswell", need kernel patch currently, see the "Known Issues" section.
Beignet: self-test failed" and almost all unit tests fail. Linux 3.15 and 3.16 (commits f0a346b to c9224fa) enable the register whitelist by default but miss some registers needed for Beignet.
This can be fixed by upgrading Linux, or by disabling the whitelist:
# echo 0 > /sys/module/i915/parameters/enable_cmd_parser
On Haswell hardware, Beignet 1.0.1 to 1.0.3 also required the above workaround on later Linux versions, but this should not be required in current (after 83f8739) git master.
So, it's worth a shoot. Btw, it worked well on my 3rd generation HD4000.
Also, toolchain and driver in question includes bunch of GPU-support test cases.
For anyone who comes across this question as I did, the existing answers have some out-of-date information; Intel now offers free drivers for Linux on the site posted above: https://software.intel.com/en-us/articles/opencl-drivers#philinux
The drivers themselves are only supported on 5th, 6th and 7th gen Core processors (and a bunch of other Celerons and Xeons, see link), with earlier processors such as 4th gen still needing the Media Server Studio.
However, they now offer a Linux Community version of Media Server Studio which is free to download.
They also have a Driver Support Matrix for Intel Media SDK and OpenCL which has some useful information about compatibility: https://software.intel.com/en-us/articles/driver-support-matrix-for-media-sdk-and-opencl
You may check intel open source Beignet OpenCL library: http://arrayfire.com/opencl-on-intel-hd-iris-graphics-on-linux/
For me (ubuntu 15.10 + Intel i5 4th generation GPU) it works quite well.
P.S.
Also I must say that I managed to download "media server" for linux a couple of months ago (but didn't used it yet). So you may check it also.

Can I install CUDA without drivers in Linux CentOS 6 (only cuda toolkit)

I tried to install cuda toolkit without display driver in CentOS 6. It gets installed properly. I was able to compile but it is compiling without performing any operation and I get garbage values in array addition. For cudaGetDeviceCount(&count) I am getting value as "o" which means I don't have any card on my machine.
You can install the CUDA toolkit without installing the driver.
You can then compile CUDA codes that use the runtime API.
You will not be able to run those codes unless you have a proper CUDA driver and GPU installed in the machine, however.
Codes that depend on the driver API will also not be compilable in this configuration, on older CUDA toolkits, without additional work. Newer CUDA toolkits provide stub libraries for driver libraries, which can be linked against.
This answer covers the method to install the CUDA toolkit without the driver.
If you want just run the codes and profiling the performance and other parameters, it would be helpful if you install GPGPU-sim simulator. It doesn't need any graphic card on your machine.