I'm trying to use a parallel kernel algorithm, but it needs to call OpenGL functions [like draw a line] directly from the kernel, is this possible?
Drawing to a texture object. thx for clues
No, not really.
First of all, OpenGL drawing commands are issued on CPU side (glDrawArrays for instance). In OpenCL kernels you are operating on the GPU and cannot call these methods.
Additionally, in OpenCL you don't even have access to the fixed-function parts of the GPU rendering pipeline like the tessellation unit or rasterizer.
What you can do of course is create a 2D buffer in OpenCL, implement line rendering by yourself and paint "pixels" in there, but that's probably not what you want.
There are also extensions for OpenGL-OpenCL interoperability, a tutorial is here: https://software.intel.com/en-us/articles/opencl-and-opengl-interoperability-tutorial
Maybe if you tell us what you want to achieve we can give you better alternatives :)
Related
I have a texture in Unity which I will modify frequently.
Now there are two options:
I can make changes to texture by calling setPixels and then call Texture2D.apply. I think the apply actually copies the data from CPU to GPU.
One option is I can modify the texture in native code by getting the texture native handle and modifying it using glTexSubImage2D functions.
Now I read the apply copies only the changed pixels to GPU not full texture but I really doubt if its possible. but if it is true does this mean that calling Texture2D.apply == glTexSubImage2Din terms of performance.
If not, what should I use if I need good performance. I actually dont want to go to native side as I will have to manage the native code on for different graphics APIs supported by Unity like opengl, DX etc
Texture2D.Apply() and glTexSubImage2D are both used to update Texture. They both perform the-same action but they have differences in them.
GetPixels, SetPixels and Texture2D.Apply() are done on the CPU.
You should only use GetPixels, SetPixels and Texture2D.Apply() if you need individual pixels. Good example of this is when you want to send the Texture data over the network.
glTexSubImage2D is done on the GPU and does not require SetPixels or
GetPixels.
glTexSubImage2D is extremely faster than GetPixels, SetPixels and Texture2D.Apply().
If not, what should I use if I need good performance. I actually dont
want to go to native side as I will have to manage the native code on
for different graphics APIs supported by Unity like opengl,
You mentioned that you will be modifying the image frequently, so do not use GetPixels, SetPixels and Texture2D.Apply(). I know it is the easiest solution but it is very slow.
For the best performance:
1.Use glTexSubImage2D
Pass Texture.GetNativeTexturePtr() to the native C++ side as IntPtr then use glTexSubImage2D to directly modify it. I noticed that most of your questions is about C++ and OpenGL so this shouldn't be hard for someone like you.
As for supporting different graphics APIs, the first to support is OpenGL because that's supported on all major platforms. From the Editor, change the Graphics API to OpenGL then start coding. It should work on Windows, Mac, Linux, Android and iOS. If you want to support Direct3D, Metal and Vulkan then go for them too. You just don't have to. OpenGL is enough for this.
2. Use Shaders
You can combine Unity Shaders and Compute Shaders and still get more performance than glTexSubImage2D because this will be happening on the GPU instead of CPU. I personally find shaders complicated so #1 should be your priority.
Yes, glTexSubImage2D can be used to update a smaller rectangular portion of a larger texture.
Both SDL and Game Maker have the concept of surfaces, images that you may modify on the fly and display them. I'm using OpenGL 1 and i'd like to know if openGL has this concept of Surface.
The only way that i came up with was:
Every frame create / destroy a new texture based on needs.
Every frame, update said texture based on needs.
These approachs don't seem to be very performant, but i see no alternative. Maybe this is how they are implemented in the mentioned engines.
Yes these two are the ways you would do it in OpenGL 1.0. I dont think there are any other means as far as 1.0 spec is concerned.
Link : https://www.opengl.org/registry/doc/glspec10.pdf
Do note that the textures are stored on the device memory (GPU) which is fast to access for shading. And the above approaches copy it between host (CPU) memory and device memory. Hence the performance hit is the speed of host-device copy.
Why are you limited to OpenGL 1.0 spec. You can go higher and then you start getting more options.
Use GLSL shaders to directly edit content from one texture and output the same to another texture. Processing will be done on the GPU and a device-device copy is as fast as it gets.
Use CUDA. Map a texture to a CUDA array, use your kernel to modify the content. Or use OpenCL for non-NVIDIA cards.
This would be the better scenario so long as the modification can be executed in parallel this would benefit.
I would suggest trying the CPU copy method, as it might be fast enough for your needs. The host-device copy is getting faster with latest hardware. You might be able to get real-time 60fps or higher even with this copy, unless its a lot of textures you plan to execute this for.
This is probably a stupid question, but I cant find good examples on how to approach this, or if its even possible. Im just done with a project where I used gdi to biblt stuff onto a DIB-buffer then swap that onto the screen hdc, basically making my own swapchain and drawing with opengl.
So then I thought, can I do the same thing using directx11? But I cant seem to find where the DIB/buffer I need to change even is.
Am I even thinking about this correctly? Any ideas on how to handle this?
Yes, you can. Nvidia exposes vendor-specific extensions called NV_DX_interop and NV_DX_Interop2. With these extensions, you can directly access a DirectX surface (when it resides on the GPU) and render to it from an OpenGL context. There should be minimal (driver-only) overhead for this operation and the CPU will almost never be involved.
Note that while this is a vendor-specific extension, Intel GPUs support it as well.
However, don't do this simply for the fun of it or if you control all the source code for your application. This kind of interop scenario is meant for cases where you have two legacy/complicated codebases and interop is a cheaper/better option than porting all the logic to the other API.
Yeah you can do it, both OpenGL and D3D support both writeable textures and locking them to get to the pixel data.
Simply render your scene in OpenGL to a texture, lock it, read the pixel data and pass it directly to the D3D locked texture pixel data, unlock it then do whatever you want with the texture.
Performance would be dreadful of course, you're stalling the GPU multiple times in a single "operation" and forcing it to synchronize with the CPU (who's passing the data) and the bus (for memory access). Plus there would be absolutely no benefit at all. But if you really want to try it, you can do it.
Is it possible to use shader for calculating some values and then return them back for further use?
For example I send mesh down to GPU, with some parameters about how it should be modified(change position of vertices), and take back resulting mesh? I see that rather impossible because I haven't seen any variable for comunication from shaders to CPU. I'm using GLSL so there are just uniform, atributes and varying. Should I use atribute or uniform, would they be still valid after rendering? Can I change values of those variables and read them back in CPU? There are methods for mapping data in GPU but would those be changed and valid?
This is the way I'm thinking about this, though there could be other way, which is unknow to me. I would be glad if someone could explain me this, as I've just read some books about GLSL and now I would like to program more complex shaders, and I wouldn't like to relieve on methods that are impossible at this time.
Thanks
Great question! Welcome to the brave new world of General-Purpose Computing on Graphics Processing Units (GPGPU).
What you want to do is possible with pixel shaders. You load a texture (that is: data), apply a shader (to do the desired computation) and then use Render to Texture to pass the resulting data from the GPU to the main memory (RAM).
There are tools created for this purpose, most notably OpenCL and CUDA. They greatly aid GPGPU so that this sort of programming looks almost as CPU programming.
They do not require any 3D graphics experience (although still preferred :) ). You don't need to do tricks with textures, you just load arrays into the GPU memory. Processing algorithms are written in a slightly modified version of C. The latest version of CUDA supports C++.
I recommend to start with CUDA, since it is the most mature one: http://www.nvidia.com/object/cuda_home_new.html
This is easily possible on modern graphics cards using either Open CL, Microsoft Direct Compute (part of DirectX 11) or CUDA. The normal shader languages are utilized (GLSL, HLSL for example). The first two work on both Nvidia and ATI graphics cards, cuda is nvidia exclusive.
These are special libaries for computing stuff on the graphics card. I wouldn't use a normal 3D API for this, althought it is possible with some workarounds.
Now you can use shader buffer objects in OpenGL to write values in shaders that can be read in host.
My best guess would be to send you to BehaveRT which is a library created to harness GPUs for behavorial models. I think that if you can formulate your modifications in the library, you could benefit from its abstraction
About the data passing back and forth between your cpu and gpu, i'll let you browse the documentation, i'm not sure about it
To what extend does OpenGL's GLSL utilize SLI setups? Is it utilized at all at the point of execution or only for end rendering?
Similarly, I know that OpenCL is alien to SLI but assuming one has several GPUs, how does it compare to GLSL in multiprocessing?
Since it might depend on the application, e.g. common transformation, or ray tracing, can you offer insight on differences depending on application type?
The goal of SLI is to divide the rendering workload on several GPU. First, the graphic driver uses a either a Sort-first or time decomposition (GPU0 works on frame n while GPU1 works on frame n+1) approach. And then, the pixels are copied from one GPU to the other.
That said, SLI has nothing to do with the shading language used by OpenGL (the way the pixels are drawn doesn't really matter).
For OpenCL, I would say that you have to divide your workload between the GPU by yourself, but I am not sure.
If you want to take advantage of multiple GPUs with OpenCL, you will have to create command queues for each device and run kernels on each device after splitting up the workload.
See http://developer.nvidia.com/object/sli_best_practices.html
Basically, you have to instruct the driver that you want to use SLI, and in which mode. After this, the driver will (almost) seamlessly do all the work for you.
Alternate Frame Rendering : no sync needed, so better performance, but more lag
Split Frame Rendering : lots of sync, some vertices are processed twice, but less lag.
For you GLSL vs OpenCL comparison, I don't know of any good benchmark. I'd be interested, though.