When making a surface in SDL, there's an option to use the HWSURFACE tag, which I imagine means the surface is handled by the GPU instead of the CPU. But now SDL2 has textures, and I'm wondering, what's the difference? Will there be and performance difference using hardware surfaces instead of textures? Do they behave the same?
I've tried googling all over, but I can only find info on regular software surfaces.
There are no textures support in SDL1, and there is no HWSURFACE (or any other surface flag) in SDL2. flags in SDL_CreateRGBSurface in SDL2 commented as "The flags are obsolete and should be set to 0". There is no sane way to mix them.
Specifying SDL_HWSURFACE would cause the surface to be created using video memory. PCs without GPUs still have video memory and it is faster to put data in that memory area onto the screen than to copy it from system RAM.
Textures are uploaded to the GPU's own dedicated RAM and data must be passed through that memory in order to be put on the screen at all. SDL2 no longer has the SDL_HWSURFACE flag because the rendering subsystem uses the GPU via OpenGL or Direct3D and cannot use the old way to get graphics on the screen.
Related
I've been learning OpenGL, and as I sit trying to write my VBOs, PBOs, VAOs, textures, quads, bindings, fragment shaders, vertex shaders, and a whole suite of other modern abstractions upon abstractions built after decades of evolution, I wonder: Isn't the display nothing but a large block of memory?
I've heard of tales, that in the "good ol' days" (such as the Commodore 64), all you had to do was assign a value to an arbitrary byte in memory, and the screen would change a pixel. Extremely simple and elegant. In the modern day, this has changed with layers upon layers of abstractions and safeguards, such that changing a pixel on your display is several hundred feet away.
This begs the question, is it possible in the modern day to just "update a pixel of the screen"? Is it possible to write my own graphics driver or something, where I can send commands to some C wrapper which interfaces with the GPU to change those pixels? This is an extremely broad question, but I'm curious. The answer I'm looking for to this question would provide a rough outline of what you'd have to do in order to be able to arbitrarily get some C code to set a pixel on the screen, as well as a rough outline of why OpenGL has progressed the way it has - what problems did VBOs, PBOs, VAOs, bindings, shaders, etc. solve, and how we got to where we are today.
Isn't the display nothing but a large block of memory?
Yes, it is called a framebuffer.
I've heard of tales, that in the "good ol' days" (such as the Commodore 64)
Your current PC works like that right when you power it up! If you use the CPU to write into video memory, that is called a software renderer.
In the modern day, this has changed with layers upon layers of abstractions and safeguards, such that changing a pixel on your display is several hundred feet away.
No, they are not abstractions/safeguards for "changing pixels". Nowadays software renderers are not used anymore. Instead, you have to tell the GPU (which is another computer on its own) how to draw. That "talk" is what the APIs (like OpenGL) do for you.
Now, the GPUs are meant to be fast at drawing, and that requires specialized code and data structures. Those are all the things you mention: VBOs, PBOs, VAOs, shaders, etc. (in OpenGL parlance). There is no way around that, because GPUs are different hardware.
is it possible in the modern day to just "update a pixel of the screen"?
Yes, but that will end up being drawn somehow by the GPU, even if it looks to you like a memory write.
Is it possible to write my own graphics driver or something, where I can send commands to some C wrapper which interfaces with the GPU to change those pixels?
Yes, but that "C wrapper" is the graphics driver. A graphics driver for a modern GPU is very complex.
what you'd have to do in order to be able to arbitrarily get some C code to set a pixel on the screen
You cannot write a "C program" to write to a graphical screen because the C standard does not concern itself with graphical displays.
So it depends on your operating system, your hardware, whether you want 2D or 3D acceleration support, the API you choose...
as well as a rough outline of why OpenGL has progressed the way it has - what problems did VBOs, PBOs, VAOs, bindings, shaders, etc. solve, and how we got to where we are today.
See above.
You can make your own frame buffer - that is just an integer array - and do rasterization on it, then use for example the Windows GDI function SetBitmapBits() to draw it to the display in one go. The final draw-to-display command depends on the operating system.
How you do the rasterization on your framebuffer is completely up to you. You can use the CPU to draw individual pixels or rasterize lines and triangles, see for example this demo of my old CPU graphics engine using Windows GDI: https://youtu.be/GFzisvhtRS4.
Using the CPU is fine as long as you do not rasterize large datasets. From my experience, the limit to real-time 60fps rendering on the CPU is ~50k lines per frame.
If you want to rasterize really large datasets, you have to use a GPU in some way. Since the framebuffer is just an integer array, you can transfer it to/from the GPU using OpenCL or CUDA and on the GPU - if your dataset happens to already be in video memory - do all the rasterization extremely fast in parallel. For this you will need an additional z-buffer to decide which pixels to overdraw by occluding geometries. This way you can rasterize approximately 30 Million lines per frame at 60fps. This demo is rendered on the GPU in real time using OpenCL: https://youtu.be/lDsz2maaZEo
Is it possible in the modern day to just "update a pixel of the screen"?
Yes. In Windows for example, you can use SetPixel() to draw a pixel or BitBlt() to draw in bulk. See this Q/A
This works fine, but this means you're using the CPU for rendering and you'll find the GPU is much more effective for this task, especially if you require decent framerate and non-trivial graphics. The reason there's these "whole suite of other modern abstractions upon abstractions" is to serve as an interface to the GPU since it has an independent set of memory and totally different execution model. Other GPU libraries (OpenCL, DirectX, Vulkan, etc) all have the same kind of abstractions.
I've glossed over many nuances but I hope the point gets across.
This is probably a stupid question, but I cant find good examples on how to approach this, or if its even possible. Im just done with a project where I used gdi to biblt stuff onto a DIB-buffer then swap that onto the screen hdc, basically making my own swapchain and drawing with opengl.
So then I thought, can I do the same thing using directx11? But I cant seem to find where the DIB/buffer I need to change even is.
Am I even thinking about this correctly? Any ideas on how to handle this?
Yes, you can. Nvidia exposes vendor-specific extensions called NV_DX_interop and NV_DX_Interop2. With these extensions, you can directly access a DirectX surface (when it resides on the GPU) and render to it from an OpenGL context. There should be minimal (driver-only) overhead for this operation and the CPU will almost never be involved.
Note that while this is a vendor-specific extension, Intel GPUs support it as well.
However, don't do this simply for the fun of it or if you control all the source code for your application. This kind of interop scenario is meant for cases where you have two legacy/complicated codebases and interop is a cheaper/better option than porting all the logic to the other API.
Yeah you can do it, both OpenGL and D3D support both writeable textures and locking them to get to the pixel data.
Simply render your scene in OpenGL to a texture, lock it, read the pixel data and pass it directly to the D3D locked texture pixel data, unlock it then do whatever you want with the texture.
Performance would be dreadful of course, you're stalling the GPU multiple times in a single "operation" and forcing it to synchronize with the CPU (who's passing the data) and the bus (for memory access). Plus there would be absolutely no benefit at all. But if you really want to try it, you can do it.
I was moving from SDL to SDL2, and was confused of the 'render & texture' system introduced.
Back in SDL, the most frequent operation was creating Surface's and BlitSurface them onto screen. Now there seems a trend of using renderers and textures. However, this is extremely slow (in terms of coding overhead) from my point. Why can't I just load_BMP and BlitSurface as before? What good can be introduced from this whole window-renderer-texture thing?
I have read a couple threads What is a SDL renderer? but still a little confusing.
So..
Will the old Surface way work in SDL2?
What is the point in Renderer & Texture? (could be about hardware-accelaration according to a little googling, but not sure what that means)
You might want to take a look at the migration guide for SDL2, it provides information on the new way of dealing with 2D graphics.
The point of using textures instead of surfaces is that textures works on the GPU and get loaded into video memory and surfaces works in system memory with the CPU and since GPUs are much better suited than CPU for handling graphics it will be faster. Also, the renderer hides the underlying system used (it could be D3D, OpenGL, or something else).
You can still load surfaces, but you'll have to convert them to textures before rendering them or use the SDL_UpdateWindowSurface and SDL_GetWindowSurface functions; the latter link includes an example on how to use them.
As for the SDL2 approach being slow, as you assert, I don't agree with you, you set up the window and renderer once, load your textures much like you loaded surfaces, copy them to the renderer instead of blitting and finally use SDL_RenderPresent instead of SDL_Flip. Not that different really :)
But, take a look at the migration guide, it has the information you're looking for.
Is it possible yet to draw CUDA/OPENCL results directly to the screen with any existing API (opengl, directx, something else)? Skipping the typical drawing a textured quad method.
Even with registering resources and using modern CUDA interop methods, we still have to march through entire rendering pipelines just to render an array of colors. For applications like mine where every ms counts, this is a problem.
There's no way to draw directly on the screen with OpenCL or CUDA.
It is a solvable problem, but as far as I know, NVIDIA has not provided the needed APIs because they would be very complicated both to implement and to use, and the performance benefits would be limited at best.
The two main issues are:
1) the differing layouts of the buffers used for rendering (i.e. you'd have to use surface load/store functionality - a mapping into CUDA's address space is not suitable for graphics because the pitch-linear layout has poor performance in that context) and
2) the platform-specific details of incorporating your CUDA/OpenCL output into the presentation model (be it the desktop or a page-flipped full-screen experience, like a Direct3D game, or incorporating your app's output into the desktop). Bear in mind that most desktops these days are themselves page-flipped, so scribbling on the front buffer is frowned upon in any case.
I very much doubt that there is any performance lost in drawing pixels using a textured quad but you can draw pixels directly on the framebuffer with glDrawPixels.
Is it possible to allocate some memory on the GPU without cuda?
i'm adding some more details...
i need to get the video frame decoded from VLC and have some compositing functions on the video; I'm doing so using the new SDL rendering capabilities.
All works fine until i have to send the decoded data to the sdl texture... that part of code is handled by standard malloc which is slow for video operations.
Right now i'm not even sure that using gpu video will actually help me
Let's be clear: are you are trying to accomplish real time video processing? Since your latest update changed the problem considerably, I'm adding another answer.
The "slowness" you are experiencing could be due to several reasons. In order get the "real-time" effect (in the perceptual sense), you must be able to process the frame and display it withing 33ms (approximately, for a 30fps video). This means you must decode the frame, run the compositing functions (as you call) on it, and display it on the screen within this time frame.
If the compositing functions are too CPU intensive, then you might consider writing a GPU program to speed up this task. But the first thing you should do is determine where the bottleneck of your application is exactly. You could strip your application momentarily to let it decode the frames and display them on the screen (do not execute the compositing functions), just to see how it goes. If its slow, then the decoding process could be using too much CPU/RAM resources (maybe a bug on your side?).
I have used FFMPEG and SDL for a similar project once and I was very happy with the result. This tutorial shows to do a basic video player using both libraries. Basically, it opens a video file, decodes the frames and renders them on a surface for displaying.
You can do this via Direct3D 11 Compute Shaders or OpenCL. These are similar in spirit to CUDA.
Yes, it is. You can allocate memory in the GPU through OpenGL textures.
Only indirectly through a graphics framework.
You can use OpenGL which is supported by virtually every computer.
You could use a vertex buffer to store your data. Vertex buffers are usually used to store points for rendering, but you can easily use it to store an array of any kind. Unlike textures, their capacity is only limited by the amount of graphics memory available.
http://www.songho.ca/opengl/gl_vbo.html has a good tutorial on how to read and write data to vertex buffers, you can ignore everything about drawing the vertex buffer.