I've got a pretty old ATI HD 3400 video card , which have no support of OpenCL , so i'm wondering if i can actually play around with OpenGL libraries provided by ATI catalyst driver ?
If my algorithm is running in glutDisplayFunc ( displayFunc ) , that is in displayFun () , is it actually costing CPU power or GPU power ?
GLUT is just a library which manages platform-specific window and GL context creation. The function you pass to glutDisplayFunc is just called by GLUT at the appropriate time and context for the platform you're running on; it is not executed on the GPU.
It is not possible to have code that you've compiled in the normal fashion as part of a larger program run on the GPU.
However, the individual graphics operations run inside of your display func do of course perform the rendering on the GPU; the CPU is still computing which graphics operation to execute, but not actually rendering the results. Each gl function is a normal CPU function, but what it does is send a command through your system bus to your graphics card, which then does the actual rendering.
Furthermore, these operations are asynchronous; the gl functions don't wait for your GPU to finish the operation before letting your program continue. This is useful because your CPU and GPU can both be working simultaneously — the GPU draws graphics while the CPU figures out what graphics to draw. On the other hand, if you do need communication in the other direction — such as glReadPixels — then the CPU has to wait for the GPU to catch up. This is also the difference between glFlush and glFinish.
Related
I am working on a 2D graphic application with OpenGL(like QGIS). Recently when I was testing some benchmarks, there was a weird performance difference between my 2 Graphic Cards. So I made a simple test and draw just 1 million squares using VBO. So there are 4m vertices each 20 bytes, So my total VBO size is 80 MB. And I draw whole things with just one DrawElements call. When I measured render time in my laptop which has 2 Graphic Cards it runs about 43 ms on Geforce and about 1 ms on Integrated Intel card. But I expected to be faster on Geforce. Why is it so? Should I disable some
Opengl options?
My System specification is:
ASUS N53m With Integrated Graphics Card and Geforce GT 610m
EDIT:
I also tested on another system with AMD Radeon HD 5450, it was about 44 ms again. I also used single precision instead and it reduced to 30 ms. But still integrated GPU is more faster!
It is definitely not measuring issue, because I can see the lag when zoom in/out.
The run time behavior of different OpenGL implementations vastly differs as I found out in my experiments regarding low-latency rendering techniques for VR. In general the only truly reliable timing interval to measure, that gives consistent results is inter-frame time between the very same step in your drawing. I.e. measure the time from buffer swap to buffer swap (if you want to measure raw drawing performance, disable V-Sync), or between the same glClear calls.
Everything else is only consistent within a certain implementation, but not between vendors (at the time of testing this I had no AMD GPU around, so I lack data on this). A few notable corner cases I discovered:
SwapBuffers
NVidia: returns only after the swap buffer has been presented. That means: Either waits for V-Sync or returns only after the buffers have been swapped
Intel/Linux/X11: always returns immediately. V-Sync affects the next OpenGL call that'd effects pixels in the not-yet-presented buffer and that does not fit into the command queue. Hence "clearing" the viewport with a large quad, skybox or using the depth-ping-pong method (found only in very old applications) gives very inconsistent frame intervals. glClear will reliably block until V-Sync after swap
glFinish
NVidia: actually finishes the rendering, as expected
Intel/Linux/X11: drawing to back buffer, acts like a No-Op, drawing to front buffer acts like a finish followed by a copy from an auxiliary back to front buffer (weird); essentially means you can't make the drawing process "visible".
I yet have to test what the Intel driver does if bypassing X11 (using KMS). Note that the OpenGL specification leaves it up to the implementation how and when it does certain things, as long as the outcome is consistent and conforms to the specification. And all the observed behavior is perfectly conformant.
I am building a real-time signal processing and display system using a nVidia Tesla C2050 GPU. The design was such that the signal processing part would run as a separate program and do all the computations using CUDA. In parallel, if needed I can start a separate display program which displays the processed signal using OpenGL.Since the design was to run the processes as independent processes, I do not have any CUDA-OpenGL interoperability These two programs exchange data with each other over a UNIX stream socket.
The signal processing program spends most of the time using the GPU for the CUDA stuff.I am refreshing my frame in OpenGL every 50 msecs while the CUDA program runs for roughly 700 msecs for each run and two sequential runs are usually separated by 30-40 msecs. When I run the programs one at a time (i.e. only CUDA or OpenGL part is running) everything works perfectly. But when I start the programs together, the display is also not what it is supposed to be, while the CUDA part produces the correct output. I have checked the socket implementation and I am fairly confident that the sockets are working correctly.
My question is since I have a single GPU and no CUDA-OpenGL interoperability and both the processes use the GPU regularly, is it possible that the context switching between the CUDA kernel and the OpenGL kernel is interfering with each other. Should I change the design to have a single program to run bot the parts with CUDA-OpenGL interoperability.
Compute capability 5.0 and less devices cannot run graphics and compute concurrently. The Tesla C2050 does not support any form of pre-emption so while the CUDA kernel is executing the GPU cannot be used to render the OpenGL commands. CUDA-OpenGL interop does not solve this issue.
If you have a single GPU then the best option is to break the CUDA kernels into shorter launches so that the GPU can switch between compute in graphics. In the aformentioned case the CUDA kernel should not execute for more than 50ms - GLRenderTime.
Using a second GPU to do the graphics rendering would be the better option.
I have a code which basically draws parallel coordinates using opengl fixed func pipeline.
The coordinate has 7 axes and draws 64k lines. SO the output is cluttered, but when I run the code on my laptop which has intel i5 proc, 8gb ddr3 ram it runs fine. One of my friend ran the same code in two different systems both having intel i7 and 8gb ddr3 ram along with a nvidia gpu. In those systems the code runs with shuttering and sometimes the mouse pointer becomes unresponsive. If you guys can give some idea why this is happening, it would be of great help. Initially I thought it would run even faster in those systems as they have a dedicated gpu. My own laptop has ubuntu 12.04 and both the other systems have ubuntu 10.x.
Fixed function pipeline is implemented using gpu programmable features in modern opengl drivers. This means most of the work is done by the GPU. Fixed function opengl shouldn't be any slower than using glsl for doing the same things, but just really inflexible.
What do you mean by coordinates having axes and 7 axes? Do you have screen shots of your application?
Mouse stuttering sounds like you are seriously taxing your display driver. This sounds like you are making too many opengl calls. Are you using immediate mode (glBegin glVertex ...)? Some OpenGL drivers might not have the best implementation of immediate mode. You should use vertex buffer objects for your data.
Maybe I've misunderstood you, but here I go.
There are API calls such as glBegin, glEnd which give commands to the GPU, so they are using GPU horsepower, though there are also calls to arrays, other function which have no relation to API - they use CPU.
Now it's a good practice to preload your models outside the onDraw loop of the OpenGL by saving the data in buffers (glGenBuffers etc) and then use these buffers(VBO/IBO) in your onDraw loop.
If managed correctly it can decrease the load on your GPU/CPU. Hope this helps.
Oleg
how can I use graphic card instead of CPU when I call SDL functions?
You cannot, SDL is only a software renderer. However, you can use SDL to create a window, catch events and then you can perform your drawing using OpenGL.
If your program is eating 100 % CPU, make sure that you limit the FPS correctly (by adding SDL_Delay to the main loop).
If possible, you can use SDL2.0.
That supports Full 3D hardware acceleration using SDL1.2 similar functions.
http://wiki.libsdl.org/MigrationGuide
In SDL 2.0, create your renderer with SDL_RENDERER_ACCELERATED and SDL_RENDERER_PRESENTVSYNC flags. First one implies hardware acceleration when possible, and second limits your program to run 60(depends on monitor's refresh rate) fps at most, thus freeing CPU from continuous work.
i'm programming a simple OpenGL program on a multi-core computer that has a GPU. The GPU is a simple GeForce with PhysX, CUDA and OpenGL 2.1 support. When i run this program, is the host CPU that executes OpenGL specific commands or the ones are directly transferred
to the GPU ???
Normally that's a function of the drivers you're using. If you're just using vanilla VGA drivers, then all of the OpenGL computations are done on your CPU. Normally, however, and with modern graphics cards and production drivers, calls to OpenGL routines that your graphics card's GPU can handle in hardware are performed there. Others that the GPU can't perform are handed off to the CPU.