opengl fixed function uses gpu or cpu? - opengl

I have a code which basically draws parallel coordinates using opengl fixed func pipeline.
The coordinate has 7 axes and draws 64k lines. SO the output is cluttered, but when I run the code on my laptop which has intel i5 proc, 8gb ddr3 ram it runs fine. One of my friend ran the same code in two different systems both having intel i7 and 8gb ddr3 ram along with a nvidia gpu. In those systems the code runs with shuttering and sometimes the mouse pointer becomes unresponsive. If you guys can give some idea why this is happening, it would be of great help. Initially I thought it would run even faster in those systems as they have a dedicated gpu. My own laptop has ubuntu 12.04 and both the other systems have ubuntu 10.x.

Fixed function pipeline is implemented using gpu programmable features in modern opengl drivers. This means most of the work is done by the GPU. Fixed function opengl shouldn't be any slower than using glsl for doing the same things, but just really inflexible.
What do you mean by coordinates having axes and 7 axes? Do you have screen shots of your application?
Mouse stuttering sounds like you are seriously taxing your display driver. This sounds like you are making too many opengl calls. Are you using immediate mode (glBegin glVertex ...)? Some OpenGL drivers might not have the best implementation of immediate mode. You should use vertex buffer objects for your data.

Maybe I've misunderstood you, but here I go.
There are API calls such as glBegin, glEnd which give commands to the GPU, so they are using GPU horsepower, though there are also calls to arrays, other function which have no relation to API - they use CPU.
Now it's a good practice to preload your models outside the onDraw loop of the OpenGL by saving the data in buffers (glGenBuffers etc) and then use these buffers(VBO/IBO) in your onDraw loop.
If managed correctly it can decrease the load on your GPU/CPU. Hope this helps.


What's causing this unpredictable OpenGL bug?

I have an OpenGL test application that is producing incredibly unusual results. When I start up the application it may or may not feature a severe graphical bug.
It might produce an image like this:
Or like this:
Or just the correct image, like this:
The scene consists of one spinning colored cube (made of 12 triangles) with a simple shader on it that colors the pixels based on the absolute value of their model space coordinates. The junk faces appear to spin with the cube as though they were attached to it and often junk triangles or quads flash on the screen briefly as though they were rendered in 2D.
The thing I find most unusual about this is that the behavior is highly inconsistent, starting the exact same application repeatedly without me personally changing anything else on the system will produce different results, sometimes bugged, sometimes not, the arrangement of the junk faces produced isn't consistent either.
I can't really post source code for the application as it is very lengthy and the actual OpenGL calls are spread out across many wrapper classes and such.
This is occurring under the following conditions:
Windows 10 64 bit OS (although I have observed very similar behavior under Windows 8.1 64 bit).
AMD FX-9590 CPU (Clocked at 4.7GHz on an ASUS Sabertooth 990FX).
AMD 7970HD GPU (It is a couple years old and occasionally areas of the screen in 3D applications become scrambled, but nothing on the scale of what I'm experiencing here).
Using SDL ( for window and context creation.
Using GLEW ( for OpenGL.
Using OpenGL versions 1.0, 3.3 and 4.3 (I'm assuming SDL is indeed using the versions I instructed it to).
AMD Catalyst driver version 15.7.1 (Driver Packaging Version listed as 15.20.1062.1004-150803a1-187674C, although again I have seen very similar behavior on much older drivers).
Catalyst Control Center lists my OpenGL version as
This looks like a broken graphics card to me. Most likely some problem with the memory (either the memory itself, or some soldering problem). Artifacts like those you see can happen if for some reason setting the address for a memory operation does not fully settle or happen at all, before starting the read; that can happen due to a bad connection between the GPU and the memory (solder connections failed) or because the memory itself failed.
Solution: Buy new graphics card. You may try out what happens if you resolder it using a reflow process; there are some tutorials on how to do this DIY, but a proper reflow oven gives better results.

Image Geometrical remapping on OpenGL ES

I have an algorithem that runs on PC and uses OpenCV remap. It is slow and I need to run it on an embedded system (for example a device such as this:
It has OpenGL 3.0 and I am wondering if it is possible to write code in OpenGL shader to do the remapping (OpenCV remapping).
I have another device that has OpenGL 2.0, Can that device do shader programming?
where can I learn about shader programming in OpenGL?
I am using Linux.
Edit 1
The code runs on a PC and it takes around 1min, On am embedded system it takes around 2 hours!
I need to run it on an embedded system and for that reason I think to use OpenGL or OpenCL (the board has OpenCL 1.1 driver).
What is the best option for this? Can I use OpenGl 2 or OpenGL3?
A PC with a good graphic card (compatible with OpenCV) is much faster than a little embedded PC like Odroid or Banana Pi. I mean that computational_power/price or computational_power/energy is lower on these platforms.
If your algorithm is slow:
Are you sure your graphic driver is correctly configured to support OpenCV?
Try to improve your algorithm. On a current PC, is easy to get 1TFLOP with OpenCL, so if your program really require more, you should think about computer clouds and such. Check that you configured the appropriate buffers type, etc.
OpenGL 3 allow general shaders, but OpenGL 2 is very different and it must be much harder or impossible to make your algorithm compatible.
To learn OpenGL/GLSL, be very care because most page learn bad/old code.
I recommend you a good book, like:
OpenGL 3+, or OpenGL ES 3+ have general purpose shaders and may be used for fast computing. So yes, you will get performance increased. But graphic cards on these platform are very small/slow (usually less than 10 cores). Do not expect to get the same 1min-result on this ODROID than on your PC with 500-2000 GPU cores.
OpenGL 2 has fixed pipeline and it is hard to use it for parallel computing.
If you really require to use an embedded platform, maybe you may use a cloud of them?

glDrawArray+VBO increasing memory footprint

I am writing a Windows based OpenGL viewer application.
I am using VBO + triangle strip + glDrawArrays method to render my meshes. Every thing is perfectly working on all machines.
In case of Windows Desktop with nVidia Quadro cards the working/peak working memory shoots when i first call glDrawArray.
While in case of laptops having nvidia mobile graphic cards the working memory or peak working memory does not shoot. Since last few days i am checking almost all forums/post/tuts about VBO memory issue. Tried all combinations of VBO like GL_STATIC_DRAW/DYNAMIC/STREAM, glMapbuffer/glunmapbuffer. But nothing stops shooting memory on my desktops.
I suspect that for VBO with ogl 1.5 i am missing some flags.
PS: I have almost 500 to 600 VBO's in my application. I am using array of structures ( i.e. v,n,c,t together in a structure). And I am not aligning my VBOs to 16k memory.
Can any one suggest me how I should go ahead to solve this issue. Any hints/pointers would be helpful.
Do you actually run out of memory or does your application increasingly consume memory? If not, why bother? If the OpenGL implementation keeps a working copy for itself, then this is probably for a reason. Also there's little you can do on the OpenGL side to avoid this, since it's entirely up to the driver how it manages its stuff. I think the best course of action, if you really want to keep the memory footprint low, is contacting NVidia, so that they can double check if this may be a bug in their drivers.

How can I stress the GPU

I would like to add some diagnostic code to our application that stresses both the CPU and GPU, and then measures heat. A third party tool is not an option. From what I can tell, CUDA is not an option either, as it requires Nvidia's compiler - is that right? As far as I can tell, my best option is DirectX. Anything simple and non visual on the GPU would do.
Platform: Windows XP Embedded
DirectX 9.0C
Simply create a shader in HLSL which contain an endless loop.
Turn off all culling and instancing and upload tones of triangle data to the gpu for processing and drawing, this will stress both the CPU (not too much these days) and the GPU should suffer under the overdrawing burden.
one should be able to use the code for any intro tutorial for this (ones that use DrawPrimitiveUP will stress the CPU more, but don't require creation of GPU buffers). you probably also want vsync disabled, so that the GPU works as fast as it can(aka it doesn't wait too much/at all on other events)

Does GLSL utilize SLI? Does OpenCL? What is better, GLSL or OpenCL for multiple GPUs?

To what extend does OpenGL's GLSL utilize SLI setups? Is it utilized at all at the point of execution or only for end rendering?
Similarly, I know that OpenCL is alien to SLI but assuming one has several GPUs, how does it compare to GLSL in multiprocessing?
Since it might depend on the application, e.g. common transformation, or ray tracing, can you offer insight on differences depending on application type?
The goal of SLI is to divide the rendering workload on several GPU. First, the graphic driver uses a either a Sort-first or time decomposition (GPU0 works on frame n while GPU1 works on frame n+1) approach. And then, the pixels are copied from one GPU to the other.
That said, SLI has nothing to do with the shading language used by OpenGL (the way the pixels are drawn doesn't really matter).
For OpenCL, I would say that you have to divide your workload between the GPU by yourself, but I am not sure.
If you want to take advantage of multiple GPUs with OpenCL, you will have to create command queues for each device and run kernels on each device after splitting up the workload.
Basically, you have to instruct the driver that you want to use SLI, and in which mode. After this, the driver will (almost) seamlessly do all the work for you.
Alternate Frame Rendering : no sync needed, so better performance, but more lag
Split Frame Rendering : lots of sync, some vertices are processed twice, but less lag.
For you GLSL vs OpenCL comparison, I don't know of any good benchmark. I'd be interested, though.