In GFXBench testing, what is the meaning of "OffScreen test"? - opengl

As title. Is that mean computation power of GPU?
What are the main differences of "OnScreen" and "OffScreen"?

On Screen = GPU drawing operations are performed on a region of memory that goes directly to the display output. Usually also means that the memory in question is managed by the windowing system.
Off Screen = GPU drawing operations are performed on a region of memory that is not directed to the display output and is managed by the program itself. Off Screen rendering is used to generate intermediary images in computer graphics, like reflection maps, shadow maps, postprocessing filters (motion or Depth of Field blur).

Related

OpenGL vector graphics rendering performance on mobile devices

It is generally advised not to use vector graphics in mobile games, or pre-rasterize them - for performance. Why is that? I though that OpenGL is at least as good at drawing lines / triangles as rendering images on screen...
Rasterizing them caches them as images so less overhead takes place vs calculating every coordinate for vector and drawing (more draw cycles and more cpu usage). Drawing a vector is exactly that, you are drawing arcs from point to point on every single call vs displaying an image at a certain coordinate with a cached image file.
Although using impostors is a great optimization trick, depending on the impostors shape, how much overdraw is involved and whenever you may need blending in the process the trick can get you to be fillrate bound. Also in some scenarios where shapes may change, caching the graphics into impostors may not be feasible or may incur in other overheads. Is at matter of balancing your rendering pipeline.
The answer depends on the hardware. Are you using a GPU or NOT?
Today modern mobile devices with Android and IOS have a GPU unit embedded in the chipset.
This GPUs are very good with vector graphics. To probe this point most GPU's have a dedicated Geometry processor in addition to 1 or more pixel processors. (By example Mali-400 GPU).
By example let's say you want to draw a 200 trasparent circles of different colors.
If you do it with modern OpenGL, you will only need one set of geometry (a list of triangles forming a circle) and a list of parameters for each circle, let's say position and color. If you provide this information to the GPU, it will draw it in parallel very quickly.
If you do it using different textures for each color, your program will be very heavy (in storage size) and probably will be more slow due memory bandwidth problems.
It depends on what you want to do, and the hardware. If your hardware doesn't have a GPU you probably should pre-render your graphics.

Fastest way of plotting a point on screen in MFC C++ app

I have an application that contains many millions of 3d rgb points that form an image when plotted. What is the fastest way of getting them to screen in a MFC application? I've tried CDC.SetPixelV in conjunction with a bitmap, which seems quite slow, and am looking towards a Direct3D or OpenGL window in my MFC view class. Any other good places to look?
Double buffering is your solution. There are many examples on codeproject. Check this one for example
Sounds like a point cloud. You might find some good information searching on that term.
3D hardware is the fastest way to take 3D points and get them into a 2D display, so either Direct3D or OpenGL seem like the obvious choices.
If the number of points is much greater than the number of pixels in your display, then you'll probably first want to cull points that are trivially outside the view. You put all your points in some sort of spatial partitioning structure (like an octree) and omit the points inside any node that's completely outside the viewing frustrum. This reduces the amount of data you have to push from system memory to GPU memory, which will likely be the bottleneck. (If your point cloud is static, and you're just building a fly through, and if your GPU has enough memory, you could skip the culling, send all the data at once, and then just update the transforms for each frame.)
If you don't want to use the GPU and instead write a software renderer, you'll want to render to a bitmap that's in the same pixel format as your display (to eliminate the chance of the blit need to do any pixels formatting as it blasts the bitmap to the display). For reasonable window sizes, blitting at 30 frames per second is feasible, but it might not leave much time for the CPU to do the rendering.

Reducing RAM usage with regard to Textures

Currently, My app is using a large amount of memory after loading textures (~200Mb)
I am loading the textures into a char buffer, passing it along to OpenGL and then killing the buffer.
It would seem that this memory is used by OpenGL, which is doing its own texture management internally.
What measures could I take to reduce this?
Is it possible to prevent OpenGL from managing textures internally?
One typical solution is to keep track of which textures you are needing at a given position of your camera or time-frame, and only load those when you need (opposed to load every single texture at the loading the app). You will have to have a "manager" which controls the loading-unloading and bounding of the respective texture number (e.g. a container which associates a string, name of the texture, with an integer) assigned by the glBindTexture)
Other option is to reduce the overall quality/size of the textures you are using.
It would seem that this memory is used by OpenGL,
Yes
which is doing its own texture management internally.
No, not texture management. It just need to keep the data somewhere. On modern systems the GPU is shared by several processes running simultanously. And not all of the data may fit into fast GPU memory. So the OpenGL implementation must be able to swap data out. The GPU fast memory is not storage, it's just another cache level. Just like the system memory is cache for system storage.
Also GPUs may crash and modern drivers reset them in situ, without the user noticing. For this they need a full copy of the data as well.
Is it possible to prevent OpenGL from managing textures internally?
No, because this would either be tedious to do, or break things. But what you can do, is loading only the textures you really need for drawing a given scene.
If you look through my writings about OpenGL, you'll notice that for years I tell people not to writing silly things like "initGL" functions. Put everything into your drawing code. You'll go through a drawing scheduling phase anyway (you must sort translucent objects far-to-near, frustum culling, etc.). That gives you the opportunity to check which textures you need, and to load them. You can even go as far and load only lower resolution mipmap levels so that when a scene is initially shown it has low detail, and load the higher resolution mipmaps in the background; this of course requires appropriate setting of minimum and maximum mip levels to be set as either texture or sampler parameter.

How to speed up offscreen OpenGL rendering with large textures on Win32?

I'm developing some C++ code that can do some fancy 3D transition effects between two images, for which I thought OpenGL would be the best option.
I start with a DIB section and set it up for OpenGL, and I create two textures from input images.
Then for each frame I draw just two OpenGL quads, with the corresponding image texture.
The DIB content is then saved to file.
For example one effect is to locate the two quads (in 3d space) like two billboards, one in front of the other(obscuring it), and then swoop the camera up, forward and down so you can see the second one.
My input images are 1024x768 or so and it takes a really long time to render (100 milliseconds) when the quads cover most of the view. It speeds up if the camera is far away.
I tried rendering each image quad as hundreds of individual tiles, but it takes just the same time, it seems like it depends on the number of visible textured pixels.
I assumed OpenGL could do zillions of polygons a second. Is there something I am missing here?
Would I be better off using some other approach?
Thanks in advance...
Edit :
The GL strings show up for the DIB version as :
Vendor : Microsoft Corporation
Version: 1.1.0
Renderer : GDI Generic
The Onscreen version shows :
Vendor : ATI Technologies Inc.
Version : 3.2.9756 Compatibility Profile Context
Renderer : ATI Mobility Radeon HD 3400 Series
So I guess I'll have to use FBO's , I'm a bit confused as to how to get the rendered data out from the FBO onto a DIB, any pointers (pun intended) on that?
It sounds like rendering to a DIB is forcing the rendering to happen in software. I'd render to a frame buffer object, and then extract the data from the generated texture. Gamedev.net has a pretty decent tutorial.
Keep in mind, however, that graphics hardware is oriented primarily toward drawing on the screen. Capturing rendered data will usually be slower that displaying it, even when you do get the hardware to do the rendering -- though it should still be quite a bit faster than software rendering.
Edit: Dominik Göddeke has a tutorial that includes code for reading back texture data to CPU address space.
One problem with your question:
You provided no actual rendering/texture generation code.
Would I be better off using some other approach?
The simplest thing you can do is to make sure your textures have sizes equal to power of two. I.e. instead of 1024x768 use 1024x1024, and use only part of that texture. Explanation: although most of modern hardware supports non-pow2 textures, they are sometimes treated as "special case", and using such texture MAY produce performance drop on some hardware.
I assumed OpenGL could do zillions of polygons a second. Is there something I am missing here?
Yes, you're missing one important thing. There are few things that limit GPU performance:
1. System memory to video memory transfer rate (probably not your case - only for dynamic textures\geometry when data changes every frame).
2. Computation cost. (If you write a shader with heavy computations, it will be slow).
3. Fill rate (how many pixels program can put on screen per second), AFAIK depends on memory speed on modern GPUs.
4. Vertex processing rate (not your case) - how many vertices GPU can process per second.
5. Texture read rate (how many texels per second GPU can read), on modern GPUs depends on GPU memory speed.
6. Texture read caching (not your case) - i.e. in fragment shader you can read texture few hundreds times per pixel with little performance drop IF coordinates are very close to each other (i.e. almost same texel in each read) - because results are cached. But performance will drop significantly if you'll try to access 100 randomly located texels for every pixels.
All those characteristics are hardware dependent.
I.e., depending on some hardware you may be able to render 1500000 polygons per frame (if they take a small amount of screen space), but you can bring fps to knees with 100 polygons if each polygon fills entire screen, uses alpha-blending and is textured with a highly-detailed texture.
If you think about it, you may notice that there are a lot of videocards that can draw a landscape, but fps drops when you're doing framebuffer effects (like blur, HDR, etc).
Also, you may get performance drop with textured surfaces if you have built-in GPU. When I fried PCIEE slot on previous motherboard, I had to work with built-in GPU (NVidia 6800 or something). Results weren't pleasant. While GPU supported shader model 3.0 and could use relatively computationally expensive shaders, fps rapidly dropped each time when there was a textured object on screen. Obviously happened because built-in GPU used part of system memory as video memory, and transfer rates in "normal" GPU memory and system memory are different.

The legacy device context is too coarse

I have a Process Control system. It has a huge 2D workspace where all the logic is laid out.
The 2D workspace is a coordinate system.
You usually do not see the whole workspace at once, but rather some in-zoomed part of it focusing on some part of the controlled process. Such subsystem views are bookmarked into predefined named images (Power Generator1, Diesel Generator, Main lubrication pump etc).
This workspace interacts with many legacy MFC software components that individually contribute graphics onto the workspace (the device context is passed around to all contributors).
Now, one of the software components renders AutoCAD drawings onto the surface. However, the resolution of the device context is not sufficient for the details of this job. The device context logical resolution is unfortunately dictated by our own coordinate system, which at high zoom levels is quite different from the device units (pixels).
For example, a line drawn using
DC.MoveTo(1,1);
DC.LineTo(1,2);
.... will actually, even though it's drawn directly onto the device context by increment of just one logical unit, cover quite some distance on the screen. But the width of the line would still be only one device pixel. A circle looks high res, but its data (center point and radius) can only be done in coarse increments.
I have considered the following options:
* When a predefined image is loaded and displayed, create a device context with a better suited resolution. The problem would then be that the other graphic providers interact with it using old logical units, which when used against the new DC would result in way too small and displaced graphical elements.
I wonder if I can create some DC wrapper that accepts both kinds of coordinates through different APIs, which are then translated into high res coordinates internally.
Is it possible to have two DCs with different logical/device unit ratio? And render them both to screen?
I mentioned that a circle is rendered beautifully with one pixel width even though it's placement and radius are restricted. Vertical lines are also rendered beautifully, even though the end points can only be given in coarse coordinates. This leads me to believe that it is technically possible to draw in an area that in DC logical coordinates could only be described in decimals.
Does anybody have any idea about what to do?
You need to scale your model, not the device context.
You could draw the high-def image to another DC in a new window and place that window over your low-res-drawing. Of course you have to handle clipping yourself.