OpenGL stable enough for 24/7 running program?

OpenGL stable enough for 24/7 running program? - opengl

I am creating a GUI program that will run 24/7. I couldn't find much online on the subject, but is OpenGL stable enough to run 24/7 for weeks on end without leaks, crashes, etc?
Should I have any concerns or anything to look into before delving too deep into using OpenGL?
I know that OpenGL and DirectX are primarily used for games or other programs that aren't used for very long lengths. Hopefully someone here has some experience with this or knowledge on the subject. Thanks.
EDIT: Sorry for the lack of detail. This will only be doing 2D rendering, and nothing too heavy, what I have now (which will be similar to production) already runs at a stable 900-1000 FPS on my i5 laptop with Radeon 6850m

Going into OpenGL just for making a GUI sounds insane. You should be worried more about what language you use if you are concerned about stuff like memory leaks. Remember that in C/C++ you manage memory on your own.
Furthermore, do you really need the GUI to be running 24/7? If you are making a service sort of application, you might as well leave it in the background and make a second application which provides the GUI. These two applications would communicate via soma IPC (sockets?). That's how this sort of thing usually works, not having a window open all the time.
In the end, memory leaks are not caused by some graphical library, but more by the programmer writing bad code. The library should be the last in your list of possible reasons for memory leaks/creashes.

I work for a company that makes (windows based) quality assurance software (machine vision) using Delphi.
The main operator screen shows the camera images at up to 20fps (2 x 10fps) with opengl overlay, and has essentially unbounded uptime (longest uptimes close to an year, longer is hard due to power downs for maintenance). Higher speed cameras have their display rates throttled.
I would avoid integrated video from intel for a while longer though. Since i5 it meets our minimal requirements (non power of 2 textures mostly), but the initial drivers were bad, and while they have improved there are occasional stability and regularity problems still.

Related

What can Vulkan do specifically that OpenGL 4.6+ cannot?

I'm looking into whether it's better for me to stay with OpenGL or consider a Vulkan migration for intensive bottlenecked rendering.
However I don't want to make the jump without being informed about it. I was looking up what benefits Vulkan offers me, but with a lot of googling I wasn't able to come across exactly what gives performance boosts. People will throw around terms like "OpenGL is slow, Vulkan is way faster!" or "Low power consumption!" and say nothing more on the subject.
Because of this, it makes it difficult for me to evaluate whether or not the problems I face are something Vulkan can help me with, or if my problems are due to volume and computation (and Vulkan would in such a case not help me much).
I'm assuming Vulkan does not magically make things in the pipeline faster (as in shading in triangles is going to be approximately the same between OpenGL and Vulkan for the same buffers and uniforms and shader). I'm assuming all the things with OpenGL that cause grief (ex: framebuffer and shader program changes) are going to be equally as painful in either API.
There are a few things off the top of my head that I think Vulkan offers based on reading through countless things online (and I'm guessing this certainly is not all the advantages, or whether these are even true):
Texture rendering without [much? any?] binding (or rather a better version of 'bindless textures'), which I've noticed when I switched to bindless textures I gained a significant performance boost, but this might not even be worth mentioning as a point if bindless textures effectively does this and therefore am not sure if Vulkan adds anything here
Reduced CPU/GPU communication by composing some kind of command list that you can execute on the GPU without needing to send much data
Being able to interface in a multithreaded way that OpenGL can't somehow
However I don't know exactly what cases people run into in the real world that demand these, and how OpenGL limits these. All the examples so far online say "you can run faster!" but I haven't seen how people have been using it to run faster.
Where can I find information that answers this question? Or do you know some tangible examples that would answer this for me? Maybe a better question would be where are the typical pain points that people have with OpenGL (or D3D) that caused Vulkan to become a thing in the first place?
An example of answer that would not be satisfying would be a response like
You can multithread and submit things to Vulkan quicker.
but a response that would be more satisfying would be something like
In Vulkan you can multithread your submissions to the GPU. In OpenGL you can't do this because you rely on the implementation to do the appropriate locking and placing fences on your behalf which may end up creating a bottleneck. A quick example of this would be [short example here of a case where OpenGL doesn't cut it for situation X] and in Vulkan it is solved by [action Y].
The last paragraph above may not be accurate whatsoever, but I was trying to give an example of what I'd be looking for without trying to write something egregiously wrong.

Vulkan really has four main advantages in terms of run-time behavior:
Lower CPU load
Predictable CPU load
Better memory interfaces
Predictable memory load
Specifically lower GPU load isn't one of the advantages; the same content using the same GPU features will have very similar GPU performance with both of the APIs.
In my opinion it also has many advantages in terms of developer usability - the programmer's model is a lot cleaner than OpenGL, but there is a steeper learning curve to get to the "something working correctly" stage.
Let's look at each of the advantages in more detail:
Lower CPU load
The lower CPU load in Vulkan comes from multiple areas, but the main ones are:
The API encourages up-front construction of descriptors, so you're not rebuilding state on a draw-by-draw basis.
The API is asynchronous and can therefore move some responsibilities, such as tracking resource dependencies, to the application. A naive application implementation here will be just as slow as OpenGL, but the application has more scope to apply high level algorithmic optimizations because it can know how resources are used and how they relate to the scene structure.
The API moves error checking out to layer drivers, so the release drivers are as lean as possible.
The API encourages multithreading, which is always a great win (especially on mobile where e.g. four threads running slowly will consume a lot less energy than one thread running fast).
Predictable CPU load
OpenGL drivers do various kinds of "magic", either for performance (specializing shaders based on state only known late at draw time), or to maintain the synchronous rendering illusion (creating resource ghosts on the fly to avoid stalling the pipeline when the application modifies a resource which is still referenced by a pending command).
The Vulkan design philosophy is "no magic". You get what you ask for, when you ask for it. Hopefully this means no random slowdowns because the driver is doing something you didn't expect in the background. The downside is that the application takes on the responsibility for doing the right thing ;)
Better memory interfaces
Many parts of the OpenGL design are based on distinct CPU and GPU memory pools which require a programming model which gives the driver enough information to keep them in sync. Most modern hardware can do better with hardware-backed coherency protocols, so Vulkan enables a model where you can just map a buffer once, and then modify it adhoc and guarantee that the "other process" will see the changes. No more "map" / "unmap" / "invalidate" overhead (provided the platform supports coherent buffers, of course, it's still not universal).
Secondly Vulkan separates the concept of the memory allocation and how that memory is used (the memory view). This allows the same memory to be recycled for different things in the frame pipeline, reducing the amount of intermediate storage you need allocated.
Predictable memory load
Related to the "no magic" comment for CPU performance, Vulkan won't generate random resources (e.g. ghosted textures) on the fly to hide application problems. No more random fluctuations in resource memory footprint, but again the application has to take on the responsibility to do the right thing.

This is at risk of being opinion based. I suppose I will just reiterate the Vulkan advantages that are written on the box, and hopefully uncontested.
You can disable validation in Vulkan. It obviously uses less CPU (or battery\power\noise) that way. In some cases this can be significant.
OpenGL does have poorly defined multi-threading. Vulkan has well defined multi-threading in the specification. Meaning you do not immediately lose your mind trying to code with multiple threads, as well as better performance if otherwise the single thread would be a bottleneck on CPU.
Vulkan is more explicit; it does not (or tries to not) expose big magic black boxes. That means e.g. you can do something about micro-stutter and hitching, and other micro-optimizations.
Vulkan has cleaner interface to windowing systems. No more odd contexts and default framebuffers. Vulkan does not even require window to draw (or it can achieve it without weird hacks).
Vulkan is cleaner and more conventional API. For me that means it is easier to learn (despite the other things) and more satisfying to use.
Vulkan takes binary intermediate code shaders. While OpenGL used not to. That should mean faster compilation of such code.
Vulkan has mobile GPUs as first class citizen. No more ES.
Vulkan have open source, and conventional (GitHub) public tracker(s). Meaning you can improve the ecosystem without going through hoops. E.g. you can improve\implement a validation check for error that often trips you. Or you can improve the specification so it does make sense for people that are not insiders.

Fast cross-platform cpu profiling of C++ functions

I have written a software in Qt for a company, which works with a hardware device and shows realtime plots. I have problems with its speed and I need to know which parts are cpu intensive, but there are lots of threads and events, and when I use valgrind the application gets so slow that serial handlers won't work as expected and timeouts happen and therefore I can't find what's going on. The code base is huge and simplification is almost impossible, because things depend on each other. I'm developing on MacOSX but the application runs on Linux, I wanted to know if there is a faster profiler than valgrind out there. Preferably one that works on MacOSX, as valgrind doesn't work on MacOSX (It says it does, but there are so many problems that makes it just not practical). Thanks in advance.
P.S. If you need any more info, please comment instead of down voting, I can't offer the code as it is proprietary, but I can say it is well written, and I'm a fairly experienced C++ programmer.

Qt CPU usage while using OpenGL

I verified this with two applications using OpenGL using a GQLWidget. If screenupdates are very frequent, so say 30 fps, and/or resolution is high, CPU usage of one of the cores skyrockets. I'm looking for a solution on how to fix this and/or verify if it happens on Windows as well.

In my experience QGLWidget itself is a very efficient thin wrapper around GL and your windowing system; if you have high CPU usage using it, well chances are that you'd have high CPU usage using any other way of implementing an OpenGL app too.
If you have high CPU usage using OpenGL, chances are either:
You're falling back on a software OpenGL implementation (ie Mesa);
e.g Debian will do this if you don't install any graphics device
drivers.
You're using old school immediate mode OpenGL: glBegin,...vertices...,glEnd. Get into VBOs instead.
The fact you mention display resolution as a factor rather suggests the former problem.

You need to get any profiler, profile your code and see where bottleneck is. Since your program eats CPU resources (and not GPU), this should be fairly easy.
As far as I know, "AQTime 7 Standard"(windows) is currently available for free. Or you could use gprof - depending on your toolkit/platform.
One very possible scenario (aside from software OpenGL fallback) is that you use dynamic memory allocation too frequently or running debug build. Immediate mode could be a problem if you have 100000+ polygons per frame.

I have seen a few GL implementations that were terrible at minimising host CPU usage. There appear to be plenty of situations where the CPU will busy-wait while the GPU draws. Often simply turning on vertical sync in the card settings will cause your app to draw less often and still take up just as much CPU.
Unfortunately there is little you can do about this yourself, save for limiting how often your app draws.

Are there ways to overcome graphic APIs CPU bound bottlenecks on PC?

Recently, I have been spending a lot of my time researching the topic of GPUs, and have came across several articles talking about how PC games are having a hard time staying ahead of the curve compared to console games due to limitations with the APIs. For example, on Xbox 360, it is my understanding that the games run in kernel mode, and that because the hardware will always be the same, the games can be programmed "closer to the metal" and the Directx api has less abstraction. On PC however, making the same number of draw calls with Direct-X or Opengl may take even more the 2 times the amount of time than on console due to switching to kernel mode and more layers of abstraction. I am interested in hearing possible solutions to this problem.
I have heard of a few solutions, such as programing directly on the hardware, but while (from what I understand), ATI has released the specifications of there low level API, nVidia keeps theirs secret, so that wouldn't work too well, not to mention the added development time of making different profiles.
Would programming an entire "software rendering" solution in Opencl and running that on a GPU be any better? My understanding is that games with a lot of draw calls are cpu bound and the calls are single threaded (on PC that is), so is Opencl a viable option?
So the question is:
What are possible methods to increase the efficiency of, or even remove the need for, graphics APIs such as Opengl and Directx?

The general solution is to not make draw as many draw calls. Texture atlases via array textures, instancing, and various other techniques make this possible.
Or to just use the fact that modern computers have a lot more CPU performance than consoles. Or even better, make yourself GPU bound. After all, if your CPU is your bottleneck, then that means you have GPU power to spare. Use it.
OpenCL is not a "solution" to anything related to this. OpenCL has no access to any of the many things one would need to do to actually use a GPU to do rendering. In order to use OpenCL for graphics, you would have to not use the GPU's rasterizer/clipper, it's specialized buffers for transferring information from stage to stage, the post T&L cache, or the blending/depth comparison/stencil/etc hardware. All of that is fixed function and extremely fast and specialized. And completely unavailable to OpenCL.
And even then, it doesn't actually make it not CPU bound anymore. You still have to marshal what you're rendering and so forth. And you probably won't have access to the graphics FIFO, so you'll have to find another way to feed your shaders.
Or, to put it another way, this is a "problem" that doesn't need solving.

If you try to write a renderer in OpenCL, you will end up with something resembling OpenGL and DirectX. You will also most likely end up with something much slower than these APIs which were developed by many experts over many years. They are specialized to handle efficient rasterizing and use internal hooks not available to OpenCL. It could be a fun project, but definitely not a useful one.
Nicol Bolas already gave you some good techniques to increase the load of the GPU relative to the CPU. The final answer is of course that the best technique will depend on your specific domain and constraints. For example, if your rendering needs call for lots of pixel overdraw with complicated shaders and lots of textures, the CPU will not be the bottleneck. However, the most important general rule from with modern hardware is to limit the number of OpenGL calls made by better batching.

APIs. For example, on Xbox 360, it is my understanding that the games run in kernel mode, and that because the hardware will always be the same, the games can be programmed "closer to the metal" and the Directx api has less abstraction. On PC however, making the same number of draw calls with Direct-X or Opengl may take even more the 2 times the amount of time than on console due to switching to kernel mode and more layers of abstraction.
The benefits of close-to-metal operation on consoles is largely overcompensated on PCs by their much larger CPU performance and available memory. Add to this that the HDDs of consoles are not nearly as fast as modern PC ones (SATA-1 vs SATA-3, or even just PATA) and many games get their contents from an optical drive which is even slower.
The PS3 360 for example offers only 256MiB memory for game logic and another 256MiB of RAM for graphics and more you don't get to work with. The X-Box 360 offers 512MiB of unified RAM, so you have to squeeze everthing into that. Now compare this with a low end PC, which easily comes with 2GiB of RAM for the program alone. And even the cheapest graphics cards offer at least 512MiB of RAM. A gamers machine will have several GiB of RAM, and the GPU will offer something between 1GiB to 2GiB.
This extremly limits the possibilites for a game developer and many PC gamers are mourning that so many games are "consoleish", yet their PCs could do so much more.

Measuring device drivers CPU/IO utilization caused by my program

Sometimes code can utilize device drivers up to the point where the system is unresponsive.
Lately I've optimized a WIN32/VC++ code which made the system almost unresponsive. The CPU usage, however, was very low. The reason was 1000's of creations and destruction of GDI objects (pens, brushes, etc.). Once I refactored the code to create all objects only once - the system became responsive again.
This leads me to the question: Is there a way to measure CPU/IO usage of device drivers (GPU/disk/etc) for a given program / function / line of code?

You can use various tools from SysInternals Utilities (now a Microsoft product, see http://technet.microsoft.com/en-us/sysinternals/bb545027) to give a basic idea before jumping in. In your case process explorer (procexp) and process monitor (procmon) performs a decent job. They can be used to get you a basic idea about what type of slowness it is before doing profiling drill down.
Then you can use xperf http://msdn.microsoft.com/en-us/performance/default to drill down. With correct setup, this tool can bring you to the exact function that causes slowness without injecting profiling code into your existing program. There's already a PDC video talking about how to use it http://www.microsoftpdc.com/2009/CL16 and I highly recommend this tool. Per my own experience, it's always better to observe using procexp/procmon first, then targeting your suspects with xperf, because xperf can generate overwhelming load of information if not filtered in a smart way.
In certain hard cases that involving locking contentions, Debugging Tools for Windows (windbg) will be very handy, and there are dedicated books talking about its usage. These books typically talk about hang detection and there are quite a few techniques here can be used to detect slowness, too. (e.g. !runaway)

Maybe you could use ETW for this? Not so sure it will help you see what line causes what, but it should give you a good overall picture of how your app is doing.

To find the CPU/memory/disk usage of the program in real time, you can use the resource monitor and task manager programs that come with windows. You can find the amount of time that a block of code takes relative to the other blocks of code by printing out the systime. Remember not to do too much monitoring at once, because that can throw off your calculations.
If you know how much CPU time that the program takes and what percentage of time the block of code takes, then you can estimate approximately how much CPU time that a block of code takes.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js