Sometimes we can trigger a "CPU Crash" with code like:
*(int*)0=1234;
Correspondingly, any simple code can trigger a "GPU Crash" (may need some DirectX interface)?
Why I need this is because I want to learn about Unreal's(UE4/UE5) GPU crash processing flow.
Any code with UE is expected.
THX
It is very difficult to crash a modern GPU with decent drivers. Generally they will run until task completion. If the task is extremely large or takes a very long time, the OS will assume they have simply stopped responding. After this occurs, there are a few options:
Crash: Very rare
Reset the driver: This will reset the DirectX state and crash
whatever program is using it (unless it is checking for that, but
that seems like a waste)
Abort the operation: This is a feature on
more modern hardware and drivers, but they can simply stop the
program.
Run uninterrupted: This occurs only if the GPU is in use by NOTHING else.
Do some heavy calculation with no end (fractals) and see what happens.
You can trigger a 'Timeout Device Recovery' (TDR) which is very similar to a GPU crash.
Using a Visual Studio Developer Command Prompt running as admin:
dxcap -forcetdr
Keep in mind this triggers a TDR for all running applications.
Related
I have been working for 2.5 years on a personal flight sim project on my leisure time, written in C++ and using Opengl on a windows 7 PC.
I recently had to move to windows 10. Hardware is exactly the same. I reinstalled Code::blocks.
It turns out that on first launch of my project after the system start, performance is OK, similar to what I used to see with windows 7. But, the second, third, and all next launches give me lower performance, with significant less fluidity in frame rate compared to the first run, detectable by eye. This never happened with windows 7.
Any time I start my system, first run is fast, next ones are slower.
I had a look at the task manager while doing some runs. The first run is handled by one of the 4 cores of my CPU (iCore5-6500) at approximately 85%. For the next runs, the load is spread accross the 4 cores. During those slower runs on 4 cores, I tried to modify the affinity and direct my program to only one core without significant improvement in performance. The selected core was working at full load, though.
My C++ code doesn't explicitly use any thread function at this stage. From my modest programmer's point of view, there is only one main thread run in the main(). On the task manager, I can see that some 10 to 14 threads are alive when my program runs. I guess (wrongly?) that they are implicitly created by the use of joysticks, track ir or other communication task with GPU...
Could it come from memory not being correctly freed when my program stops? I thought windows would free it properly, even if I forgot some 'delete' after using 'new'.
Has anyone encountered a similar situation? Any explanation coming to your minds?
Any suggestion to better understand these facts? Obviously, my ultimate goal is to have a consistent performance level whatever the number of launches.
trying to upload screenshots of second run as viewed by task manager:
trying to upload screenshots of first run as viewed by task manager:
Well I got a problems when switching to win10 for clients at my work too here few I encountered all because Windows10 has changed up scheduling of processes creating a lot of issues like:
older windowses blockless thread synchronizations techniques not working anymore
well placed Sleep() helps sometimes. Btw. similar problems was encountered when switching from w2k to wxp.
huge slowdowns and frequent freezes for few seconds on older single threaded apps
usually setting affinity to single core solves this. You can do this also in task manager just to check and if helps then you can do this in code too. Here an example on how to do it with winapi:
Cache size estimation on your system?
messed up drivers timings causing zombies processes even total freeze and or BSOD
I deal with USB in my work and its a nightmare sometimes on win10. On top of all this Win10 tends to enforce wrong drivers on devices (like gfx cards, custom USB systems etc ...)
auto freeze close app if it does not respond the wndproc in time
In Windows10 the timeout is much much smaller than in older versions. If the case You can try running in compatibility mode (set in icon properties on desktop) for older windows (however does not work for #1,#2), or change the apps code to speed up response. For example in VCL you can call Process Messages from inside of blocking code to remedy this... Or you can use threads for the heavy lifting ... just be careful with rendering and using winapi as accessing some winapi (any window/visual related stuff) functions from outside main thread causes havoc ...
On top of all this old IDEs (especially for MCUs) don't work properly anymore and new ones are usually much worse (or unusable because of lack of functionality that was present in older versions) to work with so I stayed faith full to Windows7 for developer purposes.
If none of above helps then try to log the times some of your task did need ... it might show you which part of code is the problem. I usually do this using timing graph like this:
both x,y axises are time and each task has its own color and row in graph. the graph is scrolling in time (to the left side in my case) and has changeable time scale. The numbers are showing actual and max (or sliding avg) value ...
This way I can see if some task is not taking too much time or even overlaps its next execution, peaks are also nicely visible and all runs during runtime without any debug tools which might change the behavior of execution.
I've come across an inexplicable error in SDL 2.0.3 when using hardware rendered graphics. For some reason, around 5 minutes after the program starts my graphical window closes but my console window stays open. There is no error thrown or anything to signify a problem.
When I pause the debugger, the program puts the breakpoint inside of SDL_RenderPresent(). I followed the call stack to a function inside of ntdll.dll called WaitForSingleObject() but I'm not sure what's causing it to hang forever.
Also, this does not happen when I use software rendered graphics. I am running it on an AMD FirePro M5100 FireGL V with the latest drivers installed.
My question is, does anyone know what might cause SDL_RenderPresent() to never return?
From the description seems that there are locks not released by the lower levels of the graphics pipeline.
From the fact that it happens after 5 minutes seems that there is a resource leak somewhere.
All of this is just a wild guess of course, but I'd say that either the application code or the SDL code is leaking resources (handles to textures, vertex buffers and the like) and that part of the code (either at the lower levels of SDL or in the driver) when running out is not behaving nicely (this happens often... in many cases low-resource conditions are not very well tested and handled).
This doesn't happen in software rendering because there resources are basically unlimited. A confirm of this kind of problem would be that when running in software rendering the program works but process memory use keeps growing and growing.
Pay attention also to any code that "catches" any exception/failure and keeps running after that. Writing complex software that works correctly after an abnormal state is extremely difficult (basically impossible beyond trivial cases because exception safety doesn't scale by composition: the only way that doesn't make the complexity explode is to have logical partitioning "walls" and re-initialize whole subsystems).
I have a console application that runs some temperamental hardware. If I don't detach from it nicely, windows tends to bluescreen five minutes later. I can catch when the application is closed by using SetConsoleCtrlHandler. But when I hit 'stop debugging' in visual studio, it skips this process and just kills the program brutally. As of 2009 it appears nobody has a solution for this problem.
Is this still the case? Do I really have to live with a bluescreen if I accidentally hit the wrong button?
Killing the process is what that button is for. If you want the program to terminate gracefully, you need to let it run to completion. You can use the debugger's "continue" button, or just detach from the process, and then quit the program in the normal way. Or you can use the debugger to do something that makes the program quit itself, such as setting a "done" flag that controls a main loop.
You might consider splitting your program into two parts: a service that deals with the hardware's peculiarities and presents a "clean" interface to the console application, and the console application which talks to the service instead of directly to the hardware.
You immediate problem is the TerminateProcess function that even Task Manager can execute which gives you no recourse to shutdown cleanly. Even if you could solve the Visual Studio problem, someone could right click and "End Process" and you be in the same boat.
Your root problem is poorly-written device drivers. These driver should not blue-screen even if you are abruptly terminated. If you have choice but to use them then you can try to bulletproof your process by running with administrative privileges, running as a service, or confining operations to short windows of time. Or you can simply train your operators not to do the things that will cause your delicate system to blow up. And hope.
I am trying to make a program that records a whole bunch of things periodically.
The specific reason is that if it bluescreens, a developer can go back and check a lot of the environment and see what was going on around that time.
My problem, is their a way to cause a bluescreen?
Maybe with a windowsAPI call (ZeroMemory maybe?).
Anywhoo, if you can think of a way to cause a bluescreen on call I would be thankful.
The computer I am testing this on is designed to take stuff like this haha.
by the way the language I am using is C\C++.
Thank you
You can configure a machine to crash on a keystroke (Ctrl-ScrollLock)
http://support.microsoft.com/kb/244139
Since it appears that there are times when that won't work on some systems with USB keyboards, you can also get the Debugging Tools for Windows, install the kernel debugger, and use the ".crash" command to force a bugcheck.
http://www.microsoft.com/whdc/devtools/debugging/default.mspx
In order to cause a BSOD, a driver running in kernel mode needs to cause it. If you really want to do this, you can write a driver which exposes KeBugCheck to usermode.
http://msdn.microsoft.com/en-us/library/ms801640.aspx
Thanks to Andrew below for pointing this utility out:
http://download.sysinternals.com/files/NotMyFault.zip
If you kill the csrss process you'll get a blue-screen rather quickly.
If you want to simulate a hard crash such as a bluescreen, you'd pretty much have to yank the power cord. NOT recommended.
In case of a crash, anything not saved to persistent storage will be lost. If you want to simulate a crash for purposes of logging, write a "kill switch" into your logger, which stops the logging. Now you can simulate a crash by killing the logging and making sure you have the data you would have wanted in case of an actual crash.
First of all, I would advise you to use a Virtual Machine to test this BSOD on. This will allow you to keep a backup just in case the BSOD does some damage to the system. Here's a tip on how to generate a BSOD simply by pressing CTRL+SCROLLLOCK+SCROLLLOCK.
Is there a Windows API to generate one? No, according to this article. Still, if you would call certain API's with invalid data, they could still cause a crash inside the kernel, which would result in your BSOD.
I'm not sure exactly what you'd be testing. Since your program runs periodically, surely it's enough to check that the information is being dumped at the frequency that you specify while the system is running? Are you checking that the information stays around after the blue screen? Depending on how you are dumping it (and whether you are flushing buffers), this may not be necessary.
If you dont want to write code (driver, IOCTL...) you can use DiskCryptor. Note that no disk encrypting is need.
Just need to install the driver:
dcinst.exe -setup
And then generate a bsod using the DC console:
dccon.exe -bsod
Run process as critic and exit http://waleedassar.blogspot.co.uk/2012/03/rtlsetprocessiscritical.html
we have a problem with an application we're developing. Very seldom, like once in a hundred, the application crashes at start up. When the crash happens it brings down the whole system, the computer starts to beep and freezes up completely, the only way to recover is to turn off the power (we're using Windows XP). The rarity of the crash combined with the fact that we can't break into the debugger or even generate a stackdump when it occurs makes it extremely hard to debug.
I'm looking for something that logs all function calls to a file. Does such a tool exist? It shouldn't be impossible to implement, profilers like VTune does something very similar.
We're using visual studio 2008 (C++).
Thanks
A.B.
Logging function entries/exits is a low-level approach to your problem. I would suggest using automatic debugger instrumentation (using Debugger key under Image File Execution Options with regedit or using gflags from the package I provide a link to below) and trying to repro the problem until it crashes. Additionally, you can have the debugger log function call history of suspected module(s) using a script or have collect any other information.
But not knowing the details of your application it is very hard to suggest a solution. Is it a user app, service or a driver? What does "crashes at startup" mean - at windows startup or app's startup?
Use this debugger package to troubleshoot.
The only problem with the logging idea is that when the system crashes, the latest log entries might still be in the cache and have no chance to be written to disk...
If it was me I would try running the program on a different PC - it might be flaky hardware or drivers causing the problem. An application program "shouldn't" be able to bring down the system.
A few Ideas-
There is a good chance that just prior to your crash there is some sort of exception in the application. if you set you handler for all unhandled exceptions using SetUnhandledExceptionFilter() and write a stack trace to your log file, you might have a chance to catch the crash in action.
Just remember to flush the file after every write.
Another option is to use a tool such as strace which logs all of the system calls into the kernel (there are multiple flavors and implementations for that so pick your favorite). if you look at the log just before the crash you might find the culprit
Have you considered using a second machine as a remote debugger (via the network)? When the application (and system) crashes, the second machine should still show some useful information, if not the actual point of the problem. I believe VC++ has that ability, at least in some versions.
For Visual C++ _penter() and _pexit() can be used to instrument your code.
See also Method Call Interception in C++.
GCC (including the version MingGW for Windows development) has a code generation switch called -finstrument-functions that tells the compiler to emit special calls to functions called __cyg_profile_func_enter and __cyg_profile_func_exit around every function call. For Visual C++, there are similar options called /GH and /Gh. These cause the compiler to emit calls to __penter and __pexit around function calls.
These instrumentation modes can be used to implement a logging system, with you implementing the calls that the compiler generates to output to your local filesystem or to another computer on your network.
If possible, I'd also try running your system using valgrind or a similar checking tool. This might catch your problem before it gets out-of-hand.