Heisenbug issue with using a dll. What do I do next? - c++

I am working on a system that uses a Voltage Controlled Oscillator chip (VCO) to help process a signal. The makers of the chip (Analog Devices) provide a program to load setup files onto the VCO but I want to be able to setup the chip from within the overarching signal processing control system. Fortunately Analog Devices also provides a DLL to interface with their chip and load setup files myself. I am programming in Visual C++ 6.0 (old I know) and my program is a dialog application.
I got the system to work perfectly writing setup files to the card and reading its status. I then decided that I needed to handle the case where there are multiple cards attached and one must be selected. The DLL provides GetDeviceCount() which returns an integer. For some reason every time the executable runs it returns 15663105 (garbage I assume). Whenever I debug my code however the function returns the correct number of cards. Here is my call to GetDeviceCount().
typedef int (__stdcall *GetDeviceCount)();
int AD9516_Setup()
{
int NumDevices;
GetDeviceCount _GetDeviceCount;
HINSTANCE hInstLibrary = LoadLibrary("AD9516Interface.dll");
_GetDeviceCount = (GetDeviceCount)GetProcAddress(hInstLibrary,"GetDeviceCount");
NumDevices = _GetDeviceCount();
return NumDevices;
}
Just to be clear: every other function from the DLL I have used is called exactly like this and works perfectly in the executable and debugger. I did some research and found out that a common cause of Heisenbugs is threading. I know that there is some threading behind the scenes in the dialogs I am using so I deleted all my calls to the function except one. I also discovered that the debugger code executes slower than executable code and I thought the chip may not have enough time to finish processing each command. First I tried taking up time between each chip function call by inserting an empty for loop and when that did not work I commented out all other calls to the DLL.
I do not have access to the source code used to build the DLL and I have no idea why its function would be returning garbage in the executable and not debugger. What other differences are there between running in debugger and executing that could cause an error? What are some other things I can do to search for this error?

Some compilers/IDEs add extra padding to variables in debug builds or initialize them to 0 - this might explain the differences you're encountering between debugging and "normal" execution.
Some things that might be worth checking:
- are you using the correct calling convention?
- do you get the same return value if no devices are connected?
- are you using the correct return type (uint vs int vs long vs ..)?

Try setting _GetDeviceCount to 0 before calling the function; that could be what the debugger is doing for you.

Related

Unexpected IConnectionPointImpl::Unadvise call on Windows Embedded Compact 7

We have a bigger software running on Win CE6 without problems. The core functionality is implemented in a COM server DLL that provides connection points. The COM client program registers event handlers for the connection points on program startup to get status notifications etc. On program exit it unregisters the handlers by calling the corresponding IConnectionPointImpl::Unadvise methods.
Now, we are porting the program to run on Win EC 7. The new Board Support Package (BSP) for Win EC 7 works well. There are also different versions with different options, created at different times with different sources from Microsoft, but our software always show the same issue.
On program startup, ~10s after launch, IConnectionPointImpl::Unadvise is called unexpectedly on all registered event handlers. We only have one method in our source code that calls IConnectionPointImpl::Unadvise and this is definitely not executed.
The issue appears ~95%, but sometimes the program starts and runs without problems. We cannot use the Debugger because of the size of the program, the performance is very poor.
We guess, that the COM runtime calls the IConnectionPointImpl::Unadvise methods for some reasons. But we have no idea, how to prevent this.
Has anybody observed the same issue? Is there a solution/workaround available? Thanks.
So we finally found how solve this problem.
We remove our dependency on MarshalByReObject and replace it by a proper implementation of ISerializable.
That allow us to load properly inside custom AppDomain our assembly and events are not loose anymore.
But this has a side effect on path where assembly a configuration file are loaded. To solve this we also implement an AppDomain.AssemblyResolve event which allow us to redirect the loading in a proper place.
I hope this can help you ;)

WMVCore crashes when WMIsContentProtected methos was called

My team is developing an application that is capable of recording videos and exporting the video into .wmv format file.
The export function is done via a method that utilizes windows API method derived from WMVCore.dll library.
However, on some PCs, this export function gets into crashes and according to the dump file taken from the crash, it appears that the last method called was WMVCore.WMIsContentProtected method. The call stack from the dump file is like below:
379fe8d8 65aa0cdb 299f94f0 00025800 379fe8f4
WMVCORE!WMIsContentProtected+0x4bc3 379fe970 6102e07e 1cae3f20
299f94f0 379fece8 WMVCORE!WMCreateWriterPushSink+0x1e278
It is really frustrating getting this crash without any more information out of the dump file. I also thought that the library might be corrupted on that specific PC, but there is no way to re-install the library as it comes with Windows Media Player. Any suggestion would be appreciated.
The code simply called WMVCore.WMCreateWriterPushSink which is followed by WMVCore.WMIsContentProtected. The OS version is Windows 7 Enterprise 64 bit. The application is a web based application on IE, in which case when the crash happens, IE would also crash and stop. The version of the WMVCore.dll is 12.0.7601.17514 and the question here is whether anyone has experienced the same issue and if yes, what sort of things can be tried/done to prevent this crash from happening?

Standalone Windows app hangs upon change of focus

I have written an app in C++ from Visual Studio 2008, running on Windows 7, that runs fine, using either the debug or release versions, when under the control of the debugger, but when running standalone, using either the debug or release versions, it also runs fine except whenever I click on any unrelated window, say a file explorer window, whereupon the app hangs without any warnings from Windows, I see the little circle thing.
The code is doing something rather computationally intensive, accessing data from a 10Mb global array, and it is well within the 2Gb limit of 32 bit Windows. I have checked for the obvious things, such as uninitialized variables, infinite loops, and the like, I am not allocating any big local arrays, but have found nothing wrong. The code is running directly from the UI thread, blocking, but I do not care as there is nothing else to do till that task completes. Alternatively, I could put this code in its own worker thread communicating back to the UI thread by an interlocked buffer, but this seems redundant. I've tried this on two different machines running Windows 7 and get identical behavior. Is there something about Windows process management that I am overlooking? Is there a way to tell whether there is some sort of memory corruption going on that could cause some other process to affect the app's process?
[Edit1 by spektre] just copied user3481340's code from comment to be readable
I do not think that the computational time, which is about an hour
has anything to do with the problem.
Rather, the windows messaging for the edit box is getting messed-up.
The relevant code is:
int textlen=GetWindowTextLength(Editwin);
int k=strcspn(messagebuf,"\n");
if(k<strlen(messagebuf))textlength=strlen(messagebuf)-k;
else textlength+=k;
SendMessage (Editwin, EM_SETSEL, (WPARAM)textlen, (LPARAM)textlen);
SendMessage (Editwin, EM_REPLACESEL, 0, (LPARAM) messagebuf);
Somehow Windows stops responding to these messages.
Windows7 has some changes (from previous versions of windows)
in process scheduling
which can mess up lock-less multi-thread apps which are working 100% thread safe on older versions of windows. There is not much to do with this (unless add some safety sleeps or locks)
critical sections
I am not 100% convinced that critical sections work properly on Win7SP1 as I have some issues on specific machine setup where they do not work with combination of:
heavy duty USB HS bulk data transfers
OpenGL use
multi-threading with heavy multiple critical sections use
but it could be a hidden bug or messed-up Windows7 installation or even HW error
WindowProc
if your message loop get stuck for a longer time then it should then it will cause the program to be detected as non-responding even if it is not. Usually run in compatibility for XPSP3 helps but sadly not on all machines or all apps :( this is the main reason why many games does not run on Win7
I suspect this is your case and according to your last comment I was right.
Transfer critical processing to some thread instead so the WindowProc is not blocked. If you need an event after computation is done then flag some variable as done and scan for it in some timer so you can respond inside main thread (to avoid winapi calls outside main thread).
32bit Driver+App use on x64 windows (WOW64)
if your app is accessing any driver then on WOW64 you need special driver build to access real hardware instead of emulated by WOW64 !!! If you did not then your App is most likely waiting for real device respond instead is getting emulated one which can be different and cause real hang ups. In that case you need compile your app as x64
Or use some kind of WOW64.x86 -> x64 bridge
Or use driver which can handle it itself (usually link do different dll)

Synchronizing output between a main program and a QProcess?

I'm building a program that performs some users tests and needs to record data on what they are doing at very small intervals (every 10ms). Most of the data can be found from QT, but unfortunately I need to use a separate program to calculate mouse movement (I need to get movement even when the mouse has already hit the edge of the screen, but QT just ignores off-screen movement).
Therefore I've built a windows program that deals with the low level mouse input and outputs the changes in coordinates detected. The problem however, is that I can't get the data from the windows program to line up with the output from the main program.
In my main program, I use the follow code.
mouseTracker = new QProcess();
mouseTracker->start("C:\\WindowsFun.exe",QIODevice::ReadWrite|QIODevice::Unbuffered);
mouseTracker->setProcessChannelMode(QProcess::MergedChannels);
connect(mouseTracker,SIGNAL(readyRead()), this, SLOT(readMouseData()),Qt::DirectConnection);
and the readMouseData function looks like this.
void HideWindow::readMouseData(){
QByteArray data = mouseTracker->readAll();
QString text = QString(data);
saveFileStream << text.toStdString();
}
Some of this stuff might be unnecessary. I added in the "MergedChannels" mode and the "DirectConnection" bit in an attempt to solve the problem.
The result I'm getting is that the output from the windows program shows up in large blocks every 100ms or so, rather than being inserted into the filestream right when it occurs. It seems like there is a buffer somewhere that needs to fill, or a delay before the readyRead() signal is processed. Does anyone have any suggestions for how I can get the output from both the main program and the QProcess in real time? (Well, at least with a delay less than 10ms).
Also, if it is important, I am running Windows 7 and using MinGW for compilation of the main program and Visual Studio 2008 for the windows program that detects mouse movement. The output in windows look like this:
int xPosRelative = raw->data.mouse.lLastX;
int yPosRelative = raw->data.mouse.lLastY;
char output[100];
int n;
n = std::sprintf(output,"%d %d",xPosRelative,yPosRelative);
std::printf("%s\n",output,n);
std::fflush(0);
Let me know if anymore information is needed.
Thanks,
-Keilan
Unfortunately, the Windows implementation of QProcess is hard-coded to check once every 100 milliseconds for stdout/stderr from the external process. Looking through the Qt code, you may be able to get around this by calling waitForReadyRead frequently (with a small timeout value).
I would never depend on the standard input/output of a process on Windows. It seems that there are some limitations to the performance that hit me multiple times, even without Qt getting involved.
You'll do perfectly fine using a network connection on a localhost. That's the most universal and portable interprocess means of communication. Everything that Qt runs on supports it, and the performance is expected to be the same ballpark on each platform.

Logging/monitoring all function calls from an application

we have a problem with an application we're developing. Very seldom, like once in a hundred, the application crashes at start up. When the crash happens it brings down the whole system, the computer starts to beep and freezes up completely, the only way to recover is to turn off the power (we're using Windows XP). The rarity of the crash combined with the fact that we can't break into the debugger or even generate a stackdump when it occurs makes it extremely hard to debug.
I'm looking for something that logs all function calls to a file. Does such a tool exist? It shouldn't be impossible to implement, profilers like VTune does something very similar.
We're using visual studio 2008 (C++).
Thanks
A.B.
Logging function entries/exits is a low-level approach to your problem. I would suggest using automatic debugger instrumentation (using Debugger key under Image File Execution Options with regedit or using gflags from the package I provide a link to below) and trying to repro the problem until it crashes. Additionally, you can have the debugger log function call history of suspected module(s) using a script or have collect any other information.
But not knowing the details of your application it is very hard to suggest a solution. Is it a user app, service or a driver? What does "crashes at startup" mean - at windows startup or app's startup?
Use this debugger package to troubleshoot.
The only problem with the logging idea is that when the system crashes, the latest log entries might still be in the cache and have no chance to be written to disk...
If it was me I would try running the program on a different PC - it might be flaky hardware or drivers causing the problem. An application program "shouldn't" be able to bring down the system.
A few Ideas-
There is a good chance that just prior to your crash there is some sort of exception in the application. if you set you handler for all unhandled exceptions using SetUnhandledExceptionFilter() and write a stack trace to your log file, you might have a chance to catch the crash in action.
Just remember to flush the file after every write.
Another option is to use a tool such as strace which logs all of the system calls into the kernel (there are multiple flavors and implementations for that so pick your favorite). if you look at the log just before the crash you might find the culprit
Have you considered using a second machine as a remote debugger (via the network)? When the application (and system) crashes, the second machine should still show some useful information, if not the actual point of the problem. I believe VC++ has that ability, at least in some versions.
For Visual C++ _penter() and _pexit() can be used to instrument your code.
See also Method Call Interception in C++.
GCC (including the version MingGW for Windows development) has a code generation switch called -finstrument-functions that tells the compiler to emit special calls to functions called __cyg_profile_func_enter and __cyg_profile_func_exit around every function call. For Visual C++, there are similar options called /GH and /Gh. These cause the compiler to emit calls to __penter and __pexit around function calls.
These instrumentation modes can be used to implement a logging system, with you implementing the calls that the compiler generates to output to your local filesystem or to another computer on your network.
If possible, I'd also try running your system using valgrind or a similar checking tool. This might catch your problem before it gets out-of-hand.