Name conflict in DLL exports leads to wrong function exported - c++

I exported a function from a DLL, using a .def file, similar to this:
void __stdcall remove(FlagFile* flagP)
{
if (flagP == nullptr) {
return;
}
delete flagP;
}
Trying to call the function, I received crashes I couldn't place, until I saw in the disassembler that instead of calling the code above, the linker had exported a different function from the CRT instead, and then that function was called in place of the function I expected to be called. This resulted in a stack corruption, likely because the function above has void result, as opposed to CRT's remove(), which returns int. The solution was to simply rename the exported function to resolve the name conflict. Unfortunately, MSVC does not give a warning about the name conflict.
(Just putting this here so people can object if my analysis is wrong. This post hopefully gives others a clue who run into a similar problem.)
This was observed with MSVC 2019 under Windows 11. Not sure if it applies to other compilers/operating systems (shared libraries in that case).
So the question would be: aside from giving all functions a prefix (like OpenGL's gl), what are your strategies for dealing with this problem, if it ever was a problem?

Related

Linking to external functions in dynamic libraries with LLVM

In my project, I am emitting LLVM IR which makes calls to external functions in dynamic libraries.
I declare my external functions like:
declare %"my_type"* #"my_function"()
In the external library, functions are declared like:
extern "C" {
my_type* my_function();
}
When I compile the IR and run it, the process immediately crashes. The same behavior happens if I declare and call a nonsense function that I know doesn't exist, so I assume what's happening is that the external function is not being found/linked. (I don't think that the function itself is crashing).
I am using Python's llvmlite library for this task, and within the same process where I JIT and invoke my LLVM IR, I have another python library imported which requires the external dynamic library; so I assume that library is loaded and in-memory.
The procedure I'm using to compile and execute my LLVM code is basically the same as what's in this document, except that the IR declares and invokes an external function. I have tried invoking cos(), as in the Kaleidoscope tutorial, and this succeeds, so I am not sure what is different about my own library functions.
I have tried adding an underscore to the beginning of the function name, but I get the same result. (Do I need to add the underscore in the LLVM function declaration?)
How do I verify my assumption that the process is crashing because the named function isn't found?
How do I diagnose why the function isn't being found?
What do I need to do in order to make use of external functions in a dynamic library, from LLVM code?
Edit: It seems there is indeed trouble getting the function pointers to my external function. If I try to merely print the function address by replacing my calls with %"foo" = ptrtoint %"my_type"* ()* #"my_function" to i64 and return/print the result, it still segfaults. Merely trying to obtain the pointer is enough to cause a crash! Why is this, and how do I fix it?
Edit: Also forgot to mention— this is on Ubuntu (in a Docker container, on OSX).
I figured it out— I was missing that I need to call llvmlite.binding.load_library_permanently(filename) in order for the external symbols to become available. Even though the library is already in memory, the function still needs to be invoked. (This corresponds to the LLVM native function llvm::sys::DynamicLibrary::LoadLibraryPermanently()).
From this answer, it appears calling the above function with nullptr will import all the symbols available to the process.
Oddly, on OSX, I found that the external symbols were available even though load_library_permanently() had not been explicitly called— I'm not sure why this is (perhaps the OSX build of llvmlite itself happened to call the function with nullptr, as above?).

Access violation calling C++ dll

I created c++ dll (using mingw) from code I wrote on linux (gcc), but somehow have difficulties using it in VC++. The dll basically exposes just one class, I created pure virtual interface for it and also factory function which creates the object (the only export) which looks like this:
extern "C" __declspec(dllexport) DeviceDriverApi* GetX5Driver();
I added extern "C" to prevent name mangling, dllexport is replaced by dllimport in actual code where I want to use the dll, DeviceDriverApi is the pure virtual interface.
Now I wrote simple code in VC++ which just call the factory function and then just tries to delete the pointer. It compiles without any problems but when I try to run it I get access violation error. If I try to call any method of the object I get access violation again.
When I compile the same code in MinGW (gcc) and use the same library, it runs without any problems. So there must be something (hehe, I guess many differences actually :)) between how VC++ code uses the library and gcc code.
Any ideas what?
Cheers,
Tom
Edit:
The code is:
DeviceDriverApi* x5Driver = GetX5Driver();
if (x5Driver->isConnected())
Console::WriteLine(L"Hello World");
delete x5Driver;
It's crashing when I try to call the method and when I try to delete the pointer as well. The object is created correctly though (the first line). There are some debug outputs when the object is created and I can see them before I get the access violation error.
You're using one compiler (mingw) for the DLL, and another (VC++) for the calling code.
You're calling a 'C' function, but returning a pointer to a C++ Object.
That will never work, because VTable layouts are almost guranteed to be incompatible. And, the DLL and app are probably using different memory managers, so you're doing new() with one and delete() with the other. Again, it just won't work.
For this to work the two compilers need to both support a standard ABI (Application Binary Interface). I don't think such a thing exists for Windows.
The best option is to expose all you DLL object methods and properties via C functions (including one to delete the object). You can the re-wrap into a C++ object on the calling end.
The two different compilers may be using different calling conventions. Try putting _cdecl before the function name in both the client code and the DLL code and recompiling both.
More info on calling conventions here: http://en.wikipedia.org/wiki/X86_calling_conventions
EDIT: The question was updated with more detail and it looks likely the problem is what Adrien Plisson describes at the end of his answer. You're creating an object in one module and freeing it in another, which is wrong.
(1) I suspect a calling covnention problem as well, though the simple suggestion by Leo doesn't seem to have helped.
Is isConnected virtual? It is possible that MinGW and VC++ use different implementations for a VTable, in which case, well, tough luck.
Try to see how far you get with the debugger: does it crash at the call, or the return? Do you arrive at invalid code? (If you know to read assembly, that usually helps a lot with these problems.)
Alternatively, add trace statements to the various methods, to see how far you get.
(2) For a public DLL interface, never free memory in the caller that was allocated by a callee (or vice versa). The DLL likely runs with a completely different heap, so the pointer is not known.
If you want to rely on that behavior, you need to make sure:
Caller and Callee (i.e. DLL and main program, in your case) are compiled with the same version of the sam compiler
for all supported compilers, you have configured the compile options to ensure caller and callee use the same shared runtime library state.
So the best way is to change your API to:
extern "C" __declspec(dllexport) DeviceDriverApi* GetX5Driver();
extern "C" __declspec(dllexport) void FreeDeviceDriver(DeviceDriverApi* driver);
and, at caller site, wrap in some way (e.g. in a boost::intrusive_ptr).
try looking at the imported libraries from both your DLL and your client executable. (you can use the Dependency Viewer or dumpbin or any other tool you like). verify that both the DLL and the client code are using the same C++ runtime.
if it is not the case, you can indeed run into some issues since the way the memory is managed may be different between the 2, leading to a crash when freeing from one runtime a pointer allocated from another runtime.
if this is really your problem, try not destroying the pointer in your client executable, but rather declare and export a function in your DLL which will take care of destroying the pointer.

What is the purpose of __cxa_pure_virtual?

Whilst compiling with avr-gcc I have encountered linker errors such as the following:
undefined reference to `__cxa_pure_virtual'
I've found this document which states:
The __cxa_pure_virtual function is an error handler that is invoked when a pure virtual function is called.
If you are writing a C++ application that has pure virtual functions you must supply your own __cxa_pure_virtual error handler function. For example:
extern "C" void __cxa_pure_virtual() { while (1); }
Defining this function as suggested fixes the errors but I'd like to know:
what the purpose of this function is,
why I should need to define it myself and
why it is acceptable to code it as an infinite loop?
If anywhere in the runtime of your program an object is created with a virtual function pointer not filled in, and when the corresponding function is called, you will be calling a 'pure virtual function'.
The handler you describe should be defined in the default libraries that come with your development environment. If you happen to omit the default libraries, you will find this handler undefined: the linker sees a declaration, but no definition. That's when you need to provide your own version.
The infinite loop is acceptable because it's a 'loud' error: users of your software will immediately notice it. Any other 'loud' implementation is acceptable, too.
1) What's the purpose of the function __cxa_pure_virtual()?
Pure virtual functions can get called during object construction/destruction. If that happens, __cxa_pure_virtual() gets called to report the error. See Where do "pure virtual function call" crashes come from?
2) Why might you need to define it yourself?
Normally this function is provided by libstdc++ (e.g. on Linux), but avr-gcc and the Arduino toolchain don't provide a libstdc++.
The Arduino IDE manages to avoid the linker error when building some programs because it compiles with the options "-ffunction-sections -fdata-sections" and links with "-Wl,--gc-sections", which drops some references to unused symbols.
3) Why is it acceptable to code __cxa_pure_virtual() as an infinite loop?
Well, this is at least safe; it does something predictable. It would be more useful to abort the program and report the error. An infinite loop would be awkward to debug, though, unless you have a debugger that can interrupt execution and give a stack backtrace.

Call function in c++ dll without header

I would like to call a method from an dll, but i don't have the source neither the header file. I tried to use the dumpbin /exports to see the name of the method, but i can found the methods signature?
Is there any way to call this method?
Thanks,
If the function is a C++ one, you may be able to derive the function signature from the mangled name. Dependency Walker is one tool that will do this for you. However, if the DLL was created with C linkage (Dependency Walker will tell you this), then you are out of luck.
The C++ language does not know anything about dlls.
Is this on Windows? One way would be to:
open the dll up in depends.exe shipped with (Visual Studio)
verify the signature of the function you want to call
use LoadLibrary() to get load this dll (be careful about the path)
use GetProcAddress() to get a pointer to the function you want to call
use this pointer-to-function to make a call with valid arguments
use FreeLibrary() to release the handle
BTW: This method is also commonly referred to as runtime dynamic linking as opposed to compile-time dynamic linking where you compile your sources with the associated lib file.
There exists some similar mechanism for *nixes with dlopen, but my memory starts to fail after that. Something called objdump or nm should get you started with inspecting the function(s).
As you have found, the exports list in a DLL only stores names, not signatures. If your DLL exports C functions, you will probably have to disassemble and reverse engineer the functions to determine method signatures. However, C++ encodes the method signature in the export name. This process of combining the method name and signature is called "name mangling". This Stackoverflow question has a reference for determining the method signature from the mangled export name.
Try the free "Dependency Walker" (a.k.a. "depends") utility. The "Undecorate C++ Functions" option should determine the signature of a C++ method.
It is possible to figure out a C function signature by analysing beginnig of its disassembly. The function arguments will be on the stack and the function will do some "pops" to read them in reverse order. You will not find the argument names, but you should be able to find out their number and the types. Things may get more difficult with return value - it may be via 'eax' register or via a special pointer passed to the function as the last pseudo-argument (on the top of the stack).
If you indeed know or strongly suspect the function is there, you can dynamically load the DLL with loadLibrary and get a pointer to the function with getProcAddress. See MSDN
Note that this is a manual, dynamic way to load the library; you'll still have to know the correct function signature to map to the function pointer in order to use it. AFAIK there is no way to use the dll in a load-time capability and use the functions without a header file.
Calling non-external functions is a great way to have your program break whenever the 3rd party DLL is updated.
That said, the undname utility may also be helpful.

Weird MSC 8.0 error: "The value of ESP was not properly saved across a function call..."

We recently attempted to break apart some of our Visual Studio projects into libraries, and everything seemed to compile and build fine in a test project with one of the library projects as a dependency. However, attempting to run the application gave us the following nasty run-time error message:
Run-Time Check Failure #0 - The value of ESP was not properly saved across a function call. This is usually a result of calling a function pointer declared with a different calling convention.
We have never even specified calling conventions (__cdecl etc.) for our functions, leaving all the compiler switches on the default. I checked and the project settings are consistent for calling convention across the library and test projects.
Update: One of our devs changed the "Basic Runtime Checks" project setting from "Both (/RTC1, equiv. to /RTCsu)" to "Default" and the run-time vanished, leaving the program running apparently correctly. I do not trust this at all. Was this a proper solution, or a dangerous hack?
This debug error means that the stack pointer register is not returned to its original value after the function call, i.e. that the number of pushes before the function call were not followed by the equal number of pops after the call.
There are 2 reasons for this that I know (both with dynamically loaded libraries). #1 is what VC++ is describing in the error message, but I don't think this is the most often cause of the error (see #2).
1) Mismatched calling conventions:
The caller and the callee do not have a proper agreement on who is going to do what. For example, if you're calling a DLL function that is _stdcall, but you for some reason have it declared as a _cdecl (default in VC++) in your call. This would happen a lot if you're using different languages in different modules etc.
You would have to inspect the declaration of the offending function, and make sure it is not declared twice, and differently.
2) Mismatched types:
The caller and the callee are not compiled with the same types. For example, a common header defines the types in the API and has recently changed, and one module was recompiled, but the other was not--i.e. some types may have a different size in the caller and in the callee.
In that case, the caller pushes the arguments of one size, but the callee (if you're using _stdcall where the callee cleans the stack) pops the different size. The ESP is not, thus, returned to the correct value.
(Of course, these arguments, and others below them, would seem garbled in the called function, but sometimes you can survive that without a visible crash.)
If you have access to all the code, simply recompile it.
I read this in other forum
I was having the same problem, but I just FIXED it. I was getting the same error from the following code:
HMODULE hPowerFunctions = LoadLibrary("Powrprof.dll");
typedef bool (*tSetSuspendStateSig)(BOOL, BOOL, BOOL);
tSetSuspendState SetSuspendState = (tSuspendStateSig)GetProcAddress(hPowerfunctions, "SetSuspendState");
result = SetSuspendState(false, false, false); <---- This line was where the error popped up.
After some investigation, I changed one of the lines to:
typedef bool (WINAPI*tSetSuspendStateSig)(BOOL, BOOL, BOOL);
which solved the problem. If you take a look in the header file where SetSuspendState is found (powrprof.h, part of the SDK), you will see the function prototype is defined as:
BOOLEAN WINAPI SetSuspendState(BOOLEAN, BOOLEAN, BOOLEAN);
So you guys are having a similar problem. When you are calling a given function from a .dll, its signature is probably off. (In my case it was the missing WINAPI keyword).
Hope that helps any future people! :-)
Cheers.
Silencing the check is not the right solution. You have to figure out what is messed up with your calling conventions.
There are quite a few ways to change the calling convetion of a function without explicitly specifying it. extern "C" will do it, STDMETHODIMP/IFACEMETHODIMP will also do it, other macros might do it as well.
I believe if run your program under WinDBG (http://www.microsoft.com/whdc/devtools/debugging/default.mspx), the runtime should break at the point where you hit that problem. You can look at the call stack and figure out which function has the problem and then look at its definition and the declaration that the caller uses.
I saw this error when the code tried to call a function on an object that was not of the expected type.
So, class hierarchy: Parent with children: Child1 and Child2
Child1* pMyChild = 0;
...
pMyChild = pSomeClass->GetTheObj();// This call actually returned a Child2 object
pMyChild->SomeFunction(); // "...value of ESP..." error occurs here
I was getting similar error for AutoIt APIs which i was calling from VC++ program.
typedef long (*AU3_RunFn)(LPCWSTR, LPCWSTR);
However, when I changed the declaration which includes WINAPI, as suggested earlier in the thread, problem vanished.
Code without any error looks like:
typedef long (WINAPI *AU3_RunFn)(LPCWSTR, LPCWSTR);
AU3_RunFn _AU3_RunFn;
HINSTANCE hInstLibrary = LoadLibrary("AutoItX3.dll");
if (hInstLibrary)
{
_AU3_RunFn = (AU3_RunFn)GetProcAddress(hInstLibrary, "AU3_WinActivate");
if (_AU3_RunFn)
_AU3_RunFn(L"Untitled - Notepad",L"");
FreeLibrary(hInstLibrary);
}
It's worth pointing out that this can also be a Visual Studio bug.
I got this issue on VS2017, Win10 x64. At first it made sense, since I was doing weird things casting this to a derived type and wrapping it in a lambda. However, I reverted the code to a previous commit and still got the error, even though it wasn't there before.
I tried restarting and then rebuilding the project, and then the error went away.
I was getting this error calling a function in a DLL which was compiled with a pre-2005 version of Visual C++ from a newer version of VC (2008).
The function had this signature:
LONG WINAPI myFunc( time_t, SYSTEMTIME*, BOOL* );
The problem was that time_t's size is 32 bits in pre-2005 version, but 64 bits since VS2005 (is defined as _time64_t). The call of the function expects a 32 bit variable but gets a 64 bit variable when called from VC >= 2005. As parameters of functions are passed via the stack when using WINAPI calling convention, this corrupts the stack and generates the above mentioned error message ("Run-Time Check Failure #0 ...").
To fix this, it is possible to
#define _USE_32BIT_TIME_T
before including the header file of the DLL or -- better -- change the signature of the function in the header file depending on the VS version (pre-2005 versions don't know _time32_t!):
#if _MSC_VER >= 1400
LONG WINAPI myFunc( _time32_t, SYSTEMTIME*, BOOL* );
#else
LONG WINAPI myFunc( time_t, SYSTEMTIME*, BOOL* );
#endif
Note that you need to use _time32_t instead of time_t in the calling program, of course.
I was having this exact same error after moving functions to a dll and dynamically loading the dll with LoadLibrary and GetProcAddress. I had declared extern "C" for the function in the dll because of the decoration. So that changed calling convention to __cdecl as well. I was declaring function pointers to be __stdcall in the loading code. Once I changed the function pointer from __stdcall to__cdecl in the loading code the runtime error went away.
Are you creating static libs or DLLs? If DLLs, how are the exports defined; how are the import libraries created?
Are the prototypes for the functions in the libs exactly the same as the function declarations where the functions are defined?
do you have any typedef'd function prototypes (eg int (*fn)(int a, int b) )
if you dom you might be have gotten the prototype wrong.
ESP is an error on the calling of a function (can you tell which one in the debugger?) that has a mismatch in the parameters - ie the stack has restored back to the state it started in when you called the function.
You can also get this if you're loading C++ functions that need to be declared extern C - C uses cdecl, C++ uses stdcall calling convention by default (IIRC). Put some extern C wrappers around the imported function prototypes and you may fix it.
If you can run it in the debugger, you'll see the function immediatey. If not, you can set DrWtsn32 to create a minidump that you can load into windbg to see the callstack at the time of the error (you'll need symbols or a mapfile to see the function names though).
Another case where esp can get messed up is with an inadvertent buffer overflow, usually through mistaken use of pointers to work past the boundary of an array. Say you have some C function that looks like
int a, b[2];
Writing to b[3] will probably change a, and anywhere past that is likely to hose the saved esp on the stack.
You would get this error if the function is invoked with a calling convention other than the one it is compiled to.
Visual Studio uses a default calling convention setting thats decalred in the project's options. Check if this value is the same in the orignal project settings and in the new libraries. An over ambitious dev could have set this to _stdcall/pascal in the original since it reduces the code size compared to the default cdecl. So the base process would be using this setting and the new libraries get the default cdecl which causes the problem
Since you have said that you do not use any special calling conventions this seems to be a good probability.
Also do a diff on the headers to see if the declarations / files that the process sees are the same ones that the libraries are compiled with .
ps : Making the warning go away is BAAAD. the underlying error still persists.
This happened to me when accessing a COM object (Visual Studio 2010). I passed the GUID for another interface A for in my call to QueryInterface, but then I cast the retrieved pointer as interface B. This resulted in making a function call to one with an entirely signature, which accounts for the stack (and ESP) being messed up.
Passing the GUID for interface B fixed the problem.
In my MFC C++ app I am experiencing the same problem as reported in Weird MSC 8.0 error: “The value of ESP was not properly saved across a function call…”. The posting has over 42K views and 16 answers/comments none of which blamed the compiler as the problem. At least in my case I can show that the VS2015 compiler is at fault.
My dev and test setup is the following: I have 3 PCs all of which run Win10 version 10.0.10586. All are compiling with VS2015, but here is the difference. Two of the VS2015s have Update 2 while the other has Update 3 applied. The PC with Update 3 works, but the other two with Update 2 fail with the same error as reported in the posting above. My MFC C++ app code is exactly the same on all three PCs.
Conclusion: at least in my case for my app the compiler version (Update 2) contained a bug that broke my code. My app makes heavy use of std::packaged_task so I expect the problem was in that fairly new compiler code.
ESP is the stack pointer. So according to the compiler, your stack pointer is getting messed up. It is hard to say how (or if) this could be happening without seeing some code.
What is the smallest code segment you can get to reproduce this?
If you're using any callback functions with the Windows API, they must be declared using CALLBACK and/or WINAPI. That will apply appropriate decorations to make the compiler generate code that cleans the stack correctly. For example, on Microsoft's compiler it adds __stdcall.
Windows has always used the __stdcall convention as it leads to (slightly) smaller code, with the cleanup happening in the called function rather than at every call site. It's not compatible with varargs functions, though (because only the caller knows how many arguments they pushed).
Here's a stripped down C++ program that produces that error. Compiled using (Microsoft Visual Studio 2003) produces the above mentioned error.
#include "stdafx.h"
char* blah(char *a){
char p[1];
strcat(p, a);
return (char*)p;
}
int main(){
std::cout << blah("a");
std::cin.get();
}
ERROR:
"Run-Time Check Failure #0 - The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention."
I had this same problem here at work. I was updating some very old code that was calling a FARPROC function pointer. If you don't know, FARPROC's are function pointers with ZERO type safety. It's the C equivalent of a typdef'd function pointer, without the compiler type checking.
So for instance, say you have a function that takes 3 parameters. You point a FARPROC to it, and then call it with 4 parameters instead of 3. The extra parameter pushed extra garbage onto the stack, and when it pops off, ESP is now different than when it started. So I solved it by removing the extra parameter to the invocation of the FARPROC function call.
Not the best answer but I just recompiled my code from scratch (rebuild in VS) and then the problem went away.