I stumbled about an issue while using libstdc++'s std::any implementation with mingw across a shared library boundary. It produces a std::bad_any_cast where it obviously should not (i believe).
I use mingw-w64, gcc-7 and compile the code with -std=c++1z.
The simplified code:
main.cpp:
#include <any>
#include <string>
// prototype from lib.cpp
void do_stuff_with_any(const std::any& obj);
int main()
{
do_stuff_with_any(std::string{"Hello World"});
}
lib.cpp:
Will be compiled into a shared library and linked with the executable from main.cpp.
#include <any>
#include <iostream>
void do_stuff_with_any(const std::any& obj)
{
std::cout << std::any_cast<const std::string&>(obj) << "\n";
}
This triggers a std::bad_any_cast although the any passed to do_stuff_with_any does contain a string. I digged into gcc's any implementation and it seems to use comparison of the address of a static inline member function (a manager chosen from a template struct depending on the type of the stored object) to check if the any holds an object of the requested type.
And the address of this function seems to change across the shared library boundary.
Isn't std::any guaranteed to work across shared library boundaries? Does this code trigger UB somewhere? Or is this a bug in the gcc implementation? I am pretty sure it works on linux so is this only a bug in mingw? Is it known or should i report it somewhere if so? Any ideas for (temporary) workarounds?
While it is true that this is an issue on how Windows DLLs work, and that as of GCC 8.2.0, the issue still remains, this can be easily worked around by changing the __any_caster function inside the any header to this:
template<typename _Tp>
void* __any_caster(const any* __any)
{
if constexpr (is_copy_constructible_v<decay_t<_Tp>>)
{
#if __cpp_rtti
if (__any->type().hash_code() == typeid(_Tp).hash_code())
#else
if (__any->_M_manager == &any::_Manager<decay_t<_Tp>>::_S_manage)
#endif
{
any::_Arg __arg;
__any->_M_manager(any::_Op_access, __any, &__arg);
return __arg._M_obj;
}
}
return nullptr;
}
Or something similar, the only relevant part is the comparison line wrapped in the #if.
To elaborate, there is 2 copies of the manager function one on the exe and one on the dll, the passed object contains the address of the exe because that's where it was created, but once it reaches the dll side, the pointer gets compared to the one in the dll address space, which will never match, so, instead type info hash_codes should be compared instead.
Related
Yesterday I asked a question about this problem, but I wasn't able to give a MVCE. I've managed to reproduce this with a simple program. The problem is with using an std::list as a static inline declaration in a class. Microsoft Visual Studio does support this new C++17 feature. It had some bugs as of March, but as far as I know they've been fixed since. Here are instructions of how I can get this problem, this happens in debug mode.
In main.cpp
#include <iostream>
#include "header1.h"
int main()
{
return 0;
}
In header1.h:
#include <list>
struct Boo
{
static inline std::list<int> mylist;
};
In anotherCPP.cpp
#include "Header1.h"
When the program exits main() it destroys all the static objects and throws an exception.
If this doesn't crash, maybe on your system the compiler/linker optimised some code out, so you can try making main.cpp and anotherCPP.cpp do something. In anotherCPP.cpp:
#include <iostream>
#include "Header1.h"
void aFunction()
{
std::cout << Boo::mylist.size();
}
And make main.cpp:
#include <iostream>
#include "Header1.h"
void aFunction();
int main()
{
std::cout << Boo::mylist.size();
afunction();
return 0;
}
When the program exits I get an exception here when the std::list is being cleared. Here is the Visual Studio debug code where it crashes:
for (_Nodeptr _Pnext; _Pnode != this->_Myhead(); _Pnode = _Pnext)
{ // delete an element
_Pnext = _Pnode->_Next; // Here: Exception thrown:
// read access violation.
// _Pnode was 0xFFFFFFFFFFFFFFFF.
this->_Freenode(_Pnode);
}
This happens only if I declare the static inline std::list< int > mylist in the class. If I declare it as static std::list< int > mylist in my class and then define it separately in one .cpp as std::list< int > Boo::mylist; it works fine. This problem arises when I declare the std::list static inline and I include the header for the class in two .cpp files.
In my project I have stepped through the std::list clear loop from above, I took note of the "this" pointer address. I stepped through the loop as it freed nodes in my list. It then came back to free other std::lists, including in std::unordered_map (as they also use std::lists from the looks of it). Finally when the read access exception is thrown and _Pnode is an invalid pointer address, I noticed the "this" pointer address is the same as the "this" pointer address when clearing std::list< int > mylist, which makes me think that it's trying to delete it twice, and probably why it's crashing.
I hope someone can reproduce this, I'm not sure what this is, if it's a bug or something I'm doing wrong. Also this happens for me in 32 and 64 bit, but only in debug mode, because the node freeing loop I provided is under a macro:
#if _ITERATOR_DEBUG_LEVEL == 2
This issue was filed as a bug here under the title "Multiple initializations of inline static data member in Debug mode".
This was found in Visual Studio 2017 version 15.7.
The VS compiler team has accepted this and have fixed the problem in an upcoming release.
While trying to replicate the behavior in this question in Visual Studio 2017 I found that instead of linking &FuncTemplate<C> to the exact same address the function template<> FuncTemplate<C>() {} gets copied into dllA and dllB so that the corresponding test program always returns not equal.
The solution was setup fresh with 3 Win32Projects, one as ConsoleApplication, the others as DLL. To link the DLLs I added them as reference to the console project (linking manually didn't work either). The only change in code I made was adding the __declspec(dllexport) to a() and b().
Is this behavior standard conforment? It seems like the ODR should be used here to collapse the copies of the function. Is there a way to get the same behavior seen in the other question?
Template.h
#pragma once
typedef void (*FuncPtr)();
template<typename T>
void FuncTemplate() {}
class C {};
a.cpp - dll project 1
#include "Template.h"
__declspec(dllexport) FuncPtr a() {
return &FuncTemplate<C>;
}
b.cpp - dll project 2
#include "Template.h"
__declspec(dllexport )FuncPtr b() {
return &FuncTemplate<C>;
}
main.cpp - console project
#include <iostream>
#include "i.h"
// seems like there is no __declspec(dllimport) needed here
FuncPtr a();
FuncPtr b();
int main() {
std::cout << (a() == b() ? "equal" : "not equal") << std::endl;
return 0;
}
C++ compilation is generally split into two parts, the compiler itself and the linker. It is the job of the linker to find and consolidate all the compilations of an identical function into a single unit and throw away the duplicates. At the end of a linking step, every function should either be part of the linker output or flagged as needing to be resolved at execution time from another DLL. Each DLL will contain a copy of the function if it is being used within that DLL or exported from it.
The process of resolving dynamic links at execution time is outside of the C++ tool chain, it happens at the level of the OS. It doesn't have the ability to consolidate duplicates like the linker does.
I think as far as ODR is concerned, each DLL is considered a separate executable.
Let I have a header, for example #include <GL/gl.h>. It contains subset of OpenGL API functions. I need something like this:
static_assert(has_glDrawArraysIndirect::value, "There is no glDrawArraysIndirect");
Or even better:
PFNGLDRAWARRAYSINSTANCEDPROC ptr_glDrawArraysIndirect = ptr_to_glDrawArraysIndirect::ptr;
Where ptr_to_glDrawArraysIndirect::ptr unrolls to pointer to glDrawArraysIndirect if it's defined or to a stub function stub_glDrawArraysIndirect otherwise.
My target operating system is very specific. Any linker based solution (like GetProcAddress or dlsym) doesn't work for me, since there is no dynamic linker. More than, my driver doesn't provide glXGetProcAdrress nor wglGetProcAddress, basically there there is no way to query pointer at run time by function name (Actually, I want to implement such a mechanism).
Any ideas?
Here is an answer that can detect it at compile time and produce a boolean value. It works by creating a template function of the same name in a namespace and then using that namespace inside of the is_defined() function. If the real glDrawArraysIndirect() exists it will take preference over the template version. If you comment out the first declaration of glDrawArraysIndirect() the static assert at the bottom will trigger.
Test on GodBolt
#include <type_traits>
enum GLenum {};
void glDrawArraysIndirect(GLenum, const void*);
namespace detail {
struct dummy;
template<typename T>
dummy& glDrawArraysIndirect(T, const void*);
}
constexpr bool is_defined()
{
using namespace detail;
using ftype = decltype(glDrawArraysIndirect(GLenum(), nullptr));
return std::is_same<ftype, void>();
}
static_assert(is_defined(), "not defined");
With a little tweak you can make your custom function the template and use a similar trick
ideone.com
#include <type_traits>
#include <iostream>
//#define USE_REAL
enum GLenum {TEST};
typedef void (*func_type)(GLenum, const void*);
#ifdef USE_REAL
void glDrawArraysIndirect(GLenum, const void*);
#endif
namespace detail {
struct dummy {};
template<typename T = dummy>
void glDrawArraysIndirect(GLenum, const void*, T = T())
{
std::cout << "In placeholder function" << std::endl;
}
}
void wrapDraw(GLenum x, const void* y)
{
using namespace detail;
glDrawArraysIndirect(x, y);
}
#ifdef USE_REAL
void glDrawArraysIndirect(GLenum, const void*)
{
std::cout << "In real function" << std::endl;
}
#endif
int main()
{
wrapDraw(TEST, nullptr);
}
Include the expression sizeof(::function) somewhere. (If the function exists then asking for the size of the pointer to the function is a perfectly valid thing to do).
It will be benign at runtime, and :: forces the use of the function declared at global scope.
Of course, if function does not exist at global scope, then compilation will fail.
Along with other errors, the compiler will issue a specific error if you were to write something on the lines of
static_assert(sizeof(::function), "There is no global function");
My target operating system is very specific. Any linker based solution (like GetProcAddress or dlsym) doesn't work for me, since there is no dynamic linker.
Is this an embedded system or just a weirdly stripped down OS running on standard PC hardware?
More than, my driver doesn't provide glXGetProcAdrress nor wglGetProcAddress, basically there there is no way to query pointer at run time by function name
The abiliy to query function pointers at runtime does not depend on the presence of a dynamic linker. Those two are completely orthogonal and even a purely statically linked embedded OpenGL implementation can offer a GetProcAddress interface just fine. Instead of trying to somehow solve the problem at compile or link time, I'd rather address the problem by implementing a GetProcAddress for your OpenGL driver; you can do that even if the driver is available as only a static library in binary form. Step one:
Create function pointer stubs for each and every OpenGL function, statically initialized to NULL and attributed weak linkage. Link this into a static library you may call gl_null_stubs or similar.
Create a GetProcAddress function that for every OpenGL function there is returns the pointer to the function symbol within the scope of the function's compilation unit.
Now link your weird OpenGL driver with the stubs library and the GetProcAddress implementation. For every function there is, the weak linkage of the stub will the static library symbol to take precedence. For all OpenGL symbols not in your driver the stubs will take over.
There: Now you have a OpenGL driver library that has a GetProcAddress implementation. That wasn't that hard, was it?
How to check if function is declared in global scope at compile time?
My target operating system is very specific...
A possible solution might be, if you are using a recent GCC -probably as a cross-compiler for your weird target OS and ABI- to customize the gcc (or g++ etc...) compiler with your own MELT extension.
MELT is a domain specific language, implemented as a free software GCC plugin (mostly on Linux), to customize the GCC compiler.
Suppose there's a library, one version of which defines a function with name foo, and another version has the name changed to foo_other, but both these functions still have the same arguments and return values. I currently use conditional compilation like this:
#include <foo.h>
#ifdef USE_NEW_FOO
#define trueFoo foo_other
#else
#define trueFoo foo
#endif
But this requires some external detection of the library version and setting the corresponding compiler option like -DUSE_NEW_FOO. I'd rather have the code automatically figure what function it should call, based on it being declared or not in <foo.h>.
Is there any way to achieve this in any version of C?
If not, will switching to any version of C++ provide me any ways to do this? (assuming the library does all the needed actions like extern "C" blocks in its headers)? Namely, I'm thinking of somehow making use of SFINAE, but for a global function, rather than method, which was discussed in the linked question.
In C++ you can use expression SFINAE for this:
//this template only enabled if foo is declared with the right args
template <typename... Args>
auto trueFoo (Args&&... args) -> decltype(foo(std::forward<Args>(args)...))
{
return foo(std::forward<Args>(args)...);
}
//ditto for fooOther
template <typename... Args>
auto trueFoo (Args&&... args) -> decltype(fooOther(std::forward<Args>(args)...))
{
return fooOther(std::forward<Args>(args)...);
}
If you are statically linking to a function, in most versions of C++, the name of the function is "mangled" to reflect its argument list. Therefore, an attempt to statically link to the library, by a program with an out-of-date .hpp file, will result in an "unknown symbol" linker-error.
In the C language, there's no metadata of any kind which indicates what the argument list of any exported function actually is.
Realistically, I think, you simply need to be sure that the .h or .hpp files that you're using to link to a library, actually reflect the corresponding object-code within whatever version of that library you are using. You also need to be sure that the Makefile (or "auto-make" process) will correctly identify any-and-all modules within your application which link-to that library and which therefore must be recompiled in case of any changes to it. (If it were me, I would recompile the entire application.) In short, you must see to it that this issue doesn't occur.
In C++ you can do something like this:
#include <iostream>
#include <type_traits>
//#define DEFINE_F
#ifdef DEFINE_F
void f()
{
}
#endif
namespace
{
constexpr struct special
{
std::false_type operator()() const;
}f;
}
struct checkForF
{
static const constexpr auto value = std::conditional< std::is_same<std::false_type, decltype(::f())>::value, std::false_type, std::true_type >::type();
};
int main()
{
std::cout << checkForF::value << std::endl;
}
ideone
Please note I only handle f without any parameters.
Background:
I've found myself with the unenviable task of porting a C++ GNU/Linux application over to Windows. One of the things this application does is search for shared libraries on specific paths and then loads classes out of them dynamically using the posix dlopen() and dlsym() calls. We have a very good reason for doing loading this way that I will not go into here.
The Problem:
To dynamically discover symbols generated by a C++ compiler with dlsym() or GetProcAddress() they must be unmangled by using an extern "C" linkage block. For example:
#include <list>
#include <string>
using std::list;
using std::string;
extern "C" {
list<string> get_list()
{
list<string> myList;
myList.push_back("list object");
return myList;
}
}
This code is perfectly valid C++ and compiles and runs on numerous compilers on both Linux and Windows. It, however, does not compile with MSVC because "the return type is not valid C". The workaround we've come up with is to change the function to return a pointer to the list instead of the list object:
#include <list>
#include <string>
using std::list;
using std::string;
extern "C" {
list<string>* get_list()
{
list<string>* myList = new list<string>();
myList->push_back("ptr to list");
return myList;
}
}
I've been trying to find an optimal solution for the GNU/Linux loader that will either work with both the new functions and the old legacy function prototype or at least detect when the deprecated function is encountered and issue a warning. It would be unseemly for our users if the code just segfaulted when they tried to use an old library. My original idea was to set a SIGSEGV signal handler during the call to get_list (I know this is icky - I'm open to better ideas). So just to confirm that loading an old library would segfault where I thought it would I ran a library using the old function prototype (returning a list object) through the new loading code (that expects a pointer to a list) and to my surprise it just worked. The question I have is why?
The below loading code works with both function prototypes listed above. I've confirmed that it works on Fedora 12, RedHat 5.5, and RedHawk 5.1 using gcc versions 4.1.2 and 4.4.4. Compile the libraries using g++ with -shared and -fPIC and the executable needs to be linked against dl (-ldl).
#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>
#include <list>
#include <string>
using std::list;
using std::string;
int main(int argc, char **argv)
{
void *handle;
list<string>* (*getList)(void);
char *error;
handle = dlopen("library path", RTLD_LAZY);
if (!handle)
{
fprintf(stderr, "%s\n", dlerror());
exit(EXIT_FAILURE);
}
dlerror();
*(void **) (&getList) = dlsym(handle, "get_list");
if ((error = dlerror()) != NULL)
{
printf("%s\n", error);
exit(EXIT_FAILURE);
}
list<string>* libList = (*getList)();
for(list<string>::iterator iter = libList->begin();
iter != libList->end(); iter++)
{
printf("\t%s\n", iter->c_str());
}
dlclose(handle);
exit(EXIT_SUCCESS);
}
As aschepler says, its because you got lucky.
As it turns out, the ABI used for gcc (and most other compilers) for both x86 and x64 returns 'large' structs (too big to fit in a register) by passing an extra 'hidden' pointer arg to the function, which uses that pointer as space to store the return value, and then returns the pointer itself. So it turns out that a function of the form
struct foo func(...)
is roughly equivlant to
struct foo *func(..., struct foo *)
where the caller is expected to allocate space for a 'foo' (probably on the stack) and pass in a pointer to it.
So it just happens that if you have a function that is expecting to be called this way (expecting to return a struct) and instead call it via a function pointer that returns a pointer, it MAY appear to work -- if the garbage bits it gets for the extra arg (random register contents left there by the caller) happen to point to somewhere writable, the called function will happily write its return value there and then return that pointer, so the called code will get back something that looks a like a valid pointer to the struct it is expecting. So the code may superficially appear to work, but its actually probably clobbering a random bit of memory that may be important later.