How to Bypass a Standard C++ Function While Maintaining Its Functionality - c++

I am looking for a way to be able to redefine a set of POSIX functions but then end the redefinition with a call to the original function. The idea is that I am trying to create a layer that can restrict what OS API's can be called depending on which "profile" is active. This "profile" determines what set of functions are allowed and any not specified should not be used.
For example, if in one profile I am not allowed to use strcpy, I would like to be able to either cause a compile time error (via static_assert) or print something to the screen saying "strcpy is not allowed in this profile" such as below:
MY_string.h
#include <string.h>
char *strcpy(char *restrict s1, const char *restrict s2)
{
#if defined(PROFILE_PASS_THROUGH)
printf("strcpy is not allowed in this profile\n");
return strcpy(s1, s2);
#elif defined(PROFILE_ERROR)
static_assesrt(0, "strcpy is not allowed in this profile\n");
return 0;
#else
return strcpy(s1, s2);
#endif
}
So that way within main.cpp I can use MY_string.h
#define PROFILE_PASS_THROUGH
#include "MY_string.h"
int main()
{
char temp1[10];
char temp2[10];
sprintf(temp2, "Testing");
if (0 = strcpy(temp1, temp2))
{
printf("temp1 is %s\n", temp1);
}
return 0;
}
Now I realize that the code I have written above will not compile properly due to the redefinition of strcpy, but is there a way to allow this sort of functionality without playing around with macros or creating my own standard c and c++ libraries?

You can write a preprocessor that changes calls to the standard routine to calls to your own routine. Such a preprocessor might be complicated, depending whether you need to recognize the full C++ grammar to distinguish calls using name spaces and so on or you can get away with more casual recognition of the calls.
You can link with your own library, producing a relocatable object module with resolved names stripped. Your library would contain routines with the standard names, such as strcpy, that execute whatever code you desire and call other names, such as Mystrcpy. The object module produced by this is then linked with a second library and with the standard library. The second library contains routines with those names, such as Mystrcpy, that call the original library names strcpy. The details for doing this are of course dependent on your linker. The goal is to have a chain like this: Original code calls strcpy. This is resolved to the version of strcpy in the first library. That version calls Mystrcpy. Mystrcpy calls the standard library strcpy.
You can compile to assembly and edit the names in the assembly so that your routines are called instead of the standard library routines.
On some systems, you can use dlsym and other functions defined in <dlfcn.h> to load the dynamic library that contains the standard implementations and to call them via pointers returned by dlsym instead of by the usual names in source code.
The GCC linker has a --wrap switch that resolves calls to foo to your routine __wrap_foo and resolves calls to __real_foo (which you would use in your implementation) to the real foo.
See also Intercepting Arbitrary Functions on Windows, UNIX, and Macintosh OS X Platforms.

No, cannot be done in C++. What you want is more akin to a LISP (or derivative) language, where you can grab the slot for an existing function and 'override it in place', potentially punting back to the original implementation.

Typical way of doing is on Unix is via LD_PRELOAD, example (Unix) below proxies a function call, malloc in particular (full example):
/**
* malloc() direct call
*/
inline void * libc_malloc(size_t size)
{
typedef void* (*malloc_func_t)(size_t);
static malloc_func_t malloc_func = (malloc_func_t) dlsym(RTLD_NEXT, "malloc");
return malloc_func(size);
}

In your MY_String.h:
... blah blah
using mynamespace::strcpy;
#endif // header guard or maybe not there if using pragma
then all strcpys that are not prefixed with std:: will use yours. If you REALLY want to ban them, grep and take a shotgun with you when you find the person who used it.

If using some recent GCC (e.g. version 4.7 or newer) you could also write a GCC plugin or a GCC extension in MELT to replace every call to strcpy to your own mystrcpy. This probably will take you some work (perhaps days, not hours) but has the enormous advantage to work inside the compiler, on the GCC compiler's internal representations (Gimple). So it will be done even after inlining, etc. And since you extend the compiler, you can tailor its behavior to what you want.
MELT is a domain specific language to extend GCC. It is designed for such tasks.

You cannot avoid these functions to be called.
A C++ program can do anything it wants, it could have some code that loads the strcpy symbol from libc and runs it. If a malicious developer want to call that function, you have no way to avoid it. To do that you'd need to run the C++ code in some special environment (in a sandbox, or virtual machine), but I'm afraid such technology is not available.
If you trust the developers, and you're just looking for a way to remind them not to call certain functions, then there could be some solution.
One solution could be avoiding to #include libc headers (like cstring), and only include your own header files where you only declared the desired functions.
Another solution could be that of looking to the compiled executable in order to find out what functions are called, or to LD_PRELOAD a library that redefines (and thus overrides) standard functions to make them print a warning at runtime.

Here is how you would you change MY_string.h
#include <cstring>
namespace my_functions{
char *strcpy(char *s1, const char *s2)
{
#if defined(PROFILE_PASS_THROUGH)
printf("strcpy is not allowed in this profile\n");
return std::strcpy(s1, s2);
#elif defined(PROFILE_ERROR)
static_assert(0, "strcpy is not allowed in this profile\n");
return 0;
#else
return std::strcpy(s1, s2);
#endif
}
}
using namespace my_functions;
For this to work you cannot include or have using namespace std;

Related

Can I include a DLL generated by GCC in a MSVC project?

I have a library of code I'm working on upgrading from x86 to x64 for a Windows application.
Part of the code took advantage of MSVC inline assembly blocks. I'm not looking to go through and interpret the assembly but I am looking to keep functionality from this part of the application.
Can I compile the functions using the inline assembly using GCC to make a DLL and link that to the rest of the library?
EDIT 1:(7/7/21) The flexibility with which compiler the project uses is open and I am currently looking into using Clang for use with MSVC.(also the Intel C++ compiler as another possibility) As stated in the first sentence it is a Windows application that I want to keep on Windows and the purpose of using another compiler is due to me 1.) not wanting to rewrite the large amount of assembly and 2.) because I know that MSVC does not support x64 inline assembly. So far clang seems to be working with a couple issues of how it declares comments inside of the assembly block and a few commands. The function is built around doing mathematical operations on a block of data, in what was supposed to be as fast as possible when it was developed but now that it works as intended I'm not looking to upgrade just maintain functionality. So, any compiler that will support inline assembly is an option.
EDIT 2:(7/7/21) I forgot to mention in the first edit, I'm not necessarily looking to load the 32-bit DLL into another process because I'm worried about copying data into an out of shared memory. I've done a similar solution for another project but the data set is around 8 MB and I'm worried that slow copy times for the function would cause the time constraint on the math to cause issues in the runtime of the application.(slow, laggy, and buffering are effects I'm trying to avoid.) I'm not trying to make it any faster but it definitely can't get any slower.
In theory, if you manage to create a plain C interface for that DLL (all exported symbols from DLL are standard C functions) and don't use memory management functions across "border" (no mixed memory management) then you should be able to dynamically load that DLL from another another (MSVC) process and call its functions, at least.
Not sure about statically linking against it... probably not, because the compiler and linker must go hand in hand (MSVC compiler+MSVC linker or GCC compiler+GCC linker) . The output of GCC linker is probably not compatible with MSVC at least regarding name mangling.
Here is how I would structure it (without small details):
Header.h (separate header to be included in both DLL and EXE)
//... remember to use your preferred calling convention but be consistent about it
struc Interface{
void (*func0)();
void (*func1)(int);
//...
};
typedef Interface* (*GetInterface)();
DLL (gcc)
#include "Header.h"
//functions implementing specific functionality (not exported)
void f0)(){/*...*/}
void f1)(int){/*...*/}
//...
Interface* getInterface(){//this must be exported from DLL (compiler specific)
static Interface interface;
//initialize functions pointers from interface with corresponding functions
interface.func0 = &f0;
interface.func1 = &f1;
//...
return &interface;
}
EXE (MSVC)
#include "Header.h"
int main(){
auto dll = LoadLibrary("DLL.dll");
auto getDllInterface = (GetInstance)GetProcAddress(dll, "getInterface");
auto* dllInterface = getDllInterface();
dllInterface->func0();
dllInterface->func1(123);
//...
return 0;
}

C++ Plugins ABI issues on Linux

I am working on a plugin system to replace shared libs.
I am aware of ABI issues when designing an API for shared libs and entry points in the libs, such as exported classes, should be carefully design.
For example, adding, removing or reordering private member variables of an exported class may lead to different memory layout and runtime errors (from my understanding, that's why the Pimpl pattern might be useful). Of course there are many other pitfalls to avoid when modifying exported classes.
I have built a small example here to illustrate my question.
First, i provide the following header for the plugin developer :
// character.h
#ifndef CHARACTER_H
#define CHARACTER_H
#include <iostream>
class Character
{
public:
virtual std::string name() = 0;
virtual ~Character() = 0;
};
inline Character::~Character() {}
#endif
Then the plugin is built as a shared lib "libcharacter.so" :
#include "character.h"
#include <iostream>
class Wizard : public Character
{
public:
virtual std::string name() {
return "wizard";
}
};
extern "C"
{
Wizard *createCharacter()
{
return new Wizard;
}
}
And finally the main application that uses the plugin :
#include "character.h"
#include <iostream>
#include <dlfcn.h>
int main(int argc, char *argv[])
{
(void)argc, (void)argv;
using namespace std;
Character *(*creator)();
void *handle = dlopen("../character/libcharacter.so", RTLD_NOW);
if (handle == nullptr) {
cerr << dlerror() << endl;
exit(1);
}
void *f = dlsym(handle, "createCharacter");
creator = (Character *(*)())f;
Character *character = creator();
cout << character->name() << endl;
dlclose(handle);
return 0;
}
Is it sufficient to define an abstract class to get rid of all ABI issues?
Is it sufficient to define an abstract class to get rid of all ABI issues?
Short answer:
No.
I wouldn't recommend using C++ for a plugin API (see longer answer below), but if you do decide to stick with C++ then:
Don't use any standard library types in your plugin API.
For instance, Character::name() returns a std::string. If the implementation of std::string ever changes (and it has in the past in GCC) then that will result in Undefined Behavior. Really, anything that you don't control (any third-party library) shouldn't be used in the API.
Don't use exceptions or RTTI across the plugin boundary. On Linux exceptions and RTTI might work if you load the plugin with RTLD_GLOBAL (not a good idea for plugins) and both the host and the plugin use the same runtime. But in general you either won't be able to catch exceptions from another module, or they might even cause heap corruption (if they are allocated by different runtimes).
Only add functions to the end of your abstract classes, or everything will silently break because of the vtable layout changing (and that can be really hard to diagnose).
Always allocate and deallocate an object from the same module. I noticed you don't have a destroyCharacter() function (main() actually leaks the character but that's another question). Always provide symmetric create and destroy functions for resources created by a different module (shared library or plugin).
I believe on Linux with GCC the host application's operator new and operator delete get correctly propagated to the loaded plugin (through weak symbols), but if you want it to work on Windows then don't assume that operator new and operator delete in the host application and the plugin are the same. A statically linked runtime, especially built with LTO, might also mess with this.
Longer answer:
There are a lot of possible issues when exporting a C++ API from a plugin.
Generally speaking, there are no guarantees about it working if anything about the toolchains used to build the host application and the plugin differs. This can include (but is not limited to) compilers, versions of the language, compiler flags, preprocessor definitions, etc.
The common wisdom regarding plugins is to use a pure C89 API, because the C ABI on all common platforms is very stable.
Keeping to the common subset of C89 and C++ will mean that the host and plugin can use different language standards, standard libraries, etc. Unless the host or the plugin are built with some weird (and probably non-standard-conforming) APIs, this should be reasonably safe. Obviously, you still have to be careful with data structure layouts.
You can then provide a rich C++ header-only wrapper for the C API that handles lifetime and error code/exception conversions, etc.
As a nice bonus, C APIs are producible and consumable by most languages, which could allow the plugin authors to use not just C++.
There are actually quite a few pitfalls even in a C API. If we're being pedantic then the only safe things are functions with fixed-size arguments and return types (pointers, size_t, [u]intN_t) - not even necessarily built-in types (short, int, long, ...), or enums. E.g. in GCC: -fshort-enums can change the size of enums, -fpack-struct[=n] can change the padding within structs.
So, if you really want to be safe then don't use enums and either pack all your structs or don't expose them directly (instead expose accessor functions).
Other considerations:
These aren't strictly related to the question but should definitely be considered before committing to a specific style of API.
Error handling: Whether or not you use C++, you'll need an alternative to exceptions.
This will probably be some form of error code. std::error_code in C++ can be then used to wrap the raw enum/int as soon as you're in C++ land, and if the API uses C++ then a std::expected-like or Boost.Outcome-like type with a stable ABI could be used.
Loading the plugin and importing symbols: With abstract classes it's easy - a simple factory function is all you need. With a traditional C API you might end up needing to import hundreds of symbols. One way of dealing with that would be to emulate a vtable in C. Make each object that has associated functions start with a pointer to a dispatch table, e.g.
typedef struct game_string_view { const char *data; size_t size; } game_string_view;
typedef enum game_plugin_error_code { game_plugin_success = 0, /* ... */ } game_plugin_error_code;
typedef struct game_plugin_character_impl *GamePluginCharacter; // handle to a Character
typedef struct game_plugin_character_dispatch_table { // basically a vtable
void (*destroy)(GamePluginCharacter character); // you could even put destroy() here
game_string_view (*name)(GamePluginCharacter character);
void (*update)(GamePluginCharacter character, /*...*/, game_plugin_error_code *ec); // might fail
} game_plugin_character_dispatch_table;
typedef struct game_plugin_character_impl {
// every call goes through this table and takes GamePluginCharacter as it's first argument
const game_plugin_character_dispatch_table *dispatch;
} game_plugin_character_impl;
Future extensibility and compatibility: You should design the API, knowing that you'll want to change it in the future and keep compatibility. IMO, a C API lends itself well to this because it forces you to be very precise in what is exposed. The plugin should be able to expose it's API version to the host in a way that is forward and backward compatible.
It's a good idea to think about extensibility when designing each function signature. E.g. if a struct is passed by pointer (instead of by value), then it's size can be extended without breaking compatibility (so long as at run time both the caller and the callee agree on it's size).
Visibility: Maybe look into visibility on Linux and other platforms. This isn't really a question of API design, just helps clean up the symbols exported from a shared library.
All of the above is by no means extensive.
I would suggest the talk "Hourglass Interfaces for C++ APIs" as further "reading".
And of course there other good talks and articles on the matter (that I can't remember of the top of my head).

How to create a DLL, which accepts strings from MT4 and returns back string type?

I am trying for two weeks to create a DLL to which I can pass strings and get back strings. But still no success.
I tried this on Dev-C++(TDM-GCC 4.9.2) and visual studio community 2015. I searched a lot about this and tried almost every sample code I found but I have no success.
I have to use this DLL with MetaTrader Terminal 4.
Here is a one sample code, which I used. This code compiles successfully but when I send a string to this, from MT4, I get an access violation error.
#ifndef MYLIB_HPP
#define MYLIB_HPP
#include <string>
#ifdef MYLIB_EXPORTS
#define MYLIB_API __declspec(dllimport)
#else
#define MYLIB_API __declspec(dllexport)
#endif
bool MYLIB_API test(const std::string& str);
#endif
bool MYLIB_API MyTest(const std::string& str)
{
return (str == "Hi There");
}
If you do share a C++ string between a DLL and another executable, both need to have been compiled with the same tool-chain. This is because std::string is defined in header only. So, if the DLL and executable use different string headers, they may well be binary incompatible.
If you want to make sure that things do work with different tool-chains, stick to NULL terminated C strings.
You have just experienced one of the MQL4 tricks,the MQL4 string is not a string but a struct thus #import on MQL4 side will make MT4 to inject that, not matching your DLL C-side expectations and the access-violation error is straightforward, as your C-side code tried to access the MQL4 territories...
First rule to design API/DLL: READ the documentation very carefully.
Yes, one may object, that the MQL4 doc is somewhat tricky to follow, but thus more double the Rule#1, read the documentation very, very, very carefully as some important design facts are noted almost hidden in some not very predictable chapters or somewhere in explanations of ENUM tables, compiler directives, pragma-s side-notes et al.
Second rule: design API/DLL interface so as to allow smooth integration
MQL4 has changed the rules somewhere about Build 670+. Good news is, the MetaQuotes has announced, that there will be no further investment on their side into MT4 further developlments, so the MT4-side of the DLL/API integration will hopefully stop from further creeping.
Given your statement, that you design the DLL/API, try to design future-proof specification -- use block of uchar[]-s instead of "interpretations"-sensitive string, pass both inputs and outputs by-reference and return just some form of int aReturnCODE = myDLL_FUNC( byRefParA, byRefParB, byRefRESULT ); and your efforts will result in clean code, better portability among 3rd party language-wrappers and will also minimise your further maintenance costs.
Most likely, your code and the one you're linking against have been compiled with a different ABI for std::string, i.e. the string used by the library has a different memory layout (and sizeof) than the one you're compiling with.
I once ran into this problem when linking against the hdf5 library and using gcc. In this case, the problem could be solved by reverting to a previous ABI, as explained here.
However, the problem also occurred with clang, when such a solution was not available. Thus, to make this all working I had to avoid using std::string in any calls to the library (hdf5 in my case) that was compiled with the different ABI, and instead make do with the hdf5 interface using const char*.

Shadowing functions of C stdlib/stdio

I am writing a game and for now i was able to implement a filesystem via sqlite with a class and its methods. To make life more easy i have planned to write some functions like fopen,fclose,fread,rename, etc. to be able to shadow the base functions and to direct my calls to my filesystem rather than to the original one. For the first three function everything worked fine for me with these prototypes:
File *fopen(String _Filename, String _Mode); // i have my own optimized File struct
void fclose(File *_File);
size_t fread(String *_DstBuf, size_t _ElementSize, size_t _Count, File *_File);
This worked fine as i am either returning another struct or the parameters except a File* and not a FILE*, however the rename function seems to be a bit trickier!
int rename(String _OldFilename, String _NewFilename);
This is nearly the same prototype. except that i use std::string (typedef'ed String) than const char*! Any idea how i could convince my compiler either to use my function or to ignore the stdio-one?
And what is the reason that you cannot simply use your own functions by any other name?
If the whole conflict is with overload resolution, you should simply just shadow the actual prototypes; You can make them forwards to your own functions.
However, I recommend against the general approach here: even with that 'fix' in place you will at the very best have include ordering issues, and possibly even duplicate link symbols.
If your functions don't do the same, make them use another name. Since you are using c++, you could do this vile trick (otherwise ill-advised) in MyFsFunctions.h:
namespace MyFsFunctions
{
// prototypes for fopen, fclose, fwrite, fread etc
}
using namespace MyFsFunctions;
// or:
using MyFsFunctions::fopen;
using MyFsFunctions::fclose;
using MyFsFunctions::fread;
using MyFsFunctions::fwrite; // etc...
I'm pretty sure you will still want (need) to shadow the exact function prototypes (or the compiler may still complain about ambiguous identifiers references).
Other hints:
use a fuse file system driver (on Linux/UNIX/MacOS; might be overkill, but implementing it seems a lot more robust and may even simpler than what you do here).
there is always C macros (-10 points for evil)
gnu linker has options that let's you 'replace' link symbols - mainly for debugging purposes, but you can leverage those here
How about implementing a rename with the standard signature that all it will do would be calling your Stringed version?
Doesn't sound complicated to me. Something like this:
int rename(const char *charOld, const char *charNew)
{
std::string stdOld(charOld);
std::string stdNew(charNew);
return rename(stdOld, stdNew);
}

wrap a c++ library in c? (don't "extern c")

is it possible to wrap a c++ library into c?
how could i do this?
are there any existing tools?
(need to get access to a existing c++ library but only with C)
You can write object-oriented code in C, so if it's an object-oriented C++ library, it's possible to wrap it in a C interface. However, doing so can be very tedious, especially if you need to support inheritance, virtual functions and such stuff.
If the C++ library employs Generic Programming (templates), it might get really hairy (you'd need to provide all needed instances of a template) and quickly approaches the point where it's just not worth doing it.
Assuming it's OO, here's a basic sketch of how you can do OO in C:
C++ class:
class cpp {
public:
cpp(int i);
void f();
};
C interface:
#ifdef __cplusplus
extern "C" {
#endif
typedef void* c_handle;
c_handle c_create(int i)
{
return new cpp(i);
}
void c_f(c_handle hdl)
{
static_cast<cpp*>(hdl)->f();
}
void c_destroy(c_handle hdl)
{
delete static_cast<cpp*>(hdl);
}
#ifdef __cplusplus
}
#endif
Depending on your requirements, you could amend that. For example, if this is going to be a public C interface to a private C++ API, handing out real pointers as handles might make it vulnerable. In that case you would hand out handles that are, essentially, integers, store the pointers in a handle-to-pointer map, and replace the cast by a lookup.
Having functions returning strings and other dynamically sized resources can also become quite elaborate. You would need the C caller provide the buffer, but it can't know the size before-hand. Some APIs (like parts of the WIn32 API) then allow the caller to call such a function with a buffer of the length 0, in which case they return the length of the buffer required. Doing so, however, can make calling through the API horribly inefficient. (If you only know the length of the required buffer after the algorithm executed, it needs to be executed twice.)
One thing I've done in the past is to hand out handles (similar to the handle in the above code) to internally stored strings and provide an API to ask for the required buffer size, retrieve the string providing the buffer, and destroy the handle (which deletes the internally stored string).
That's a real PITA to use, but such is C.
Write a c++ wrapper that does an extern c, compile that with c++, and call your wrapper.
(don't “extern c”)
extern C only helps you to have a names in dll like you see them.
You can use
dumpbin /EXPORTS your.dll
to see what happens with names with extern C or without it.
http://msdn.microsoft.com/en-us/library/c1h23y6c(v=vs.71).aspx
To answer your question... It depends... But it is highly unlikely that you can use it without wrappings. If this C++ library uses just a simple functions and types you can just use it. If this C++ library uses a complex classes structure - probably you will be unable to use it from C without wrapping. It is because the internal of classes may be structured one way or another depending on many conditions (using inference with virtual tables or abstracting. Or in example complex C++ library may have its own object creation mechanisms so you HAVE to use it in the way it is designed or you will get unpredictable behavior).
So, I think, you have to prepare yourself for doing dome wrappings.
And here is a good article about wrapping C++ classes. It the article the Author tells about wrapping C++ classes to C# but he uses C at first step.
http://www.codeproject.com/KB/cs/marshalCPPclass.aspx
If the C++ library is written which can be compiled with C compiler with slight editting (such as changing bool to int, false to 0 and true to 1 etc), then that can be done.
But not all C++ code can be wrapped in C. Template is one feature in C++ that cannot be wrapped, or its nearly impossible.
Wrap it in C++ cpp that calls that dll, and "extern C" in that file you made.