C equivalent of IOMemoryDescriptor class - c++

I'm writing some C code using IOKit, and need to use IOMemoryDescriptor methods. Unfortunately, I can only compile pure C sources, and that is a C++ class. So, I'm asking if there is some C interface that lets me perform the same operations.
Specifically, I want a function that does pretty much this, but that can be compiled as C:
#include <IOKit/IOMemoryDescriptor.h>
extern "C" void CopyOut(mach_vm_address_t src, void *dst, size_t size)
{
IOMemoryDescriptor *memDesc;
memDesc = IOMemoryDescriptor::withAddressRange(src, size, kIODirectionOut, current_task());
// Error checking removed for brevity
memDesc->prepare();
memDesc->readBytes(0, dst, size);
memDesc->complete();
memDesc->release();
}

Being based on BSD, xnu has inherited some of BSD's kernel APIs, including the copyin and copyout functions. They are declared in libkern.h, and they do pretty much what you're using an IOMemoryDescriptor for, but nothing else.
You do mention you're using IOKit - if you need anything beyond this out of IOKit's functionality, you'll pretty much have to go with a C++ compiler, or use C to call mangled names directly.
If you're new to using a weird compiler for building kexts, I'll just warn you that kernel code for x86_64 must not use the red zone of the stack, as that can't exist due to interrupt handling. If your compiler assumes a red zone is present, you'll get bizarre crashes. Clang and gcc have corresponding flags for disabling the red zone. (-mno-red-zone, if I remember correctly, automatically activated via the kernel mode flag) Even if you're using a non-official compiler, linking against an object file built with clang's C++ compiler at the last stage should work fine for wrapping any other C++ APIs.

Related

Can I include a DLL generated by GCC in a MSVC project?

I have a library of code I'm working on upgrading from x86 to x64 for a Windows application.
Part of the code took advantage of MSVC inline assembly blocks. I'm not looking to go through and interpret the assembly but I am looking to keep functionality from this part of the application.
Can I compile the functions using the inline assembly using GCC to make a DLL and link that to the rest of the library?
EDIT 1:(7/7/21) The flexibility with which compiler the project uses is open and I am currently looking into using Clang for use with MSVC.(also the Intel C++ compiler as another possibility) As stated in the first sentence it is a Windows application that I want to keep on Windows and the purpose of using another compiler is due to me 1.) not wanting to rewrite the large amount of assembly and 2.) because I know that MSVC does not support x64 inline assembly. So far clang seems to be working with a couple issues of how it declares comments inside of the assembly block and a few commands. The function is built around doing mathematical operations on a block of data, in what was supposed to be as fast as possible when it was developed but now that it works as intended I'm not looking to upgrade just maintain functionality. So, any compiler that will support inline assembly is an option.
EDIT 2:(7/7/21) I forgot to mention in the first edit, I'm not necessarily looking to load the 32-bit DLL into another process because I'm worried about copying data into an out of shared memory. I've done a similar solution for another project but the data set is around 8 MB and I'm worried that slow copy times for the function would cause the time constraint on the math to cause issues in the runtime of the application.(slow, laggy, and buffering are effects I'm trying to avoid.) I'm not trying to make it any faster but it definitely can't get any slower.
In theory, if you manage to create a plain C interface for that DLL (all exported symbols from DLL are standard C functions) and don't use memory management functions across "border" (no mixed memory management) then you should be able to dynamically load that DLL from another another (MSVC) process and call its functions, at least.
Not sure about statically linking against it... probably not, because the compiler and linker must go hand in hand (MSVC compiler+MSVC linker or GCC compiler+GCC linker) . The output of GCC linker is probably not compatible with MSVC at least regarding name mangling.
Here is how I would structure it (without small details):
Header.h (separate header to be included in both DLL and EXE)
//... remember to use your preferred calling convention but be consistent about it
struc Interface{
void (*func0)();
void (*func1)(int);
//...
};
typedef Interface* (*GetInterface)();
DLL (gcc)
#include "Header.h"
//functions implementing specific functionality (not exported)
void f0)(){/*...*/}
void f1)(int){/*...*/}
//...
Interface* getInterface(){//this must be exported from DLL (compiler specific)
static Interface interface;
//initialize functions pointers from interface with corresponding functions
interface.func0 = &f0;
interface.func1 = &f1;
//...
return &interface;
}
EXE (MSVC)
#include "Header.h"
int main(){
auto dll = LoadLibrary("DLL.dll");
auto getDllInterface = (GetInstance)GetProcAddress(dll, "getInterface");
auto* dllInterface = getDllInterface();
dllInterface->func0();
dllInterface->func1(123);
//...
return 0;
}

How to dynamically register class in a factory class at runtime period with c++

Now, I implemented a factory class to dynamically create class with a idenification string, please see the following code:
void IOFactory::registerIO()
{
Register("NDAM9020", []() -> IOBase * {
return new NDAM9020();
});
Register("BK5120", []() -> IOBase * {
return new BK5120();
});
}
std::unique_ptr<IOBase> IOFactory::createIO(std::string ioDeviceName)
{
std::unique_ptr<IOBase> io = createObject(ioDeviceName);
return io;
}
So we can create the IO class with the registered name:
IOFactory ioFactory;
auto io = ioFactory.createIO("BK5120");
The problem with this method is if we add another IO component, we should add another register code in registerIO function and compile the whole project again. So I was wondering if I could dynamically register class from a configure file(see below) at runtime.
io_factory.conf
------------------
NDAM9020:NDAM9020
BK5120:BK5120
------------------
The first is identification name and the second is class name.
I have tried with Macros, but the parameter in Macros cann't be string. So I was wondering if there is some other ways. Thanks for advance.
Update:
I didn't expect so many comments and answers, Thank you all and sorry for replying late.
Our current OS is Ubuntu16.04 and we use the builtin compiler that is gcc/g++5.4.0, and we use CMake to manage the build.
And I should mention that it is not a must that I should register class at runtime period, it is also OK if there is a way to do this in compile period. What I want is just avoiding the recompiling when I want to register another class.
So I was wondering if I could dynamically register class from a configure file(see below) at runtime.
No. As of C++20, C++ has no reflection features allowing it. But you could do it at compile time by generating a simple C++ implementation file from your configuration file.
How to dynamically register class in a factory class at runtime period with c++
Read much more about C++, at least a good C++ programming book and see a good C++ reference website, and later n3337, the C++11 standard. Read also the documentation of your C++ compiler (perhaps GCC or Clang), and, if you have one, of your operating system. If plugins are possible in your OS, you can register a factory function at runtime (by referring to to that function after a plugin providing it has been loaded). For examples, the Mozilla firefox browser or recent GCC compilers (e.g. GCC 10 with plugins enabled), or the fish shell, are doing this.
So I was wondering if I could dynamically register class from a configure file(see below) at runtime.
Most C++ programs are running under an operating system, such as Linux. Some operating systems provide a plugin mechanism. For Linux, see dlopen(3), dlsym(3), dlclose(3), dladdr(3) and the C++ dlopen mini-howto. For Windows, dive into its documentation.
So, with a recent C++ implementation and some recent operating systems, y ou can register at runtime a factory class (using plugins), and you could find libraries (e.g. Qt or POCO) to help you.
However, in pure standard C++, the set of translation units is statically known and plugins do not exist. So the set of functions, lambda-expressions, or classes in a given program is finite and does not change with time.
In pure C++, the set of valid function pointers, or the set of valid possible values for a given std::function variable, is finite. Anything else is undefined behavior. In practice, many real-life C++ programs accept plugins thru their operating systems or JIT-compiling libraries.
You could of course consider using JIT-compiling libraries such as asmjit or libgccjit or LLVM. They are implementation specific, so your code won't be portable.
On Linux, a lot of Qt or GTKmm applications (e.g. KDE, and most web browsers, e.g. Konqueror, Chrome, or Firefox) are coded in C++ and do load plugins with factory functions. Check with strace(1) and ltrace(1).
The Trident web browser of MicroSoft is rumored to be coded in C++ and probably accepts plugins.
I have tried with Macros, but the parameter in Macros can't be string.
A macro parameter can be stringized. And you could play x-macros tricks.
What I want is just avoiding the recompiling when I want to register another class.
On Ubuntu, I recommend accepting plugins in your program or library
Use dlopen(3) with an absolute file path; the plugin would typically be passed as a program option (like RefPerSys does, or like GCC does) and dlopen-ed at program or library initialization time. Practically speaking, you can have lots of plugins (dozen of thousands, see manydl.c and check with pmap(1) or proc(5)). The dlsym(3)-ed C++ functions in your plugins should be declared extern "C" to disable name mangling.
A single C++ file plugin (in yourplugin.cc) can be compiled with g++ -Wall -O -g -fPIC -shared yourplugin.cc -o yourplugin.so and later you would dlopen "./yourplugin.so" or an absolute path (or configure suitably your $LD_LIBRARY_PATH -see ld.so(8)- and pass "yourplugin.so" to dlopen). Be also aware of Rpath.
Consider also (after upgrading your GCC to GCC 9 at least, perhaps by compiling it from its source code) using libgccjit (it is faster than generating temporary C++ code in some file and compiling that file into a temporary plugin).
For ease of debugging your loaded plugins, you might be interested by Ian Taylor's libbacktrace.
Notice that your program's global symbols (declared as extern "C") can be accessed by name by passing a nullptr file path to dlopen(3), then using dlsym(3) on the obtained handle. You want to pass -rdynamic -ldl when linking your program (or your shared library).
What I want is just avoiding the recompiling when I want to register another class.
You might registering classes in a different translation unit (a short one, presumably). You could take inspiration from RefPerSys multiple #include-s of its generated/rps-name.hh file. Then you would simply recompile a single *.cc file and relink your entire program or library. Notice that Qt plays similar tricks in its moc, and I recommend taking inspiration from it.
Read also J.Pitrat's book on Artificial Beings: the Conscience of a Conscious Machine ISBN which explains why a metaprogramming approach is useful. Study the source code of GCC (or of RefPerSys), use or take inspiration from SWIG, ANTLR, GNU bison (they all generate C++ code) when relevant
You seem to have asked for more dynamism than you actually need. You want to avoid the factory itself having to be aware of all of the classes registered in it.
Well, that's doable without going all the way runtime code generation!
There are several implementations of such a factory; but I am obviously biased in favor of my own: einpoklum's Factory class (gist.github.com)
simple example of use:
#include "Factory.h"
// we now have:
//
// template<typename Key, typename BaseClass, typename... ConstructionArgs>
// class Factory;
//
#include <string>
struct Foo { Foo(int x) { }; }
struct Bar : Foo { Bar(int x) : Foo(x) { }; }
int main()
{
util::Factory<std::string, Foo, int> factory;
factory.registerClass<Bar>("key_for_bar");
auto* my_bar_ptr factory.produce("key_for_bar");
}
Notes:
The std::string is used as a key; you could have a factory with numeric values as keys instead, if you like.
All registered classes must be subclasses of the BaseClass value chosen for the factory. I believe you can change the factory to avoid that, but then you'll always be getting void *s from it.
You can wrap this in a singleton template to get a single, global, static-initialization-safe factory you can use from anywhere.
Now, if you load some plugin dynamically (see #BasileStarynkevitch's answer), you just need that plugin to expose an initialization function which makes registerClass() class calls on the factory; and call this initialization function right after loading the plugin. Or if you have a static-initialization safe singleton factory, you can make the registration calls in a static-block in your plugin shared library - but be careful with that, I'm not an expert on shared library loading.
Definetly YES!
Theres an old antique post from 2006 that solved my life for many years. The implementation runs arround having a centralized registry with a decentralized registration method that is expanded using a REGISTER_X macro, check it out:
https://web.archive.org/web/20100618122920/http://meat.net/2006/03/cpp-runtime-class-registration/
Have to admit that #einpoklum factory looks awesome also. I created a headeronly sample gist containing the code and a sample:
https://gist.github.com/h3r/5aa48ba37c374f03af25b9e5e0346a86

How to get a caller graph from a given symbol in a binary

This question is related to a question I've asked earlier this day: I wonder if it's possible to generate a caller graph from a given function (or symbol name e.g. taken from nm), even if the function of interest is not part of "my" source code (e.g. located in a library, e.g. malloc())
For example to know where malloc is being used in my program named foo I would first lookup the symbol name:
nm foo | grep malloc
U malloc##GLIBC_2.2.5
And then run a tool (which might need a specially compiled/linked version of my program or some compiler artifacts):
find_usages foo-with-debug-symbols "malloc##GLIBC_2.2.5"
Which would generate a (textual) caller graph I can then process further.
Reading this question I found radare2 which seems to accomplish nearly everything you can imagine but somehow I didn't manage to generate a caller graph from a given symbol yet..
Progress
Using radare2 I've managed to generate a dot caller graph from an executable, but something is still missing. I'm compiling the following C++ program which I'm quite sure has to use malloc() or new:
#include <string>
int main() {
auto s = std::string("hello");
s += " welt";
return 0;
}
I compile it with static libraries in order to be sure all calls I want to analyze can be found in the binary:
g++ foo.cpp -static
By running nm a.out | grep -E "_Znwm|_Znam|_Znwj|_Znaj|_ZdlPv|_ZdaPv|malloc|free" you can see a lot of symbols which are used for memory allocation.
Now I run radare2 on the executable:
r2 -qAc 'agCd' a.out > callgraph.dot
With a little script (inspired by this answer) I'm looking for a call-path from any symbol containing "sym.operatornew" but there seems to be none!
Is there a way to make sure all information needed to generate a call graph from/to any function which get's called inside that binary?
Is there a better way to run radare2? It looks like the different call graph visualization types provide different information - e.g. the ascii art generator does provide names for symbols not provided by the dot generator while the dot generator provides much more details regarding calls.
In general, you cannot extract an exact control flow graph from a binary, because of indirect jumps and calls there. A machine code indirect call is jumping into the content of some register, and you cannot reliably estimate all the values that register could take (doing so could be proven equivalent to the halting problem).
Is there a way to make sure all information needed to generate a call graph from/to any function which get's called inside that binary?
No, and that problem is equivalent to the halting problem, so there would be never a sure way to get that call graph (in a complete and sound way).
The C++ compiler would (usually) generate indirect jumps for virtual function calls (they jump thru the vtable) and probably when using a shared library (read Drepper's How To Write Shared Libraries paper for more).
Look into the BINSEC tool (developed by colleagues from CEA, LIST and by INRIA), at least to find references.
If you really want to find most (but not all) dynamic memory allocations in your C++ source code, you might use static source code analysis (like Frama-C or Frama-Clang) and other tools, but they are not a silver bullet.
Remember that allocating functions like malloc or operator new could be put in function pointer locations (and your C++ code might have some allocator deeply buried somewhere, then you are likely to have indirect calls to malloc)
Maybe you could spend months of effort in writing your own GCC plugin to look for calls to malloc after optimizations inside the GCC compiler (but notice that GCC plugins are tied to one particular version of GCC). I am not sure it is worth the effort. My old (obsolete, non maintained) GCC MELT project was able to find calls to malloc with a size above some given constant. Perhaps in at least a year -end of 2019 or later- my successor project (bismon, funded by CHARIOT H2020 project) might be mature enough to help you.
Remember also that GCC is capable of quite fancy optimizations related to malloc. Try to compile the following C code
//file mallfree.c
#include <stdlib.h>
int weirdsum(int x, int y) {
int*ar2 = malloc(2*sizeof(int));
ar2[0] = x; ar2[1] = y;
int r = ar2[0] + ar2[1];
free (ar2);
return r;
}
with gcc -S -fverbose-asm -O3 mallfree.c. You'll see that the generated mallfree.s assembler file contain no call to malloc or to free. Such an optimization is permitted by the As-if rule, and is practically useful to optimize most usages of C++ standard containers.
So what you want is not simple even for apparently "simple" C++ code (and is impossible in the general case).
If you want to code a GCC plugin and have more than a full year to spend on that issue (or could pay at least 500k€ for that), please contact me. See also
https://xkcd.com/1425/ (your question is a virtually impossible one).
BTW, of course, what you really care about is dynamic memory allocation in optimized code (you really want inlining and dead code elimination, and GCC does that quite well with -O3 or -O2). When GCC is not optimizing at all (e.g. with -O0 which is the implicit optimization) it would do a lot of "useless" dynamic memory allocation, specially with C++ code (using the C++ standard library). See also CppCon 2017: Matt Godbolt “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid” talk.

How to Bypass a Standard C++ Function While Maintaining Its Functionality

I am looking for a way to be able to redefine a set of POSIX functions but then end the redefinition with a call to the original function. The idea is that I am trying to create a layer that can restrict what OS API's can be called depending on which "profile" is active. This "profile" determines what set of functions are allowed and any not specified should not be used.
For example, if in one profile I am not allowed to use strcpy, I would like to be able to either cause a compile time error (via static_assert) or print something to the screen saying "strcpy is not allowed in this profile" such as below:
MY_string.h
#include <string.h>
char *strcpy(char *restrict s1, const char *restrict s2)
{
#if defined(PROFILE_PASS_THROUGH)
printf("strcpy is not allowed in this profile\n");
return strcpy(s1, s2);
#elif defined(PROFILE_ERROR)
static_assesrt(0, "strcpy is not allowed in this profile\n");
return 0;
#else
return strcpy(s1, s2);
#endif
}
So that way within main.cpp I can use MY_string.h
#define PROFILE_PASS_THROUGH
#include "MY_string.h"
int main()
{
char temp1[10];
char temp2[10];
sprintf(temp2, "Testing");
if (0 = strcpy(temp1, temp2))
{
printf("temp1 is %s\n", temp1);
}
return 0;
}
Now I realize that the code I have written above will not compile properly due to the redefinition of strcpy, but is there a way to allow this sort of functionality without playing around with macros or creating my own standard c and c++ libraries?
You can write a preprocessor that changes calls to the standard routine to calls to your own routine. Such a preprocessor might be complicated, depending whether you need to recognize the full C++ grammar to distinguish calls using name spaces and so on or you can get away with more casual recognition of the calls.
You can link with your own library, producing a relocatable object module with resolved names stripped. Your library would contain routines with the standard names, such as strcpy, that execute whatever code you desire and call other names, such as Mystrcpy. The object module produced by this is then linked with a second library and with the standard library. The second library contains routines with those names, such as Mystrcpy, that call the original library names strcpy. The details for doing this are of course dependent on your linker. The goal is to have a chain like this: Original code calls strcpy. This is resolved to the version of strcpy in the first library. That version calls Mystrcpy. Mystrcpy calls the standard library strcpy.
You can compile to assembly and edit the names in the assembly so that your routines are called instead of the standard library routines.
On some systems, you can use dlsym and other functions defined in <dlfcn.h> to load the dynamic library that contains the standard implementations and to call them via pointers returned by dlsym instead of by the usual names in source code.
The GCC linker has a --wrap switch that resolves calls to foo to your routine __wrap_foo and resolves calls to __real_foo (which you would use in your implementation) to the real foo.
See also Intercepting Arbitrary Functions on Windows, UNIX, and Macintosh OS X Platforms.
No, cannot be done in C++. What you want is more akin to a LISP (or derivative) language, where you can grab the slot for an existing function and 'override it in place', potentially punting back to the original implementation.
Typical way of doing is on Unix is via LD_PRELOAD, example (Unix) below proxies a function call, malloc in particular (full example):
/**
* malloc() direct call
*/
inline void * libc_malloc(size_t size)
{
typedef void* (*malloc_func_t)(size_t);
static malloc_func_t malloc_func = (malloc_func_t) dlsym(RTLD_NEXT, "malloc");
return malloc_func(size);
}
In your MY_String.h:
... blah blah
using mynamespace::strcpy;
#endif // header guard or maybe not there if using pragma
then all strcpys that are not prefixed with std:: will use yours. If you REALLY want to ban them, grep and take a shotgun with you when you find the person who used it.
If using some recent GCC (e.g. version 4.7 or newer) you could also write a GCC plugin or a GCC extension in MELT to replace every call to strcpy to your own mystrcpy. This probably will take you some work (perhaps days, not hours) but has the enormous advantage to work inside the compiler, on the GCC compiler's internal representations (Gimple). So it will be done even after inlining, etc. And since you extend the compiler, you can tailor its behavior to what you want.
MELT is a domain specific language to extend GCC. It is designed for such tasks.
You cannot avoid these functions to be called.
A C++ program can do anything it wants, it could have some code that loads the strcpy symbol from libc and runs it. If a malicious developer want to call that function, you have no way to avoid it. To do that you'd need to run the C++ code in some special environment (in a sandbox, or virtual machine), but I'm afraid such technology is not available.
If you trust the developers, and you're just looking for a way to remind them not to call certain functions, then there could be some solution.
One solution could be avoiding to #include libc headers (like cstring), and only include your own header files where you only declared the desired functions.
Another solution could be that of looking to the compiled executable in order to find out what functions are called, or to LD_PRELOAD a library that redefines (and thus overrides) standard functions to make them print a warning at runtime.
Here is how you would you change MY_string.h
#include <cstring>
namespace my_functions{
char *strcpy(char *s1, const char *s2)
{
#if defined(PROFILE_PASS_THROUGH)
printf("strcpy is not allowed in this profile\n");
return std::strcpy(s1, s2);
#elif defined(PROFILE_ERROR)
static_assert(0, "strcpy is not allowed in this profile\n");
return 0;
#else
return std::strcpy(s1, s2);
#endif
}
}
using namespace my_functions;
For this to work you cannot include or have using namespace std;

Why Are Vtables Not Being Implemented Correctly On Embedded Platform?

I am developing code for an embedded system (specifically, the PSoC 5, using PSoC Creator), and writing in C++.
While I've overcome most hurdles with using C++ , first off compiling in C++ using the compiler flag -x c++, defining the new and delete operators, making sure exceptions aren't thrown with the compiler flag -fno-exception, I've come to a brick wall when it comes to using virtual functions.
If I try and declare a virtual function, the compiler gives me the error undefined reference to "vtable for __cxxabiv1::__class_type_info". The only way to get around this is to use the compiler flag -fno-rtti, which prevents the error and makes it compile successfully. However, if I do that, the embedded program crashes when trying to run the overloaded virtual function, and I'm thinking this is because the vtable does not exist.
I don't see why you shouldn't be able to implement vtables on an embedded platform, since all it is a extra space in memory before or after member objects (depending on the exact compiler).
The reason I am trying to use virtual functions is because I am wanting to use FreeRTOS with C++, and other people have implemented this by using virtual functions (see http://www.freertos.org/FreeRTOS_Support_Forum_Archive/July_2010/freertos_Is_it_possible_create_freertos_task_in_c_3778071.html for the discussion, and https://github.com/yuriykulikov/Event-driven_Framework_for_Embedded_Systems for a well written embedded C++ FreeRTOS framework)
The fact that the error message refers to a class named __cxxabiv1 suggests that you are not linking against the correct C++ runtime for your platform. I don't know anything about PSoC, but on more "normal" platforms, this sort of error could happen if you used the gcc (resp. clang) command at link-time instead of g++ (resp. clang++); or under handwavey circumstances if you used -lc++ without -stdlib=libc++ or -lstdc++ without -stdlib=libstdc++.
Use the -v option to examine your linker command line, and try to find out exactly which C++ runtime library it's pulling in. It'll probably be named something like libcxxabi or libcxxrt.
This guy here gives step-by-step instructions for compiling C++ in PSoC Creator; but he never figured out how to link with a C++ runtime library, so all his tips are focused on how to remove C++isms from your code (-fno-rtti, -fno-exceptions,...). I agree that there doesn't seem to be any information online about how to actually use C++ with PSoC.
For this specific error, you could always try defining the missing symbol yourself:
// file "fix-link-errors.cpp"
namespace __cxxabiv1 {
class __class_type_info {
virtual void dummy();
};
void __class_type_info::dummy() { } // causes the vtable to get created here
};
Or many linkers have the ability to define undefined symbols as 0x0 through command-line options such as -C or --defsym. However, that's not only a Bad Idea but also inconvenient, because you'd have to figure out what the actual (mangled) name of the vtable object is, and the linker didn't tell you that. (This being GCC, it's probably something like __ZTVN10__cxxabiv117__class_type_infoE.)
Either of those "solutions" would result in horrible crashes if the program ever tried to do anything with the vtable; but they'd shut the linker up, if that's all you cared about and you knew the program would never actually use RTTI. But in that case, it should be sufficient to use -fno-rtti consistently on your entire project.
What, specifically, goes wrong when you use -fno-rtti?