Calling a native C function after making a variable with that name - c++

I have a pretty large application which holds most of the program data in a large container object called system.
So I access things all the time, e.g. printf("%s\n",system.constants.jobname);, and changing the name of system is undesirable.
I learned later that the function system exists as a native C function for running terminal commands, e.g. system("rm *.txt");
My problem is that I get compilation errors trying to use system as a function because it's already defined as an object.
Is there any way one can call a native C function explicitly ignoring programmatically defined variables? Or give the native system function an alias? (I'm using C++ so using it would be fine)

If you're using C++, system is in the global namespace. Assuming you've put your stuff in a proper namespace (you have, right?) you can refer to it as ::system.

Assuming using shared libraries is an acceptable solution, you can do this.
Create another C file which will not use your system container. Now write a function my_system that is a wrapper to system.
By wrapper I mean, it takes the same argument and calls system and returns what system returns.
Don't forget to export my_system
Now compile this as a dll (or .so on *NIX).
In your main project, load the dll and get a handle. Now query for address of my_system on the handle and make the call using function pointer.

Related

Wrapping c++ code in Matlab to access program state information in Matlab

I have a c++ program that updates a system. When I wrote everything in C++ it looked a little something like this
System S; //initialize a System object 'S'
while (notFinished)
{
S.update1(inputVars1);
S.update2(inputVars2);
}
Now I would like to call the individual update functions from matlab and be able to use access functions (written in c++) to view the state of the program at any time when debugging in matlab.
So matlab will need to call something to instantiate a "System" object, and then it will need to call individual System methods from the original system object.
Suppose I compile separate mex files to Initialize update1 update2 and some that get information about the current state getState. And then write some matlab code...
%matlab main
S = Initalize(); %mex file
while (notFinished)
update1(S); %mex file
keyboard; % access state information using "getState" mex function
update2(S); %mex file
keyboard; % access state information using "getState" mex function
end
Will this essentially allow me to call and debug my C++ program algorithms in Matlab, or is there another way to go about this whole problem?
The way I would do it is by creating a pointer for System in C++ in the Initialize mex function using "new". If you are on a 64 bit platform then cast this pointer to a 64-bit integer and create an mxArray with that type and value. Return this mxArray from your Initialize function.
For later calls to your other mex files you should pass this mxArray as an input. Inside those files you can cast it back as a pointer and call methods on the object.
I also would take one more step to wrap this whole thing inside a MATLAB System object or a regular object and not expose the pointer value S outside the object. You need methods on the object which would call your mex files. This is especially needed if you are planning to give this to other people for use. Others might accidentally overwrite or modify S causing crashes.
Finally you need a delete mex function which would delete the pointer S. If you create a handle class then you can do this in the destructor.

Changing what a function points to

I have been playing around with pointers and function pointers in c/c++. As you can get the adress of a function, can you change where a function call actually ends?
I tried getting the memory adress of a function, then writing a second functions adress to that location, but it gave me a access violation error.
Regards,
Function pointers are variables, just like ints and doubles. The address of a function is something different. It is the location of the beginning of the function in the .text section of the binary. You can assign the address of a function to a function pointer of the same type however the .text section is read only and therefore you can't modify it. Writing to the address of a function would attempt to overwrite the code at the beginning of the function and is therefore not allowed.
Note:
If you want to change, at runtime, where function calls end up you can create something called a vritual dispatch table, or vtable. This is a structure containing function pointers and is used in languages such as c++ for polymorphism.
e.g.:
struct VTable {
int (*foo)(void);
int (*bar)(int);
} vTbl;
At runtime you can change the values of vTbl.foo and vTbl.bar to point to different functions and any calls made to vTbl.foo() or .bar will be directed to the new functions.
If the function you're trying to call is inlined, then you're pretty much out of luck. However, if it's not inlined, then there may be a way:
On Unix systems there's a common feature of the dynamic linker called LD_PRELOAD which allows you to override functions in shared libraries with your own versions. See the question What is the LD_PRELOAD trick? for some discussion of this. If the function you're trying to hijack is not loaded from a shared library (i.e. if it's part of the executable or if it's coming from a statically linked library), you're probably out of luck.
On Windows, there are other attack vectors. If the function to be hooked is exported by some DLL, you could use Import Address Table Patching to hijack it without tinkering with the code of the function. If it's not exported by the DLL but you can get the address of it (i.e. by taking the address of a function) you could use something like the free (and highly recommended) N-CodeHook project.
In some environments, it is possible to "patch" the beginning instructions of a function to make the call go somewhere else. This is an unusual technique and is not used for normal programming. It is sometimes used if you have an existing compiled program and need to change how it interacts with the operating system.
Microsoft Detours is an example of a library that has the ability to this.
You can change what a function pointer points to, but you can't change a normal function nor can you change what the function contains.
You generally can't find where a function ends. There's no such standard functionality in the language and the compiler can optimize code in such ways that the function's code isn't contiguous and really has not a single point of end and in order to find where the code ends one would need to either use some non-standard tools or disassemble the code and make sense of it, which isn't something you can easily write a program for to do automatically.

MATLAB MEX interface to a class object with multiple functions

I am using the MEX interface to run C++ code in MATLAB. I would like to add several functions to MATLAB for handling a System object:
sysInit()
sysRefresh()
sysSetAttribute(name, value)
String = sysGetAttribute(value)
sysExit()
Since each MEX dll can contain one function, I need to find a way to store the pointer to the global System object which will exist until deleted by a call to sysExit.
How can I do this in MATLAB properly? Are there any ways to store global pointers across calls to MEX functions?
One common approach is to have several m-file functions that provide the public interface, e.g. sysInit.m, sysRefresh.m, etc.
Each of these m-files calls the mex function with some kind of handle, a string (or number) identifying the function to call, and any extra args. For example, sysRefresh.m might look like:
function sysRefresh(handle)
return sysMex(handle, 'refresh')
In your sysMex mex function, you can either have the handle be a raw heap pointer (easy, but not very safe), or you can maintain a mapping in C/C++ from the handle ID to the actual object pointers. This solution requires a little extra work, but it's much safer. This way someone can't accidentally pass an arbitrary number as a handle, which acts as a dangling pointer. Also, you can do fancier things like use an onCleanup function to release all memory and resources when you unload the mex function (e.g. so you don't have to restart matlab when you recompile the mex function).
You can go a bit further if you like and hide the handle behind a Matlab class. Read up on the OO features for Matlab in the docs if you're interested. If you're using a recent version, you can take advantage of their much cleaner handle objects.
Alternatively, you may get away with not using MEX at all. In matlab (on Windows) you can load any generic dll with loadlibrary and call any of its functions with callib. This is probably not portable across operating systems, though.

How to mimic the "multiple instances of global variables within the application" behaviour of a static library but using a DLL?

We have an application written in C/C++ which is broken into a single EXE and multiple DLLs. Each of these DLLs makes use of the same static library (utilities.lib).
Any global variable in the utility static library will actually have multiple instances at runtime within the application. There will be one copy of the global variable per module (ie DLL or EXE) that utilities.lib has been linked into.
(This is all known and good, but it's worth going over some background on how static libraries behave in the context of DLLs.)
Now my question.. We want to change utilities.lib so that it becomes a DLL. It is becoming very large and complex, and we wish to distribute it in DLL form instead of .lib form. The problem is that for this one application we wish to preserve the current behaviour that each application DLL has it's own copy of the global variables within the utilities library. How would you go about doing this? Actually we don't need this for all the global variables, only some; but it wouldn't matter if we got it for all.
Our thoughts:
There aren't many global variables within the library that we care about, we could wrap each of them with an accessor that does some funky trick of trying to figure out which DLL is calling it. Presumably we can walk up the call stack and fish out the HMODULE for each function until we find one that isn't utilities.dll. Then we could return a different version depending on the calling DLL.
We could mandate that callers set a particular global variable (maybe also thread local) prior to calling any function in utilities.dll. The utilities DLL could then use this global variable value to determine the calling context.
We could find some way of loading utilities.dll multiple times at runtime. Perhaps we'd need to make multiple renamed copies at build time, so that each application DLL can have it's own copy of the utilities DLL. This negates some of the advantages of using a DLL in the first place, but there are other applications for which this "static library" style behaviour isn't needed and which would still benefit from utilities.lib becoming utilities.dll.
You are probably best off simply having utilities.dll export additional functions to allocate and deallocate a structure that contains the variables, and then have each of your other worker DLLs call those functions at runtime when needed, such as in the DLL_ATTACH_PROCESS and DLL_DETACH_PROCESS stages of DllEntryPoint(). That way, each DLL gets its own local copy of the variables, and can pass the structure back to utilities.dll functions as an additional parameter.
The alternative is to simply declare the individual variables locally inside each worker DLL directly, and then pass them into utilities.dll as input/output parameters when needed.
Either way, do not have utilities.dll try to figure out context information on its own. It won't work very well.
If I were doing this, I'd factor out all stateful global variables - I would export a COM object or a simple C++ class that contains all the necessary state, and each DLL export would become a method on your class.
Answers to your specific questions:
You can't reliably do a stack trace like that - due to optimizations like tail call optimization or FPO you cannot determine who called you in all cases. You'll find that your program will work in debug, work mostly in release but crash occasionally.
I think you'll find this difficult to manage, and it also puts a demand that your library can't be reentrant with other modules in your process - for instance, if you support callbacks or events into other modules.
This is possible, but you've completely negated the point of using DLL's. Rather than renaming, you could copy into distinct directories and load via full path.

Compiling a DLL with gcc

Sooooo I'm writing a script interpreter. And basically, I want some classes and functions stored in a DLL, but I want the DLL to look for functions within the programs that are linking to it, like,
program dll
----------------------------------------------------
send code to dll-----> parse code
|
v
code contains a function,
that isn't contained in the DLL
|
list of functions in <------/
program
|
v
corresponding function,
user-defined in the
program--process the
passed argument here
|
\--------------> return value sent back
to the parsing function
I was wondering basically, how do I compile a DLL with gcc? Well, I'm using a windows port of gcc. Once I compile a .dll containing my classes and functions, how do I link to it with my program? How do I use the classes and functions in the DLL? Can the DLL call functions from the program linking to it? If I make a class { ... } object; in the DLL, then when the DLL is loaded by the program, will object be available to the program? Thanks in advance, I really need to know how to work with DLLs in C++ before I can continue with this project.
"Can you add more detail as to why you want the DLL to call functions in the main program?"
I thought the diagram sort of explained it... the program using the DLL passes a piece of code to the DLL, which parses the code, and if function calls are found in said code then corresponding functions within the DLL are called... for example, if I passed "a = sqrt(100)" then the DLL parser function would find the function call to sqrt(), and within the DLL would be a corresponding sqrt() function which would calculate the square root of the argument passed to it, and then it would take the return value from that function and put it into variable a... just like any other program, but if a corresponding handler for the sqrt() function isn't found within the DLL (there would be a list of natively supported functions) then it would call a similar function which would reside within the program using the DLL to see if there are any user-defined functions by that name.
So, say you loaded the DLL into the program giving your program the ability to interpret scripts of this particular language, the program could call the DLLs to process single lines of code or hand it filenames of scripts to process... but if you want to add a command into the script which suits the purpose of your program, you could say set a boolean value in the DLL telling it that you are adding functions to its language and then create a function in your code which would list the functions you are adding (the DLL would call it with the name of the function it wants, if that function is a user-defined one contained within your code, the function would call the corresponding function with the argument passed to it by the DLL, the return the return value of the user-defined function back to the DLL, and if it didn't exist, it would return an error code or NULL or something). I'm starting to see that I'll have to find another way around this to make the function calls go one way only
This link explains how to do it in a basic way.
In a big picture view, when you make a dll, you are making a library which is loaded at runtime. It contains a number of symbols which are exported. These symbols are typically references to methods or functions, plus compiler/linker goo.
When you normally build a static library, there is a minimum of goo and the linker pulls in the code it needs and repackages it for you in your executable.
In a dll, you actually get two end products (three really- just wait): a dll and a stub library. The stub is a static library that looks exactly like your regular static library, except that instead of executing your code, each stub is typically a jump instruction to a common routine. The common routine loads your dll, gets the address of the routine that you want to call, then patches up the original jump instruction to go there so when you call it again, you end up in your dll.
The third end product is usually a header file that tells you all about the data types in your library.
So your steps are: create your headers and code, build a dll, build a stub library from the headers/code/some list of exported functions. End code will link to the stub library which will load up the dll and fix up the jump table.
Compiler/linker goo includes things like making sure the runtime libraries are where they're needed, making sure that static constructors are executed, making sure that static destructors are registered for later execution, etc, etc, etc.
Now as to your main problem: how do I write extensible code in a dll? There are a number of possible ways - a typical way is to define a pure abstract class (aka interface) that defines a behavior and either pass that in to a processing routine or to create a routine for registering interfaces to do work, then the processing routine asks the registrar for an object to handle a piece of work for it.
On the detail of what you plan to solve, perhaps you should look at an extendible parser like lua instead of building your own.
To your more specific focus.
A DLL is (typically?) meant to be complete in and of itself, or explicitly know what other libraries to use to complete itself.
What I mean by that is, you cannot have a method implicitly provided by the calling application to complete the DLLs functionality.
You could however make part of your API the provision of methods from a calling app, thus making the DLL fully contained and the passing of knowledge explicit.
How do I use the classes and functions in the DLL?
Include the headers in your code, when the module (exe or another dll) is linked the dlls are checked for completness.
Can the DLL call functions from the program linking to it?
Yes, but it has to be told about them at run time.
If I make a class { ... } object; in the DLL, then when the DLL is loaded by the program, will object be available to the program?
Yes it will be available, however there are some restrictions you need to be aware about. Such as in the area of memory management it is important to either:
Link all modules sharing memory with the same memory management dll (typically c runtime)
Ensure that the memory is allocated and dealloccated only in the same module.
allocate on the stack
Examples!
Here is a basic idea of passing functions to the dll, however in your case may not be most helpfull as you need to know up front what other functions you want provided.
// parser.h
struct functions {
void *fred (int );
};
parse( string, functions );
// program.cpp
parse( "a = sqrt(); fred(a);", functions );
What you need is a way of registering functions(and their details with the dll.)
The bigger problem here is the details bit. But skipping over that you might do something like wxWidgets does with class registration. When method_fred is contructed by your app it will call the constructor and register with the dll through usage off methodInfo. Parser can lookup methodInfo for methods available.
// parser.h
class method_base { };
class methodInfo {
static void register(factory);
static map<string,factory> m_methods;
}
// program.cpp
class method_fred : public method_base {
static method* factory(string args);
static methodInfo _methoinfo;
}
methodInfo method_fred::_methoinfo("fred",method_fred::factory);
This sounds like a job for data structures.
Create a struct containing your keywords and the function associated with each one.
struct keyword {
const char *keyword;
int (*f)(int arg);
};
struct keyword keywords[max_keywords] = {
"db_connect", &db_connect,
}
Then write a function in your DLL that you pass the address of this array to:
plugin_register(keywords);
Then inside the DLL it can do:
keywords[0].f = &plugin_db_connect;
With this method, the code to handle script keywords remains in the main program while the DLL manipulates the data structures to get its own functions called.
Taking it to C++, make the struct a class instead that contains a std::vector or std::map or whatever of keywords and some functions to manipulate them.
Winrawr, before you go on, read this first:
Any improvements on the GCC/Windows DLLs/C++ STL front?
Basically, you may run into problems when passing STL strings around your DLLs, and you may also have trouble with exceptions flying across DLL boundaries, although it's not something I have experienced (yet).
You could always load the dll at runtime with load library