What can one reasonably do to try to contain catastrophic failures (such as crashes) within a shared library plugin written in C++, beyond catching all exceptions using catch (...) { /* handle here */ }?
My situation is a bit peculiar so let me describe it in detail. The "plugins" extend Mathematica (a programming language). They must be compiled into shared libraries which export a number of C (not C++) functions all of which have the same signature (similar to argc, argv, etc.). These functions can use a simple API to handle "arguments", return a value, or call back to Mathematica. After the "plugin" is loaded, these C functions will be exposed as Mathematica functions.
My contribution here is a tool that takes the interface description of a C++ class and automatically generates these standard C functions to call class members. I want to generate code to try to catch failures and report on them. So far the only protection I have is catching all exceptions.
Note 1: I am not interested in trying to run the plugin in a separate process for performance reasons. The API that Mathematica provides give direct access to large numerical matrices within Mathematica. I absolutely want to avoid copying these. Mathematica also provides a different API for out-of-process plugins, which I do use when performance is not critical.
Note 2: I cannot change Mathematica itself, e.g. control how it loads shared libraries.
Note 3: I looked a bit at signal handlers, but I worry that it's not a good idea to set once since it will affect the host application, which I have no control over. I only want to catch failures within the plugin itself, not other parts of the host.
I think this should be possible to some extent because MATLAB does it: if a MATLAB "plugin" (i.e. MEX file written in C) crashes, it won't immediately bring down the entire application. Instead it shows an error like this:
Notice the "Attempt to Continue" button. This is precisely the kind of thing I want to do: report some information about the problem (such as which plugin misbehaved, if this can be found out) and give a last chance to the user to try to save their work, while warning them about the potential consequences of continuing.
I expect some answers to be platform specific. I'm interested in Windows, OS X and GNU/Linux.
Related
I need to provide my users the ability to write mathematical computations into the program. I plan to have a simple text interface with a few buttons including those to validate the script grammar, save etc.
Here's where it gets interesting. These functions the user is writing need to execute at multi-megabyte line speeds in a communications application. So I need the speed of a compiled language, but the usage of a script. A fully interpreted language just won't cut it.
My idea is to precompile the saved user modules into objects at initialization of the C++ application. I could then use these objects to execute the code when called upon. Here are the workflows I have in mind:
1) Testing(initial writing) of script: Write code in editor, save, compile into object (testing grammar), run with test I/O, Edit Code
2) Use of Code (Normal operation of application): Load script from file, compile script into object, Run object code, Run object code, Run object code, etc.
I've looked into several off the shelf interpreters, but can't find what I'm looking for. I considered JAVA, as it is pretty fast, but I would need to load the JAVA virtual machine, which means passing objects between C and the virtual machine... The interface is the bottleneck here. I really need to create a native C++ object running C++ code if possible. I also need to be able to run the code on multiple processors effectively in a controlled manner.
I'm not looking for the whole explanation on how to pull this off, as I can do my own research. I've been stalled for a couple days here now, however, and I really need a place to start looking.
As a last resort, I will create my own scripting language to fulfill the need, but that seems a waste with all the great interpreters out there. I've also considered taking an existing open source complier and slicing it up for the functionality I need... just not saving the compiled results to disk... I don't know. I would prefer to use a mainline language if possible... but that's not required.
Any help would be appreciated. I know this is not your run of the mill idea I have here, but someone has to have done it before.
Thanks!
P.S.
One thought that just occurred to me while writing this was this: what about using a true C compiler to create object code, save it to disk as a dll library, then reload and run it inside "my" code? Can you do that with MS Visual Studio? I need to look at the licensing of the compiler... how to reload the library dynamically while the main application continues to run... hmmmmm I could then just group the "functions" created by the user into library groups. Ok that's enough of this particular brain dump...
A possible solution could be use gcc (MingW since you are on windows) and build a DLL out of your user defined code. The DLL should export just one function. You can use the win32 API to handle the DLL (LoadLibrary/GetProcAddress etc.) At the end of this job you have a C style function pointer. The problem now are arguments. If your computation has just one parameter you can fo a cast to double (*funct)(double), but if you have many parameters you need to match them.
I think I've found a way to do this using standard C.
1) Standard C needs to be used because when it is compiled into a dll, the resulting interface is cross compatible with multiple compilers. I plan to do my primary development with MS Visual Studio and compile objects in my application using gcc (windows version)
2) I will expose certain variables to the user (inputs and outputs) and standardize them across units. This allows multiple units to be developed with the same interface.
3) The user will only create the inside of the function using standard C syntax and grammar. I will then wrap that function with text to fully define the function and it's environment (remember those variables I intend to expose?) I can also group multiple functions under a single executable unit (dll) using name parameters.
4) When the user wishes to test their function, I dump the dll from memory, compile their code with my wrappers in gcc, and then reload the dll into memory and run it. I would let them define inputs and outputs for testing.
5) Once the test/create step was complete, I have a compiled library created which can be loaded at run time and handled via pointers. The inputs and outputs would be standardized, so I would always know what my I/O was.
6) The only problem with standardized I/O is that some of the inputs and outputs are likely to not be used. I need to see if I can put default values in or something.
So, to sum up:
Think of an app with a text box and a few buttons. You are told that your inputs are named A, B, and C and that your outputs are X, Y, and Z of specified types. You then write a function using standard C code, and with functions from the specified libraries (I'm thinking math etc.)
So now your done... you see a few boxes below to define your input. You fill them in and hit the TEST button. This would wrap your code in a function context, dump the existing dll from memory (if it exists) and compile your code along with any other functions in the same group (another parameter you could define, basically just a name to the user.) It then runs the function using a functional pointer, using the inputs defined in the UI. The outputs are sent to the user so they can determine if their function works. If there are any compilation errors, that would also be outputted to the user.
Now it's time to run for real. Of course I kept track of what functions are where, so I dynamically open the dll, and load all the functions into memory with functional pointers. I start shoving data into one side and the functions give me the answers I need. There would be some overhead to track I/O and to make sure the functions are called in the right order, but the execution would be at compiled machine code speeds... which is my primary requirement.
Now... I have explained what I think will work in two different ways. Can you think of anything that would keep this from working, or perhaps any advice/gotchas/lessons learned that would help me out? Anything from the type of interface to tips on dynamically loading dll's in this manner to using the gcc compiler this way... etc would be most helpful.
Thanks!
I am designing a system in C/C++ which is extendible with all sort of plugins. There is a well defined C public API which mostly works with (const) char* and other pointer types. The plugins are compiled into .so or .dll files, and the main application loads them upon startup, and later unloads or reloads them upon request.
The plugins might come in from various sources, trustable or not so :)
Now, I would like to make sure, that if one plugin does something stupid (such as tries to free a memory which he was not supposed to free), this action does not bring down the entire system, but merely notices the main system about the misbehaving plugin for it in order to remove it from the queue.
The code calls are being done in the following manner:
const char* data = get_my_data();
for(int i = 0; i<plugins; i++)
{
plugins[i]->execute(data);
}
but if plugin[0] frees "by accident" the data string or overwrites it or by mistake jumps to address 0x0 this would bring down the entire system, and I don't want this. How can I avoid this kind of catastrophe. (I know, I can duplicate the data string ... this does not solve my problem :) )
Make a wrapper process for plugin and communicate with that wrapper through IPC.
In case of plugin failure your main process would be untouched
Simply put, you can't do that in the same process. If your plugins are written in C or C++, they can contain numerous sources of undefined behavior, meaning sources for undetectable unavoidable crashes. So you should either launch the plugins in their own processes like kassak suggested and let them crash if they want to, or use another language for your plugins, e.g. some intepreted scripting language like lua.
Have a look at http://msdn.microsoft.com/en-us/library/1deeycx5(v=vs.90).aspx
I use /EHa in one of my projects to help me catch exceptions from libraries that do stupid things. If you compile your code with this setting a normal try catch block will catch exceptions like devide by zero, etc.
Not sure if there is some equivalent for this on Linux -- please let me know if there is..
I am about to delve into kernel land. My question relates to the programming language. I have seen most tutorials to be written in C. I currently program in C++ and Assembly. I also studied C before C++, but I didn't use it a lot. Would it be possible to program in kernel mode using simplistic C++without using any advanced constructs? Basically I am trying to avoid the minor differences that exist between the two languages(like no bool in C, no automatic returning of 0 from main, really minor differences). I won't be using templates, classes and the like. So would it be possible to program in kernel mode using simplistic C++ without any major annoyances?
Even if not officially supported, you can use C++ as the development language for Windows kernel development.
You should be aware of the following things :
you MUST define the new and delete operator to map to ExAllocatePoolWithTag and ExFreePool.
try to avoid virtual functions. It seems not possible to control the location of the vtable of the object and this may have unexpected results if it is in a pageable portion and you code is called with IRQL >= DISPATCH_LEVEL.
if you still need to use virtual methods table than lock .rdata segment before using it on IRQL >= DISPATCH_LEVEL.
Apart from these kinds of limitations, you can use C++ for your driver development.
Add two links if you want to do C++ in WDK. It's a one time setup effort.
The NT Insider:Guest Article: C++ in an NT Driver
The NT Insider:Global Relief Effort - C++ Runtime Support for the NT DDK
Have seen kernel codes use lots of auto-locks/smart-pointers; although they make the code neat, I feel it has a learning curve for beginner to fully understand, and if abused, lots of construct/destruct codes slow things down.
If you write your code carefully, knowing what exactly stands behind each definition, operator, call, etc, then there should be no problem writing kernel code in C++. The Microsoft document mentioned in the comments above is a good reading precisely because it describes situations in which C++ isn't as transparent as C or doesn't provide similar important guarantees and from that you know what to avoid.
Microsoft has written a guide. Basically they tell us to steer clear of anything but using C++'s relaxed rules of variable declarations...sigh. Anything else and you're on your own. Anyway it can't be all that bad but here are some examples of what you need to remember:
Memory allocated in the paged pool can get paged out. If you try to access it when IRQL is above PASSIVE_LEVEL you're screwed (or at least you will be every once in a while when your customer complains about your driver BSODding their system)! Test your driver on a low memory system under load!
The non-paged pool is limited, you most likely cannot allocate all your needs from it.
Stack is much smaller than in user mode ~12-24K.
Anything you do involving floating point path in the kernel must be protected by KeSaveFloatingPointState and KeRestoreFloatingPointState
C++ exceptions: No
Read the guide for more. Now if you can make sure that the generated code follows the rules, go ahead and use C++.
I want to know if exists an "evaluate" function in C++ like the Matlab one.
In practise, I need a function that can interprets a string like a command line.
thanks for the answers.
If you are actually trying to "evaluate" C++ source code within a running C++ application, then basically no - it's not a feature specified by the language.
There are interpreters for subsets of C++ (e.g. CInt, Ch and UnderC) - they may be able to run your C++ program if it's a relatively simple one. Alternatively, some can be embedded within a compiled C++ program to allow some run-time source code evaluation, but with limited access to and ability to change the pre-compiled code and its variables.
It's also possible for a running program to invoke the compiler and dynamically load/link a resultant library, but this is a very unusual practice and not without performance, security and interoperability issues:
creating a new process for the compiler, compiling and linking is a relatively resource-hungry and slow operation, but once the library's linked the new code can be executed at normal out-of-line function call speeds
the usual issues with executing an external process
ensuring the path and compiler executable name can't be changed by malicious inputs to the program
that no malware is substituted for or infecting the compiler
on-the-fly source code doesn't contain statements like system(), exec(), unlink() calls, abuse network connectivity, chew unwarranted CPU/memory/descriptors etc.
the pre-compiled C++ program can't be modified or easily/deeply probed by the newly linked code, so the main mechanisms for new behaviour must have been designed in to the pre-compiled application already: expectations for newly accessible variables, functions, and factory methods / virtual dispatch.
If you actually need something more limited, like the ability to evaluate mathematical expressions or logical predicates, possibly expressed in a C++-source style, perhaps reading or setting some of your values, then various more limited and specialised libraries and embedded interpreted are available. There are even libraries for creating such parsers, such as the boost spirit library.
Finally, interpreters for other languages - Lua, Ruby, Python, Perl, TCL etc. - may be embedded in the C++ application, sporting various approaches to interoperability and security.
You can use system(): http://linux.die.net/man/3/system
I was wondering if its possible / anyone knows any tools out there to compare the execution of two related programs (for example, assignments on a class) to see how similar they are. For example, not to compare the names of functions, but how they use syscalls. One silly case of this would be testing if a C string is printed as (see example below) in more than one case one separate program.
printf("%s",str)
Or as
for (i=0;i<len;i++) printf("%c",str[i]);
I haven´t put much thought into this, but i would imagine that strace / ltrace (maybe even oprofile) would be a good starting point. Particularly, this is for UNIX C / C++ programs.
Thanks.
If you have access to the source code of the two programs, you may build a graph of the functions (each function is a node, and there is an edge from A to B if A calls B()), and compute some graph similarity metrics. This will catch a source code copy made by renaming and reorganizing.
An initial idea would be to use ltrace and strace to log the calls and then use diff on the logs. This would obviously only cover the library an system calls. If you need a more fine granular logging, the oprofile might help.
If you have access to the source code you could instrument your code by compiling it with profiling information and then parse the gcov output after the runs. A pure static source code analysis may be sufficient if your code is not taking different routes depending on external data/state.
I think you can do this kind of thing using valgrind.
A finer-grained version (and depending on what is the access to the program source and what you exactly want in terms of comparison) would be to use kprobes.
Kernel Dynamic Probes (Kprobes) provides a lightweight interface for kernel modules to implant probes and register corresponding probe handlers. A probe is an automated breakpoint that is implanted dynamically in executing (kernel-space) modules without the need to modify their underlying source. Probes are intended to be used as an ad hoc service aid where minimal disruption to the system is required. They are particularly advocated in production environments where the use of interactive debuggers is undesirable. Kprobes also has substantial applicability in test and development environments. During test, faults may be injected or simulated by the probing module. In development, debugging code (for example a printk) may be easily inserted without having to recompile to module under test.