Is there a way to create a sandbox environment inside C++ where you can either:
- Run processes in
- Load Dynamic Libraries in (Linux)
Dynamic Libraries are preferred because the easy communication between the main process and the sand-boxed processes.
A sand-boxed process should not be able to put memory on the heap or interact with the Kernel / Operating System. Instead the main process will provide an interface to do these things.
Is there any way to do this? I could create a script interpreter but that'd take away a lot of the speed. I'd like to keep the speed loss minimal.
You can use some software which provides sandbox environment. Use can provide memory limit, time limit etc. and other parameters to the application. I used a sandbox file when I created some online judge and needed to execute c++ files of other users in a limited environment.
Related
I develop a C++ framework that is used to run user code in a well defined environment (Linux boxes under our supervision).
I would like to prevent badly written modules to start eating up all memory of a machine. As I develop the framework could I simply force the program to stop itself if its memory consumption is too high ? What api or tool should I use for this ?
A simple mechanism for controlling a process's resource limits is provided by setrlimit. But that doesn't isolate the process (as you should for untrusted third-party code), it only puts some restrictions on it. To properly isolate a process from the rest of the system, you should either use a VM, or make use of cgroups and kernel namespaces — preferrably not by hand, but via some existing library or framework (such as Docker).
How to have my program stops if its memory consumption exceeds a limit ?
When you define the interface between the application and it's modules, ensure that one of the first steps (probably the first) will be to pass an allocator-like class instance, from the application to the modules.
This instance should be used in the module to allocate and deallocate all necessary memory.
This will allow implementations of this allocator instance, to report memory allocations to the main application, which should be able to triggering an exception, if a limit (per module or per application) is reached.
You can directly provide your own operator new. However, that won't protect you from calls to malloc, or direct OS calls. This would require patching or wrapping glibc (since you're on Linux). Doable but not nice.
What's your desired security level? Are you protecting against Murphy or Machiavelli ? Might a plugin use a third-party library which allocates memory on bahalf of the plugin? Do you need to keep track of the plugin that allocated the memory?
I know the question might seem a little vague but I will try to explain as clearly as I can.
In C++ there is a way to dynamically link code to your already running program. I am thinking about creating my own plugin system (For learning/research purposes) but I'd like to limit the plugins to specific system access for security purposes.
I would like to give the plugins limited access to for example disk writing such that it can only call functions from API I pass from my application (and write through my predefined interface) Is there a way to enforce this kind of behaviour from the application side?
If not: Are there other language's that support secure dynamically linked modules?
You should think of writing a plugin container (or a sand-box), then coordinate everything through the container, also make sure to drop privileges that you do not need inside the container process before running the plugin. Being run in a process means, you can run the container also as a unique user and not the one who started the process, after that you can limit the user and automatically the process will be limited. Having a dedicated user for a process is the most common and easiest way, it is also the only cross-platform way to limit a process, even on Windows you can use this method to limit a process.
Limiting access to shared resources that OS provides, like disk or RAM or CPU depends heavily on the OS, and you have not specified what OS. While it is doable on most OSes, Linux is the prime choice because it is written with multi-seat and server-use-cases in mind. For example in Linux you can use cgroups here to limit CPU, or RAM easily for each process, then you will only need to apply it for your plugin container process. There is blkio to control disk access, but you can still use the traditional quote mechanism in Linux to limit per-process or per-user share of disk space.
Supporting plugins is an involved process, and the best way to start is reading code that does some of that, Chromium sand-boxing is best place I can suggest, it is very cleanly written, and has nice documentation. Fortunately the code is not very big.
If you prefer less involvement with actual cgroups, there is an even easier mechanism for limiting resources, docker is fairly new but abstracts away low level OS constructs to easily contain applications, without the need to run them in Virtual Machines.
To block some calls, a first idea may be to hook the system calls which are forbidden and others API call which you don't want. You can also hook the dynamic linking calls to prevent your plugins to load another DLLs. Hook disk read/write API to block read/write.
Take a look at this, it may give you an idea to how can you forbid function calls.
You can also try to sandbox your plugins, try to look some open source sandbox and understand how they work. It should help you.
In this case you really have to sandbox the environment in that the DLL runs. Building such a sandbox is not easy at all, and it is something you probably do not want to do at all. System calls can be hidden in strings, or generated through meta programming at execution time, so hard to detect by just analysing the binary. Luckyly people have already build solutions. For example google's project native client with the goal to generally allow C++ code to be run safely in the browser. And when it is safe enough for a browser, it is probably safe enough for you and it might work outside of the browser.
What are some ways to help identify issues in a large multi-threaded c++ application that may be encumbered by access to storage I/O?
I can analyze an application to find specific slowdowns for specific runs but I cannot seem to simulate a slow I/O to help identify specific problem areas.
Performance can be a different when any of the main system components are tweaked (CPU, memory, and I/O) and I would think that it would be useful to see the difference in runs where this set of dependent components vary.
I am familiar with running tools such as VTune, if there is somewhere inside this analyzer that can do this I would like to know but I would be open to using other tools.
You could create and mount a FUSE filesystem that just wraps regular filesystem calls in a delay: http://www.cs.nmsu.edu/~pfeiffer/fuse-tutorial/
Let's say I want my clients to give the ability to create plug-ins for their application, but I don't want to make them hacks which poke with the memory of my program, is it possible to prevent this?
Or load the DLL in a kind of memory region where it won't have access to the main program memory?
You can let the plugins run in a separate process. Any information that is needed by the plugin is passed as a message to that process. Any result that is needed by the application is received as a message. You can have a separate process per plugin, or you can have all plugins run in the same process.
As an aside, most modern versions of a plugin feature use an embedded runtime environment, such as the JVM. Then, the plugin is running in the same process as the application, but within the confines of a virtual environment, which effectively limits the havoc the plugin can wreck upon your program. In this scenario, there is no DLL, but script code or byte code.
The short answer is "no".
Long answer:
A DLL is loaded into memory, and will appear to be part of the executable file itself for all intents and purposes, both from the process's perspective, and the OS's perspective. Sure the DLL is (perhaps) shared between multiple executables, so the OS needs to track how many "users" there are of a particular DLL, but from one process' perspective, it's part of the executable. It's a separate address range, but the rights and permissions for the content of the DLL are exactly the same as any other DLL or the main exectuable itself.
If you have plugins, you need to TRUST the plugins. If that's not what you want, then don't use the DLL model to make plugins (e.g. use a shared memory region and another executable to allow access to the shared memory only).
my web server has a lot of dependencies for sending back data, when it gets a request. i am testing one of these dependency applications within the web server. the application is decoupled from the main web server, and only queries are going to it in the form of api's exposed.
my question is, if i wish to check these api's in a multithreaded environment (c++ functions with a 2 quadcore processor machine), what is the best wy to go about doing it?
do i call each api in a separate thread or process? if so, how do i implement such code? from what i can figure out, i would be duplicating the functioning of the web server, but i can find no other better way to figure out the performance improvements given by that component alone.
It depends on whether your app deails with data that's shared if it is run in parallel processes because that'll most likely determine where the speed bottleneck awaits.
E.g, if the app accesses a database or disk files, you'll probably have to simulate multiple threads/processes querying the app in order to see how they get along with each other, i.e. whether they have to wait for each other while accessing the shared resource.
But if the app only does some internal calculation, all by its own, then it may scale well, as long as all its data fits into memory (i.e. not virtual memory access, e.g. disk access, necessary). Then you can test the performance of just one instance and focus on optimizing its speed.
It also might help to state the OS you're planning to use. Mac OS X offers tools for performance testing and optimization that Windows and Linux may not, and vice versa.