Disabling system calls in C++ - c++

Is it possible to disable system calls when compiling C++ code? And if it is, how would I do that?
And to extend this question a bit. I wish to make program to not be able to interact with operating system, except for file reading and writing. Is it possible to do this?
EDIT: With not be able to interact with OS, I mean to not be able to change anything in OS, like creating, editing or deleting something. My main concern is system calls, which would almost in all cases be intended to be harmful.
This is for grading programs, where I would be running other people code. The programs would usually solve various algorithmic problems, so there is no need for very advanced features. Basic (more or less) STL usage and classic code. There would be no external libraries (like Boost or anything like that) or multiple files.

Yes, it's certainly possible.
Take a look at the source code for geordi to see how it does it. Geordi is an IRC bot that compiles, links and runs C++ code under an environment where most system calls are disabled.

#define system NO_SYSTEM_CALL
If you are ok with macros to generate errors for compilation purpose.

You could use any combination of the following:
create your own library with a dummy function called system and link it with the student code (assuming you control the build steps)
grep the source code (though preprocessing hacks could get around that)
run the built binaries under an unprivileged user id, after chroot etc.
use a virtual machine
invoke the compiler with -Dsystem= (though the student could #undef)
(maybe - have to check the end-user agreement) upload their source to ideone or similar and let their security handle such issues

An program can always invoke system calls, at leased under *nix it can. You could however take a look at SELinux, Apparmor, GRsec this are kernel safeguards which can block certain system calls for an application.

Related

How to Prevent I/O Access in C++ or Native Compiled Code

I know this may be impossible but I really hope there's a way to pull it off. Please tell me if there's any way.
I want to write a sandbox application in C++ and allow other developers to write native plugins that can be loaded right into the application on the fly. I'd probably want to do this via DLLs on Windows, but I also want to support Linux and hopefully Mac.
My issue is that I want to be able to prevent the plugins from doing I/O access on their own. I want to require them to use my wrapped routines so that I can ensure none of the plugins write malicious code that starts harming the user's files on disk or doing things undesireable on the network.
My best guess on how to pull off something like this would be to include a compiler with the application and require the source code for the plugins to be distributed and compiled right on the end-user platform. Then I'd need an code scanner that could search the plugin uncompiled code for signatures that would show up in I/O operations for hard disk or network or other storage media.
My understanding is that the STD libaries like fstream wrap platform-specific functions so I would think that simply scanning all the code that will be compiled for platform-specific functions would let me accomplish the task. Because ultimately, any C native code can't do any I/O unless it talks to the OS using one of the OS's provided methods, right??
If my line of thinking is correct on this, does anyone have a book or resource recommendation on where I could find the nuts and bolts of this stuff for Windows, Linux, and Mac?
If my line of thinking is incorrect and its impossible for me to really prevent native code (compiled or uncompiled) from doing I/O operations on its own, please tell me so I don't create an application that I think is secure but really isn't.
In an absolutely ideal world, I don't want to require the plugins to distribute uncompiled code. I'd like to allow the developers to compile and keep their code to themselves. Perhaps I could scan the binaries for signatures that pertain to I/O access????
Sandboxing a program executing code is certainly harder than merely scanning the code for specific accesses! For example, the program could synthesize assembler statements doing system calls.
The original approach on UNIXes is to chroot() the program but I think there are problems with that approach, too. Another approach is a secured environment like selinux, possible combined with chroot(). The modern approach used to do things like that seems to run the program in a virtual machine: upon start of the program fire up a suitable snapshot of a VM. Upon termination just rewind to tbe snaphot. That merely requires that the allowed accesses are somehow channeled somewhere.
Even a VM doesn't block I/O. It can block network traffic very easily though.
If you want to make sure the plugin doesn't do I/O you can scan it's DLL for all it's import functions and run the function list against a blacklist of I/O functions.
Windows has the dumpbin util and Linux has nm. Both can be run via a system() function call and the output of the tools be directed to files.
Of course, you can write your own analyzer but it's much harder.
User code can't do I/O on it's own. Only the kernel. If youre worried about the plugin gaining ring0/kernel privileges than you need to scan the ASM of the DLL for I/O instructions.

Calling external files (e.g. executables) in C++ in a cross-platform way

I know many have asked this question before, but as far as I can see, there's no clear answer that helps C++ beginners. So, here's my question (or request if you like),
Say I'm writing a C++ code using Xcode or any text editor, and I want to use some of the tools provided in another C++ program. For instance, an executable. So, how can I call that executable file in my code?
Also, can I exploit other functions/objects/classes provided in a C++ program and use them in my C++ code via this calling technique? Or is it just executables that I can call?
I hope someone could provide a clear answer that beginners can absorb.. :p
So, how can I call that executable file in my code?
The easiest way is to use system(). For example, if the executable is called tool, then:
system( "tool" );
However, there are a lot of caveats with this technique. This call just asks the operating system to do something, but each operating system can understand or answer the same command differently.
For example:
system( "pause" );
...will work in Windows, stopping the exectuion, but not in other operating systems. Also, the rules regarding spaces inside the path to the file are different. Finally, even the separator bar can be different ('\' for windows only).
And can I also exploit other functions/objects/classes... from a c++
and use them in my c++ code via this calling technique?
Not really. If you want to use clases or functions created by others, you will have to get the source code for them and compile them with your program. This is probably one of the easiest ways to do it, provided that source code is small enough.
Many times, people creates libraries, which are collections of useful classes and/or functions. If the library is distributed in binary form, then you'll need the dll file (or equivalent for other OS's), and a header file describing the classes and functions provided y the library. This is a rich source of frustration for C++ programmers, since even libraries created with different compilers in the same operating system are potentially incompatible. That's why many times libraries are distributed in source code form, with a list of instructions (a makefile or even worse) to obtain a binary version in a single file, and a header file, as described before.
This is because the C++ standard does not the low level stuff that happens inside a compiler. There are lots of implementation details that were freely left for compiler vendors to do as they wanted, possibly trying to achieve better performance. This unfortunately means that it is difficult to distribute a simple library.
You can call another program easily - this will start an entirely separate copy of the program. See the system() or exec() family of calls.
This is common in unix where there are lots of small programs which take an input stream of text, do something and write the output to the next program. Using these you could sort or search a set of data without having to write any more code.
On windows it's easy to start the default application for a file automatically, so you could write a pdf file and start the default app for viewing a PDF. What is harder on Windows is to control a separate giu program - unless the program has deliberately written to allow remote control (eg with com/ole on windows) then you can't control anything the user does in that program.

Disable system() and exec() function in C and Pascal

Is there any way to disable system() and exec() function in C/C++ and Pascal, by using any compiler argument or modifying header/unit file? (It's a Windows)
I've tried using -Dsystem=NONEXIST for gcc and g++ but #include <cstdio> causes compile error.
EDIT: Of course I know they can use #undef system to bypass the defense, so I've tried to comment out the system function line in stdlib.h, but that doesn't work too.
EDIT2 (comment): It's a system, to which users submit their programs and the server compile and run it with different input data, then compare the program output with pre-calculated standard output to see if the program is correct. Now some users send code like system("shutdown -s -t 0"); to shutdown the server.
The server is running Windows system so I don't have any chroot environment. Also the server application is closed-source so I can do nothing to control how the program submitted by user is executed. What I can do is to modify the compiler commandline argument and modify header files.
Well, you could try:
#define system DontEvenThinkAboutUsingThisFunction
#define exec OrThisOneYouClown
in a header file but I'm pretty certain any code monkey worth their salt could bypass such a "protection".
I'd be interested in understanding why you thought this was necessary (there may be a better solution if we understood the problem better).
The only thing that comes to mind is that you want to provide some online compiler/runner akin to the Euler project. If that was the case, then you could search the code for the string system<whitespace>( as an option but, even then, a determined party could just:
#define InoccuousFunction system
to get around your defenses.
If that is the case, you might want to think about using something like chroot so that no-one can even get access to any dangerous binaries like shutdown (and that particular beast shouldn't really be runnable by a regular user anyway) - in other words, restrict their environment so that the only things they can even see are gcc and its kin.
You need to do proper sandboxing since, even if you somehow prevented them from running external programs, they may still be able to do dangerous things like overwite files or open up socket connections to their own box to send through the contents of your precious information.
One possibility is to create your own version of such functions, and link them into every program you compile/link on the server. If the symbol is found in your objects, it'll take precedence.
Just make sure you get them all ;)
It would be much better to run the programs as a user with as few privileges as possible. Then you don't have to worry about them deleting/accessing system files, shutting down the system, etc.
EDIT: of course, by my logic, the user could provide their own version of the function also, which does dynamic library loading & symbol lookup to find the original function. You really just need to sandbox it.
For unixoid environments, there is Geordi, which uses a lot of help from the operating system to sandbox the code to be executed.
Basically you want to run the code in a very restricted environment; Linux provides a special process flag for that which disables any system calls that would give access to resources that the process did not have at the point where the flag was set (i.e. it disallows opening new files, but any files that are already open may be accessed normally).
I think Windows should have a similar mechanism.
Not really (because of tricks like calling some library function which would call system itself, or because the functionality of spawning processes can be done with just fork & execve system calls, which remain available...).
But why do you ask that?
You can never (as you have found out) rely on user input to be safe. system and execXX are unlikely to be your only problems.
This means you have the following options:
Run the program in some kind of chrooted jail (not sure how to do this on windows)
Scan the code before before compiling to ensure there are no "illegal" functions.
Scan the executable binary after compiling to ensure that it is not using any "forbidden" library function.
Prevent the linker from linking to any external libraries including the standard C library (libc) on unix. You then create your own "libc" which explicilty allow certain functions.
Number 3 on unix can use utilities like readelf or objdump can check for linked in symbols. This can also probably be done using the Binary File Descriptor Library as well.
Number 4 will require fiddling with compiler flags but probably is the safest out of the options listed above.
You could use something like this
#include<stdlib.h>
#include<unistd.h>
#define system <stdlib.h>
#define exec <unistd.h>
In this case even if the user wants to swap macro values they can't. If they try to swap macro values like this
#define <stdlib.h> system
#define <unistd.h> exec
they can't because C wouldn't allow this type of first name in macros. Even if somehow they swap these values then we have included those header files that will create a compile time error.

C++: Any way to 'jail function'?

Well, it's a kind of a web server.
I load .dll(.a) files and use them as program modules.
I recursively go through directories and put '_main' functors from these libraries into std::map under name, which is membered in special '.m' files.
The main directory has few directories for each host.
The problem is that I need to prevent usage of 'fopen' or any other filesystem functions working with directory outside of this host directory.
The only way I can see for that - write a warp for stdio.h (I mean, write s_stdio.h that has a filename check).
May be it could be a deamon, catching system calls and identifying something?
add
Well, and what about such kind of situation: I upload only souses and then compile it directly on my server after checking up? Well, that's the only way I found (having everything inside one address space still).
As C++ is low level language and the DLLs are compiled to machine code they can do anything. Even if you wrap the standard library functions the code can do the system calls directly, reimplementing the functionality you have wrapped.
Probably the only way to effectively sandbox such a DLL is some kind of virtualisation, so the code is not run directly but in a virtual machine.
The simpler solution is to use some higher level language for the loadable modules that should be sandboxed. Some high level languages are better at sandboxing (Lua, Java), other are not so good (e.g. AFAIK currently there is no official restricted environment implemented for Python).
If you are the one loading the module, you can perform a static analysis on the code to verify what APIs it calls, and refuse to link it if it doesn't check out (i.e. if it makes any kind of suspicious call at all).
Having said that, it's a lot of work to do this, and not very portable.

Finding very similar program executions

I was wondering if its possible / anyone knows any tools out there to compare the execution of two related programs (for example, assignments on a class) to see how similar they are. For example, not to compare the names of functions, but how they use syscalls. One silly case of this would be testing if a C string is printed as (see example below) in more than one case one separate program.
printf("%s",str)
Or as
for (i=0;i<len;i++) printf("%c",str[i]);
I havenĀ“t put much thought into this, but i would imagine that strace / ltrace (maybe even oprofile) would be a good starting point. Particularly, this is for UNIX C / C++ programs.
Thanks.
If you have access to the source code of the two programs, you may build a graph of the functions (each function is a node, and there is an edge from A to B if A calls B()), and compute some graph similarity metrics. This will catch a source code copy made by renaming and reorganizing.
An initial idea would be to use ltrace and strace to log the calls and then use diff on the logs. This would obviously only cover the library an system calls. If you need a more fine granular logging, the oprofile might help.
If you have access to the source code you could instrument your code by compiling it with profiling information and then parse the gcov output after the runs. A pure static source code analysis may be sufficient if your code is not taking different routes depending on external data/state.
I think you can do this kind of thing using valgrind.
A finer-grained version (and depending on what is the access to the program source and what you exactly want in terms of comparison) would be to use kprobes.
Kernel Dynamic Probes (Kprobes) provides a lightweight interface for kernel modules to implant probes and register corresponding probe handlers. A probe is an automated breakpoint that is implanted dynamically in executing (kernel-space) modules without the need to modify their underlying source. Probes are intended to be used as an ad hoc service aid where minimal disruption to the system is required. They are particularly advocated in production environments where the use of interactive debuggers is undesirable. Kprobes also has substantial applicability in test and development environments. During test, faults may be injected or simulated by the probing module. In development, debugging code (for example a printk) may be easily inserted without having to recompile to module under test.