Need to sandbox application that compiles C++ modules from untrusted sources online

Need to sandbox application that compiles C++ modules from untrusted sources online - c++

I’m developing a C++ application where I want to compile C++ modules from potentially untrusted sources online, and have them operate on a specific bank of data within a single process. I’d like to sandbox these somehow. This is obviously a complex issue, but hoping to discover if there’s any potential approach or tool/library I haven’t yet thought of. The app will run on Windows & OSX at minimum, and (hopefully) Linux, iOS, Android.
My app would locally compile the C++ modules it downloads, and dynamically link the object code to a process in the app (not necessarily the “main” app process). The C++ modules would only have access to my API via the headers I provide, however the API (and any dependent libraries) need to be linked into the same process. The API’s dependent libraries are compute-based only, such as native SIMD-based math and possibly memory allocation. I don’t expect they will need to call any network, disk, or any other OS functionality, for that matter – except for needing to communicate their input data and computed results to the main process (maybe over shared memory ?)
I don’t care if the sandboxed process’ memory is corrupted or hollowed, as long as it’s contained in that process. I also want to avoid having any system API call addresses linked into in the process memory space, to prevent compromised code from finding them.
I’ve done a review of the basic security issues (stack crashes, return oriented programming hacks, etc.). Also looked at some related projects:
I see Google has a sandbox subproject within the Chromium repo which might be useful, but unsure of it’s utility in my case.
Windows Sandbox is a Microsoft tool for Windows only, and isn’t available on some versions anyway. Moreover. there are big performance issues with using it. The app runs in real time, with frame rate requirements similar to a video game.
considered compiling to WebAssembly, but at the moment it seems too immature (no SIMD, hard to debug, and potentially vulnerable to hacks in the wrapping host or browser.)
I thought there might be some kind of wrapper library already out there to intercept all OS calls and allow custom configuration of what calls get passed through (in my case, anything except what’s needed for the inter-process communication would be denied)
Any other ideas, architectural suggestions, or promising open source projects on the horizon for this ?
Thanks,
C

Compiling untrusted source code and linking to your app sounds really unsafe. If I understand your problem correctly, you need to "provide safe runtime environment for single threaded user code with only access to your API", then in my opinion its better to use runtime interpreter instead. It will provide you more control and sandbox capabilities, safe API calls and users code exceptions handling.
If you have doubts about interpreters performance, its a good trade of to safety, flexibilty and control. Vast of interpreters compile source code to bytecode and runs realy fast. Also you can reach better performance by providing fast API to script.
In my Java enterprise projects I use built-in Rhino JavaScript interpreter to run user scripts and provide API to reach flexibility, required performance and control. This scripts can call nothing but my API. Its safe, flexible and absolutely controllable.
I found these C/C++ (C like syntax) interpreter libraries:
JavaScript (ECMA)
https://v8.dev/
Lua
http://acamara.es/blog/2012/08/running-a-lua-5-2-script-from-c/
C++ interpreter
https://github.com/root-project/cling

Related

What is the best way to make use of native code in nodejs?

I'm developing an application using nodejs/electron which has to manipulate a huge amount of data, and have written the performance-critical part of it in c++. Now that the most parts of the application are done, I'm still not sure of the best way to call the c++ code. It seems that the most natural way is to compile it as a shared library and then call it from nodejs, but for me that process seems to be so complicated, so I thought about making the c++ code a standalone executable listening on a port number, and then sending tcp requests to it from my nodejs app.
The question is then: Would that affect performance? And is that a bad design choice?
Thanks

You have several options:
Put your code into it's own app and workout a way to pass in what you want it to process (via cmdline arguments, file, stdio, networking, etc...) and then run it from node.js via the child_process module.
Put your code into a nodejs add-on, using the add-on SDK so you can directly call it from within nodejs.
Compile your code to web-assembly and load/run that web assembly directly from within nodejs. If you haven't heard of web assembly before, it's a fairly new capability in Javascript engines. A language like Rust or C or C++ compiles their code to a web assembly target. It's kind of like a low-level, generic (non-CPU-specific) assembly language that Javascript engines can directly run and other typed languages such as Rust, C or C++ can compile to directly just like they would compile to native machine language for a particular compiler target. The JS engine then take that web assembly and compiles it on the fly to the local machine language while adding a few memory access safeguards.
so I thought about making the c++ code a standalone executable listening on a port number, and then sending tcp requests to it from my nodejs app.
That would work just fine.
The question is then: Would that affect performance? And is that a bad design choice?
It's kind of hard to know what your performance target is. You probably wouldn't want some operation that you call thousands of times in rapid fashion to be in a separate server, but if you're calling it less often than that, then it could be just fine.
The nodejs add-on SDK allows you a fairly high bandwidth interface to Javascript as it's the same way that a lot of the nodejs built-in libraries are implemented. It is more work to learn how it works because if you're going to play in-process within node.js and play with garbage collection, you have to do a lot of things a certain way (particularly anything that deals with memory or passing data to/from Javascript). But, it's ultimately the tightest connection to your nodejs Javascript.

Communicate with CoDeSys program on a Linux-based WAGO PFC200 PLC

I'm currently getting familiar with PLC's, the WAGO 750-8206 PLC in particular. It offers a linux OS and can run CoDeSys programs. There are some I/O modules attached to the controller: 750-530, 750-430 and 750-600. What I would like to know is this:
Is it possible to write a C++ linux application that runs on the PLC and gets/sets the digital inputs and outputs?
Even better: can I write a CoDeSys program that "talks to the I/O's" and handles all the logic and at the same time can be accessed by a C++ linux program? THe idea is this: I would like the CoDeSys program to check for let's say two digital inputs. If both are high, a variable should be set to a defined value. The linux application should be able to read that variable and conduct further processing (such as sending JSon data to a server or similar).
Also, I would need to be able to send commands from the linux application to the CoDeSys program in order to switch digital outputs (or set values on analog outputs etc) when the linux application receives a message that triggers the command.
Any thoughts and comments on this topic are greatly appreciated as I am completely new to this topic. Thanks in advance!

The answer you might want
The actual situation has changed into the opposite of the previous answer.
WAGO's recent Board Support Packages and Documentation actively support you in making changes and extensions to the PLC200 line. Specifically the WAGO 750-8206 and 17 (as of March 2016) other PLCs :
wago.us -> Products -> Components for Automation -> Modular WAGO-I/O-SYSTEM, IP 20 (750/753 Series)
What you have to do is get in touch with them and ask for their latest Board Support Package (BSP) for the PLC200 line.
I quote from the previous answer and mark the changes, my additions are in bold.
Synopsis
Could you hack a PFC200 and get custom binaries executed? Probably Absolutely yes. As long as the program is content to run on the Linux-3.6.11 kernel and glibc-2.16 and is compiled for the "armhf" API, any existing ARM application, provided you also copy the libraries it uses as well, will just run without even compiling it specifically for the PFC200.
Would it be easy or quick? No. Yes, if you have no fear of the Linux Command line. It is as easy as using the Cross Compiler provided by the Board Support Package (BSP) with the provided C-libraries and then run this to transfer your program to the PFC's flash and run it: scp your-program root#PFC200:/usr/bin
ssh root#FC200 /usr/bin/your-programOf course, you can use Eclipse CDT with the Cross Toolchain for the PFC200 and configure Eclipse to do do remote run and debug.
Will this change in the future? Maybe. Remember that PFC200 is fairly new in North America.It has, PFC200 has appeared in September 2014
The public HOWTO Building FORTE for Wago describes how to use the initial BSP to run FORTE, which is the IEC 61499 run-time environment of 4DIAC (link: sf.net/projects/fordiac ), an open source PLC environment allowing to implement industrial control solutions in a vendor neutral way. 4DIAC implements IEC 61499 extending IEC 61131-3 with better support for controller to controller communication and dynamic reconfiguration.
In case you want to access the KBUS (which talks to the I/Os) directly, you have to know that currently only one application can be in charge of KBUS.
So either CODESYS, or FORTE, or your own KBUS application can be in charge of the KBUS.
The BSP from 2015 has many examples and demos how to use all the I/O of the PLC200 (KBUS, CAN, MODBUS, PROFIBUS as well as the Switches and LEDs on the PFC200 directly). Sources for the kernel and with all kernel drivers and the other Open Source components is provided and compiled in the Board Support Package (BSP).
But, the sources for the libraries and tools developed from scratch by WAGO and are not based on GPL/Open Source code are not provided: These include the Application Device Interface(ADI)/Device Abstraction Layer(DAL) libraries which do CANopen, PROFIBUS-Slave and KBus (which is used all PLC I/O modules connected to the main PLC unit)
While CANopen is using the standard Linux Socketcan API to talk to the kernel and you could just write a normal socketcan program using the provided libsocketcan, the KBus API is an WAGO-specific invention and there, you'd have to do some reverse-engineering if you'd not want to use WAGO's DAL for accessing all the electrical I/O of the PLC, but the DAL is documented and examples how to use it are provided in the BSP.
If you use CODESYS however, there is an "codesys_lib_demo-0.1" example library which shows how to provide a library for CODESYS to use.

Outdated Answer
This answer was very specific to circumstances in 2014 and 2015. As of 2016, it contains incorrect information. Still going to leave as-is for now to provide background.
The quick answer you probably don't want
You could very reasonably write code using Codesys that put together a JSON packet and sent it off to a server elsewhere. JSON is just text, and Codesys can manipulate text in a fashion very similar to C. And there are many ethernet protocols available from within Codesys using addon libraries provided by Wago.
Now the long Answers
First some background
Since you seem to be new to Wago and the philosophy of Codesys in general... a short history.
Codesys is used to build and deploy Hard Realtime execution environments, and it is important to understand that utilizing libraries without fully understanding the consequences can destabilize performance of the entire system (bringing Codesys to its knees and throwing watchdog errors in the program). Remember, many PLC's are controlling equipment that could kill someone if it ever crashed.
Wago is fond of using Linux to provide the preemptive RT kernel for the low level task scheduling and then configuring Codesys to utilize much of the standard C-libraries that often accompany linux. Wago has been doing this for quite some time, but they would never allow you to peel back the covers without going through Codesys (which means using IEC 61131 languages, of which C++ is not included), and this was for your own safety (and their product image). If you wanted the power of linux on a Wago, you had to get a special PLC with a completely naked OS, practically no manual or support, and forfeit the entire Codesys runtime.
The new PFC200's have much more RAM and memory available than recent models, allowing for more of the standard linux userland stack (ssh, ftp, http,...) to be included without compromising the Codesys runtime, and they advertise this. BUT... they are still keeping a lid on compilation tools and required header files needed to compile and link to Codesys libraries or access specialized hardware (the Wago KBUS, which interfaces your I/O modules).
The Synapsis
Could you hack a PFC200 and get custom binaries executed? Probably yes.
Would it be easy or quick? No.
Will this change in the future? Maybe. Remember that PFC200 is fairly new in North America.
Things you may not know
Codesys does not necessarily know or care about Wago. You can get Target Platforms for Codesys that do target Intel processors running a linux os. Codesys DOES SUPPORT accessing external libraries (communication in the reverse direction is dangerous), but they often expect a C style interface, and you can only access those libraries by defining C-headers that Codesys will analyze, so you may need to do some magic to get C++ working seemlessly. What you can do is create a segment of shared memory that both C++ and Codesys access, and that is how they pass information (synchronization is another problem).
You can get an Open Wago PLC, running Codesys on Linux. Wago's IPC are made specifically for this purpose. They have more power, memory, and communication capabilities in general; but they do cost more than double your typical Wago PLC.
If you feel like toying with the idea of hacking a Wago, you will need to tear apart the manuals for Codesys (it has its own), the manuals for the Wago IPC's, and already be familiar with linux style inter-process-communication and/or dynamic libraries.
Also, there is an older Wago PLC that had the naked Linux on it 750-8??. It also has a very good manual on how to access the Wago hardware using supplied headers.
You must first understand how Codesys expects to talk to its target operating system. Then you work backwards to make it talk to Wago specific libraries living on that operating system. You must be careful not to hijack Codesys.
Your extra C++ libs should assist Codesys, not take it over. For instance, host a sqlite database on the same device, and use C++ to manage the database and provide a very simple interface that Codesys can utilize. All Codesys would do is call a function and pass some values, but your C++ would actually build an SQL query and issue it to the database (Codesys doesn't need to know why or how this is happening).
I hope at least one paragraph is helpful in some way.

Innerprocess communication between independent DLLs

I'm developing and maintaing a set of DLLs that are used as plugins for a host application. The host application has a plugin API which my plugins implement.
The host application is developed by another company and I have no control over how the plugins are used: host application might load/unload any of the plugins at any time and in any order. A plugin can run in any thread and also might be called from different threads.
I need a way for these plugins plugins to share a common resource. This resource should be initialized by the first plugin that is loaded and uninitialized by the last plugin that is unloaded. First and last might be different plugins. Thread safety is an important issue.
You can think of this as a singleton that is shared between all the currently loaded plugins.
A possible solution could be that all my plugins will share a common DLL that will initialize the singleton upon it's loading and destroy it when it is unloaded.
However I would like to have my plugins self contained if at all possible, to ease the deployment on users's machines.
Because the host application is cross-platform, the solution should be cross-platform and work in the same way on Windows, Mac OS and Linux (if at all possible). To that effect I looked at boost but was overwhelmed by the number of classes and options in the boost inter-process code.
I do not ask for a complete coded solution, but rather an advice about the best way to approach this issue.
More information and answers to questions:
The issue here is that I cannot expect any help from the host application, so it does not really matter what it is. There are actually a few applications that use the plugins and so I cannot rely on any specific features of any single application.
I can say that host application is a normal desktop application, e.g. plain old .exe on Windows, .app on Mac OS. No iOS or Andriod apps.
Plugin interface is a set of functions the host can call. API is one way: host can call plugin but plugin cannot call host. Each plugin has an initialization function that the host must call one upon loading and an uninitialization the host must call once before unloading the DLL.
Plugin are implemented in C++, but not C++11. Compilers are VisualStudio 2005 on Windows and Xcode 3.2 with gcc 4.2.1 on Mac.
That said, I would like to again emphasize that I'm looking for a general design for approaching the issue not for specific code.
Thank for any help!

Remember that every program that uses your DLL has its own address space and therefore cannot interact using normal memory (as opposed to special OS supplied shared memory). The best way to get the different processes is for your DLL to launch a separate process that countains the shared resource. You will then need to implement some sort of (local) socket API that allows data to be shared.

You could use Qt -actually QtCore-, Gtk -actually its Glib (perhaps thru Gtkmm which is a C++ glue above them), and Poco, or perhaps Apache Portable Runtime
All of them are free software, cross-platform frameworks with powerful IPC and multi-threading (and plugin) abilities.
We cannot help more unless you tell much more about your (third party) host application, its plugin interface, and your own plugins. Perhaps the host application does already provide some portable ways to do inter-process communication, or thread-safe singletons... (this is why you should tell us more about that host application; it probably uses, or at least provides, some cross-platform library or API like the ones I listed).
Perhaps using C++11 might help. I guess you want some singleton pattern.

Advantages of implementing a front-end over linking against a library

I want to write a C++ program that plays MP3. Among available MP3 decoding libraries, I chose mpg123.
I noticed that, besides being able to link against libmpg123 and make the necessary function calls in my code, the library includes a back-end/front-end interface that enables me to communicate with it's executable, and thus not having to include it's code in my program.
What are the advantages of writing a front-end rather than simply linking against the library?

Most of the advantages comes from process separation between your executable and the library executable:
Increased safety & security: if the library is crashing, this will not crash your application.
Implicit multi-processing: since both are running on separate processes, this is almost for free.
Predisposition to networking: if communication between processes is done with pipes or stdin/stdout, you can easily forward them to sockets and run your executable on a separate machine.
Language neutral: you can use whatever programming language you want.
Of course, there is a performance penalty by using an external communication channel. But the benefits of having such decoupling can be quite impressive.

You can upgrade the backend without recompiling your program.
If the backend crashes, it probably doesn't take your program with it.

As far as I can see, the only use of the executable would be for testing purposes. You would run this third-party lib as an executable to understand the behavior of the various APIs offered so you can understand better its usage from your code and to see how they work with various input. After that you would link it to your process so that the library calls are inside your process's address space. If you just run the 2 executable's concurrently you would also have the IPC overhead.

Testing framework for functional/system testing for C/C++?

For C++, there are lots of good unit test frameworks out there, but I couldn't find a good one for functional testing. With functional testing, I mean stuff which touches the disk, requires the whole application to be in place etc.
Point in case: What framework helps with testing things like whether your I/O works? I've got a hand-rolled system in place, which creates temporary folders, copies around a bunch of data, so the tests are always done in the same environment, but before I spend more time on my custom framework -- is there a good one out there already?

I wrote one from scratch three times already - twice for testing C++ apps that talked to exchanges using FIX protocol, once for a GUI app.
The problem is, you need to emulate the outside world to do proper system testing. I don't mean "outside of your code" - outside of your application. This involves emulating end users, outside entities, the Internet and so on.
I usually use perl to write my system testing framework and tests, mostly because it's good with accessing all sorts of OS facilities and regexps are first-class citizens.
Some tips: make sure your logs are easy to parse, detailed but not too verbose. Have a sane default configuration. Make it easy to "reset" the application - you need to do it after each test.
The approach I usually use is to have some sort of "adapter" that turns the application's communications with the outside world into stdin/stdout of some executable. Then I build a perl framework on top of that, and then the test cases use the framework.

Below I list a couple of tools and larger testing applications of which I am aware. If you provide more information on your platform (OS, etc.) we can probably provide better answers.
For part of what you require, Microsoft provides the Application Verifier:
Application Verifier (AppVerifier) is a runtime verification tool used in testing applications for compatibility with Microsoft Windows XP. This tool can be used to test for a wide variety of known compatibility issues while the application is running. This article details the steps for using AppVerifier as an effective addition to the application development and testing cycles.
Application Verifier can be useful for testing out low memory conditions, other low resources, and other API usage.
Another part of the puzzle, is the Microsoft Detours package, which can be used to replace API calls with your own code (useful for say, returning error codes for tests that are hard to set up).
Detours is a library for instrumenting arbitrary Win32 functions on x86, x64, and IA64 machines. Detours intercepts Win32 functions by re-writing the in-memory code for target functions. The Detours package also contains utilities to attach arbitrary DLLs and data segments (called payloads) to any Win32 binary.
There are other, larger (and more expensive) comprehensive packages available too. Borland makes Silk.
Automated Software makes TestComplete. The selection of one of these tools would be up to your needs for your applications.
IBM/Rational provides the Rational Functional Tester, which is available across many platforms, and feature-rich.

Hi I am not sure if the framework we have helps in your situation but it hooks into Rational Functional Tester and allows the user to create various datasets to be attached to different tests and to change the enviornments without changing the scripting and reuses the automation in an efficient way.
Have a look if your interested:
http://www.testpro.com.au/Test-Automation-Framework.html

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js