I know this may be impossible but I really hope there's a way to pull it off. Please tell me if there's any way.
I want to write a sandbox application in C++ and allow other developers to write native plugins that can be loaded right into the application on the fly. I'd probably want to do this via DLLs on Windows, but I also want to support Linux and hopefully Mac.
My issue is that I want to be able to prevent the plugins from doing I/O access on their own. I want to require them to use my wrapped routines so that I can ensure none of the plugins write malicious code that starts harming the user's files on disk or doing things undesireable on the network.
My best guess on how to pull off something like this would be to include a compiler with the application and require the source code for the plugins to be distributed and compiled right on the end-user platform. Then I'd need an code scanner that could search the plugin uncompiled code for signatures that would show up in I/O operations for hard disk or network or other storage media.
My understanding is that the STD libaries like fstream wrap platform-specific functions so I would think that simply scanning all the code that will be compiled for platform-specific functions would let me accomplish the task. Because ultimately, any C native code can't do any I/O unless it talks to the OS using one of the OS's provided methods, right??
If my line of thinking is correct on this, does anyone have a book or resource recommendation on where I could find the nuts and bolts of this stuff for Windows, Linux, and Mac?
If my line of thinking is incorrect and its impossible for me to really prevent native code (compiled or uncompiled) from doing I/O operations on its own, please tell me so I don't create an application that I think is secure but really isn't.
In an absolutely ideal world, I don't want to require the plugins to distribute uncompiled code. I'd like to allow the developers to compile and keep their code to themselves. Perhaps I could scan the binaries for signatures that pertain to I/O access????
Sandboxing a program executing code is certainly harder than merely scanning the code for specific accesses! For example, the program could synthesize assembler statements doing system calls.
The original approach on UNIXes is to chroot() the program but I think there are problems with that approach, too. Another approach is a secured environment like selinux, possible combined with chroot(). The modern approach used to do things like that seems to run the program in a virtual machine: upon start of the program fire up a suitable snapshot of a VM. Upon termination just rewind to tbe snaphot. That merely requires that the allowed accesses are somehow channeled somewhere.
Even a VM doesn't block I/O. It can block network traffic very easily though.
If you want to make sure the plugin doesn't do I/O you can scan it's DLL for all it's import functions and run the function list against a blacklist of I/O functions.
Windows has the dumpbin util and Linux has nm. Both can be run via a system() function call and the output of the tools be directed to files.
Of course, you can write your own analyzer but it's much harder.
User code can't do I/O on it's own. Only the kernel. If youre worried about the plugin gaining ring0/kernel privileges than you need to scan the ASM of the DLL for I/O instructions.
I am working on a huge program that employs a (custom built) micro-threading solution. It sometimes happens that I need debug a crash. During such times, it is useful to be able to switch from one micro-thread to another.
If I'm doing live debugging, I can replace all of the registers to those that came from the micro-thread context. I have written a macro to do just that, and it works really well.
The problem is that I cannot change the register values if I am doing post-mortem debugging (from a core file). In such a case, I have no way to tell GDB to change its concept of what the current frame is, as all registers are considered read-only in that case.
Is there a way to tell GDB about my custom context management?
Shachar
There's not a simple, built-in way to do this in gdb.
I think probably the simplest way would be to write a version of gdbserver that can read your core files and that presents your micro-threads to gdb as real threads. There's been at least one gdbserver out there that can read core files already, so maybe it isn't crazily hard. However, I couldn't really say for sure.
What are some advantages to a GUI debugger like in Eclipse and what are some advantages to using a command line debugger such as gdb? Does industry use command line debuggers? and if so, what situations do people use command line debuggers?
I usually use gdb, but some advantages I can think of off the top of my head:
Being command line, debugging binaries on remote systems is as easy as opening an ssh connection.
Great scripting support, and the ability to run many commands per breakpoint (See the continue keyword)
Much shorter start-up time and a faster development cycle.
Copy&pastable commands and definable functions that let you repeat common commands easier
gdb also speaks a well-defined protocol, so you can debug code running on lots of obscure hardware and kernels.
Typing short commands is shorter and more efficient in the long run than working around a GUI (in my opinion).
However, if you're next to a system or runtime you've never used before, using a visual debugger can be easier to get started from the get-go. Also, having your debugger be tightly integrated with your IDE (if you use one) can be a big boost in productivity.
Visual debugger and command line ones don't have to be completely separate, there are visual front ends for gdb, such as DDD. (I don't use DDD however since it feels ultra kludgy and outdated. It does exist though. XCode also wraps gdb for debugging support)
Command line debugger is good for debugging a remote system (especially when the connection is slow), it is also useful for low performance systems or systems without Xserver/graphic card. CLI debuggers are also used for quick analysis or core dump and SIGSEGVs (they are faster to start). Command-line debuggers are more portable, they are installed almost on every system (or them can be easily installed, or even started from network/flash drive)
I think that command-line can be used for programs without source, and the graphical debuggers are better for projects with complex data structures/classes.
Another situation is that command-line debuggers easier to automatize, e.g. I have a shell script, which do a full call graph logging of program using gdb. It will be very hard to automate a graphic debugger.
It's essentially impossible to compare meaningfully based on the debugger's display. People who like command lines are likely to use text mode, command-driven debuggers. People who like GUIs are likely to use graphical, menu-driven debuggers.
Nearly the only time there's a really strong technical motivation toward one or the other is if you're debugging a windowing system. For example, using a debugger that depends on a having a functional X Server doesn't work very well if what you're trying to debug is the X Server itself.
I'm thinking about adding code to my application that would gather diagnostic information for later examination. Is there any C++ library created for such purpose? What I'm trying to do is similar to profiling, but it's not the same, because gathered data will be used more for debugging than profiling.
EDIT:
Platform: Linux
Diagnostic information to gather: information resulting from application logic, various asserts and statistics.
You might also want to check out libcwd:
Libcwd is a thread-safe, full-featured debugging support library for C++
developers. It includes ostream-based debug output with custom debug
channels and devices, powerful memory allocation debugging support, as well
as run-time support for printing source file:line number information
and demangled type names.
List of features
Tutorial
Quick Reference
Reference Manual
Also, another interesting logging library is pantheios:
Pantheios is an Open Source C/C++ Logging API library, offering an
optimal combination of 100% type-safety, efficiency, genericity
and extensibility. It is simple to use and extend, highly-portable (platform
and compiler-independent) and, best of all, it upholds the C tradition of you
only pay for what you use.
I tend to use logging for this purpose. Log4cxx works like a charm.
If debugging is what you're doing, perhaps use a debugger. GDB scripts are pretty easy to write up and use. Maintaining them in parallel to your code might be challenging.
Edit - Appending Annecdote:
The software I maintain includes a home-grown instrumentation system. Macros are used to queue log messages and configuration options control what classes of messages are logged and the level of detail to be logged. A thread processes the logging queue, flushing messages to file and rotating files as they become too large (which they commonly do). The system provides a lot of detail, but often all too often it provides huge files our support engineers must wade through for hours to find anything useful.
Now, I've only used GDB to diagnose bugs a few times, but for those issues it had a few nice advantages over the logging system. GDB scripting allowed me to gather new instrumentation data without adding new instrumentation lines and deploying a new build of my software to the client. GDB can generate messages from third-party libraries (needed to debug into openssl at one point). GDB adds no run-time impact to the software when not in use. GDB does a pretty good job of printing the contents of objects; the code-level logging system requires new macros to be written when new objects need to have their states logged.
One of the drawbacks was that the gdb scripts I generated had no explicit relationship to the source code; the source file and the gdb script were developed independently. Ideally, changes to the source file should impact and update the gdb script. One thought is to put specially-formatted comments in code and have a scripting language make a pass on the source files to generate the debugger script file for the source file. Finally, have the makefile execute this script during the build cycle.
It's a fun exercise to think about the potential of using GDB for this purpose, but I must admit that there are probably better code-level solutions out there.
If you execute your application in Linux, you can use "ulimit" to generate a core when your application crash (or assert(false), or kill -6 ), later, you can debug with gdb (gdb -c core_file binary_file) and analyze the stack.
Salu2.
PD. for profiling, use gprof
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
We are producing a portable code (win+macOs) and we are looking at how to make the code more rubust as it crashes every so often... (overflows or bad initializations usually) :-(
I was reading that Google Chrome uses a process for every tab so if something goes wrong then the program does not crash compleatelly, only that tab. I think that is quite neat, so i might give it a go!
So i was wondering if someone has some tips, help, reading list, comment, or something that can help me build more rubust c++ code (portable is always better).
In the same topic i was also wondering if there is a portable library for processes (like boost)?
Well many Thanks.
I've developed on numerous multi-platform C++ apps (the largest being 1.5M lines of code and running on 7 platforms -- AIX, HP-UX PA-RISC, HP-UX Itanium, Solaris, Linux, Windows, OS X). You actually have two entirely different issues in your post.
Instability. Your code is not stable. Fix it.
Use unit tests to find logic problems before they kill you.
Use debuggers to find out what's causing the crashes if it's not obvious.
Use boost and similar libraries. In particular, the pointer types will help you avoid memory leaks.
Cross-platform coding.
Again, use libraries that are designed for this when possible. Particularly for any GUI bits.
Use standards (e.g. ANSI vs gcc/MSVC, POSIX threads vs Unix-specific thread models, etc) as much as possible, even if it requires a bit more work. Minimizing your platform specific code means less overall work, and fewer APIs to learn.
Isolate, isolate, isolate. Avoid in-line #ifdefs for different platforms as much as possible. Instead, stick platform specific code into its own header/source/class and use your build system and #includes to get the right code. This helps keep the code clean and readable.
Use the C99 integer types if at all possible instead of "long", "int", "short", etc -- otherwise it will bite you when you move from a 32-bit platform to a 64-bit one and longs suddenly change from 4 bytes to 8 bytes. And if that's ever written to the network/disk/etc then you'll run into incompatibility between platforms.
Personally, I'd stabilize the code first (without adding any more features) and then deal with the cross-platform issues, but that's up to you. Note that Visual Studio has an excellent debugger (the code base mentioned above was ported to Windows just for that reason).
The Chrome answer is more about failure mitigation and not about code quality. Doing what Chrome is doing is admitting defeat.
Better QA that is more than just programmer testing their own work.
Unit testing
Regression testing
Read up on best practices that other
companies use.
To be blunt, if your software is crashing often due to overflows and bad initializations, then you have a very basic programming quality problem that isn't going to be easily fixed. That sounds a hash and mean, that isn't my intent. My point is that the problem with the bad code has to be your primary concern (which I'm sure it is). Things like Chrome or liberal use to exception handling to catch program flaw are only distracting you from the real problem.
You don't mention what the target project is; having a process per-tab does not necessarily mean more "robust" code at all. You should aim to write solid code with tests regardless of portability - just read about writing good C++ code :)
As for the portability section, make sure you are testing on both platforms from day one and ensure that no new code is written until platform-specific problems are solved.
You really, really don't want to do what Chrome is doing, it requires a process manager which is probably WAY overkill for what you want.
You should investigate using smart pointers from Boost or another tool that will provide reference counting or garbage collection for C++.
Alternatively, if you are frequently crashing you might want to perhaps consider writing non-performance critical parts of your application in a scripting language that has C++ bindings.
Scott Meyers' Effective C++ and More Effective C++ are very good, and fun to read.
Steve McConnell's Code Complete is a favorite of many, including Jeff Atwood.
The Boost libraries are probably an excellent choice. One project where I work uses them. I've only used WIN32 threading myself.
I agree with Torlack.
Bad initialization or overflows are signs of poor quality code.
Google did it that way because sometimes, there was no way to control the code that was executed in a page (because of faulty plugins, etc.). So if you're using low quality plug ins (it happens), perhaps the Google solution will be good for you.
But a program without plugins that crashes often is just badly written, or very very complex, or very old (and missing a lot of maintenance time). You must stop the development, and investigate each and every crash. On Windows, compile the modules with PDBs (program databases), and each time it crashes, attach a debugger to it.
You must add internal tests, too. Avoid the pattern:
doSomethingBad(T * t)
{
if(t == NULL) return ;
// do the processing.
}
This is very bad design because the error is there, and you just avoid it, this time. But the next function without this guard will crash. Better to crash sooner to be nearer from the error.
Instead, on Windows (there must be a similar API on MacOS)
doSomethingBad(T * t)
{
if(t == NULL) ::DebugBreak() ; // it will call the debugger
// do the processing.
}
(don't use this code directly... Put it in a define to avoid delivering it to a client...)
You can choose the error API that suits you (exceptions, DebugBreak, assert, etc.), but use it to stop the moment the code knows something's wrong.
Avoid the C API whenever possible. Use C++ idioms (RAII, etc.) and libraries.
Etc..
P.S.: If you use exceptions (which is a good choice), don't hide them inside a catch. You'll only make your problem worse because the error is there, but the program will try to continue and will probably crash sometimes after, and corrupt anything it touches in the mean time.
You can always add exception handling to your program to catch these kinds of faults and ignore them (though the details are platform specific) ... but that is very much a two edged sword. Instead consider having the program catch the exceptions and create dump files for analysis.
If your program has behaved in an unexpected way, what do you know about your internal state? Maybe the routine/thread that crashed has corrupted some key data structure? Maybe if you catch the error and try to continue the user will save whatever they are working on and commit the corruption to disk?
Beside writing more stable code, here's one idea that answers your question.
Whether you are using processes or threads. You can write a small / simple watchdog program. Then your other programs register with that watchdog. If any process dies, or a thread dies, it can be restarted by the watchdog. Of course you'll want to put in some test to make sure you don't keep restarting the same buggy thread. ie: restart it 5 times, then after the 5th, shutdown the whole program and log to file / syslog.
Build your app with debug symbols, then either add an exception handler or configure Dr Watson to generate crash dumps (run drwtsn32.exe /i to install it as the debugger, without the /i to pop the config dialog). When your app crashes, you can inspect where it went wrong in windbg or visual studio by seeing a callstack and variables.
google for symbol server for more info.
Obviously you can use exception handling to make it more robust and use smart pointers, but fixing the bugs is best.
I would recommend that you compile up a linux version and run it under Valgrind.
Valgrind will track memory leaks, uninitialized memory reads and many other code problems. I highly recommend it.
After over 15 years of Windows development I recently wrote my first cross-platform C++ app (Windows/Linux). Here's how:
STL
Boost. In particular the filesystem and thread libraries.
A browser based UI. The app 'does' HTTP, with the UI consisting of XHTML/CSS/JavaScript (Ajax style). These resources are embedded in the server code and served to the browser when required.
Copious unit testing. Not quite TDD, but close. This actually changed the way I develop.
I used NetBeans C++ for the Linux build and had a full Linux port in no time at all.
Build it with the idea that the only way to quit is for the program to crash and that it can crash at any time. When you build it that way, crashing will never/almost never lose any data. I read an article about it a year or two ago. Sadly, I don't have a link to it.
Combine that with some sort of crash dump and have it email you it so you can fix the problem.