Using RedGate's Reflector, you may easily get the full source for an application. But to reduce the chances of an algorithm being found out, there are obfuscators to reduce this likelihood.
My questions are:
How effective are obfuscators?
What is safer in terms of hiding your algorithms: C++ without .NET libraries or obfuscated .NET programs.
Are there any others way to make .NET source code even more secure?
If you definitely want to secure something, go for C++, as in the .NET World there is a powerful deobfuscator named de4dot (see here: https://github.com/0xd4d/de4dot) that deobfuscates what most obfuscators on the market produce, event the ones it does not explicitely know.
However, this will only raise the bar, as even in the c++ world, there are powerful tools also (IDA: http://www.hex-rays.com/products/ida/index.shtml).
There are other solutions, like mixed code assemblies where you can have the part you want to hide in native code and the rest in managed. see here for more: Mixed (Native and Managed) Assemblies
"What is safer in terms to get source code: c++ program without .net libraries or .net obfuscated program." Without .net ofc...
Obfuscated .net .java still easy to decompile. There are pro obfuscators, which makes the code not recompilable those are slow down the hack process a bit.
Have you have heard if something can be closed that can be opened?
even if is writen is Assembly...
Usually beginner programmers are scared about this kind of theft. I would suggest to first create a software part, which worth to be stolen for others ( not for you )
Because .NET is designed to be self-descriptive, using an obfuscator will only hinder their progress. Although decompilers will have reduced readability, anyone understanding MSIL will have a better chance. Even C++ applications will be decompilable at some stage, as the program ultimately gets executed step by step in memory. C++ applications will take longer to work it out, but if a hacker knows Assembler (which they probably would if they are decompiling your application to gain access to the algorithm), its just a matter of time.
Obfuscation is really to make it as difficult as possible in a reasonable timespan, rather than making it impossible. The same principal lies with encryption. Encryption isn't impossible to break, it just takes so long that the context may not be of any use in 70-80 years time.
There are 2 alternatives I can think of apart from the ones covered here:
Host the algorithm at a remote location
Host the algorithm in a hardware component - very very expensive
The first option would be more suited if you have a network connection available. The processing is done on a separate server and the algorithm is not exposed to the public. This is how activation codes work nowadays. A serial code of some sort is generated, encrypted with a public key encryption and sent to a source which will decrypt and validate the data. The response is also encrypted and sent back.
Also, digitally signing your application and your dependencies will also assist as hackers could not plug-in components very easily. If they tried to use a DLL in place of one of your old ones (to fake a call to a service and return "success"), your code would check the digital signature before using the DLL.
So in summary, obfuscating will slow down the process but not prevent it. The only way I can think of is to host that algorithm at a secure location and send requests to it. Yes there is the problem of the hosting scenario, DoS etc, but your algorithm is protected, which is what you wanted.
Try something link this. However it will only obfuscate the strings. To obfuscate function calls, variables and other elements, look for commercial products and services for that purpose.
How effective are obfuscators?
I find ConfuserEx's name, constants, and control flow protection quite effective making .NET code difficult to read.
Using unicode for the Name protection can render class/method/etc names into unreadable unicode.
The constants protection encode constant strings like debug log string which provide excellent hints for hacker to guess what the code is doing.
The control flow scramble your code into a lot of switch-cases.
See below for an example of ConfuserEx'ed C# code:
What is safer in terms of hiding your algorithms: C++ without .NET libraries or obfuscated .NET programs.
If .NET assembly is not obfuscated, then it's like giving away the source code.
Comparing obfuscated .NET and native x86/x64 code, IMO, the difficulty to read them is about the same. Native x86 code can be disassembled into quite readable C code using software like IDA-Pro. Also there are people who can read and understand x86 language really fast.
Are there any others way to make .NET source code even more secure?
There is this Microsoft .NET Native(still in early stage with limitations) that compiles C# code into native x86/x64 code. This is not actually a protection but just people read x86 code slower.
Related
I’m developing a C++ application where I want to compile C++ modules from potentially untrusted sources online, and have them operate on a specific bank of data within a single process. I’d like to sandbox these somehow. This is obviously a complex issue, but hoping to discover if there’s any potential approach or tool/library I haven’t yet thought of. The app will run on Windows & OSX at minimum, and (hopefully) Linux, iOS, Android.
My app would locally compile the C++ modules it downloads, and dynamically link the object code to a process in the app (not necessarily the “main” app process). The C++ modules would only have access to my API via the headers I provide, however the API (and any dependent libraries) need to be linked into the same process. The API’s dependent libraries are compute-based only, such as native SIMD-based math and possibly memory allocation. I don’t expect they will need to call any network, disk, or any other OS functionality, for that matter – except for needing to communicate their input data and computed results to the main process (maybe over shared memory ?)
I don’t care if the sandboxed process’ memory is corrupted or hollowed, as long as it’s contained in that process. I also want to avoid having any system API call addresses linked into in the process memory space, to prevent compromised code from finding them.
I’ve done a review of the basic security issues (stack crashes, return oriented programming hacks, etc.). Also looked at some related projects:
I see Google has a sandbox subproject within the Chromium repo which might be useful, but unsure of it’s utility in my case.
Windows Sandbox is a Microsoft tool for Windows only, and isn’t available on some versions anyway. Moreover. there are big performance issues with using it. The app runs in real time, with frame rate requirements similar to a video game.
considered compiling to WebAssembly, but at the moment it seems too immature (no SIMD, hard to debug, and potentially vulnerable to hacks in the wrapping host or browser.)
I thought there might be some kind of wrapper library already out there to intercept all OS calls and allow custom configuration of what calls get passed through (in my case, anything except what’s needed for the inter-process communication would be denied)
Any other ideas, architectural suggestions, or promising open source projects on the horizon for this ?
Thanks,
C
Compiling untrusted source code and linking to your app sounds really unsafe. If I understand your problem correctly, you need to "provide safe runtime environment for single threaded user code with only access to your API", then in my opinion its better to use runtime interpreter instead. It will provide you more control and sandbox capabilities, safe API calls and users code exceptions handling.
If you have doubts about interpreters performance, its a good trade of to safety, flexibilty and control. Vast of interpreters compile source code to bytecode and runs realy fast. Also you can reach better performance by providing fast API to script.
In my Java enterprise projects I use built-in Rhino JavaScript interpreter to run user scripts and provide API to reach flexibility, required performance and control. This scripts can call nothing but my API. Its safe, flexible and absolutely controllable.
I found these C/C++ (C like syntax) interpreter libraries:
JavaScript (ECMA)
https://v8.dev/
Lua
http://acamara.es/blog/2012/08/running-a-lua-5-2-script-from-c/
C++ interpreter
https://github.com/root-project/cling
Hope this question isn't going to be too vague. Reading through the COM spec and Don Box's Essential COM book, there is plenty of talk of the "problems that COM solves" - and they all sound important, relevant and current.
So how are the problems that COM addresses dealt with on other systems (linux, unix, OSX, android)? I'm thinking of things like:
binary compatibility across compilers and compiler versions
binary component reuse
compiling an application such that it has run-time dependencies rather than load-time ones (so that it runs even when a dependency is missing)
access to library functionality from languages other than the library's own
reasonably low-overhead remote procedure calls to components loaded in the address space of a different process
etc (I'm sure the list goes on)
I'm basically just trying to understand why for instance on Linux CORBA isn't a thing like COM is a thing on Windows (if that makes any sense). Does maybe software development on Linux subscribe to a different philosophy than the component-based model proposed by COM?
And finally, is COM a C/C++ thing? Several times I've come across comments from people saying COM is made "obsolete" by .NET but without really explaining what they meant by that.
For the remainder of this post, I'm going to use Linux as an example of open-source software. Where I mention "Linux" it's mostly a short/simple way to refer to open source software in general though, not anything specific to Linux.
COM vs. .NET
COM isn't actually restricted to C and C++, and .NET doesn't actually replace COM. However, .NET does provide alternatives to COM for some situations. One common use of COM is to provide controls (ActiveX controls). .NET provides/supports its own protocol for controls that allows somebody to write a control in one .NET language, and use that control from any other .NET language--more or less the same sort of thing that COM provides outside the .NET world.
Likewise, .NET provides Windows Communication Foundation (WCF). WCF implements SOAP (Simple Object Access Protocol)--which may have started out simple, but grew into something a lot less simple at best. In any case, WCF provides many of the same kinds of capabilities as COM does. Although WCF itself is specific to .NET, it implements SOAP, and a SOAP server built using WCF can talk to one implemented without WCF (and vice versa). Since you mention overhead, it's probably worth mentioning that WCF/SOAP tend to add more overhead that COM (I've seen anywhere from nearly equal to about double the overhead, depending on the situation).
Differences in Requirements
For Linux, the first two points tend to have relatively low relevance. Most software is open source, and many users are accustomed to building from source in any case. For such users, binary compatibility/reuse is of little or no consequence (in fact, quite a few users are likely to reject all software that isn't distributed in source code form). Although binaries are commonly distributed (e.g., with apt-get, yum, etc.) they're basically just caching a binary built for a specific system. That is, on Windows you might have a single binary for use on anything from Windows XP up through Windows 10, but if you use apt-get on, say, Ubuntu 18.02, you're installing a binary built specifically for Ubuntu 18.02, not one that tries to be compatible with everything back to Ubuntu 10 (or whatever).
Being able to load and run (with reduced capabilities) when a component is missing is also most often a closed-source problem. Closed source software typically has several versions with varying capabilities to support different prices. It's convenient for the vendor to be able to build one version of the main application, and give varying levels of functionality depending on which other components are supplied/omitted.
That's primarily to support different price levels though. When the software is free, there's only one price and one version: the awesome edition.
Access to library functionality between languages again tends to be based more on source code instead of a binary interface, such as using SWIG to allow use of C or C++ source code from languages like Python and Ruby. Again, COM is basically curing a problem that arises primarily from lack of source code; when using open source software, the problem simply doesn't arise to start with.
Low-overhead RPC to code in other processes again seems to stem primarily from closed source software. When/if you want Microsoft Excel to be able to use some internal "stuff" in, say, Adobe Photoshop, you use COM to let them communicate. That adds run-time overhead and extra complexity, but when one of the pieces of code is owned by Microsoft and the other by Adobe, it's pretty much what you're stuck with.
Source Code Level Sharing
In open source software, however, if project A has some functionality that's useful in project B, what you're likely to see is (at most) a fork of project A to turn that functionality into a library, which is then linked into both the remainder of project A and into Project B, and quite possibly projects C, D, and E as well--all without imposing the overhead of COM, cross-procedure RPC, etc.
Now, don't get me wrong: I'm not trying to act as a spokesperson for open source software, nor to say that closed source is terrible and open source is always dramatically superior. What I am saying is that COM is defined primarily at a binary level, but for open source software, people tend to deal more with source code instead.
Of course SWIG is only one example among several of tools that support cross-language development at a source-code level. While SWIG is widely used, COM is different from it in one rather crucial way: with COM, you define an interface in a single, neutral language, and then generate a set of language bindings (proxies and stubs) that fit that interface. This is rather different from SWIG, where you're matching directly from one source to one target language (e.g., bindings to use a C library from Python).
Binary Communication
There are still cases where it's useful to have at least some capabilities similar to those provided by COM. These have led to open-source systems that resemble COM to a rather greater degree. For example, a number of open-source desktop environments use/implement D-bus. Where COM is mostly an RPC kind of thing, D-bus is mostly an agreed-upon way of sending messages between components.
D-bus does, however, specify things it calls objects. Its objects can have methods, to which you can send signals. Although D-bus itself defines this primarily in terms of a messaging protocol, it's fairly trivial to write proxy objects that make invoking a method on a remote object look pretty much like invoking one on a local object. The big difference is that COM has a "compiler" that can take a specification of the protocol, and automatically generate those proxies for you (and corresponding stubs in the far end to receive the message, and invoke the proper function based on the message it received). That's not part of D-bus itself, but people have written tools to take (for example) an interface specification and automatically generate proxies/stubs from that specification.
As such, although the two aren't exactly identical, there's enough similarity that D-bus can be (and often is) used for many of the same sorts of things as COM.
Systems Similar to DCOM
COM also allows you to build distributed systems using DCOM (Distributed COM). That is, a system where you invoke a method on one machine, but (at least potentially) execute that invoked method on another machine. This adds more overhead, but since (as pointed out above with respect to D-bus) RPC is basically communication with proxies/stubs attached to the ends, it's pretty easy to do the same thing in a distributed fashion. The difference in overhead, however, tends to lead to differences in how systems need to be designed to work well, though, so the practical advantage of using exactly the same system for distributed systems as local systems tends to be fairly minimal.
As such, the open source world provides tools for doing distributed RPC, but doesn't usually work hard at making them look the same as non-distributed systems. CORBA is well known, but generally viewed as large and complex, so (at least in my experience) current use is fairly minimal. Apache Thrift provides some of the same general type of capabilities, but in a rather simpler, lighter-weight fashion. In particular, where CORBA attempts to provide a complete set of tools for distributed computing (complete with everything from authentication to distributed time keeping), Thrift follows the Unix philosophy much more closely, attempting to meet exactly one need: generate proxies and stubs from an interface definition (written in a neutral language). If you want to do those CORBA-like things with Thrift you undoubtedly can, but in a more typical case of building internal infrastructure where the caller and callee trust each other, you can avoid a lot of overhead and just get on with the business at hand. Likewise, google RPC provides roughly the same sorts of capabilities as Thrift.
OS X Specific
Cocoa provides distributed objects that are fairly similar to COM. This is based on Objective-C though, and I believe it's now deprecated.
Apple also offers XPC. XPC is more about inter-process communication than RPC, so I'd consider it more directly comparable to D-bus than to COM. But, much like D-bus, it has a lot of the same basic capabilities as COM, but in different form that places more emphasis on communication, and less on making things look like local function calls (and many now prefer messaging to RPC anyway).
Summary
Open source software has enough different factors in its design that there's less demand for something providing the same mix of capabilities as Microsoft's COM provides on Windows. COM is largely a single tool that tries to meet all needs. In the open-source world, there's less drive to provide that single, all-encompassing solution, and more tendency toward a kit of tools, each doing one thing well, that can be put together into a solution for a specific need.
Being more commercially oriented, Apple OS X probably has what are (at least arguably) closer analogs to COM than most of the more purely open-source world.
A quick answer on the last question: COM is far from being obsolete. Almost everything in the Microsoft world is COM-based, including the .NET engine (the CLR), and including the new Windows 8.x's Windows Runtime.
Here is what Microsoft says about .NET in it latest C++ pages Welcome Back to C++ (Modern C++):
C++ is experiencing a renaissance because power is king again.
Languages like Java and C# are good when programmer productivity is
important, but they show their limitations when power and performance
are paramount. For high efficiency and power, especially on devices
that have limited hardware, nothing beats modern C++.
PS: which is a bit of a shock for a developer who has invested more than 10 years on .NET :-)
In the Linux world, it is more common to develop components that are statically linked, or which run in separate processes and communicate by piping text (maybe JSON or XML) back and forth.
Some of this is due to tradition. UNIX developers have been doing stuff like this long before CORBA or COM existed. It's "the UNIX way".
As Jerry Coffin says in his answer, when you have the source code for everything, binary interfaces are not as important, and in fact just make everything more difficult.
COM was invented back when personal computers were a lot slower than they are today. In those days, loading components into your app's process space and invoking native code was often necessary to achieve reasonable performance. Now, parsing text and running interpreted scripts aren't things to be afraid of.
CORBA never really caught on in the open-source world because the initial implementations were proprietary and expensive, and by the time high-quality free implementations were available, the spec was so complicated that nobody wanted to use it if they weren't required to do so.
To a large extent, the problems solved by COM are simply ignored on Linux.
It is true that binary compatibility is less important when you have the source code available. However, you still have to worry about modularisation and versioning. If two different programs depend on different versions of the same library, you need to somehow support that.
Then there is the case of the same program using different versions of the same library. This is often useful when working on large legacy programs, where upgrading everything can be prohibitively expensive but you would like to use new features anyway. With COM, the old parts of the program can just be left alone, since new library versions can more easily be made backwards compatible.
In addition, having to compile from source instead of binary compatibility is a huge hassle. Especially if you are actually developing software, since binary incompatibility means you have to recompile much more often. If one tiny part changes in a large C++ program, you may have to wait for a 30 minute recompile. If the different pieces are compatible, only the part which changed has to be recompiled.
COM and DCOM in particular have been around in windows for some considerable time now and naturally windows developers have made use of this powerful framework.
We are now in the cross platform age and when porting such applications to other platforms we are faced with challenges which in many cases can be mitigated or eliminated altogether unless the application we are porting is more than just one simple standalone app.
If your dealing with a whole suite of modules running on different machines all communicating using windows specific technologies such as DCE/RPC, DCOM or even windows named pipes then your job just became an order of magnitude harder.
DCE/RPC DCOM and windows named pipes all are very windows specific, non portable and of course subject to windows security access control.
For instance anyone familar with OPC DA (an industrial automation protocol based on DCOM still very much in use but now superceded by OPC UA (which avoids DCOM))) will know that there are no elegant solutions here if the client (or server) needs to be available for Linux!!
Sure there appear to be some technical hurdles here given that the MS code is not in the public domain but projects such as Wine have a partly ok DCE/RPC implementation and MS do publish some of the protocol docs. Try searching and you will probably find little information and few products open source or otherwise to help you.
Perhaps the lack of open source or affordable options here is more due to legal concerns - I wonder!
Some simpler solutions simply involve installing a "gateway service" on the windows machines to allow an alternative means of access to DCOM interfaces on that machine. This is fine if the windows machine does not belong to an unwilling 3rd party which unfortunately is sometimes the case!!! I know we'll just chuck another Windows machine as the gateway in the middle is the usual global warming enhancing solution to that problem.
I would conclude that Linux to Windows DCOM interoperability is certainly not impossible but it does appear to be a topic that few are interested in talking about unless you get your wallet out!
I've made a bunch of games from my own homegrown C++/DirectX 2D engine. I was thinking some of them would be more fun with the introduction of multi-player and at the very least it would be easier to distribute them and get people to play if they could run in a browser.
I'm looking to port my games into a web format and I don't think there is anything I'm doing that Flash or Silverlight can't handle. However I don't know either of those so while I could learn something new it would save time and make porting easier if I could find something in C++. Does anyone know of a preferably open source, or otherwise freely available, library I could use to give myself a leg up?
I've heard of Haxe and it seems to be similar to what I want although it introduces a new language that can be converted to C++, ActionScript, etc. I'd prefer C++ so I can reuse some code without much of a fuss.
I also found something called RakNet which may only be useful as a networking layer to my existing C++/DirectX games but less useful for a browser based games. Has anyone used this with success? How was it to implement and integrate with existing projects?
The short answer is no. C++ requires the code to be compiled into a binary executable, and for various reasons, such code is not allowed to be run in the browser.
The long answer: The native client by Chromium/Google allows you to write native C++ code and run it in the browser. However, support is very limited, in the sense that almost no browser allows it (beyond some experimental nightly builds of Chromium and such), and you're likely going to face the same issues when porting C++ code to a different OS (aka, just because it's in the browser doesn't mean it's going to run on that obscure Linux OS).
If you want to port your games to the web, you options are either re-write it for the web, or wait a few years/decades for the native client to become widespread.
Funfact: Most mobile devices allow for C++ code to be run with a minimal wrapper. It's not the web, but it's an option if your goal is to get more people playing your games.
I would like to ask about the available (free or not) Static and Dynamic code analysis tools that can be used to C++ applications ESPECIALLY COM and ActiveX.
I am currently using Visual Studio's /analyze compiler option, which is good and all but I still feel there is lots of analysis to be done.
I'm talking about a C++ application where memory management and code security is of utmost importance.
I'm trying to check for problems relating to security such as memory management, input validation, buffer overflows, exception handling... etc
I'm not interested in, say, inheritance depth or lines of executable code.
Without a doubt you want to use Axman. This is by far the best ActiveX/Com security testing tool available, and its open source. This was one of the leading tools used in the Month Of Browser Bugs by H.D. Moore, who is also the creator of Metasploit. I I have personally used Axman to find vulnerabilities and write exploit code.
Axman uses TypeLib to identify all of the components that makeup a COM . This is a type relfection, and it means that Source code is not required. Axman uses reflection to automatically generate fuzz test cases against a COM.
There is a security tool category called the fuzzers which were used in the recent Pwn2Own 2010 contest in Vancouver. The winning guy said that he's not going to tell software makers which bug he found but instead how to create a good fuzzer that will allow them to find the bugs. This was covered by computerworld.
Basically, it finds every place that the software can take input and tries to inject random data until the application crashes. Starting from there, the user attempts to understand what went wrong and develops an effective attack.
I don't know any particular fuzzers but there are many kinds of them for various uses (buffer overflows vs sql injections, 2 very different problems, 2 different fuzzers)
We use Coverity Prevent which is a very sophisticated static analysis tool that stores defects in a database that has a web interface. It works for C, C++, and Java.
We also use open source tools like Valgrind.
Start here are work your way you http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis
I don't mean to be rude when I say "google it". Personally (ymmv) I learn much more along the way from googling than just having someone give me "the" answer.
Also, when I look for tools, I will go to to SourceForge and look for - in this case - "static code analysis"
Btw, check out Valgrind
I'm interested in using API spying/hijacking to implement some core features of a project I'm working on. It's been mentioned in this question as well, but that wasn't really on topic so I figured it'd be better with a question of its own for this.,
I'd like to gather as much information as possible on this, different techniques/libraries (MS Detours, IAT patching) or other suggestions.
Also, it'd be especially interesting to know if someone has any real production experience of using such techniques -- can they be made stable enough for production code or is this strictly a technique for research? Does it work properly over multiple versions of windows? How bug prone is it?
Personal experiences and external links both appreciated.
I implemented syringe.dll (L-GPL) instead of MS Detours (we did not like the license requirements or huge payment for x64 support) it works fantastically well, I ported it from Win32 to Win64, we have been using in our off-the-self commercial applications for around 2 years now.
We use it for very simple reasons really its to provide a presentation frame work for re-packing, re-branding the same compiled application as many different products, we do general filtering and replacment for string, general resource, toolbar, and menus.
Being L-GPL'd we supply the source, copyright etc, and only dynamically link to the library.
Hooking standard WinAPI functions is relatively safe since they're not going to change much in the near future, if at all, since Microsoft does it's
best to keep the WinAPI backwards compatible between versions.
Standard WinAPI hooking, I'd say, is generally stable and safe.
Hooking anything else, as in the target program's internals, is a different story.
Regardless of the target program, the hooking itself is usually a solid practice. The weakest link of the process is usually finding the correct spot,
and hanging on to it.
The smallest change in the application can and will change the addresses of functions, not to mention dynamic libraries and so forth.
In gamehacking, where hooking is standard practice, this has been defeated to some degree with "sigscanning", a technique first developed by LanceVorgin on the somewhat infamous
MPC boards. It works by scanning the executable image for the static parts of a function, the actual instruction bytes that won't change unless the
function's action is modified.
Sigscanning is obviously better than using static address tables, but it will also fail eventually, when the target application is changed enough.
Example implementation of sigscanning in c++ can be found here.
I've been using standard IAT hooking techniques for a few years now and it works well has been nice and stable and ported to x64 with no problems. The main problems I've had have been more to do with how I inject the hooks in the first place, it took a fair while to work out how best to suspend managed processes at the 'right' point in their start up so that injection was reliable and early enough for me. My injector uses the Win32 debug API and whilst this made it easy to suspend unmanaged processes it took a bit of trial and error to get managed processes suspended at an appropriate time.
My uses for IAT have mostly been for writing test tools, I have a deadlock detection program which is detailed here: http://www.lenholgate.com/blog/2006/04/deadlock-detection-tool-updates.html, a GetTickCount() controlling program which is available for download from here http://www.lenholgate.com/blog/2006/04/tickshifter-v02.html
and a time shifting application which is still under development.
Something a lot of people forget is that windows dll's are compiled as hot-patchable images(MSDN).
Hot-patching is the best way to do WinAPI detours, as its clean and simple, and preserves the original function, meaning no inline assembly needs to be used, only slightly adjusted function pointers.
A small hot patching tutorial can be found here.