I need help developing a polymorphism engine - instruction dependency trees

I need help developing a polymorphism engine - instruction dependency trees - c++

I am currently trying to write a polymorphism engine in C++ to toy-around with a neat anti-hacking stay-alive checking idea I have. However, writing the polymorphism engine is proving rather difficult - I haven't even established how I should go about doing it. The idea is to stream executable code to the user (that is the application I am protecting) and occasionally send them some code that runs some checksums on the memory image and returns it to the server. The problem is that I don't want someone to simply hijack or programatically crack the stay-alive check; instead each one would be generated on the server working off of a simple stub of code and running it through a polymorphism engine. Each stay-alive check would return a value dependant on the check-sum of the data and a random algorithm sneaked inside of the stay-alive check. If the stub returns incorrectly, I know that the stay-alive check has been tampered with.
What I have to work with:
*The executable images PDB file.
*Assembler & disassembler engine of which I have implemented a interface between them which allows to to relocate code etc...
Here are the steps I was thinking of doing and how I might do them. I am using the x86 instruction set on a windows PE executable
Steps I plan on taking(My problem is with step 2):
Expand instructions
Find simple instructions like mov, or push and replace them with a couple instructions which achieve the same end though with more instructions. In this step I'll also add loads of junk-code.
I plan on doing this just by using a series of translation tables in a database. This shouldn't be very difficult to do.
Shuffling
This is the part I have the most trouble with. I need to isolate the code in to functions. Then I need to establish a series of instruction dependancies trees, and then I need to relocate them based upon which one depend on the other. I can find functions by parsing the pdb files, but creating instruction dependancy trees is the tricky part I am totally lost on.
Compress instructions
Compress instructions and implement a series of uncommon & obscure instructions in the process. And, like the first step, do this by using a database of code signatures.
To clarify again, I need help performing step number 2 and am unsure how I should even begin. I have tried making some diagrams but they become very confusing to follow.
OH: And obviously the protected code is not going to be very optimal - but this is just a security project I wanted to play with for school.

I think what you are after for "instruction dependency trees" is data flow analysis. This is classic compiler technology that determines for each code element (primitive operations in a programming language), what information is delivered to it from other code elements. When you are done, you end up with what amounts to a graph with nodes being code elements (in your case, individual instructions) and directed arcs between them showing what information has to flow so that the later elements can execute on results produced by "earlier" elements in the graph.
You can see some examples of such flow analysis at my website (focused on tool that does program analysis; this tool is probably not appropriate for binary analysis but the examples should be helpful).
There's a ton of literature in the compiler books on doing data flow analysis. See any compiler text book.
There are a number of issues you need to handle:
Parsing the code to extract the code elements. You sound like you have access to all the instructions already.
Determining the operands required by, and the values produced, by each code element. This is pretty simple for "ADD register,register", but you may find this daunting for production x86 CPU which has an astonishingly big and crazy instruction set. You have to collect this for every instruction the CPU might execute, and that means all of them pretty much. Nontheless, this is just sweat and a lot of time spent looking at the instruction reference manuals.
Loops. Values can flow from an instruction, through other instructions, back to that same instruction, so the dataflows can form cycles (lots of cycles for complex loops). The dataflow literature will tell you how to handle this in terms of computing the dataflow arcs in the graph. What these mean for your protection scheme I don't know.
Conservative Analysis: You can't get the ideal data flow, because in fact you are analyzing arbitrary algorithms (e.g., a Turing machine); pointers aggravate this problem pretty severely and machine code is full of pointers. So what the data flow analysis engines often do when unable to decide if "x feeds y", is to simply assume "x (may) feed y". The dataflow graph turns conceptually from "x (must) feed y" into the pragmatic "x (may) feed y" type arcs; the literature in fact is full of "must" and "may" algorithms because of this.
Again, the literature tells you many ways to do [conservative] flow analysis (mostly having different degrees of conservatism; in fact the most conservatinve data flow analysis simply says "every x feeds every y"!). What this means in practice for your scheme, I don't know.
There are a lot of people interested in binary code analysis (e.g., the NSA), and they do data flow analysis on machines instructions complete with pointer analysis. You might find this presentation interesting: http://research.cs.wisc.edu/wisa/presentations/2002/0114/gogul/gogul.1.14.02.pdf

I'm not sure if what you are trying helps to prevent tampering a process. If someone attaches a debugger (process) and breaks on the send / recieve functions the checksum of the memory stays intact all shuffeling will stay as it is and the client will be seen as valid even if it isn't. This debugger or injected code is able to manipulate you when you ask what pages are used by your process (so you won't see injected code since it wouldn't tell you the pages in which it resides).
To your actual question:
Couldn't the shuffeling be implemented by relinking the executable. The linker keeps track of all the symbols that a .o file exports and imports. When all the .o files are read the real addresses of the function are put in the imported placeholders. If you put every function in a seperate cpp file and compile them to a .o file. When reordering the .o files in the linker call all the functions will be on a different address and the executable would still run fine.
I tested this with gcc on windows - and it works. By reordering the .o files when linking all functions are put to a different address.

I can find functions by parsing the pdb files, but creating
instruction dependancy trees is the tricky part I am totally lost on.
Impossible. Welcome to the Halting Problem.

Related

How to extract the active code path from a complex algorithm

I have been puzzled lately by an intruiging idea.
I wonder if there is a (known) method to extract the executed source code from a large complex algorithm. I will try to elaborate this question:
Scenario:
There is this complex algorithm where a large amount of people have worked on for many years. The algorithm creates measurement descriptions for a complex measurement device.
The input for the algorithm is a large set of input parameters, lets call this the recipe.
Based on this recipe, the algorithm is executed, and the recipe determines which functions, loops and if-then-else constructions are followed within the algorithm. When the algorithm is finished, a set of calculated measurement parameters will form the output. And with these output measurement parameters the device can perform it's measurement.
Now, there is a problem. Since the algorithm has become so complex and large over time, it is very very difficult to find your way in the algorithm when you want to add new functionality for the recipes. Basically a person wants to modify only the functions and code blocks that are affected by its recipe, but he/she has to dig in the whole algorithm and analyze the code to see which code is relevant for his or her recipe, and only after that process new functionality can be added in the right place. Even for simple additions, people tend to get lost in the huge amount of complex code.
Solution: Extract the active code path?
I have been brainstorming on this problem, and I think it would be great if there was a way to process the algorithm with the input parameters (the recipe), and to only extract the active functions and codeblocks into a new set of source files or code structure. I'm actually talking about extracting real source code here.
When the active code is extracted and isolated, this will result in a subset of source code that is only a fraction of the original source code structure, and it will be much easier for the person to analyze the code, understand the code, and make his or her modifications. Eventually the changes could be merged back to the original source code of the algorithm, or maybe the modified extracted source code can also be executed on it's own, as if it is a 'lite' version of the original algorithm.
Extra information:
We are talking about an algorithm with C and C++ code, about 200 files, and maybe 100K lines of code. The code is compiled and build with a custom Visual Studio based build environment.
So...:
I really don't know if this idea is just naive and stupid, or if it is feasible with the right amount of software engineering. I can imagine that there have been more similar situations in the world of software engineering, but I just don't know.
I have quite some experience with software engineering, but definitely not on the level of designing large and complex systems.
I would appreciate any kind of answer, suggestion or comment.
Thanks in advance!

Other naysayers say you can't do this. I disagree.
A standard static analysis is to determine control and data flow paths through code. Sometimes such a tool must make assumptions about what might happen, so such analyses tend to be "conservative" and can include more code than the true minimum. But any elimination of irrelevant code sounds like it will help you.
Furthermore, you could extract the control and data flow paths for a particular program input. Then where the extraction algorithm is unsure about what might happen, it can check what the particular input would have caused to happen. This gives a more precise result at the price of having to provide valid inputs to the tool.
Finally, using a test coverage tool, you can relatively easily determine the code exercised for a particular input of interest, and the code exercised by another input for case that is not so interesting, and compute the set difference. This gives code exercised by the interesting case, that is not in common with the uninteresting case.
My company builds build program analysis tools (see my bio). We do static analysis to extract control and data flow paths on C++ source code, and could fairly easily light up the code involved. We also make C++ test coverage tools, that can collect the interesting- and uninteresting- sets, and show you the difference superimposed over the source code.

I'm afraid what you try is mathematically impossible. The problem is that this
When the algorithm is finished, a set of calculated measurement parameters will form the output.
is impossible to determine by static code analysis.
What you're running into is essentially a variant of the Halting Problem for which has been proven that there can not be an algorithm/program that can determine, if an algorithm passed into it will yield a result in finite time.

Unit tests in systems programming?

I would like to start learning/using unit tests in C++. However, I'm having a hard time applying the concept of tests to my field of programming.
I'm usually not writing functions which follow a predefined input/output pattern, instead, my programming is usually on a level rather close to the operating system.
Some examples are: find out Windows version, create a system restore point, query registry for installed drives, compress a file, or recursively find all .log files older than X days.
I don't see how I could hard-code "results" into a testing function. Are unit tests even possible in my case?

The "result" doesn't have to be a CONSTANT value, it could be something that the code finds out. For example, if you are compressing a file, the result would be a file that, when uncompressed, gives you the original file. So the test would be to take an existing test-file , compress it, and then uncompress the resulting compressed file, then compare the two files. If the result is "no difference", it's a pass. If the files are not the same, you have a problem of some sort.
The same principle can be applied to any of your other methods. Finding log-files would of course require that you prepare a number of files, and given them different times (using the SetFileTime or some such).
Getting Windows version should give you the version of the Windows you are currently using.
And so on.
Of course, you should also have "negative" tests whenever possible. If you are compressing a file, what happens if you try to compress a file that doesn't exist? What if the disk is full (using a virtual harddisk or similar can help here, as filling your entire disk may not result in something great!). If the specification says the code should behave in a certain way, then verify that it gives the correct error message. Otherwise, at least ensure it doesn't "crash", or fail without an error message of some sort.

You can also have some mocking objects which will fake the OS calls:
You may have a class OS which have methods which mimics the system calls.
So your algorithm don't call directly the global os function.
then you can construct a fake OS class which returns some sort of "hard coded" values for the testing.

Compiling on-demand executables

GoToMeeting's gotomeeting.com/join has an interesting behavior - when you visit a meeting URL directly you're required to download a new exe binary file which, when executed, has the meeting ID already integrated and will auto-launch the program without you needing to input a meeting ID.
My first thought is that this was incorporated into the metadata of the executable, but closer inspection leads me to believe that these exes are compiled with the meeting ID.
So here are a few questions:
Are they building/compiling on the fly?
If so, isn't there massive overhead to implementing this?
This has to be a massive security risk, right?
So assuming I am silly enough to attempt something like this - is there a safe way to be issuing make, etc. from my web-based framework? My gut tells me there isn't.
I've read the following SO questions which tell me that this kind of question is typically met with much ire:
fast on-demand c++ compilation
Silverlight on-demand compilation/Build

I don't see why you say that issuing make from within your web-based framework must necessarily be insecure. It may, but it may not. It will almost certainly be slow and will probably result in unacceptable delays.
The more sensible approach, in my opinion, is to have the executable already compiled, with a "blob" of reserved data in the resulting file into which you substitute the actual data you want and then sign the resulting file.
This will likely be significantly faster than compiling and easier to implement to boot!

Embedded Lua - timing out rogue scripts (e.g. infinite loop) - an example anyone?

I have embedded Lua in a C++ application. I need to be able to kill rogue (i.e. badly written scripts) from hogging resources.
I know I will not be able to cater for EVERY type of condition that causes a script to run indefinitely, so for now, I am only looking at the straightforward Lua side (i.e. scripting side problems).
I also know that this question has been asked (in various guises) here on SO. Probably the reason why it is constantly being re-asked is that as yet, no one has provided a few lines of code to show how the timeout (for the simple cases like the one I described above), may actually be implemented in working code - rather than talking in generalities, about how it may be implemented.
If anyone has actually implemented this type of functionality in a C++ with embedded Lua application, I (as well as many other people - I'm sure), will be very grateful for a little snippet that shows:
How a timeout can be set (in the C++ side) before running a Lua script
How to raise the timeout event/error (C++ /Lua?)
How to handle the error event/exception (C++ side)
Such a snippet (even pseudocode) would be VERY, VERY useful indeed

You need to address this with a combination of techniques. First, you need to establish a suitable sandbox for the untrusted scripts, with an environment that provides only those global variables and functions that are safe and needed. Second, you need to provide for limitations on memory and CPU usage. Third, you need to explicitly refuse to load pre-compiled bytecode from untrusted sources.
The first point is straightforward to address. There is a fair amount of discussion of sandboxing Lua available at the Lua users wiki, on the mailing list, and here at SO. You are almost certainly already doing this part if you are aware that some scripts are more trusted than others.
The second point is question you are asking. I'll come back to that in a moment.
The third point has been discussed at the mailing list, but may not have been made very clearly in other media. It has turned out that there are a number of vulnerabilities in the Lua core that are difficult or impossible to address, but which depend on "incorrect" bytecode to exercise. That is, they cannot be exercised from Lua source code, only from pre-compiled and carefully patched byte code. It is straightforward to write a loader that refuses to load any binary bytecode at all.
With those points out of the way, that leaves the question of a denial of service attack either through CPU consumption, memory consumption, or both. First, the bad news. There are no perfect techniques to prevent this. That said, one of the most reliable approaches is to push the Lua interpreter into a separate process and use your platform's security and quota features to limit the capabilities of that process. In the worst case, the run-away process can be killed, with no harm done to the main application. That technique is used by recent versions of Firefox to contain the side-effects of bugs in plugins, so it isn't necessarily as crazy an idea as it sounds.
One interesting complete example is the Lua Live Demo. This is a web page where you can enter Lua sample code, execute it on the server, and see the results. Since the scripts can be entered anonymously from anywhere, they are clearly untrusted. This web application appears to be as secure as can be asked for. Its source kit is available for download from one of the authors of Lua.

Snippet is not a proper use of terminology for what an implementation of this functionality would entail, and that is why you have not seen one. You could use debug hooks to provide callbacks during execution of Lua code. However, interrupting that process after a timeout is non-trivial and dependent upon your specific architecture.
You could consider using a longjmp to a jump buffer set just prior to the lua_call or lua_pcall after catching a time out in a luaHook. Then close that Lua context and handle the exception. The timeout could be implemented numerous ways and you likely already have something in mind that is used elsewhere in your project.
The best way to accomplish this task is to run the interpreter in a separate process. Then use the provided operating system facilities to control the child process. Please refer to RBerteig's excellent answer for more information on that approach.

A very naive and simple, but all-lua, method of doing it, is
-- Limit may be in the millions range depending on your needs
setfenv(code,sandbox)
pcall (function() debug.sethook(
function() error ("Timeout!") end,"", limit)
code()
debug.sethook()
end);
I expect you can achieve the same through the C API.
However, there's a good number of problems with this method. Set the limit too low, and it can't do its job. Too high, and it's not really effective. (Can the chunk get run repeatedly?) Allow the code to call a function that blocks for a significant amount of time, and the above is meaningless. Allow it to do any kind of pcall, and it can trap the error on its own. And whatever other problems I haven't thought of yet. Here I'm also plain ignoring the warnings against using the debug library for anything (besides debugging).
Thus, if you want it reliable, you should probably go with RB's solution.
I expect it will work quite well against accidental infinite loops, the kind that beginning lua programmers are so fond of :P
For memory overuse, you could do the same with a function checking for increases in collectgarbage("count") at far smaller intervals; you'd have to merge them to get both.

Updating a codebase to meet standards

If you've got a codebase which is a bit messy in respect to coding standards - a mix of different conventions from different people - is it reasonable to give one person the task of going through every file and bringing it up to meet standards?
As well as being tremendously dull, you're going to get a mass of changes in SVN (or whatever) which can make comparing versions harder. Is it sensible to set someone on the whole codebase, or is it considered stupid to touch a file only to make it meet standards? Should files be left alone until some 'real' change is needed, and then updated?
Tagged as C++ since I think different languages have different automated tools for this.

Should files be left alone until some 'real' change is needed, and then updated?
This is what I would do.
Even if it's primarily text layout changes, doing it by a manual process on a large scale risks breaking code that was working.
Treat it as a refactor and do it locally whenever code has to be touched for some other reason. Add tests if they're missing to improve your chances of not breaking the code.
If your code is already well covered by tests, you might get away with something global, but I still wouldn't advocate it.
I also think this is pretty much language-agnostic.

It also depends on what kind of changes you are planning to make in order to bring it up to your coding standard. Everyone's definition of coding standard is different.
More specifically:
Can your proposed changes be made to the project with 100% guarantee that the entire project will work identically the same as before? For example, changes that only affect comments, line breaks and whitespaces should be fine.
If you do not have 100% guarantee, then there is a risk that should not be taken unless it can be balanced with a benefit. For example, is there a need to gain a deeper understanding of the current code base in order to continue its development, or fix its bugs? Is the jumble of coding conventions preventing these initiatives? If so, evaluate the costs and benefits and decide whether a makeover is justified.
If you need to understand the current code base, here is a technique: tracing.
Make a copy of the code base. Note that tracing involves adding code, so it should not be performed on the production copy.
In the new copy, insert many fprintf (trace) statements into any functions considered critical. It may be possible to automate this.
Run the project with various inputs and collect those tracing results. This will help everyone understand the current project's design.
Another technique for understanding the current code base is to document the dependencies in the project.
Some kinds of dependencies (interface dependency, C++ include dependency, C++ typedef / identifier dependency) can be extracted by automated tools.
Run-time dependency can only be extracted through tracing, or by profiling tools.

I was thinking it's a task you might give a work-experience kid or put out onto RentaCoder
This depends mainly on the codebase's size.
I've seen three trainees given the task to go through a 2MLoC codebase (several thousand source files) in order to insert one new line into the standard disclaimer at the top of all the source files (with the line's content depending on the file's name and path). It took them several days. One of the three used most of that time to write a script that would do it and later only fixed the files where the script had failed to insert the line correctly, the other two just ploughed through the files. (The one who wrote the script later got a job at that company.)
The job of manually adapting all those files in that codebase to certain coding standards would probably have to be measured in man-years.
OTOH, if it's just a few dozen files, it's certainly doable.
Your codebase is very likely somewhere in between, so your best bet might be to set a "work-experience kid" to find out whether there's a tool that can do this to your satisfaction and, if so, make it work.
Should files be left alone until some 'real' change is needed, and then updated?
I'd strongly advice against this. If you do this, you will have "real" changes intermingled with whatever reformatting took place, making it nigh impossible to see the "real" changes in the diff.

You can address the formatting aspect of coding style fairly easily. There are a number of tools that can auto-format your code. I recommend hooking one of these up to your version control tool's "check in" feature. This way, people can use whatever format they want while editing their code, but when it gets checked in, it's reformatted to the official style.
In general, I think it's best if you can do the big change all at once. In the past, we've done the following:
1. have a time dedicated to the reformatting when most people aren't working (e.g. at night or on the weekend
2. have a person check out as many files as possible at that time, reformat them, and check them in again
With a reformatting-only revision, you don't have to figure out what has changed in addition to the formatting.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js