C++ execute function from memory [closed] - c++

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I recently thought about precompilable scripting language, which would be translated to machine code during program loading.
Lets say that I can generate this binary function myself. Now I need to somehow execute it. The general scheme would look like that:
char* binary = compile("script.sc");
pushArgsToStack(1,7);
memexec(binary);
int ret = getEax();
Is there any chance to get it working?
Also, would calling jmp to c++ funcion address work like planned? I mean, after pushing args, returnAddr and so on, I want to somehow call that function from my compiled script.
Thanks for any answers

This certainly can be done.
The biggest part will be the compile function, which unless your ".sc" language is VERY trivial will require quite a bit of work. You may want to have a look at for example llvm, which allows you to generate code from an intermediate virtual machine instruction set. It adds a lot of code, but makes your life a bit easier when it comes to generating (reasonably good) instructions.
You can't really push arguments in a function - the pushing will be removed when you return. You would have to generate the push instructions as part of the "compile" process.
And you should be able to to do:
int ret = memexec(binary);
You probably want to write memexec in assembler, and perhaps have it take multiple arguments (but you'd still have the problem if what type those arguments are, so some sort of list of arguments with type information is probably really what is required - or always pass arguments as strings, or some such)
Assuming you have an operating system made in the last 15-20 years, you will also need to allocate memory with "execute rights", which requires something other than malloc. Deopending on OS, the calls you need will vary.

Related

Is it more "correct" to disable a function with a special value or a second parameter? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 11 months ago.
Improve this question
Say I have some function that I may or may not want to execute during a particular run of the code, and this function also takes some argument that is passed in as a parameter at runtime. Is there some guidance as to whether a second variable to enable/disable the execution of the function is warranted, or is a "special value" of the argument just as good? I like the idea of reducing the number of parameters but I can understand how the former might be more readable.
For example, consider a 'Delay' function (which we may or may not want to run and) which accepts 1 floating point argument for the length of the delay. We can then check first that the argument-parameter is positive, and if it is not, we can not bother calling the function at all. Is this bad code?
I generally write in C/C++ if that matters.
Given a predicate p the first option is to conditionally call the function:
if(p) Delay(42);
This makes the calling code a bit more verbose, but you don't have the overhead of the function call. If the code uses lots conditions along those lines, it may obscure what is going on. Error handling or if code is ported with lots of conditionally enabled code via macros comes to mind.
The 2nd option is to have a special value indicating that Delay() shouldn't do anything:
float t = 42;
if(!p) {
t = 0;
}
...
Delay(t);
This means the disabling the feature is now both in the caller and in callee which I would consider a negative. On the other hand, if the special case is an implementation detail of Delay(), say, you figure out a function call overhead is delta then you might have logic along these lines:
if(arg <= delta) return;
Now you are merely making use of that particular implementation detail.
check first that the argument-parameter is positive, and if it is not, we can not bother calling the function at all
This seems like a poor design, if we call the function, but some extra logic is there to decide, whether we should actually call it, based on current state.
However, if you want a single argument, which defines the delay value, but also could mean that there is no delay at all, you could use std::optional.
Still, it seems like a bad idea, to mix extra logic into this quite straightforward delay function. I'd focus on calling the delay function only when it's really needed.

Authentication via command line [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I want to provide a binary-only program written in C or C++ where the user has to pass a valid code via command line to be able to use some extra features of the program itself. The idea is to implement some verification strategy in the program which compares the passed code against a run-time generated code which univocally identifies the system or hardware on which the program is being run.
In other words, if and only if the run-time check:
f(<sysinfo>) == <given code>
is true, then the user is allowed to use the extra features of the program. f is the function generating the code at run-time and sysinfo is an appropriate information identifying the current system/hardware (i.e. MAC address of the first ethernet card, Serial Number of the processor, etc..).
The aim is to make it as much difficult as possible for the user to guess or (guess the way to calculate) a valid code without knowing f and sysinfo a priori. More importantly, I want it to be difficult to re-implement f by analyzing the disassembled code of the program.
Assuming the above is a strong strategy, how could I implement f in C or C++ and what can I choose as its argument? Also what GCC compiler flags could I turn on to obfuscate f specifically? Note that, for example, things like MD5(MAC) or MD5(SHA(MAC)) would be too simple for evident reasons.
EDIT: Another interesting point is how to make it difficult for the user to attack the code directly by removing or bypassing the portion of the code doing the check.
If you are on Windows, a standard strategy is to hash the value of the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Cryptography\MachineGuid
If you're worried that a user might "guess" the hash function, take a standard SHA256 implementation and do something sneaky like change the algorithm initialization values (one of the two groups of these uses binary representations of the cube roots of the primes to initialize - change it to 5th or 7th or whatever roots, starting at the nth place, such that you chop off the "all-zero" parts, etc.)
But really if someone is going to take the time to RE your code, it's much easier to attack the branch in the code that does the if (codeValid) { allowExtraFeatures(); } then to mess with the hashes... so don't worry too much about it.

Where to store code constants when writing a JIT compiler? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am writing a JIT compiler for x86-64 and I have a question regarding best practice for inclusion of constants into the machine code I am generating.
My approach thus far is straightforward:
Allocate a chunk of RW memory with VirtualAlloc or mmap
Load the machine code into said memory region.
Mark the page executable with VirtualProtect or mprotect (and remove the write privilege for security).
Execute.
When I am generating the code, I have to include constants (numerical, strings) and I am not sure what is the best way to go about it. I have several approaches in mind:
Store all constants as immediate values into instructions' opcodes. This seems like a bad idea for everything except maybe small scalar values.
Allocate a separate memory region for constants. This seems to me like the best idea but it complicates memory management slightly and compilation workflow - I have to know the memory location before I can start writing the executable code. Also I am not sure if this affects performance somehow due to worse memory locality.
Store the constants in the same region as the code and access it with RIP-relative addressing. I like this approach since it keeps relevant parts of the program together but I feel slightly uneasy about mixing instructions and data.
Something completely different?
What is the preferable way to go about this?
A lot depends on how you are generating your binary code. If you use a JIT assembler that handles labels and figuring out offsets, things are pretty easy. You can stick the constants in a block after the end of the code, using pc-relative references to those labels and end up with a single block of bytes with both the code and the constants (easy management). If you're trying to generate binary code on the fly, you already have the problem of figuring out how to handle forward pc-relative references (eg for forward branches). If you use back-patching, you need to extend that to support references to your constants block.
You can avoid the pc-relative offset calculations by putting the constants in a separate block and passing the address of that block as a parameter to your code. This is pretty much the "Allocate a separate region for constants" you propose. You don't need to know the address of the block if you pass it in as an argument.

Produce Large Object file from Smaller source file [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I need to stress test my program with large object files. I have researched into C++ Templates and inline functions/templates but have not been able to get the desired obj/source size ratio I want (about 50). I want to be able to have a single source compiled to a single object file with the latter's max size of 200MB. Any high level ideas would be greatly appreciated.
Additional Edit: I have created large/complex and diverse (random) template functions and have started calling them (creating instantiations) with unique types/parameters. This increased the obj/source ratio, as expected, to a certain point (around 12). Then the ratio dropped significantly (about 1) to what I assume is gcc outsmarting me and optimizing my methods. I have also looked into forcing gcc to create all functions inline, but my tests haven't shown improvements on that yet either.
Using the preprocessor to code bloat is not a valid technique for what I wish to accomplish.
You could use the preprocessor to generate lots and lots of code at preprocessing, but that might count for you as the source file being large on its own.
Generally speaking, the machine code that a C++ compiler produces is relatively small (in bytes) compared to the code that wrote it.
One thing that does hog up space is string resources. Now, if you use the same string over and over again the compiler will be smart enough to realise that it only needs to store it once, but if you change the resource a little bit each time then each variation will probably be stored separately. This changing can be done using the preprocessor.
Another idea is like you said, using templates to generate lots of functions for a lot of different types.
What is it that you want to accomplish? There might be better ways.

What is the best way of replacing a system in a very complex program? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I have a task to replace a library of classes all associated with the same thing. However this thing permeates into the rest of the code to a huge degree. I have been trying to simply comment it all out, but it is taking forever!
Is there a better way? The new system is somewhat similar but not nearly similar enough to just replace the old one.
What's the best plan of attack?
edit - My main concern is this -
what if I comment every reference to the old code, and then find that because of the complexity of the system, it still doesn't run. have I then wasted all that time?
If you're worried that the code won't run after all this surgery, then the goal must be to modify the system gradually and reversibly, verifying that it's still working at every step. Primum non nocere.
If you have a good set of unit tests (which I doubt very much, from the sound of this project), you should be in the habit of running it every few minutes. Otherwise you can at least cobble up a regression test of your own: run the code on a typical set of input data, and take the checksum of the output-- if the checksum changes, then you broke something since the last time you ran the test, so rewind to that time (you do use version control, don't you?) and proceed with care. The longer the test takes to run, the less often you can afford to run it, but it should be nightly at least.
The old Thing has not remained encapsulated (if it ever was to begin with). The rest of the code knows too much about the implementation of oldThing, making a simple swapout with newThing impossible. So clean up the interface. Look over the public declarations of oldThing (including whatever base classes are exposed) and consider whether each one is something the world really needs to know about-- if not, put in an accessor/mutator, or revise the class tree, or whatever. Isolate the implementation from the interface.
While you're doing that, look at the public interface of newThing; it should be clean and abstract, like what you're trying to achieve with oldThing (if it's a mess, then you have a whole other set of problems). With some effort you can guide the changes in the oldThing interface to match what newThing has.
As that starts to come together, the task of swapping out old for new will start to look feasible. In the end you'll be able to do it by changing a single #include statement and a single word in the makefile, if you want to go that far.
You could throw away all base libraries you don´t want to use anymore, and walk through the resulting error list.
The unsolved references will lead you to where you must get active. If you have replacement patterns for the calling patterns, that´s help.
If you don't want to shuffle in your 60k files, I'd suggest implementing a dummy version of the existing classes: remove all code done in the original classes and replace all class members by that kind of macro:
#define DEPRECATED( function, file, line ) printf("Unsupported %s call in %s line %d\n", function, file, line )
#define DEPRECATED_METHOD_WRAPPER(type, X) X { DEPRECATED( #X, __FILE__, __LINE__ ); return (type)0; }
class OldClass
{
OldClass() { DEPRECATED( "OldClass", __FILE__, __LINE__ ); }
// original method
// int doSomeStuff(int a, void *b);
// deprecated:
DEPRECATED_METHOD_WRAPPER(int, doSomeStuff(int a, void *b) );
}
Now, when your big program calls your deprecated library, you'll see in the traces:
the place where it's called
the method that is called
AND, you don't have to touch the original files for the moment.
Your program will run without it failing, but now your work will be to remove the references of your old classes... but at least you'll get a nice reminder of where they are without perturbating the flow too much.