Dynamic load of object file to execute - c++

I was wondering what are the steps to load an object file (generated from a single source file by the the msvc compiler), load it in memory of my program already running (in a buffer for example) and then run the code inside it.
The use case is that I have a large program which take a minute to load and wanted to do real time modifications from source code. Like just load the object file, fix some addresses in this object file, use the -hotpatch function to intercept call in my already running process and redirect to my object file.
Seems to me that I should just resolve the import table of the object file to point to my already loaded programs and intercept the call of the functions which have been modified.
Am I missing something ? I would like to ask before trying it to not waste time on something that may be impossible !
Thanks !

To answer the direct question (about loading and executing an obj file): this essentially amounts to re-writing a linker. Which is all but impossible.
As for (what I can figure out of) your intended usage: dynamically loading and executing an obj file wouldn't get you any closer to intercepting calls in your already running process. What you want is probably hooking. There are a lot (no, seriously, a lot) of ways to do so. Detours is the more-or-less official way to achieve this, here's a presentation of a few ways from the exotic side of the spectrum.

Related

Do Memory Mapped Files need Mutex when they are read only?

Recently, something happened with our windows c/c++ applications.
We use a DLL to map files to page file, and our applications read these shared files through memory mapping.
Everything is OK when we just run a single instance of application.
Sometimes we get nothing(just zeros) -- but not error or exception -- from mapped memory when we run 24 instances at the same time.
It seems like that this problem happens more on a slower storage device.
If the files are stored in a slower device(say, EFS of AWS), we got this problem about 6/24 instances every time.
But if we move files to EBS of AWS, we only got this problem about 1/24 or 2/24 instances, and not every time.
I guess maybe there are some conflicts during massive accessing?
Do I need mutex for these read only files?
The mutex is just for protecting writable objects, am I right?
More information:
Everything happened INSIDE that DLL.
EXEs just use this DLL to get TRUE or FALSE.
The DLL is used to judge whether some given data belong to a certain file.
Some structs describe the data structure of files, the problem is that a certain struct just get 0 when it should not, but not every time.
I logged the parameters inside the DLL, they are passed to DLL correctly, every time.
I still don't know how and why did this happen, but I found that I can avoid this problem simply by adding a RETRY to that judge function.
I still think this problem is a kind of I/O problem because RETRY can avoid this, but I have no more evidences.
And, maybe the title is not very proper to this problem so I think it's time to close it.
Finally, I figured it out.
This is NOT a memory mapped file problem, it is a LOGICAL problem.
Our DLL has not enough authority, so when we shared our data into memory, NOBODY can see them!
And our applications are designed to load data themselves if they can not find any shared data, so the difference of EFS and EBS happens!
These applications are very old, no documents left, and nobody knows how they are working, so I had to dig information from source code ...

A Function in an application(.exe) should be called only once regardless of how many times I run the same application

Suppose there are two functions, one to print "hello" and other to print "world" and I call these two functions inside the main function. Now, when I compile it will create a .exe file. When I run this .exe for the first time both functions will print "hello world".This .exe is terminated.
But if I run the same .exe for the second time or multiple times, only one function must execute ie. it should print only "world". I want to a piece of code or function that should only run once and after that, it should destroy itself and should be not be executed again regardless of how many times I run the application(.exe)
I can achieve this by accessing locally or windows registry and write some value for once and can check if that value is present, no need to execute this piece of code or function.
Can I achieve it without any external help that the application itself should be capable of performing this behaviour?
Any ideas are appreciated. thanks for reading
There is no coherent or portable way1 to do this from software without requiring the use of an external resource of some kind.
The issue is that you want the invocation of this process to be aware of the amount of times it has been executed, but the amount of times it has been executed is not a property that is recorded anywhere2. A program itself has no memory of its previous executions unless you program it do so.
Your best bet is to write out this information in some canonicalized location so that it can be read on later executions. This could be as a file in the filesystem (such as a hidden .firstrun file or something), or it could be through the registry (Windows specific), or some other environment-specific form of communication.
The main thing is that this must persist between executions and be available to your process.
1 You could potentially write code that overwrites the executable itself after the first invocation -- but this is extraordinarily brittle, and will be highly specific to the executable format. This is not an ideal nor recommended approach to solving this problem.
2 This is not a capability defined in the C or C++ standard. It's possible that there may be some specialized operating systems/flavors of linux that allow querying this -- but this is not something seen in most general-purpose operating systems. Generally the approach is communicate via an external resource.
Can I achieve it without any external help that the application itself
should be capable of performing this behaviour?
Not by any means defined by C or C++, and probably not on Windows at all.
You have to somehow, somewhere memorialize the fact that the one-time function has been called. If you have nothing but the compiled program to use for that, then the only alternative is to modify the program. Neither C nor C++ provides for live modification of the running program, much less for writing it back to the executable file containing its image.
Conceivably, if the program knows where to find or how to recreate its own source code, and if it knows how to run the compiler, then it could compile a modified version of itself. On Windows, however, it very likely could not overwrite its own executable file while it was running (though that would be possible on various other operating systems), so that would not solve the problem.
Moreover, note that any approach that involves modifying the executable would be at least a bit wonky, for different copies of the program would have their own, semi-independent idea of whether the one-time function had been run.
Basically, then, no.

fork() and free all allocated memory

I'm writing a service (i.e. background process) and want to offer starting it via a shared library. That is, someone wanting to use the service would link to the shared library, call it's start() method, which would fork and return. The fork would then run the service.
The problem with this approach is that the service process now might have a lot of legacy allocated memory it actually doesn't need. Is there a way to get rid of that and have the forked process allocate its own stuff? I know about exec() of course, but the problem with that is
that I need an executable which might not be in the location I expect it to be due to different operating system folder layouts
that I'd have to cast all potential parameters to string to pass it as program arguments to exec().
So basically, I'm looking for a way to call an arbitrary function func() with some parameters that should run in a new process, and everything not passed into that function shouldn't be in the new process. Is there a way to achieve this or something similar?
This is an interesting question that I sadly don't have a good answer for. I doubt any cleanup strategies like sbrk+close+munmap will reliably allow any libc based code to continue to function, so I'd try to make exec'ing better:
For any kind of exec based solution, you should be able to deep-copy data into shm to pass non-strings. This should take care of your second problem.
Here are some wild suggestions for your first issue:
Don't: just require that an executable is in PATH or a compile-time directory.
This is transparent and follows the UNIX philosophy. An error message Can't find myhelper in PATH will not slow anyone down. Most tools depending on helper executables do this, and it's fine.
Make your library executable, and use that as your exec target. You can try finding its name with some kind of introspection, perhaps /proc/self/maps or whatever glibc offers.
Like above, but exec python or something you can be reasonably sure exists, and use a foreign pointer interface to run a function on your library.
As part of your build process, compile a tiny executable and include it as binary data in your library. Write it to /tmp and execute.
Out of these, I prefer the simplicity and transparency of #1, even if that's the most boring solution.

Store data in executable

I'm just curious about this for a long time.
Is it possible for an application to store some changeable data (like configurations and options) inside its own executable?
for example: is it possible to design a single executable which if a user ran, set some configurations, copied it into another PC, then the application runs by its last set config in new PC.
is this possible by any means?
Update: it seems that it's possible. then How?
Yes and no -
Yes, there's plenty of space in an executable image you can put data. You can add a pre-initialised data segment for this, say, and write the data into there; or a resource, or you can abuse some of the segment padding space to store values in. You control the linker settings so you can guarantee there will be space.
No, you probably can't do this at run-time:
Windows' caching mechanism will lock the files on disk of any executable loaded. This is so that it doesn't need to worry about writing out the data into cache if it ever needs to unload a segment - it can guarantee that it can get the same data back from the same location on disk. You may be able to get around this by running with one of the .exe load copy-to-temp flags (from CD, from Network) if the OS actually respects that, or you can write out a helper exe to temp to transfer control to, unload the original and then modify the unloaded file. (This is much easier on Linux etc. where inodes are effectively a reference count - even if they have the same default locking strategy you can copy your executable, edit the settings into the copy and then move it over the original whilst still executing.)
Virus checkers will almost certainly jump on you for this.
In general I think it's a much better idea to just write settings to the registry or somewhere and provide and import / export settings option if you think it'd be needed.
Expanding on the 'how' part -
In order to know where to write the data into your file you've got two or three options really:
Use a magic string, e.g. declare a global static variable with a known sequence at the start, e.g. "---my data here---", followed by enough empty space to store your settings in. Open the file on disk, scan it for that sequence (taking care that the scanning code doesn't actually contain the string in one piece, i.e. so you don't find the scanning code instead) - then you've found your buffer to write to. When the modified copy is executed it'll have the data already in your global static.
Understand and parse the executable header data in your binary to find the location you've used. One way would be to add a named section to your binary in the linker, e.g. a 4K section called 'mySettings' flagged it as initialised data. You can (although this is a beyond my knowledge) wire this up as an external buffer you can refer to by name in your code to read from. To write, find the section table in the executable headers, find the one called 'mySettings' and you'll have the offset in the binary that you need to modify.
Hard-code the offset of the buffer that you need to read / write. Build the file once, find the offset in a hex editor and then hard-code it into your program. Since program segments are usually rounded up to 4K you'll probably get away with the same hard-coded value through minor changes, though it may well just change underneath you.
Ya, you can do it.
It's risky.
You could screw up and make the app unrunable.
Modifying executables is something that virus and trojans tend to do.
It is likely that their virus scanner will notice, stop it, and brand you as an evil doer.
I know a little bit about evil :)
In case of windows PE files, you can write data at the end of the file. You need to know the EXE size before writing your own data so that in the 2nd writes onwards you know from which position in the exe file to start writing.
Also you can't modify the file when it's running. Your main program needs to extract and run a temporary exe somewhere so that when the main program finished, the temp exe writes configuration to the main exe file.
Yes, it's possible. You probably shouldn't do it.
Mac OS X does have the concept of "bundles" where they combine an executable and its resources into one "package" (file ending in .app), but I'm not sure it's typical for applications to modifying their own bundles, and most other operating systems don't work that way either as far as I know. It's more of a facility to store images and audio and so forth along with the code, as opposed to storing configuration data that is going to be modified when the program runs.
Modifying the executable file while it's running is a pain. The task is further complicated by any compiler optimizations your compiler may apply since it changes the structure of the program and might not allow you to have an "empty space" in which to write.
Difficult. Difficult. Difficult.
But in order to do this you basically have to read in the file into a buffer, or into another file, you can use direct fstream. However make sure you use the ios::binary flag. And append the buffer or file, I mean it's a horribly simple matter of actually appending the data. The problem lies in adding to itself.
Here's what I'd do:
first write a program to pack programs into other programs. You probably possess the knowledge already. Once you have that, have it pack itself into another program, be sure you've arranged for outside messaging or passing of arguments. Then on your main program you can simply unpack that program and pass in a link to a file you create (temporary) which you would like to append yourself with. kill your current program. Let the slave append the data and call your program again.
blam appended executable.

Dynamic binary from file

This is a little bit of weird problem here.
Say I have a C++ code, running on a specific platform. The only purpose of this code is to run files containing binary, NATIVE to that platform.
Now - my question is - HOW would I get the data from these files (could even be by bits, like 1024 bits a cycle) to the memory of machine running my code so that this data would be in the execution part?
In other words, can I get the data to somewhere where I can point the instruction pointer?
If yes, how?
I don't mind if I have to use assembler for this - just so it would work.
EDIT: Sorry, I wasn't specific enough.
Basically - the file data I want to load is in no format like .exe, Mach-O executable or ELF. It is just raw binary data (with some custom headers which my code removes). This binary data is machine code, specific and suited for the processor running on the current machine.
Technically this means I want to go around normal OS executors and load the binary directly. I could interpret it but that would be around 3x slower at best.
I would not mind at all if I need to run the data in another child process - but again, I can not use normal process openers, because the data is not in any format that the OS could run automatically (again - like .exe, Mach-O, ELF).
You should just read file to memory, mark this memory as executable (it's platform-specific, for example VirtualProtect() for Windows), and call it as function: ((void(*)())ptr)();
Of course code in file should be position-independent.
How generic does your solution need to be?
The OS loader usually knows how to handle specific file formats and how to load them into memory. You can usually use the OS loader and call the entry point, but for that you need to know what is the entry point and for that you need to understand the file format anyway. I can't think of a portable solution for that.
If you'll explain more about what you want to do maybe it will be possible to supply a solution.