Get Memory Address of Binary Instructions - c++

I'm currently working on some system level code where I would like to be able to identify the memory section(s) that are from the loaded binary in order to detect things like corrupted or modified instructions;
Essentially what I'm after is a way, in Win32 using C++, to get a pointer to the range of instructions. This is somewhat similar to asking for a function pointer to the .text section's start and end. My understanding of the exe format is that the .text section is where instructions are stored, versus the .data section which holds things like global variables. Unfortunately I've found 0 hints on where this might be (I've seen no win32 function calls, nothing in the TIB, etc.)
Can anyone direct me to where I could find/calculate this information?
P.S. I do understand that if anyone changes code maliciously that they may find this code and change it; I'm still interested in the details of how to get at this information for my own curiosity.

You can't really expect this to work with an in memory binary. Any function calls to imported DLLs will get modified by the loader to point to the actual locations of the target procedures in the DLL that is loaded.
For example suppose you call a function in kernel32.dll. Then a Windows update happens which changes kernel32.dll. The next time you run your app, the jump to the function in kernel32.dll is going to be to a different memory address than the before the Windows update was applied.
And of course this all assumes that DLLs load at their preferred address. And then you may have some self-modifying code.
And so on, and so on.

You can find the entry-point to your code in the PE header. Download the PE (Portable Executable) file definition from MSDN - it has all the information. The format of the program in memory is virtually the same as it is on disk. From within the code, you can get a pointer to the PE header in memory via the GetModuleHandle() function (the handle is really a pointer to the first page).

This doesn't directly answer your question, but for your overall solution, you could look into Code Signing. If you like this solution, there are existing implementations on Windows.
As you said, binary verification alone won't solve your problem. You should also look into installing your application in an area of the file system that requires elevation/admin rights to write to, such as Program Files, or deploy it somewhere a user can't directly modify it, like a web server.

Related

how do i get executable code from my source code cpp?

I watched some videos on youtube where bytes for CPP or c# code get hardcoded in an unsigned char* then get injected into memory and executed.
how can I do that with my source code? I only found a way to inject the bytes from an exe with a little bit complicated way which caused me some problems when executing.
I also found this page where they use some kind of pentesting tool to generate an executable code (bytes) that can simply get injected in memory.
https://www.ired.team/offensive-security/code-execution/using-msbuild-to-execute-shellcode-in-c
In short: give up until you understand enough of assembly language to ejects assembly code. Blind copying of executable code won't work.
C++ or C# compiling produce machine code which:
May contain external references. A function may call other function, use global variables, etc. Even if you don't explicitly do this, the language may call its runtime. On program load time this is fixed by having all statically imported objects in executable, and loading dynamically imported modules.
Isn't necessarily position independent. That is it may not behave well in another memory location. It may contains absolute reference to itself that should be adjusted, or relative external references, that also should be adjusted. On program load time this is fixed by processing relocation table.
Actually a specific case of 1, but can be viewed separately. Executable except from code and data contains some annotations to code, most notable, exception handlers. Without exception handlers, it may not execute as expected too.
That is, arbitrary copied bytes of executable may or may not work in another location. If you try to copy entire program, most likely it will not work.
For trick like injecting code one would use assembly or machine code, not high level languages. Sorry.
To get machine code for your instructions generated by compiling C++ code from VS:
During debugging - copy or drag and drop the address from Disassembly window to Memory window.
During compilation - use /FAc option

On Windows, Using C++ is it possible to programmatically obtain a string name for the Assembly from a given function ptr

If you set a breakpoint in the debugger over a function ptr, you will see the name of the assembly included in the inspector panel. This seems to work for all function objects including lambdas.
You may see something like this;
Func=0x00FF00FF00{UE4Editor-Game.dll!<lambda_4b5336d9060965465490645>::<lambda_invoker_cdecl>}
Question; How would one programmatically obtain a string containing the assembly information given here, using the function pointer Func and functions that are available a Windows development environment?
For the given example I would call something like this;
const char* details = GetFunctionAssemblyString(Func);
The most important part I would like to obtain is this; UE4Editor-Game.dll However the full string might also be interesting...
This is for development tools only, and not-intended to be cross platform, so using windows specific functions is acceptable. I have access to the debug database .pdb.
Cheers
Note that 'assembly' is a term that is limited to .net, it indicates an image file (whether exe or dll or otherwise) that has attached metadata. For native contexts the analogous term is 'module', a module may or may not have symbol names available but is likely not going to have more than that in the usual case. Note that there may possibly be debugging information but that can be removed and the module will continue to work, the same is not the case if the metadata were removed from a .net assembly.
All of that being said, you can use the Debug Help Library to get as much information about a native process as is available. Note that comments in SymInitialize make it sound like it is not feasible for a process to load information about itself. Once you have initialized dbghelp for a particular process you could use SymFromAddr to get the name associated with a particular address and then SymGetModuleInfo64 to get information for the module containing that address.

How to modify a function in a compiled DLL

I want to know if it is possible to "edit" the code inside an already compiled DLL.
I.E. imagine that there is a function called sum(a,b) inside Math.dll which adds the two numbers a and b
Let's say i've lost the source code of my DLL. So the only thing i have is the binary DLL file.
Is there a way i could open that binary file, locate where my function resides and replace the sum(a,b) routine with, for example, another routine that returns the multiplication of a and b (instead of the sum)?
In Summary, is it posible to edit Binary code files?
maybe using reverse engineering tools like ollydbg?
Yes it is definitely possible (as long as the DLL isn't cryptographically signed), but it is challenging. You can do it with a simple Hex editor, though depending on the size of the DLL you may have to update a lot of sections. Don't try to read the raw binary, but rather run it through a disassembler.
Inside the compiled binary you will see a bunch of esoteric bytes. All of the opcodes that are normally written in assembly as instructions like "call," "jmp," etc. will be translated to the machine architecture dependent byte equivalent. If you use a disassembler, the disassembler will replace these binary values with assembly instructions so that it is much easier to understand what is happening.
Inside the compiled binary you will also see a lot of references to hard coded locations. For example, instead of seeing "call add()" it will be "call 0xFFFFF." The value here is typically a reference to an instruction sitting at a particular offset in the file. Usually this is the first instruction belonging to the function being called. Other times it is stack setup/cleanup code. This varies by compiler.
As long as the instructions you replace are the exact same size as the original instructions, your offsets will still be correct and you won't need to update the rest of the file. However if you change the size of the instructions you replace, you'll need to manually update all references to locations (this is really tedious btw).
Hint: If the instructions you're adding are smaller than what you replaced, you can pad the rest with NOPs to keep the locations from getting off.
Hope that helps, and happy hacking :-)
Detours, a library for instrumenting arbitrary Win32 functions on x86 machines. Detours intercepts Win32 functions by re-writing target function images. The Detours package also contains utilities to attach arbitrary DLLs and data segments (called payloads) to any Win32 binary.
Download
You can, of course, hex-edit the DLL to your heart's content and do all sorts of fancy things. But the question is why go to all that trouble if your intention is to replace the function to begin with?
Create a new DLL with the new function, and change the code that calls the function in the old DLL to call the function in the new DLL.
Or did you lose the source code to the application as well? ;)

Calling an executable's function code

I have the location/offset of a particular function present inside an executable. Would it be possible to call such a function (while suppressing the CRT's execution of the executable's entry point, hopefully) ?
In effect, you can simulate the Windows loader, assuming you run under Windows, but the basics should be the same on any platform. See e.g. http://msdn.microsoft.com/en-us/magazine/cc301805.aspx.
Load the file into memory,
Replace all relative addresses of functions that are called by the loaded executable with the actual function addresses.
Change the memory page to "executable" (this is the difficult and platform-dependent part)
Initialize the CRT in order to, e.g., initialize static variables.
Call.
However, as the commenters point out correctly, this might only be practical as an exercise using very simple functions. There are many, many things that can go wrong if you don't manage to emulate the complete OS loader.
PS: You could also ask the Google: http://www.cultdeadcow.com/tools/pewrap.html
PPS: You may also find helpful advice in the "security" community: https://www.blackhat.com/presentations/bh-usa-07/Harbour/Whitepaper/bh-usa-07-harbour-WP.pdf
Yes, you can call it, if you will initialize all global variables which this function uses. Probably including CRT global variables. As alternative way, you can hook and replace all CRT functions that callee uses. See disassembly of that function to get right solution.
1) Take a look at the LoadLibraryEx() API. It has some flags that could be able to do all the dirty work described by Sebastian.
2) Edit the executable. Several modified bytes will do the job. Here is some documentation on the file format: http://docsrv.sco.com:507/en/topics/COFF.html

Loading DLL from a location in memory

As the question says, I want to load a DLL from a location in memory instead of a file, similarly to LoadLibrary(Ex). I'm no expert in WinAPI, so googled a little and found this article together with MemoryModule library that pretty much meets my needs.
On the other hand the info there is quite old and the library hasn't been updated for a while too. So I wanted to know if there are different, newer and better ways to do it. Also if somebody has used the library mentioned in the article, could they provide insight on what I might be facing when using it?
Just for the curious ones, I'm exploring the concept of encrypting some plug-ins for applications without storing the decrypted version on disk.
Implementing your own DLL loader can get really hairy really fast. Reading this article it's easy to miss what kind of crazy edge cases you can get yourself into. I strongly recommend against it.
Just for a taste - consider you can't use any conventional debugging tools for the code in the DLL you're loading since the code you're executing is not listed in the region of any DLL known by the OS.
Another serious issue is dealing with DEP in windows.
Well, you can create a RAM Drive according to these instructions, then copy the DLL you can in memory to a file there and the use LoadLibrary().
Of course this is not very practical if you plan to deploy this as some kind of product because people are going to notice a driver being installed, a reboot after the installation and a new drive letter under My Computer. Also, this does nothing to actually hide the DLL since its just sitting there in the RAM Drive for everybody to watch.
Another thing I'm interested about is Why you actually want to do this? Perhaps your end result can be achieved by some other means other than Loading the DLL from memory. For instance when using a binary packer such as UPX, the DLL that you have on disk is different from the one that is eventually executed. Immediately after the DLL is loaded normally with LoadLibrary, The unpacker kicks in and rewrites the memory which the DLL is loaded to with the uncompressed binary (the DLL header makes sure that there is enough space allocated)
Similar question was raised in here:
Load native C++ .dll from RAM in debugger friendly manner
One of the answers proposes dllloader sample application shown in github:
https://github.com/tapika/dllloader
It supports .dll debugging out of box.