How to modify a function in a compiled DLL - c++

I want to know if it is possible to "edit" the code inside an already compiled DLL.
I.E. imagine that there is a function called sum(a,b) inside Math.dll which adds the two numbers a and b
Let's say i've lost the source code of my DLL. So the only thing i have is the binary DLL file.
Is there a way i could open that binary file, locate where my function resides and replace the sum(a,b) routine with, for example, another routine that returns the multiplication of a and b (instead of the sum)?
In Summary, is it posible to edit Binary code files?
maybe using reverse engineering tools like ollydbg?

Yes it is definitely possible (as long as the DLL isn't cryptographically signed), but it is challenging. You can do it with a simple Hex editor, though depending on the size of the DLL you may have to update a lot of sections. Don't try to read the raw binary, but rather run it through a disassembler.
Inside the compiled binary you will see a bunch of esoteric bytes. All of the opcodes that are normally written in assembly as instructions like "call," "jmp," etc. will be translated to the machine architecture dependent byte equivalent. If you use a disassembler, the disassembler will replace these binary values with assembly instructions so that it is much easier to understand what is happening.
Inside the compiled binary you will also see a lot of references to hard coded locations. For example, instead of seeing "call add()" it will be "call 0xFFFFF." The value here is typically a reference to an instruction sitting at a particular offset in the file. Usually this is the first instruction belonging to the function being called. Other times it is stack setup/cleanup code. This varies by compiler.
As long as the instructions you replace are the exact same size as the original instructions, your offsets will still be correct and you won't need to update the rest of the file. However if you change the size of the instructions you replace, you'll need to manually update all references to locations (this is really tedious btw).
Hint: If the instructions you're adding are smaller than what you replaced, you can pad the rest with NOPs to keep the locations from getting off.
Hope that helps, and happy hacking :-)

Detours, a library for instrumenting arbitrary Win32 functions on x86 machines. Detours intercepts Win32 functions by re-writing target function images. The Detours package also contains utilities to attach arbitrary DLLs and data segments (called payloads) to any Win32 binary.
Download

You can, of course, hex-edit the DLL to your heart's content and do all sorts of fancy things. But the question is why go to all that trouble if your intention is to replace the function to begin with?
Create a new DLL with the new function, and change the code that calls the function in the old DLL to call the function in the new DLL.
Or did you lose the source code to the application as well? ;)

Related

how do i get executable code from my source code cpp?

I watched some videos on youtube where bytes for CPP or c# code get hardcoded in an unsigned char* then get injected into memory and executed.
how can I do that with my source code? I only found a way to inject the bytes from an exe with a little bit complicated way which caused me some problems when executing.
I also found this page where they use some kind of pentesting tool to generate an executable code (bytes) that can simply get injected in memory.
https://www.ired.team/offensive-security/code-execution/using-msbuild-to-execute-shellcode-in-c
In short: give up until you understand enough of assembly language to ejects assembly code. Blind copying of executable code won't work.
C++ or C# compiling produce machine code which:
May contain external references. A function may call other function, use global variables, etc. Even if you don't explicitly do this, the language may call its runtime. On program load time this is fixed by having all statically imported objects in executable, and loading dynamically imported modules.
Isn't necessarily position independent. That is it may not behave well in another memory location. It may contains absolute reference to itself that should be adjusted, or relative external references, that also should be adjusted. On program load time this is fixed by processing relocation table.
Actually a specific case of 1, but can be viewed separately. Executable except from code and data contains some annotations to code, most notable, exception handlers. Without exception handlers, it may not execute as expected too.
That is, arbitrary copied bytes of executable may or may not work in another location. If you try to copy entire program, most likely it will not work.
For trick like injecting code one would use assembly or machine code, not high level languages. Sorry.
To get machine code for your instructions generated by compiling C++ code from VS:
During debugging - copy or drag and drop the address from Disassembly window to Memory window.
During compilation - use /FAc option

Hide or remove unwanted strings from windows executable release

I have this habit always a C++ project is compiled and the release is built up. I always open the .EXE with a hexadecimal editor (usually HxD) and have a look at the binary information.
What I hate most and try to find a solution for is the fact that somewhere in the string table, relevant (at least, from my point of view) information is offered. Maybe for other people this sounds like a schizophrenia obsession but I just don't like when my executable contains, for example, the names of all the Windows functions used in the application.
I have tried many compilers to see which of them published the least information. For example, GCC leaves all this in all of its produced final exe
libgcj_s.dll._Jv_RegisterClasses....\Data.ald.rb.Error.Data file is corrupt!
....Data for the application not found!.€.#.ř.#.0.#.€.#.°.#.p.#.p.#.p.#.p.#.
¸.#.$.#.€.#°.#.std::bad_alloc..__gnu_cxx::__concurrence_lock_error.__gnu_cxx
::__concurrence_unlock_error...std::exception.std::bad_exception...pure virt
ual method called..../../runtime/pseudo-reloc.c....VirtualQuery (addr, &b, s
ize of(b))............................/../../../gcc-4.4.1/libgcc/../gcc/conf
ig/i386/cygming-shared-data.c...0 && "Couldn't retrieve name of GCClib share
d data atom"....ret->size == sizeof(__cygming_shared) && "GCClib shared data
size mismatch".0 && "Couldn't add GCClib shared data atom".....-GCCLIBCYGMI
NG-EH-TDM1-SJLJ-GTHR-MINGW32........
Here, you can see what compiler I used, and what version. Now, a few lines below you can see a list with every Windows function I used, like CreateMainWindow, GetCurrentThreadId, etc.
I wonder if there are ways of not displaying this, or encrypting, obfuscating it.
With Visual C++ this information is not published. Instead, it is not so cross-platform as GCC, which even between two Windows systems like 7 and XP, doesn't need C++ run-time, frameworks or whatever programs compiled with VC++ need. Moreover, the VC++ executables also contain those procedures entry points to the Windows functions used in the application.
I know that even NASM, for example, saves the name of the called Windows functions, so it looks like it's a Windows issue. But maybe they can be encrypted or there's some trick to not show them.
I will have a look over the GCC source code to see where are those strings specified to be saved in the executables - maybe that instruction can be skipped or something.
Well, this is one of my last paranoia and maybe it can be treated some way. Thanks for your opinions and answers.
If you compile with -nostdlib then the GCC stuff should go away but you also lose some of the C++ support and std::*.
On Windows you can create an application that only links to LoadLibrary and GetProcAddress and at runtime it can get the rest of the functions you need (The names of the functions can be stored in encrypted form and you decrypt the string before passing it to GetProcAddress) Doing this is a lot of work and the Windows loader is probably faster at this than your code is going to be so it seems pointless to me to obfuscate the fact that you are calling simple functions like GetLastError and CreateWindow.
Windows API functions are loaded from dlls, like kernel32.dll. In order to get the loaded API function's memory address, a table of exported function names from the dll is searched. Thus the presence of these names.
You could manually load any Windows API functions you reference with LoadLibrary. The you could look up the functions' addresses with GetProcAddress and functions names stored in some obfuscated form. Alternately, you could use each function's "ordinal" -- a numeric value that identifies each function in a dll). This way, you could create a set of function pointers that you will use to call API functions.
But, to really make it clean, you would probably have to turn off linking of default libraries and replace components of the C Runtime library that are implicitly used by the compiler. Doing this is a hasslse, though.

How to figure out which methods increases size of 'exe'

I'm trying to write my first 'demoscene' application in MS Visual Studio Express 2010. Suddenly I realized, that my binary expanded from 16kb to ~100kb in fully-optimized-for-size release version. My target size is 64k. Is there any way to somehow "browse" binary to figure out, which methods consumes a lot of space, and which I should rewrite? I really want to know what my binary consists of.
From what I found in web, VS2010 is not the best compiler for demoscenes, but I still want to understand what's happening inside my .exe file.
I think you should have MSVC generate a map file for you. This is a file that will tell you the addresses of most of the different functions in your executable. The difference between consecutive addresses should tell you how much space the function takes. To generate a map file, add the /MAP linker option. For more info, see:
http://msdn.microsoft.com/en-us/library/k7xkk3e2(v=VS.100).aspx
You can strip off lots of unnecessary stuff from the executable and compress it with utilities such as mew.
I've found this useful for examining executable sizes (although not for demoscene type things): http://aras-p.info/projSizer.html
I will say this: if you are using the standard library at all then stop immediately. It is a huge code bloater. For example, each unique usage std::sort adds around 5KB and there's similar numbers for many of the standard containers (of course, it depends what functions you use, but in general they add lots of code).
Also, I'm not into the demo scene, but I believe people use Crinkler to compress their executables.
Use your version contol system to see what caused the increase. Going forward, Id log the built exe size during the nightly builds. And dont forget you can optimize for minimal size with the compiler settings.

Get Memory Address of Binary Instructions

I'm currently working on some system level code where I would like to be able to identify the memory section(s) that are from the loaded binary in order to detect things like corrupted or modified instructions;
Essentially what I'm after is a way, in Win32 using C++, to get a pointer to the range of instructions. This is somewhat similar to asking for a function pointer to the .text section's start and end. My understanding of the exe format is that the .text section is where instructions are stored, versus the .data section which holds things like global variables. Unfortunately I've found 0 hints on where this might be (I've seen no win32 function calls, nothing in the TIB, etc.)
Can anyone direct me to where I could find/calculate this information?
P.S. I do understand that if anyone changes code maliciously that they may find this code and change it; I'm still interested in the details of how to get at this information for my own curiosity.
You can't really expect this to work with an in memory binary. Any function calls to imported DLLs will get modified by the loader to point to the actual locations of the target procedures in the DLL that is loaded.
For example suppose you call a function in kernel32.dll. Then a Windows update happens which changes kernel32.dll. The next time you run your app, the jump to the function in kernel32.dll is going to be to a different memory address than the before the Windows update was applied.
And of course this all assumes that DLLs load at their preferred address. And then you may have some self-modifying code.
And so on, and so on.
You can find the entry-point to your code in the PE header. Download the PE (Portable Executable) file definition from MSDN - it has all the information. The format of the program in memory is virtually the same as it is on disk. From within the code, you can get a pointer to the PE header in memory via the GetModuleHandle() function (the handle is really a pointer to the first page).
This doesn't directly answer your question, but for your overall solution, you could look into Code Signing. If you like this solution, there are existing implementations on Windows.
As you said, binary verification alone won't solve your problem. You should also look into installing your application in an area of the file system that requires elevation/admin rights to write to, such as Program Files, or deploy it somewhere a user can't directly modify it, like a web server.

Edit strings vars in compiled exe? C++ win32

I want to have a few strings in my c++ app and I want to be able to edit them later in the deployed applications (the compiled exe), Is there a way to make the exe edit itself or it resources so I can update the strings value?
The app checks for updates on start, so I'm thinking about using that to algo send the command when I need to edit the strings (for example the string that contains the url used to check for updates).
I don't want to use anything external to the exe, I could simply use the registry but I prefer to keep everything inside the exe.
I am using visual studio 2010 c++ (or any other version of ms visual c++).
I know you said you don't want to use anything external to the program, but I think what you really want in this case is a resource-only DLL. The executable can load whichever DLL has the strings that you need in a given invocation.
Another idea is to move the strings into a "configuration" file, such as in XML or INI format.
Modifying the EXE without compilation is hacking and highly discouraged. You could use a hex editor, find the string and modify it. The new text must be have a length less than or equal to the original EXE.
Note, some virus checkers perform CRCs or checksums on the executables. Altering the executables is a red flag to these virus checkers.
It is impossible, unless your strings won't change in position & length.
So to make it possible: make your "size" of the, in your example, URL that is used to get updates pretty big (think of: 512 characters, null-filled at the end). This way, you have got some space to update the String.
Why is it impossible to use variable-sized strings? Well I can explain this with a small x86 Assembler snippet:
PUSH OFFSET test.004024F0
Let's say; at the offset of test.004024F0 is your variable-sized string. Now consider the change:
I want to insert a string, which is longer than the original string, which is stored before the string at 004024F0: This makes 004024F0 to a new value, let's say: 004024F5 (the new string, before this entry, is 5 characters longer than it's original).
You think it's simple: search for all 004024F0 and replace it with 004024F5? Wrong. 004024F0 can also be a regular "instruction" (to be precise: ADD BYTE PTR DS:[EAX+24],AL; LOCK ...). If this instruction happens to be in your code, it'll be replaced by something wrong.
Well, you might think, what about searching for that PUSH instruction? Wrong: there are virtually unlimited ways to "PUSH". For instance, MOV EAX, 004024F0; MOV ESP, EAX; ADD ESP, 4. There is also the possibility that the field is calculated: MOV EAX, 00402000; ADD EAX, 4F0; .... So this makes it "virtually unlimited".
However, if you use statically sized fields; you don't have to change the code refering to Strings. If you reserve enough space of a specific field, then you can easily write a "longer" string than original, because the size of a string is calculated by finding the first "null-byte"; pad the rest of the field with nulls.
If you use statically sized fields, it's, however, very hard to find the "position in the file", at compile-time. Considering a lot of time spending hacking your own app; you can write code that modifies the .exe, and stores a new String value at a specified offset. This file-offset isn't known at compile time, and you can patch this file-offset yourself later, using a tool like OllyDbg. This enables the executable to patch itsself :-)
Creating a self-editing exe is a very ill-advised approach to solving this problem. You are much better off storing and reading the strings from an external file. Maybe if you provide some background as to why you don't want to use anything but an exe, we can address those issues?
In theory, BeginUpdateResource, UpdateResource and EndUpdateResource are intended for this purpose. In reality, getting these to work at all is pretty tricky. I'm not at all sure they'll work for updating resources in a running executable.
Not wanting to chastise, but this doesn't sound like a great idea. Having the URL for checking for updates baked inside the program makes it inflexible.
You're trying to mitigate the inflexibility by rewriting the strings in your exe. This is really asking for trouble:
are you sure users that run your program have write permission to be able to update the exe? Default users have no write access to files installed in the program folder.
If the program is run by multiple users or simply multiple times by the same user, the exe will be locked and unmodifiable
Systems administrators will have a hard time tweaking the URL.
There is a real risk you will corrupt your exe. The rewrite process is likely to be complex, especially if you want to make the URL longer than is presently allocated.
By modifying your exe, you remove the possiblity of using code signing, which can be useful in a networked environment.
The registry (despite all it's weaknesses) is really where this kind of configuration data should go. You can put a default value in your EXE, but if you need to make changes, put them in the registry. This makes the changes transparent, saving you a lot of grief later.
Yyour algorithm that wants to write a new URL for updates should do this by writing it to the registry. Alternatively, have a config file that ships alongsite with your exe, and update that. (But bear in mind user permissions - you may not have write access to that file, but you can always write to the user hive of the registry.)