CreateProcess from memory buffer - c++

I can use CreateProcess to launch an EXE. I want to have the contents of an EXE in a memory buffer and do CreateProcess (or an equivalent) on it without having to write it to a file. Is there any way to do that?
The backstory : we make games. We send a plain EXE to our distributors, which then wrap it using their favorite DRM and sell it to their users. There have been instances where users find crashes. Most of the crashes take 5 minutes to fix, but the patch must go through the distributor and it may take several days, even weeks. I can't just send the patched EXE to the players because it wouldn't have the distributor's DRM. I'm thinking of distributing the real game EXE inside an encrypted datafile so what gets wrapped (the external EXE) just decrypts and launches the real EXE. This way I could safely distribute a fix without disabling the DRM.

It's actually quite easy. Similar technique has been described in a paper I read like 3 years ago.
Windows allow you to call the CreateProcess function with CREATE_SUSPENDED flag, that tells the API to keep the process suspended until the ResumeThread function is called.
This gives us time to grab the suspended thread's context using GetThreadContext function, then the EBX register will hold a pointer to the PBE(Process Enviroment Block) structure, which we need to determine the base address.
From the layout of the PBE structure we can see that the ImageBaseAddress is stored at the 8th byte, therefore [EBX+8] will give us actual base address of the process being suspended.
Now we need the in-memory EXE and do appropiate alignment if the alignment of memory and in-memory EXE differs.
If the base address of suspended process and in-memory exe matches, plus if the imageSize of the in-memory exe is lesser or equal to the suspended process' we can simply use WriteProcessMemory to write in-memory exe into the memory space of the suspended process.
But if the aforementioned conditions weren't met, we need a little more magic.
First, we need to unmap the original image using ZwUnmapViewOfSection, and then allocate enough memory using VirtualAllocEx within the memory space of the suspended process. Now we need to write the in-memory exe into the memory space of the suspended process using the WriteProcessMemory function.
Next, patch the BaseAddress of the in-memory exe into the PEB->ImageBaseAddress of the suspended process.
EAX register of the thread context holds EntryPoint address, which we need to rewrite with the EntryPoint address of the in-memory exe. Now we need to save the altered thread context using the SetThreadContext function.
Voila! We're ready to call the ResumeThread function on the suspended process to execute it!

You can compile the game as a DLL and put the DLL in the encrypted data file. A DLL can be loaded from memory without writing it to disk. Please see this tutorial (with sample code at the end): Loading a DLL From Memory

What you want to do requires NtCreateProcess, but it's undocumented and therefore brittle. This book apparently covers its use.
Perhaps you could build a patch system? E.g. on launch, program checks for patch DLL in same directory, and loads it if it exists.

Why do you need to create a new process? I would have thought you could run in the context of process which does the unpacking/decryption.

What you want can be achieved with something called a "Packer". Actually launching an exe from memory might be possible, but it's a lot harder than a packer ;)
One of the best known packers is UPX (google it). There are tools to decrypt it, but it should at least give you a starting point to work froim. I'm also fairly certain UPX is open-source.

Look at BoxedAppSDK
It supports launching exe from a memory buffer.
hope it helps.

Related

PE injection image relocation

All,
I have been trying to figure this out for a couple of days now and I need some help.
For a research project for work I have written some custom malware, the malware itself is not the issue here and I won't share any code, but I do need some help on the actual injector.
I have some problems trying to fully understand how and when I need to perform manual relocations. I am not relocating at the moment and using a random address from virtualallocex, and everything just works. My malware exe runs and I have no issues UNLESS the memory location where the remote process PE is loaded at overlaps with my malware PE preferred base address.
I am not using NtUnmapViewOfSection as it gets detected by AntiVirus and basically is just a crap function that randomly doesn't work, so my plan is to just use a random address provided by VirtualAllocEx and relocate if need be (which I don't understand, see questions hereunder).
This is my current working method (unless target process overlaps with preferredbase):
Download malware exe and place in buffer
CreateProcess to start victim process
Suspend the thread right after (I'm not using CREATE_SUSPENDED flag as this does not work in win10)
Get necessary header info from the buffer (PIMAGE_DOS_HEADER, PIMAGE_NT_HEADERS), also get the ImageBase address from remote process PEB
Allocate memory in target process (VirtualAllocEx), using NULL for lpAddress so virtualAllocEx can choose the location
Write PE headers and sections to the memory location
Protect Memory
Change EAX to new entrypoint
Change PEB to new baseAddress
ResumeThread
Profit
So please help me understand the following:
Why does this work without doing any manual relocations, is there some magic PE loader in the background that does this even though I'm injecting?
Why doesn't it work when the target process overlaps with the preferred base address. The PE image itself is copied in a non-overlapping memory location, so how in the hell is that any different from my working solution when the target process doesn't overlap. It should just do the magic relocation thing from my first question.
Why do I see so many people change the preferredBaseAddress in the image before writing it to memory? To my knowledge this field is only used to map PE to their preferredbaseaddress, if they can't do that the PE loader performs the relocations. Seeing as injection code usually performs its own manual relocations I have no idea why they would change this.
Hopefully somebody can help me understand, because this is driving me nuts :).
Best regards!
1: It is because of the way assembly code works. Most jmp's are relative to the current address, and thus will work no matter where the code is located. The problems arrise as soon as you want to look up variables / resolve dll import addresses from the IAT. This is because these operations require MOV instructions, and generally compilers will hardcore an address as the source operand for these functions. The problem then is that it will be pointing to some random location, and thus will either result in an access violation / undefined behaviour.
What I think is the case for you, is that both the host process, and the payload, have the same preferred base address. This means that it doesn't fail with an access violation, because there just happens to be some random data at that location.
If you always load the payload at it's preferred base address, you won't need to do manual relocations.
2: Not sure what you mean
3: For normal simple applications, you won't have to change the preferred base address. The problem arrises when your payload needs to access it's relocation table (for example, when deploying a rootkit or w/e). It kind of depends on how the virus was built. There shouldn't be any problems if the preferred base address is compared to the actual base address.
To your CREATE_SUSPENDED problem: I had this exact same problem a few weeks back. It seems that as soon as you resume a thread created with CREATE_SUSPENDED, it overwrites it's registers or w/e. I couldn't figure out why this problem arrises. What you can do to overcome this though, is never resume the main thread at all. Instead, simply create a new thread with CreateRemoteThreadEx.
EDIT: After reading one of your other questions, you actually solved this problem for me. I was changing EIP instead of EAX. I didn't know the PE loader called eax instead of just resuming the code at the instruction pointer.
If you ever need any help with this, HMU. I've done a whole ton of research on malware development, and love to share the knowledge.

identify memory code injection by memory dumping of process or dll

In order to identify memory code injection (on windows systems), I want to a hash the memory of all processes on the system, for example, if the memory of calc.exe is always x and now it is y, I know that someone injected into calc.exe code.
1: Is this thinking correct? What part of the process memory always stays the same and what part is changing?
2: Dose dll have a separate memory, or it is in the memory of the exe? In other words, can i generate a hash for memory of a dll?
3: How can I dump the memory of a process or of a dll in c++?
Code is continually being injected in processes when running windows.
One example are delay loaded DLLs. When a process starts up, only the core DLLS are loaded. When certain features get exercised, the code first loads the new DLLs (code) from disk and then executes it.
Another example is .NET managed applications. Most code sits as uncompiled code on disk. When new parts of the application need to be run, the .NET runtime loads that uncompiled code, compiles it (aka JITs it) and then executes it.
The problem you are trying to solve is worthwhile, but extremely hard. The OS itself tries to solve this problem to protect your processes.
If you are trying to do something more advanced than what windows is doing for you behind the scenes, the first thing to do will be to understand all the steps windows takes to protect process and validate the code being injected in them, while still enabling processes to load code dynamically (which is a necessity).
Good luck.
Or maybe you have a more specific problem you are trying to solve?
1) The idea is nice. But as long as the process runs, they change their memory (or they do nothing) so it won't work. What you could do, is to hash the code part of the memory.
2) No, DLL are libraries linked to your code, not a separate process. They are just loaded dynamically instead of statically (http://msdn.microsoft.com/en-us/library/windows/desktop/ms681914%28v=vs.85%29.aspx)
3) Normally your OS prohibits you from accessing memory of neighbour processes. If it would allow it for your process, then it would be very easy for malware to propagate, and your system would be very instable, as one crashing process could crash all the others. So it'll be very very tricky to do such kind of dumps ! But if your process has the right priviledges, you could have a look at ReadProcessMemory()
I have just done something similar I basically c#'s these scripts:
http://www.exploit-monday.com/2012/03/powershell-live-memory-analysis-tools.html

Memory Scan in c++

I am working on a WIN32 project in Visual Studio 2010. I am trying to Scan my main memory through the ClamAV(open source antivirus) Engine for searching a malicious/infected file in main memory.
The code i have written so far creates a snapshot of the whole main memory by using the windows function CreateToolhelp32Snapshot
Then i open a specific process from the snapshot, and pass this to the ClamAV engine, the engine then decides whether it is malicious or not. and i repeat the whole mechanism after every 10seconds. But i think this is not an efficient way,
What i want to do is to scan my whole memory once, and after this i scan only those process which is newly created in the main memory. Kindly guide me is there any way to get newly created methods from the memory only and not the whole memory
I think it cannot be done purely in user space program.
I am not very familiar with Windows API but I can give you some rough hint on how to do it.
You should scan every program BEFORE it is executed.
Make all the memory a program allocated later non-executable (NX bit)
The program will trigger a page-fault when it try to execute it, scan it now !!
Make the scanned memory read-only, and then go on to execute it
Make it non-executable again once a program tries to write this area.
NOTE: If you want to ensure security, a memory area can NEVER become executable and writable at the same time.
In this way you need only check those memory being executed. And it should be very efficient.

Accessing Memory of other applications C++

I am thinking about a problem I have been having for some time now.. I would like to write a C/C++ program (under windows first) that can access(read/change values) the memory(stack, heap, everything) of other running programs. (Not like shared memory but any memory the computer has..) Without having to start the application from my own application..
I have seen something like this before but I just can't figure out how it's done.. If I were to access the memory of any running program I would get errors from the OS right?
Any help is appreciated!
As #sharptooth said, this requires support from the OS. Different OS does it differently. Since you are on Windows, there are a few steps you could follow:
Call OpenProcess, or CreateProcess to access, or launch a new process. In this call, you must request PROCESS_VM_READ access.
Call ReadProcessMemory to read a chunk of memory in that opened process.
If you want to change memory of another process, you then need PROCESS_VM_WRITE access and use WriteProcessMemory to achieve that.
In Linux, for example, you'd use ptrace to attach to a process and peek, poke its memory.
You can start a process (another program) from your own application, and access some of its information (especially shared memory). The contrary is very difficult, the CPU fakes the memory addresses so each process believes that it has the whole memory available...
You might be interested in taking a look at the Toolhelp32ReadProcessMemory function.

How to create binary/hex dump of another process's memory?

I am having trouble finding a reasonable way to dump another process's memory to a file.
After extensive searching, I've been able to find a nice article at CodeProject that has *most* of the functionality I want:
Performing a hex dump of another process's memory. This does a good job of addressing permission issues and sets a good foundation.
However, with this utility I've seen that even a small process, such as an clean Notepad.exe or Calc.exe instance, can generate a dump file over 24MB in size, while the process itself runs under 20KB in memory according to TaskManager.
The article has lead me to believe that perhaps it is also dumping things in shared memory, possibly DLL space and the like. For example, a dump of Calc.exe will include sections that include method names (and presumably memory) from Kernel32.dll:
²³´µKERNEL32.dll ActivateActCtx AddAtomA AddAtomW AddConsoleAliasA AddConsoleAliasW AddLocalAlternateComputerNameA AddLocalAlternateComputerNameW AddRefActCtx AddVectoredExceptionHandler AllocConsole AllocateUserPhysicalPages AreFileApisANSI AssignProcessToJobObject AttachConsole BackupRead BackupSeek BackupWrite BaseCheckAppcompatCache BaseCleanupAppcompatCache
Is there a better way to dump the memory of another process that doesn't lead to this overhead, or perhaps an improvement upon the linked article's code that solves this problem? I want to get the memory that actually belongs to the process itself. I'd be okay with dumping the memory space of functions that are actually used in DLLs, but it seems unnecessary to dump the *entire* contents of multiple DLLs to get the running memory of the process.
I'm looking for a way to get the 30-60KB of a 30KB process, rather than 25MB for a 30KB process. Or at least closer than I can get currently.
Thanks in advance for your suggestions and guidance, it is appreciated.
Note: This is for a console utility, so GUI elements like the ones in the CodeProject article are unimportant.
You're basically asking for a user process minidump. The Windows Debug Helper library has a ready made function for this, MiniDumpWriteDump.
There is a coarse control over the amount of the detail contained in the mini dump from the MINIDUMP_TYPE parameter passed in to the function. The most basic, MiniDumpNormal, will only capture the call stack of each thread in the process. The amount of memory gets progressively more detailed with the other mini dump types.
You can also fine control the amount of information to be written into the mini dump by providing a callback to the MiniDumpWriteDump function and in the callback set the flags on the MINIDUMP_CALLBACK_OUTPUT structure.
The resulted mini dumps can be read with a debugger like Windbg or Visual Studio, or they can be processed by the various functions in the dbghelp.dll library.
Not really a "how to program it" answer, but I just found your question while looking for a tool that could do that, when I ran into PMDump:
http://ntsecurity.nu/toolbox/pmdump/
It's dead easy and simple to use, and creates correct dumps (I just tried it with some programs).