fork() and free all allocated memory - c++

I'm writing a service (i.e. background process) and want to offer starting it via a shared library. That is, someone wanting to use the service would link to the shared library, call it's start() method, which would fork and return. The fork would then run the service.
The problem with this approach is that the service process now might have a lot of legacy allocated memory it actually doesn't need. Is there a way to get rid of that and have the forked process allocate its own stuff? I know about exec() of course, but the problem with that is
that I need an executable which might not be in the location I expect it to be due to different operating system folder layouts
that I'd have to cast all potential parameters to string to pass it as program arguments to exec().
So basically, I'm looking for a way to call an arbitrary function func() with some parameters that should run in a new process, and everything not passed into that function shouldn't be in the new process. Is there a way to achieve this or something similar?

This is an interesting question that I sadly don't have a good answer for. I doubt any cleanup strategies like sbrk+close+munmap will reliably allow any libc based code to continue to function, so I'd try to make exec'ing better:
For any kind of exec based solution, you should be able to deep-copy data into shm to pass non-strings. This should take care of your second problem.
Here are some wild suggestions for your first issue:
Don't: just require that an executable is in PATH or a compile-time directory.
This is transparent and follows the UNIX philosophy. An error message Can't find myhelper in PATH will not slow anyone down. Most tools depending on helper executables do this, and it's fine.
Make your library executable, and use that as your exec target. You can try finding its name with some kind of introspection, perhaps /proc/self/maps or whatever glibc offers.
Like above, but exec python or something you can be reasonably sure exists, and use a foreign pointer interface to run a function on your library.
As part of your build process, compile a tiny executable and include it as binary data in your library. Write it to /tmp and execute.
Out of these, I prefer the simplicity and transparency of #1, even if that's the most boring solution.

Related

How can I change an address in another process with a value that can also change?

I am using C++ with Qt and I am struggling to find the way to achieve something I never did before.
Here is what I want to achieve :
I have a client (let's call it Client.exe) which I don't have access to the source and a launcher (let's call it... Launcher.exe) which I have access to the source.
Cient.exe needs a password and a username, supposed to come from Launcher.exe.
If I had only one couple password/username, I know I could make a .dll and inject it, but since I can have a lot of combinaisons, that is impossible.
So here is my question, what is the way to make a link allowing me to send password and username from Launcher.exe to Client.exe ?
Second question would be : is there a way to use VirtualProtect and this kind of stuff (in order to modify some instructions in memory), with an executable, meaning without any injection ? (I guess the answer is no, but I want to be sure)
Your "Launcher.exe" and your DLL injected into "Client.exe" can communicate with each other via interprocess communication, for example through file mapping. This could be used for "Launcher.exe" to pass any desired username and password to "Client.exe".
However, the main problem I see is how to get "Client.exe" to use this data, if you do not have access to the source code and if it also does not provide an API for this.
If you want to trick "Client.exe" into using the data provided by you (or by your injected DLL) instead of the intended data, then you must reverse engineer the program and change the appropriate instructions so that they load your data instead of the original data. Since you do not have access to the C/C++ source code, you will have to understand the assembly language instructions to accomplish this.
In order to find out which instructions to change, you will likely need a debugging tool such as x64dbg, which is designed to debug applications that you haven't written yourself (and have no source code for) and possibly also a static analysis tool, such as IDA or Ghidra. Furthermore, if the program deliberately protects itself from reverse-engineering, you will have to learn how to overcome this (which can be very hard).
You could also accomplish this without injecting a DLL, by using WriteProcessMemory. You may need to also use VirtualAllocEx if you need extra memory inside the target process, for example for injecting instructions or data.
In any case, before tampering with another process's instructions or data, it may be advisable to suspend all of its threads using SuspendThread, and then resuming all threads afterwards with ResumeThread. Otherwise, if the program runs while its instructions or data are in an inconsistent state, the program may crash.

Detecting process memory injection on windows (anti-hack)

Standard hacking case. Hack file type injects into a started process and writes over process memory using WriteProcessMemory call. In games this is not something you would want because it can provide the hacker to change the portion of the game and give himself an advantage.
There is a possibility to force a user to run a third-party program along with the game and I would need to know what would be the best way to prevent such injection. I already tried to use a function EnumProcessModules which lists all process DLLs with no success. It seems to me that the hacks inject directly into process memory (end of stack?), therefore it is undetected. At the moment I have came down to a few options.
Create a blacklist of files, file patterns, process names and memory patterns of most known public hacks and scan them with the program. The problem with this is that I would need to maintain the blacklist and also create an update of the program to hold all avalible hacks. I also found this usefull answer Detecting memory access to a process but it could be possible that some existing DLL is already using those calls so there could be false positives.
Using ReadProcessMemory to monitor the changes in well known memory offsets (hacks usually use the same offsets to achieve something). I would need to run a few hacks, monitor the behaviour and get samples of hack behaviour when comparing to normal run.
Would it be possible to somehow rearrange the process memory after it starts? Maybe just pushing the process memory down the stack could confuse the hack.
This is an example of the hack call:
WriteProcessMemory(phandler,0xsomeoffset,&datatowrite,...);
So unless the hack is a little more smarter to search for the actual start of the process it would already be a great success. I wonder if there is a system call that could rewrite the memory to another location or somehow insert some null data in front of the stack.
So, what would be the best way to go with this? It is a really interesting and dark area of the programming so I would like to hear as much interesting ideas as possible. The goal is to either prevent the hack from working or detect it.
Best regards
Time after time compute the hash or CRC of application's image stored in memory and compare it with known hash or CRC.
Our service http://activation-cloud.com provides the ability to check integrity of application against the signature stored in database.

On Sandboxing a memory-leaky 3rd-Party DLL

I am looking for a way to cure at least the symptoms of a leaky DLL i have to use. While the library (OpenCascade) claims to provides a memory manager, i have as of yet being unable to make it release any memory it allocated.
I would at least wish to put the calls to this module in a 'sandbox', in order to keep my application from not losing memory while the OCC-Module isn't even running any more.
My question is: While I realise that it would be an UGLY HACK (TM) to do so, is it possible to preallocate a stretch of memory to be used specifically by the libraries, or to build some kind of sandbox around it so i can track what areas of memory they used in order to release them myself when i am finished?
Or would that be to ugly a hack and I should try to resolve the issues otherwise?
The only reliable way is to separate use of the library into a dedicated process. You will start that process, pass data and parameters to it, run the library code, retrieve results. Once you decide the memory consumption is no longer tolerable you restart the process.
Using a library that isn't broken would probably be much easier, but if a replacement ins't available you could try intercepting the allocation calls. If the library isn't too badly 'optimized' (specifically function inlining) you could disassemble it and locate the malloc and free functions; on loading, you could replace every 4 (or 8 on p64 system) byte sequence that encodes that address with one that points to your own memory allocator. This is almost guaranteed to be a buggy, unreadable timesink, though, so don't do this if you can find a working replacement.
Edit:
Saw #sharptooth's answer, which has a much better chance of working. I'd still advise trying to find a replacement though.
You should ask Roman Lygin's opinion - he used to work at occ. He has at least one post that mentions memory management http://opencascade.blogspot.com/2009/06/developing-parallel-applications-with_23.html.
If you ask nicely, he might even write a post that explains mmgt's internals.

How can I log which thread called which function from which class and at what time throughout my whole project?

I am working on a fairly large project that runs on embedded systems. I would like to add the capability of logging which thread called which function from which class and at what time. E.g., here's what a typical line of the log file would look like:
Time - Thread Name - Function Name - Class Name
I know that I can do this by using the _penter hook function, which would execute at the beginning of every function called within my project (Source: http://msdn.microsoft.com/en-us/library/c63a9b7h%28VS.80%29.aspx). I could then find a library that would help me find the function, class, and thread from which _penter was called. However, I cannot use this solution since it is VC++ specific.
Is there a different way of doing this that would be supported by non-VC++ implementations? I am using the ARM/Thumb C/C++ Compiler, RVCT3.1. Additionally, is there a better way of tracking problems that may arise from multithreading?
Thank you,
Borys
I've worked with a system that had similar requirements (ARM embedded device). We had to build much of it from scratch, but we used some CodeWarrior stuff to do it, and then the map file for the function name lookup.
With CodeWarrior, you can get some code inserted into the start and end of each function, and using that, you can track when you enter each function, and when you switch threads. We used assembly, and you might have to as well, but it's easier than you think. One of your registers will be your return value, which is a hex value. If you compile with a map file, you can then use that hex value to look up the (mangled) name of that function. You can find the class name in the function name.
But, basically, get yourself a stream to somewhere (ideally to a desktop), and yell to the stream:
Entered Function #####
Left Function #####
Switched to Thread #
(PS - Actual encoding should be more like 1 21361987236, 2 1238721312, since you don't actually want to send characters)
If you're only ever processing one thread at a time, this should give you an accurate record of where you went, in the order you went there. Attach clock tick info for function profiling, add a message for allocations (and deallocations) and you get memory tracking.
If you're actually running multiple threads, it could get substantially harder, or be more of the same - I don't know. I'd put timing information on everything, and then have a separate stream for each thread. Although you might just be able to detect which processor you're running on, and report that, for which thread.... I don't, however, know if any of that will work.
Still, the basic idea was: Report on each step (function entry/exit, thread switching, and allocation), and then re-assemble the information you care about on the desktop side, where you have processing to spare.
gcc has PRETTY_FUNCTION define. With regard to thread, you can always call gettid or similar.
I've written a few log systems that just increment a thread # and store in in thread-local-data. That helps with giving thread of log statements. (time is easy to print out)
For tracing all function calls automatically, I'm not so sure. If it's just a few, you can easily write an object & macro that logs entry/exit using the __FUNCNAME__ #define (or something similar for your compiler).

Using dlopen, how can I cope with changes to the library file I have loaded?

I have a program written in C++ which uses dlopen to load a dynamic library (Linux, i386, .so). When the library file is subsequently modified, my program tends to crash. This is understandable, since presumably the file is simply mapped into memory.
My question is: other than simply creating myself a copy of the file and dlopening that, is there way for me to load a shared object which is safe against subsequent modifications, or any way to recover from modifications to a shared object that I have loaded?
Clarification: The question is not "how can I install a new library without crashing the program", it is "if someone who I don't control is copying libraries around, is it possible for me to defend against that?"
If you rm the library prior to installing the new one, I think your system will keep the inode allocated, the file open, and your program running. (And when your program finally exits, then the mostly-hidden-but-still-there file resources are released.)
Update: Ok, post-clarification. The dynamic linker actually completely "solves" this problem by passing the MAP_COPY flag, if available, to mmap(2). However, MAP_COPY does not exist on Linux and is not a planned future feature. Second best is MAP_DENYWRITE, which I believe the loader does use, and which is in the Linux API, and which Linux used to do. It errors-out writes while a region is mapped. It should still allow an rm and replace. The problem here is that anyone with read-access to a file can map it and block writes, which opens a local DoS hole. (Consider /etc/utmp. There is a proposal to use the execute permission bit to fix this.)
You aren't going to like this, but there is a trivial kernel patch that will restore MAP_DENYWRITE functionality. Linux still has the feature, it just clears the bit in the case of mmap(2). You have to patch it in code that is duplicated per-architecture, for ia32 I believe the file is arch/x86/ia32/sys_ia32.c.
asmlinkage long sys32_mmap2(unsigned long addr, unsigned long len,
unsigned long prot, unsigned long flags,
unsigned long fd, unsigned long pgoff)
{
struct mm_struct *mm = current->mm;
unsigned long error;
struct file *file = NULL;
flags &= ~(MAP_EXECUTABLE | MAP_DENYWRITE); // fix this line to not clear MAP_DENYWRITE
This should be OK as long as you don't have any malicious local users with credentials. It's not a remote DoS, just a local one.
If you install a new version of the library, the correct procedure is to create a new file in the same directory, then rename it over the old one. The old file will remain while it's open, and continue to be used.
Package managers like RPM do this automatically - so you can update shared libraries and executables while they're running - but the old versions keep running.
In the case where you need to take a new version, restart the process or reload the library - restarting the process sounds better - your program can exec itself. Even init can do this.
It is not possible to defend against someone overwriting your library if they have file write permission.
Because dlopen memory maps the library file, all changes to the file are visible in every process that has it open.
The dlopen function uses memory mapping because it is the most memory efficient way to use shared libraries. A private copy would waste memory.
As others have said, the proper way to replace a shared library in a Unix is to use unlink or rename, not to overwrite the library with a new copy. The install command will do this properly.
This is an intriguing question. I hate finding holes like this in Linux, and love finding ways to fix them.
My suggestion is inspired by the #Paul Tomblin answer to this question about temporary files on Linux. Some of the other answers here have suggested the existence of this mechanism, but have not described a method of exploiting it from the client application as you requested.
I have not tested this, so I have no idea how well it will work. Also, there may be minor security concerns associated with a race condition related to the brief period of time between when the temporary file is created and when it is unlinked. Also, you already noted the possibility of creating a copy of the library, which is what I am proposing. My twist on this is that your temporary copy exists as an entry in the file system for only an instant, regardless of how long you actually hold the library open.
When you want to load a library follow these steps:
copy the file to a temporary location, probably starting with mkstemp()
load the temporary copy of the library using dlopen()
unlink() the temporary file
proceed as normal, the file's resources will be automatically removed when you dlclose()
It would be nice if there were a really easy way to achieve the "copy the file" step without requiring you to actually copy the file. Hard-linking comes to mind, but I don't think that it would work for these purposes. It would be ideal if Linux had a copy-on-write mechanism which was as easy to use as link(), but I am not aware of such a facility.
Edit: The #Zan Lynx answer points out that creating custom copies of dynamic libraries can be wasteful if they are replicated into multiple processes. So my suggestion probably only makes sense if it is applied judiciously -- to only those libraries which are at risk of being stomped (presumably a small subset of all libraries which does not include files in /lib or /usr/lib).
If you can figure out where your library is mapped into memory, then you might be able to mprotect it writeable and do a trivial write to each page (e.g. read and write back the first byte of each page). That should get you a private copy of every page.
If 'mprotect' doesn't work (it may not, the original file was probably opened read-only), then you can copy the region out to another location, remap the region (using mmap) to a private, writeable region, then copy the region back.
I wish the OS had a "transform this read-only region to a copy-on-write region". I don't think something like that exists, though.
In any of these scenarios, there is still a window of vulnerability - someone can modify the library while dlopen is calling initializers or before your remap call happens. You're not really safe unless you can fix the dynamic linker like #DigitalRoss describes.
Who is editing your libraries out from under you, anyway? Find that person and hit him over the head with a frying pan.