fixing the recursive call to malloc with LD_PRELOAD - c++

I am using LD_PRELOAD to log malloc calls from an application and map out the virtual address space however malloc is used internally by fopen/printf. Is there a way I can fix this issue?
I know about glibc's hooks but I want to avoid changing the source code of the application.

My issue was caused by the fact that malloc is used internally by glibc so when I use LD_PRELOAD to override malloc any attempt to log caused malloc to be called resulting in a recursive call to malloc itself
Solution:
call original malloc whenever the TLS needs memory allocation
providing code:
static __thread int no_hook;
static void *(*real_malloc)(size_t) = NULL;
static void __attribute__((constructor))init(void) {
real_malloc = (void * (*)(size_t))dlsym(RTLD_NEXT, "malloc");
}
void * malloc(size_t len) {
void* ret;
void* caller;
if (no_hook) {
return (*real_malloc)(len);
}
no_hook = 1;
caller = (void*)(long) __builtin_return_address(0);
printf("malloc call %zu from %lu\n", len, (long)caller);
ret = (*real_malloc)(len);
// fprintf(logfp, ") -> %pn", ret);
no_hook = 0;
return ret;
}

Related

What is the best memory management for GCC C++? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 months ago.
Improve this question
What's the problem of my memory management? Because it causes crash that I show in a comment in the code below ("A memory block could not be found when trying to free."). I know that my memory management is not thread safe because I use global variables g_numBlocks and g_blocks that can cause risk when using multiple threads.
Since my memory management code seems too complex, can anyone suggest a stable and better "Memory Management for C++" to avoid memory leaks.
The code that contains bug
#include "emc-memory.h" // <-- Declare the functions MALLOC() and FREE() from other library.
#include <vector>
int main() {
printf("HERE(1)\n");
{
std::vector<string> paths = { // <-- Problem, 'std::vector' & 'string' use internal malloc/free & operator new/delete that are overwritten with my own custom memory management.
"/foo/bar.txt",
"/foo/bar.",
"/foo/bar",
"/foo/bar.txt/bar.cc",
"/foo/bar.txt/bar.",
"/foo/bar.txt/bar",
"/foo/.",
"/foo/..",
"/foo/.hidden",
"/foo/..bar",
};
} // <-- It crashes here, error in FREE(): "A memory block could not be found when trying to free.".
printf("HERE(2)\n"); // The reason I know it crashes above is this line is not evaluated, only "HERE(1)" is printed. I'm using [RelWithDebInfo] with blurry debugging info.
return 0;
}
Compilers:
[Visual Studio 2015] [Debug]: No problem.
[Visual Studio 2015] [RelWithDebInfo]: No problem.
[GCC 12.1.0 x86_64-w64-mingw32] [Debug]: No problem.
[GCC 12.1.0 x86_64-w64-mingw32] [RelWithDebInfo]: Broken which means there's a bug.
In "emc-memory.h" in other library .so file
extern const char* __file;
extern int __line;
#define new (__file = __FILE__, __line = __LINE__, 0) ? 0 : new
enum MEMORYBLOCKTYPE {
MEMORYBLOCKTYPE_MALLOC,
MEMORYBLOCKTYPE_NEW,
};
void *MALLOC(size_t size, MEMORYBLOCKTYPE type);
void *REALLOC(void *block, size_t newSize);
void FREE(void *block, MEMORYBLOCKTYPE type);
#define malloc(size) ((__file = __FILE__, __line = __LINE__, 0) ? 0 : MALLOC(size, MEMORYBLOCKTYPE_MALLOC))
#define realloc(block, newSize) REALLOC(block, newSize)
#define free(block) FREE(block, MEMORYBLOCKTYPE_MALLOC)
In "emc-memory.cpp" in other library .so file
I use this code in a link to override the operator new & delete: https://codereview.stackexchange.com/questions/7216/custom-operator-new-and-operator-delete
typedef unsigned long long BlockId; // The reason it's 64-bit is a memory block can be freed and reallocated multiple times, which means that there can be a lot of ids.
BlockId g_blockId = 0;
BlockId newBlockId() {
return g_blockId++;
}
struct Block {
const char *file;
int line;
const char *scope;
char *hint;
size_t size;
BlockId id; // That id is used for comparison because it will never be changed but the block pointer can be changed.
void *block;
MEMORYBLOCKTYPE type;
};
bool g_blocks_initialized = false;
int g_numBlocks;
Block **g_blocks;
void *MALLOC(size_t size, MEMORYBLOCKTYPE type) {
if (g_blocks_initialized == false) {
g_blocks_initialized = true;
_initializeList(g_numBlocks, g_blocks);
}
Block *b = (Block *)malloc(sizeof(*b));
b->file = __file ; __file = nullptr;
b->line = __line ; __line = 0;
b->scope = __scope; __scope = nullptr;
b->hint = allocateMemoryHint(__hint);
b->size = size;
b->id = newBlockId();
b->block = malloc(size);
b->type = type;
_addListItem(g_numBlocks, g_blocks, b);
return b->block;
}
void FREE(void *block, MEMORYBLOCKTYPE type) {
if (block == nullptr) {
return; // 'free' can free a nullptr.
}
for (int i = 0; i < g_numBlocks; i++) {
Block *b = g_blocks[i];
if (b->block == block) {
if (b->type != type) {
switch (type) {
case MEMORYBLOCKTYPE_MALLOC: EMC_ERROR("The memory block type must be MALLOC."); break;
case MEMORYBLOCKTYPE_NEW: EMC_ERROR("The memory block type must be NEW."); break;
default: EMC_ERROR("Error"); break;
}
}
_removeListItem(g_numBlocks, g_blocks, b);
freeMemoryHint(b->hint); b->hint = nullptr;
SAFE_FREE(b->block);
SAFE_FREE(b);
return;
}
}
EMC_ERROR("A memory block could not be found when trying to free.\n\nExamples:\n - Calling free(pointer) where pointer was not set to zero after it's been called twice, the solution was to use SAFE_FREE(). And if possible, replace any free() with SAFE_FREE(). For example, see Lexer::read0() on the original line \"free(out.asIdentifier);\".\n - If an 'Engine' object is destroyed before destroying a Vulkan object then it can cause this error (It can happen with 'Release' or 'RelWithDebInfo' configuration but not with 'Debug' configuration), that problem happened to me and I stuck there for hours until I realized it.");
}
I would humbly suggest that without a very clear reason to think otherwise the best memory management for GCC C++ is the out-of-the-box default memory management for GCC C++.
That would mean your best solution would have been to do nothing or as it is now strip out your overrides of the global operators.
You may find in some area of a system the default memory management is sub-optimal but in 2022 the default options are very effective and if you find a general purpose strategy that is better it's a publishable paper.
However your question tells us nothing about the application in question or your motivations for thinking you should even try to change the memory management let alone give advice on what to.
Sure you can add a global allocation mutex to block memory management and make it thread-safe. I will be surprised if that doesn't turn out to more than throw away whatever advantage you're hoping to gain.
Not sure where SAFE_FREE is coming from.
If you see in MALLOC function, they use c runtime malloc. Meaning that if you want to free the block you need to use the corresponding free() function.
Make sure that SAFE_FREE is indeed using the c runtime free with the correct parameters.

Organization of Initialization code

Good day,
I have been working on memory allocations for portable use.
I have a header file which contains function prototypes and operator overloads for new and delete.
void* mem_align16(size_t size);
void mem_delete16(void* memory);
I then have cpp files for each operating system implementation.
for example sbrk for linux and HeapAlloc for windows.
when working with windows I must use a handle to do memory allocations.
HANDLE heap_handle = HeapCreate(0, 0, 0);
How would I use the handle in an organized clean manner? As the handle would need to be used at startup inside of main. The only way I can think of using the handle is by having it as a static variable inside of the cpp file where the functions are used and then use a function to update the static variable and then have it forward declared inside of main.cpp
//inside of win32_heap.cpp
static HANDLE heaphandle = 0;
void make_heap_handle(void) {
heaphandle = HeapCreate(0, 0, 0);
}
//inside of win32_main.cpp
void make_heap_handle(void);
int main(int argc, char** argv) {
make_heap_handle();
return (0);
}
For some reason this feels like the wrong thing to do in order to get a usable handle for use with my functions.
You don't need to use HeapCreate() in order to use HeapAlloc()/HeapFree(), you could just use GetProcessHeap() instead:
void* mem_align16(size_t size)
{
// adjust size as needed...
void *memory = HeapAlloc(GetProcessHeap(), 0, size);
if (!memory) return nullptr;
// adjust memory as needed ...
return memory;
}
void mem_delete16(void* memory)
{
// adjust memory as needed...
HeapFree(GetProcessHeap(), 0, memory);
}
However, if you do want to create a private heap of your own, you can create a singleton class that is hidden inside your win32_heap.cpp file to create and destroy your private heap. The rest of your code outside of win32_heap.cpp will not be privy to its existence:
class MyHeap
{
public:
HANDLE hHeap;
MyHeap() {
hHeap = HeapCreate(0, 0, 0);
}
~MyHeap() {
HeapDestroy(hHeap);
}
};
static MyHeap gMyHeap;
void* mem_align16(size_t size)
{
// adjust size as needed...
void *memory = HeapAlloc(gMyHeap.hHeap, 0, size);
if (!memory) return 0;
// adjust memory as needed ...
return memory;
}
void mem_delete16(void* memory)
{
// adjust memory as needed...
HeapFree(gMyHeap.hHeap, 0, memory);
}

Override global new/delete and malloc/free with tcmalloc library

I want to override new/delete and malloc/free. I have tcmalloc library linked in my application. My aim is to add stats.
From new I am calling malloc. Below is an example it's global.
void* my_malloc(size_t size, const char *file, int line, const char *func)
{
void *p = malloc(size);
....
....
....
return p;
}
#define malloc(X) my_malloc(X, __FILE__, __LINE__, __FUNCTION__)
void *
operator new(size_t size)
{
auto new_addr = malloc(size);
....
...
return new_addr;
}
New/delete override is working fine.
My question is what happen to other file where I have use malloc directly for example
first.cpp
malloc(sizeof(..))
second.cpp
malloc(sizeof(..))
How this malloc call get's interpret as my macro is not in header file.
tcmalloc provides new/delete hooks that can be used to implement any kind of tracking/accounting of memory usage. See e.g. AddNewHook in https://github.com/gperftools/gperftools/blob/master/src/gperftools/malloc_hook.h

Track memory allocation per function

So I know I can track memory allocation with methods of overloading new globally like so: http://www.almostinfinite.com/memtrack.html
However, I was wondering if there was a good way to do this per function so I can get a report of how much is allocated per function. Right now I can get file and lines and what the typeid is as in the link I provided but I would like to find which function is allocating the most.
What about doing something like: http://ideone.com/Wqjkrw
#include <iostream>
#include <cstring>
class MemTracker
{
private:
static char func_name[100];
static size_t current_size;
public:
MemTracker(const char* FuncName) {strcpy(&func_name[0], FuncName);}
static void inc(size_t amount) {current_size += amount;}
static void print() {std::cout<<func_name<<" allocated: "<<current_size<<" bytes.\n";}
static void reset() {current_size = 0; memset(&func_name[0], 0, sizeof(func_name)/sizeof(char));}
};
char MemTracker::func_name[100] = {0};
size_t MemTracker::current_size = 0;
void* operator new(size_t size)
{
MemTracker::inc(size);
return malloc(size);
}
void operator delete(void* ptr)
{
free(ptr);
}
void FuncOne()
{
MemTracker(__func__);
int* i = new int[100];
delete[] i;
i = new int[200];
delete[] i;
MemTracker::print();
MemTracker::reset();
}
void FuncTwo()
{
MemTracker(__func__);
char* c = new char[1024];
delete[] c;
c = new char[2048];
delete[] c;
MemTracker::print();
MemTracker::reset();
}
int main()
{
FuncOne();
FuncTwo();
FuncTwo();
FuncTwo();
return 0;
}
Prints:
FuncOne allocated: 1200 bytes.
FuncTwo allocated: 3072 bytes.
FuncTwo allocated: 3072 bytes.
FuncTwo allocated: 3072 bytes.
What platform are you using? There might be platform specific solutions without changing the functions in your code base.
If you are using Microsoft Visual Studio, you can use compiler switches /Gh and /GH to let the compiler call functions _penter and _pexit that you can define. In those functions, you can query how much memory the program is using. There should be enough information in there to figure out how much memory is allocated in each function.
Example code for checking memory usage is provided in this MSDN article.

Problems with LD_PRELOAD and calloc() interposition for certain executables

Relating to a previous question of mine
I've successfully interposed malloc, but calloc seems to be more problematic.
That is with certain hosts, calloc gets stuck in an infinite loop with a possible internal calloc call inside dlsym. However, a basic test host does not exhibit this behaviour, but my system's "ls" command does.
Here's my code:
// build with: g++ -O2 -Wall -fPIC -ldl -o libnano.so -shared Main.cc
#include <stdio.h>
#include <dlfcn.h>
bool gNanoUp = false;// global
// Function types
typedef void* (*MallocFn)(size_t size);
typedef void* (*CallocFn)(size_t elements, size_t size);
struct MemoryFunctions {
MallocFn mMalloc;
CallocFn mCalloc;
};
MemoryFunctions orgMemFuncs;
// Save original methods.
void __attribute__((constructor)) __nano_init(void) {
fprintf(stderr, "NANO: init()\n");
// Get address of original functions
orgMemFuncs.mMalloc = (MallocFn)dlsym(RTLD_NEXT, "malloc");
orgMemFuncs.mCalloc = (CallocFn)dlsym(RTLD_NEXT, "calloc");
fprintf(stderr, "NANO: malloc() found #%p\n", orgMemFuncs.mMalloc);
fprintf(stderr, "NANO: calloc() found #%p\n", orgMemFuncs.mCalloc);
gNanoUp = true;
}
// replacement functions
extern "C" {
void *malloc(size_t size) {
if (!gNanoUp) __nano_init();
return orgMemFuncs.mMalloc(size);
}
void* calloc(size_t elements, size_t size) {
if (!gNanoUp) __nano_init();
return orgMemFuncs.mCalloc(elements, size);
}
}
Now, When I do the following, I get an infinite loop followed by a seg fault, eg:
% setenv LD_PRELOAD "./libnano.so"
% ls
...
NANO: init()
NANO: init()
NANO: init()
Segmentation fault (core dumped)
However if I comment out the calloc interposer, it almost seems to work:
% setenv LD_PRELOAD "./libnano.so"
% ls
NANO: init()
NANO: malloc() found #0x3b36274dc0
NANO: calloc() found #0x3b362749e0
NANO: init()
NANO: malloc() found #0x3b36274dc0
NANO: calloc() found #0x3b362749e0
<directory contents>
...
So somethings up with "ls" that means init() gets called twice.
EDIT
Note that the following host program works correctly - init() is only called once, and calloc is successfully interposed, as you can see from the output.
// build with: g++ test.cc -o test
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char* argv[]) {
void* p = malloc(123);
printf("HOST p=%p\n", p);
free(p);
char* c = new char;
printf("HOST c=%p\n", c);
delete c;
void* ca = calloc(10,10);
printf("HOST ca=%p\n", ca);
free(ca);
}
% setenv LD_PRELOAD "./libnano.so"
% ./test
NANO: init()
NANO: malloc() found #0x3b36274dc0
NANO: calloc() found #0x3b362749e0
HOST p=0x601010
HOST c=0x601010
HOST ca=0x601030
I know I am a bit late (6 years). But I wanted to override calloc() today and faced a problem because dlsym() internally uses calloc(). I solved it using a simple technique and thought of sharing it here:
static unsigned char buffer[8192];
void *calloc(size_t nmemb, size_t size)
{
if (calloc_ptr == NULL) // obtained from dlsym
return buffer;
init(); // uses dlsym() to find address of the real calloc()
return calloc_ptr(len);
}
void free(void *in)
{
if (in == buffer)
return;
free_ptr(in);
}
buffer satisfies the need of dlsym() till the real calloc() has been located and my calloc_ptr function pointer initialized.
With regard to __nano_init() being called twice: You've declared the function as a constructor, so it's called when the library is loaded, and it's called a second time explicitly when your malloc() and calloc() implementations are first called. Pick one.
With regard to the calloc() interposer crashing your application: Some of the functions you're using, including dlsym() and fprintf(), may themselves be attempting to allocate memory, calling your interposer functions. Consider the consequences, and act accordingly.
Using dlsym based hooking can result in crashes, as dlsym calls back into the memory allocator. Instead use malloc hooks, as I suggested in your prior question; these can be installed without actually invoking dlsym at all.
You can get away with a preliminary poor calloc that simply returns NULL. This actually works on Linux, YMMV.
static void* poor_calloc(size_t nmemb, size_t size)
{
// So dlsym uses calloc internally, which will lead to infinite recursion, since our calloc calls dlsym.
// Apparently dlsym can cope with slightly wrong calloc, see for further explanation:
// http://blog.bigpixel.ro/2010/09/interposing-calloc-on-linux
return NULL; // This is a poor implementation of calloc!
}
You can also use sbrk to allocate the memory for "poor calloc".