I'm making ASCII game and I need performance, so decided to go with printf(). But there is a problem, I designed my char array as multidimensional char ** array, and printing it outputs garbage of memory instead of data. I know it's possible to print it with a for loop but the performance rapidly drops that way. I need to printf it like a static array[][]. Is there a way?
I did some example of working and notWorking array. I only need printf() to work with nonWorking array.
edit: using Visual Studio 2015 on Win 10, and yeah, I tested performance and cout is much slower than printf (but I don't really know why is this happening)
#include <iostream>
#include <cstdio>
int main()
{
const int X_SIZE = 40;
const int Y_SIZE = 20;
char works[Y_SIZE][X_SIZE];
char ** notWorking;
notWorking = new char*[Y_SIZE];
for (int i = 0; i < Y_SIZE; i++) {
notWorking[i] = new char[X_SIZE];
}
for (int i = 0; i < Y_SIZE; i++) {
for (int j = 0; j < X_SIZE; j++) {
works[i][j] = '#';
notWorking[i][j] = '#';
}
works[i][X_SIZE-1] = '\n';
notWorking[i][X_SIZE - 1] = '\n';
}
works[Y_SIZE-1][X_SIZE-1] = '\0';
notWorking[Y_SIZE-1][X_SIZE-1] = '\0';
printf("%s\n\n", works);
printf("%s\n\n", notWorking);
system("PAUSE");
}
Note: I think I could make some kind of a buffer or static array for just copying and displaying data, but I wonder if that can be done without it.
If you would like to print a 2D structure with printf without a loop, you need to present it to printf as a contiguous one-dimension C string. Since your game needs access to the string as a 2D structure, you could make an array of pointers into this flat structure that would look like this:
Array of pointers partitions the buffer for use as a 2D structure, while the buffer itself can be printed by printf because it is a contiguous C string.
Here is the same structure in code:
// X_SIZE+1 is for '\n's; overall +1 is for '\0'
char buffer[Y_SIZE*(X_SIZE+1)+1];
char *array[Y_SIZE];
// Setup the buffer and the array
for (int r = 0 ; r != Y_SIZE ; r++) {
array[r] = &buffer[r*(X_SIZE+1)];
for (int c = 0 ; c != X_SIZE ; c++) {
array[r][c] = '#';
}
array[r][X_SIZE] = '\n';
}
buffer[Y_SIZE*(X_SIZE+1)] = '\0';
printf("%s\n", buffer);
Demo.
Some things you can do to increase performance:
There is absolutely no reason to have an array of pointers, each pointing at an array. This will cause heap fragmentation as your data will end up all over the heap. Allocating memory in adjacent cells have many benefits in terms of speed, for example it might improve the use of data cache.
Instead, allocate a true 2D array:
char (*array2D) [Y] = new char [X][Y];
printf as well as cout are both incredibly slow, as they come with tons of overhead and extra features which you don't need. Since they are just advanced wrappers around the system-specific console functions, you should consider using the system-specific functions directly. For example, the Windows console API. It will however turn your program non-portable.
If that's not an option, you could try to use puts instead of printf, since it has far less overhead.
Main performance issue with printf/cout is that they write to the end of the "standard output stream", meaning you can't write where you like, but always at the bottom of the screen. Forcing you to constantly redraw the whole thing every time you changed something, which will be slow and possibly cause flicker issues.
Old DOS/Turbo C programs solved this with a non-standard function called gotoxy which allowed you to move the "cursor" and print where you liked. In modern programming, you can do this with the console API functions. Example for Windows.
You could/should separate graphics from the rest of the program. If you have one thread handing graphics only and the main thread handling algorithms, the graphic updates will work smoother, without having to wait for whatever else the program is doing. It makes the program far more advanced though, as you have to consider thread safety issues.
This is a code that evaluates if a wide string is either L"false" or L"true", but when I try to run it, it gives me this error when trying to free the duplicate string pointer "HEAP CORRUPTION DETECTED: after Normal block(#135756) at 0x00000000002EB3A0. CRT detected that the application wrote to memory after end of heap buffer.".
Here is the inline code:
const wchar_t* sequence = L"false";
wchar_t* duplicate;
size_t length = wcslen(sequence) + 1;
duplicate = static_cast<wchar_t*>(malloc(length));
wcscpy_s(duplicate, length, sequence);
int boolean = -1;
if (wcscmp(duplicate, L"false") == 0) {
boolean = 0;
}
else if (wcscmp(duplicate, L"true") == 0) {
boolean = 1;
}
free(duplicate);
All the string pointers seem to be OK right before the free statement. I am sure I have done some serious mistake simply because I was able to corrupt the heap.
Compiler: Microsoft Visual Studio 2015 RC
Processor: Inter Core i5-3450 3.10 GHz
Use
duplicate = static_cast(malloc(length * sizeof(wchar_t));
otherwise you do not hane enough space for the wide string
I wonder why this code doesn't work:
#include <iostream>
using namespace std;
int main()
{
int *pointer = (int*)0x02F70BCC;
cout<<*pointer;
return 0;
}
In my opinion it should write on the screen value of 0x02F70BCC,
instead of this my programm crashes.
I know that memory with adress 0x02F70BCC stores value of 20.
But like I said no matter what it just doesn't want to show correct number.
Please help me guys, detailed explanation would be very nice of you.
It doesn't work, because you won't get access to every location in memory you want. Not every location in memory is valid, you may want to read about Virtual Address Space.
Some addresses are reserved for device drivers and kernel mode operations. Another range of addresses (for example 0xCCCCCCCC and higher) may be reserved for uninitialized pointers.
Even if some location is valid, operating system may still deny access to write to/read from certain location, if that would cause undefined behaviour or violate system safety.
EDIT
I think you might be interested in creating some kind of "GameHack", that allows you to modify amount of resources, number of units, experience level, attributes or anything.
Memory access is not a simple topic. Different OSes use different strategies to prevent security violations. But many thing can be done here, after all there is a lot software for doing such things.
First of all, do you really need to write your own tool? If you just want some cheating, use ArtMoney - it is a great memory editor, that I have been using for years.
But if you really have to write it manually, you need to do some research first.
On Windows, for example, I would start from these:
ReadProcessMemory
WriteProcessMemory
Also, I am quite certain, that one of possible techniques is to pretend, that you are a debugger:
DebugActiveProcess.
EDIT 2
I have done some research and it looks, that on Windows (I assume this is your platform, since you mentioned gaming; can't imagine playing anything on crappy Linux), steps required to write another process' memory are:
1. Enumerate processes: (EnumProcesses)
const size_t MAX_PROC_NUM = 512;
DWORD procIDs[MAX_PROC_NUM] = { 0 };
DWORD idsNum = 0;
if(!EnumProcesses(procIDs, sizeof(DWORD) * MAX_PROC_NUM, &idsNum))
//handle error here
idsNum /= sizeof(DWORD); //After EnumProcesses(), idsNum contains number of BYTES!
2. Open required process. (OpenProcess,GetModuleFileNameEx)
const char* game_exe_path = "E:\\Games\\Spellforce\\Spellforce.exe"; //Example
HANDLE game_proc_handle = nullptr;
DWORD proc_access = PROCESS_QUERY_INFORMATION | PROCESS_VM_READ | PROCESS_VM_WRITE; //read & write memory, query info needed to get .exe name
const DWORD MAX_EXE_PATH_LEN = 1024;
for(DWORD n = 0 ; n < idsNum ; ++idsNum)
{
DWORD current_id = procIDs[n];
HANDLE current_handle = OpenProcess(proc_access, false, current_id);
if(!current_handle)
{
//handle error here
continue;
}
char current_path[MAX_EXE_PATH_LEN];
DWORD length = GetModuleFileNameEx(current_handle, nullptr, current_path, MAX_EXE_PATH_LEN);
if(length > 0)
{
if(strcmp(current_path, game_exe_path) == 0) //that's our game!
{
game_proc_handle = current_handle;
break;
}
}
CloseHandle(current_handle); //don't forget this!
}
if(!game_proc_handle)
//sorry, game not found
3. Write memory (WriteProcessMemory)
void* pointer = reinterpret_cast<void*>(0x02F70BCC);
int new_value = 5000; //value to be written
BOOL success = WriteProcessMemory(game_proc_handle, pointer, &new_value, sizeof(int), nullptr);
if(success)
//data successfully written!
else
//well, that's... em...
This code is written just 'as is', but I see no errors, so you can use it as your starting point. I also provided links for all functions I used, so with some additional research (if necessary), you can achieve what you are trying to.
Cheers.
When you use,
cout<<*pointer;
the program tries to dereference the value of the pointer and writes the value at the address.
If you want to print just the pointer, use:
cout << pointer;
Example:
int main()
{
int i = 20;
int* p = &i;
std::cout << *p << std::endl; // print the value stored at the address
// pointed to by p. In this case, it will
// print the value of i, which is 20
std::cout << p << std::endl; // print the address that p points to
// It will print the address of i.
}
So I used this example of the HeapWalk function to implement it into my app. I played around with it a bit and saw that when I added
HANDLE d = HeapAlloc(hHeap, 0, sizeof(int));
int* f = new(d) int;
after creating the heap then some new output would be logged:
Allocated block Data portion begins at: 0X037307E0
Size: 4 bytes
Overhead: 28 bytes
Region index: 0
So seeing this I thought I could check Entry.wFlags to see if it was set as PROCESS_HEAP_ENTRY_BUSY to keep a track of how much allocated memory I'm using on the heap. So I have:
HeapLock(heap);
int totalUsedSpace = 0, totalSize = 0, largestFreeSpace = 0, largestCounter = 0;
PROCESS_HEAP_ENTRY entry;
entry.lpData = NULL;
while (HeapWalk(heap, &entry) != FALSE)
{
int entrySize = entry.cbData + entry.cbOverhead;
if ((entry.wFlags & PROCESS_HEAP_ENTRY_BUSY) != 0)
{
// We have allocated memory in this block
totalUsedSpace += entrySize;
largestCounter = 0;
}
else
{
// We do not have allocated memory in this block
largestCounter += entrySize;
if (largestCounter > largestFreeSpace)
{
// Save this value as we've found a bigger space
largestFreeSpace = largestCounter;
}
}
// Keep a track of the total size of this heap
totalSize += entrySize;
}
HeapUnlock(heap);
And this appears to work when built in debug mode (totalSize and totalUsedSpace are different values). However, when I run it in Release mode totalUsedSpace is always 0.
I stepped through it with the debugger while in Release mode and for each heap it loops three times and I get the following flags in entry.wFlags from calling HeapWalk:
1 (PROCESS_HEAP_REGION)
0
2 (PROCESS_HEAP_UNCOMMITTED_RANGE)
It then exits the while loop and GetLastError() returns ERROR_NO_MORE_ITEMS as expected.
From here I found that a flag value of 0 is "the committed block which is free, i.e. not being allocated or not being used as control structure."
Does anyone know why it does not work as intended when built in Release mode? I don't have much experience of how memory is handled by the computer, so I'm not sure where the error might be coming from. Searching on Google didn't come up with anything so hopefully someone here knows.
UPDATE: I'm still looking into this myself and if I monitor the app using vmmap I can see that the process has 9 heaps, but when calling GetProcessHeaps it returns that there are 22 heaps. Also, none of the heap handles it returns matches to the return value of GetProcessHeap() or _get_heap_handle(). It seems like GetProcessHeaps is not behaving as expected. Here is the code to get the list of heaps:
// Count how many heaps there are and allocate enough space for them
DWORD numHeaps = GetProcessHeaps(0, NULL);
HANDLE* handles = new HANDLE[numHeaps];
// Get a handle to known heaps for us to compare against
HANDLE defaultHeap = GetProcessHeap();
HANDLE crtHeap = (HANDLE)_get_heap_handle();
// Get a list of handles to all the heaps
DWORD retVal = GetProcessHeaps(numHeaps, handles);
And retVal is the same value as numHeaps, which indicates that there was no error.
Application Verifier had been set up previously to do a full page heap verifying of my executable and was interfering with the heaps returned by GetProcessHeaps. I'd forgotten about it being set up as it was done for a different issue several days ago and then closed without clearing the tests. It wasn't happening in debug build because the application builds to a different file name for debug builds.
We managed to detect this by adding a breakpoint and looking at the callstack of the thread. We could see the AV DLL had been injected in and that let us know where to look.
What would be a good way to detect a C++ memory leak in an embedded environment? I tried overloading the new operator to log every data allocation, but I must have done something wrong, that approach isn't working. Has anyone else run into a similar situation?
This is the code for the new and delete operator overloading.
EDIT:
Full disclosure: I am looking for a memory leak in my program and I am using this code that someone else wrote to overload the new and delete operator. Part of my problem is the fact that I don't fully understand what it does. I know that the goal is to log the address of the caller and previous caller, the size of the allocation, a 1 if we are allocating, a 2 if we are deallocation. plus the name of the thread that is running.
Thanks for all the suggestions, I am going to try a different approach that someone here at work suggested. If it works, I will post it here.
Thanks again to all you first-rate programmers for taking the time to answer.
StackOverflow rocks!
Conclusion
Thanks for all the answers. Unfortunately, I had to move on to a different more pressing issue. This leak only occurred under a highly unlikely scenario. I feel crappy about just dropping it, I may go back to it if I have more time. I chose the answer I am most likely to use.
#include <stdlib.h>
#include "stdio.h"
#include "nucleus.h"
#include "plus/inc/dm_defs.h"
#include "plus/inc/pm_defs.h"
#include "posix\inc\posix.h"
extern void* TCD_Current_Thread;
extern "C" void rd_write_text(char * text);
extern PM_PCB * PMD_Created_Pools_List;
typedef struct {
void* addr;
uint16_t size;
uint16_t flags;
} MemLogEntryNarrow_t;
typedef struct {
void* addr;
uint16_t size;
uint16_t flags;
void* caller;
void* prev_caller;
void* taskid;
uint32_t timestamp;
} MemLogEntryWide_t;
//size lookup table
unsigned char MEM_bitLookupTable[] = {
0,1,1,2,1,2,2,3,1,2,2,3,1,3,3,4
};
//#pragma CODE_SECTION ("section_ramset1_0")
void *::operator new(unsigned int size)
{
asm(" STR R14, [R13, #0xC]"); //save stack address temp[0]
asm(" STR R13, [R13, #0x10]"); //save pc return address temp[1]
if ( loggingEnabled )
{
uint32_t savedInterruptState;
uint32_t currentIndex;
// protect the thread unsafe section.
savedInterruptState = NU_Local_Control_Interrupts(NU_DISABLE_INTERRUPTS);
// Note that this code is FRAGILE. It peeks backwards on the stack to find the return
// address of the caller. The location of the return address on the stack can be easily changed
// as a result of other changes in this function (i.e. adding local variables, etc).
// The offsets may need to be adjusted if this function is touched.
volatile unsigned int temp[2];
unsigned int *addr = (unsigned int *)temp[0] - 1;
unsigned int count = 1 + (0x20/4); //current stack space ***
//Scan for previous store
while ((*addr & 0xFFFF0000) != 0xE92D0000)
{
if ((*addr & 0xFFFFF000) == 0xE24DD000)
{
//add offset in words
count += ((*addr & 0xFFF) >> 2);
}
addr--;
}
count += MEM_bitLookupTable[*addr & 0xF];
count += MEM_bitLookupTable[(*addr >>4) & 0xF];
count += MEM_bitLookupTable[(*addr >> 8) & 0xF];
count += MEM_bitLookupTable[(*addr >> 12) & 0xF];
addr = (unsigned int *)temp[1] + count;
// FRAGILE CODE ENDS HERE
currentIndex = currentMemLogWriteIndex;
currentMemLogWriteIndex++;
if ( memLogNarrow )
{
if (currentMemLogWriteIndex >= MEMLOG_SIZE/2 )
{
loggingEnabled = false;
rd_write_text( "Allocation Logging is complete and DISABLED!\r\n\r\n");
}
// advance the read index if necessary.
if ( currentMemLogReadIndex == currentMemLogWriteIndex )
{
currentMemLogReadIndex++;
if ( currentMemLogReadIndex == MEMLOG_SIZE/2 )
{
currentMemLogReadIndex = 0;
}
}
NU_Local_Control_Interrupts(savedInterruptState);
//Standard operator
//(For Partition Analysis we have to consider that if we alloc size of 0 always as size of 1 then are partitions must be optimized for this)
if (size == 0) size = 1;
((MemLogEntryNarrow_t*)memLog)[currentIndex].size = size;
((MemLogEntryNarrow_t*)memLog)[currentIndex].flags = 1; //allocated
//Standard operator
void * ptr;
ptr = malloc(size);
((MemLogEntryNarrow_t*)memLog)[currentIndex].addr = ptr;
return ptr;
}
else
{
if (currentMemLogWriteIndex >= MEMLOG_SIZE/6 )
{
loggingEnabled = false;
rd_write_text( "Allocation Logging is complete and DISABLED!\r\n\r\n");
}
// advance the read index if necessary.
if ( currentMemLogReadIndex == currentMemLogWriteIndex )
{
currentMemLogReadIndex++;
if ( currentMemLogReadIndex == MEMLOG_SIZE/6 )
{
currentMemLogReadIndex = 0;
}
}
((MemLogEntryWide_t*)memLog)[currentIndex].caller = (void *)(temp[0] - 4);
((MemLogEntryWide_t*)memLog)[currentIndex].prev_caller = (void *)*addr;
NU_Local_Control_Interrupts(savedInterruptState);
((MemLogEntryWide_t*)memLog)[currentIndex].taskid = (void *)TCD_Current_Thread;
((MemLogEntryWide_t*)memLog)[currentIndex].size = size;
((MemLogEntryWide_t*)memLog)[currentIndex].flags = 1; //allocated
((MemLogEntryWide_t*)memLog)[currentIndex].timestamp = *(volatile uint32_t *)0xfffbc410; // for arm9
//Standard operator
if (size == 0) size = 1;
void * ptr;
ptr = malloc(size);
((MemLogEntryWide_t*)memLog)[currentIndex].addr = ptr;
return ptr;
}
}
else
{
//Standard operator
if (size == 0) size = 1;
void * ptr;
ptr = malloc(size);
return ptr;
}
}
//#pragma CODE_SECTION ("section_ramset1_0")
void ::operator delete(void *ptr)
{
uint32_t savedInterruptState;
uint32_t currentIndex;
asm(" STR R14, [R13, #0xC]"); //save stack address temp[0]
asm(" STR R13, [R13, #0x10]"); //save pc return address temp[1]
if ( loggingEnabled )
{
savedInterruptState = NU_Local_Control_Interrupts(NU_DISABLE_INTERRUPTS);
// Note that this code is FRAGILE. It peeks backwards on the stack to find the return
// address of the caller. The location of the return address on the stack can be easily changed
// as a result of other changes in this function (i.e. adding local variables, etc).
// The offsets may need to be adjusted if this function is touched.
volatile unsigned int temp[2];
unsigned int *addr = (unsigned int *)temp[0] - 1;
unsigned int count = 1 + (0x20/4); //current stack space ***
//Scan for previous store
while ((*addr & 0xFFFF0000) != 0xE92D0000)
{
if ((*addr & 0xFFFFF000) == 0xE24DD000)
{
//add offset in words
count += ((*addr & 0xFFF) >> 2);
}
addr--;
}
count += MEM_bitLookupTable[*addr & 0xF];
count += MEM_bitLookupTable[(*addr >>4) & 0xF];
count += MEM_bitLookupTable[(*addr >> 8) & 0xF];
count += MEM_bitLookupTable[(*addr >> 12) & 0xF];
addr = (unsigned int *)temp[1] + count;
// FRAGILE CODE ENDS HERE
currentIndex = currentMemLogWriteIndex;
currentMemLogWriteIndex++;
if ( memLogNarrow )
{
if ( currentMemLogWriteIndex >= MEMLOG_SIZE/2 )
{
loggingEnabled = false;
rd_write_text( "Allocation Logging is complete and DISABLED!\r\n\r\n");
}
// advance the read index if necessary.
if ( currentMemLogReadIndex == currentMemLogWriteIndex )
{
currentMemLogReadIndex++;
if ( currentMemLogReadIndex == MEMLOG_SIZE/2 )
{
currentMemLogReadIndex = 0;
}
}
NU_Local_Control_Interrupts(savedInterruptState);
// finish logging the fields. these are thread safe so they dont need to be inside the protected section.
((MemLogEntryNarrow_t*)memLog)[currentIndex].addr = ptr;
((MemLogEntryNarrow_t*)memLog)[currentIndex].size = 0;
((MemLogEntryNarrow_t*)memLog)[currentIndex].flags = 2; //unallocated
}
else
{
((MemLogEntryWide_t*)memLog)[currentIndex].caller = (void *)(temp[0] - 4);
((MemLogEntryWide_t*)memLog)[currentIndex].prev_caller = (void *)*addr;
if ( currentMemLogWriteIndex >= MEMLOG_SIZE/6 )
{
loggingEnabled = false;
rd_write_text( "Allocation Logging is complete and DISABLED!\r\n\r\n");
}
// advance the read index if necessary.
if ( currentMemLogReadIndex == currentMemLogWriteIndex )
{
currentMemLogReadIndex++;
if ( currentMemLogReadIndex == MEMLOG_SIZE/6 )
{
currentMemLogReadIndex = 0;
}
}
NU_Local_Control_Interrupts(savedInterruptState);
// finish logging the fields. these are thread safe so they dont need to be inside the protected section.
((MemLogEntryWide_t*)memLog)[currentIndex].addr = ptr;
((MemLogEntryWide_t*)memLog)[currentIndex].size = 0;
((MemLogEntryWide_t*)memLog)[currentIndex].flags = 2; //unallocated
((MemLogEntryWide_t*)memLog)[currentIndex].taskid = (void *)TCD_Current_Thread;
((MemLogEntryWide_t*)memLog)[currentIndex].timestamp = *(volatile uint32_t *)0xfffbc410; // for arm9
}
//Standard operator
if (ptr != NULL) {
free(ptr);
}
}
else
{
//Standard operator
if (ptr != NULL) {
free(ptr);
}
}
}
If you're running Linux, I suggest trying Valgrind.
There are several forms of operator new:
void *operator new (size_t);
void *operator new [] (size_t);
void *operator new (size_t, void *);
void *operator new [] (size_t, void *);
void *operator new (size_t, /* parameters of your choosing! */);
void *operator new [] (size_t, /* parameters of your choosing! */);
All the above can exist at both global and class scope. For each operator new, there is an equivalent operator delete. You need to make sure you are adding logging to all versions of the operator, if that is the way you want to do it.
Ideally, you would want the system to behave the same regardless of whether the memory logging is present or not. For example, the MS VC run time library allocates more memory in debug than in release because it prefixes the memory allocation with a bigger information block and adds guard blocks to the start and end of the allocation. The best solution is to keep all the memory logging information is a separate chunk or memory and use a map to track the memory. This can also be used to verify that the memory passed to delete is valid.
new
allocate memory
add entry to logging table
delete
check address exists in logging table
free memory
However, you're writing embedded software where, usually, memory is a limited resource. It is usually preferable on these systems to avoid dynamic memory allocation for several reasons:
You know how much memory there is so you know in advance how many objects you can allocate. Allocation should never return null as that is usually terminal with no easy way of getting back to a healthy system.
Allocating and freeing memory leads to fragmentation. The number of objects you can allocate will decrease over time. You could write a memory compactor to move allocated objects around to free up bigger chunks of memory but that will affect performance. As in point 1, once you get a null, things get tricky.
So, when doing embedded work, you usually know up front how much memory can be allocated to various objects and, knowing this, you can write more efficient memory managers for each object type that can take appropriate action when memory runs out - discarding old items, crashing, etc.
EDIT
If you want to know what called the memory allocation, the best thing to do is use a macro (I know, macros are generally bad):
#define NEW new (__FILE__, __LINE__, __FUNCTION__)
and define an operator new:
void *operator new (size_t size, char *file, int line, char *function)
{
// log the allocation somewhere, no need to strcpy file or function, just save the
// pointer values
return malloc (size);
}
and use it like this:
SomeObject *obj = NEW SomeObject (parameters);
You compiler might not have the __FUNCTION__ preprocessor definition so you can safely omit it.
http://www.linuxjournal.com/article/6059
Actually from my experience it always better to create memory pools for embedded systems and use custom allocator/de-allocator. We can easily identify the leaks. For example, we had a simple custom memory manager for vxworks, where we store the task id, timestamp in the allocated mem block.
One way is to insert file name and line number strings (via pointer) of the module allocating memory into the allocated block of data. The file and line number is handled by using the C++ standard "__FILE__" and "__LINE__" macros. When the memory is de-allocated, that information is removed.
One of our systems has this feature and we call it a "memory hog report". So anytime from our CLI we can print out all the allocated memory along with a big list of information of who has allocated memory. This list is sorted by which code module has the most memory allocated. Many times we'll monitor memory usage this way over time, and eventually the memory hog (leak) will bubble up to the top of the list.
overloading new and delete should work if you pay close attention.
Maybe you can show us what isn't working about that approach?
I'm not an embedded environment expert, so the only advise I can give is to test as much code as you can on your development machine using your favorite free or proprietary tools. Tools for a particular embedded platform may also exist and you can use them for final testing. But most powerful tools are for desktops.
On desktop environment I like the job DevPartner Studio does. This is for Windows and proprietary. There're free tools available for Linux but I don't have much expirience with them. As an example there's EFence
If you override constructor and destructor of your classes, you can print to the screen or a log file. Doing this will give you an idea of when things are being created, what is being created, as well as the same information for deletion.
For easy browsing, you can add a temporary global variable, "INSTANCE_ID", and print this (and increment) on every constructor/destructor call. Then you can browse by ID, and it should make goings a little easier.
The way we did it with our C 3D toolkit was to create custom new/malloc and delete macros that logged each allocation and deallocation to a file. We had to ensure that all the code called our macros of course. The writing to the log file was controlled by a run time flag and only happened under debug so we didn't have to recompile.
Once the run was complete a post-processor ran over the file matching allocations to deallocations and reported any unmatched allocations.
It had a performance hit, but we only needed to do it once it a while.
Is there a real need to roll your own memory leak detection?
Assuming you can't use dynamic memory checkers, like the open-source valgrind tool on Linux, static analysis tools like the commercial products Coverity Prevent and Klocwork Insight may be of use. I've used all three, and have had very good results with all of them.
Lots of good answers.
I would just point out that if the program is one that, like a small command-line utility, runs for a short period of time and then releases all its memory back to the OS, memory leaks probably do no harm.
You can use a third party tool to do this.
You can detect leaks within your own class structures by adding memory counters in your New and Delete calls to increment/decrement the memory counters, and print out a report at your application close. However, this won't detect memory leaks for memory allocated outside your class system - a third party tool can do this though.
Can you describe what is "not working" with your log methods?
Do you not get the expected logs? or, are they showing everything is fine but you still have leaks?
How have you confirmed that this is definitely a leak and not some other type of corruption?
One way to check your overloading is correct: Instantiate a counter object per class, increment this in the new and decrement it in the delete of the class. If you have a growing count, you have a leak. Now, you would expect your log lines to be coinciding with the increment and decrement points.
Not specifically for embedded development, but we used to use BoundsChecker for that.
Use smart pointers and never think about it again, there's loads of official types around but are pretty easy to roll your own too:
class SmartCoMem
{
public:
SmartCoMem() : m_size( 0 ), m_ptr64( 0 ) {
}
~SmartCoMem() {
if( m_size )
CoTaskMemFree((LPVOID)m_ptr64);
}
void copyin( LPCTSTR in, const unsigned short size )
{
LPVOID ptr;
ptr = CoTaskMemRealloc( (LPVOID)m_ptr64, size );
if( ptr == NULL )
throw std::exception( "SmartCoMem: CoTaskMemAlloc Failed" );
else
{
m_size = size;
m_ptr64 = (__int64)ptr;
memcpy( (LPVOID)m_ptr64, in, size );
}
}
std::string copyout( ) {
std::string out( (LPCSTR)m_ptr64, m_size );
return out;
}
__int64* ptr() {
return &m_ptr64;
}
unsigned short size() {
return m_size;
}
unsigned short* sizePtr() {
return &m_size;
}
bool loaded() {
return m_size > 0;
}
private:
//don't allow copying as this is a wrapper around raw memory
SmartCoMem (const SmartCoMem &);
SmartCoMem & operator = (const SmartCoMem &);
__int64 m_ptr64;
unsigned short m_size;
};
There's no encapsulation in this example due to the API I was working with but still better than working with completely raw pointers.
For testing like this, try compiling your embedded code natively for Linux (or whatever OS you use), and use the a well-established tool like Valgrind to test for memory leaks. It can be a challenge to do this, but you just need to replace any code that directly accesses hardware with some code that simulates something suitable for your testing.
I've found that using SWIG to convert my embedded code into a linux-native library and running the code from a Python script is really effective. You can use all of the tools that are available for non-embedded projects, and test all of your code except the hardware drivers.