C++ syntax I don't understand

C++ syntax I don't understand - c++

I've found a C++ code that has this syntax:
void MyClass::method()
{
beginResetModel();
{
// Various stuff
}
endResetModel();
}
I've no idea why there are { } after a line ending with ; but it seems there is no problem to make it compile and run. Is it possible this as something to do with the fact that the code may be asynchronous (I'm not sure yet)? Or maybe the { } are only here to delimit a part of the code and don't really make a difference but honestly I doubt this. I don't know, does someone has any clue what this syntax mean ?
More info: There is no other reference to beginResetModel, resetModel or ResetModel in the whole project (searched with grep). Btw the project is a Qt one. Maybe it's another Qt-related macro I haven't heard of.

Using {} will create a new scope. In your case, any variable created in those braces will cease to exist at the } in the end.

beginResetModel();
{
// Various stuff
}
endResetModel()
The open and close braces in your code are a very important feature in C++, as they delimit a new scope. You can appreciate the power of this in combination with another powerful language feature: destructors.
So, suppose that inside those braces you have code that creates various objects, like graphics models, or whatever.
Assuming that these objects are instances of classes that allocate resources (e.g. textures on the video card), and those classes have destructors that release the allocated resources, you are guaranteed that, at the }, these destructors are automatically invoked.
In this way, all the allocated resources are automatically released, before the code outside the closing curly brace, e.g. before the call to endResetModel() in your sample.
This automatic and deterministic resource management is a key powerful feature of C++.
Now, suppose that you remove the curly braces, and your method looks like this:
void MyClass::method()
{
beginResetModel();
// {
// Various stuff
// }
endResetModel();
}
Now, all the objects created in the Various stuff section of code will be destroyed before the } that terminates the MyClass::method(), but after the call to endResetModel().
So, in this case, you end up with the endResetModel() call, followed by other release code that runs after it. This may cause bugs.
On the other hand, the curly braces that define a new scope enclosed in begin/endResetModel() do guarantee that all the objects created inside this scope are destroyed before endResetModel() is invoked.

{} delimits a scope. That means that any variable declared inside there is not accessible outside of it and is erased from memory once the } is reached. Here is an example:
#include <iostream>
using namespace std;
class MyClass{
public:
~MyClass(){
cout << "Destructor called" << endl;
}
};
int main(){
{
int x = 3;
MyClass foo;
cout << x << endl; //Prints 3
} //Here "Destructor called" is printed since foo is cleared from the memory
cout << x << endl; //Compiler error, x isn't defined here
return 0;
}
Usually scopes are used for functions, loops, if-statements, etc, but you're perfectly allowed to use scopes without any statement before them. This can be particularly useful to declare variables inside a switch (this answer explains why).

As others have pointed out, the curly braces create a new scope, but maybe the interesting thing is why would you want to do that - that is, what is the difference between using it and not using it. There are cases where scopes are obviously necessary, such as with if or for blocks; if you don't create a scope after them you can only have one statement. Another possible reason is that maybe you use one variable in one part of the function and do not one it to be used outside of that part, so you put it into its own scope. However, the main use of scopes out of control statements has to do with RAII. When you declare an instance variable (not a pointer or reference), it is always initialized; when it goes out of scope, it is always destroyed. This can be used to define blocks that require some setup at the beginning and some tear down at the end (if you are familiar with Python, similar to with blocks).
Take this example:
#include <mutex>
void fun(std::mutex & mutex) {
// 1. Perform some computations...
{
std::lock_guard<std::mutex> lock(mutex);
// 2. Operations in this scope are performed with the mutex locked
}
// 3. More computations...
}
In this example, part 2 is only run after the mutex has been acquired, and is released before part 3 starts. If you remove the additional scope:
#include <mutex>
void fun(std::mutex & mutex) {
// 1. Perform some computations...
std::lock_guard<std::mutex> lock(mutex);
// 2. Operations in this scope are performed with the mutex locked
// 3. More computations...
}
In this case the mutex is acquired before starting part 2, but it is held until part 3 is complete (possibly producing more interlocking between threads than necessary). Note, however, that in both cases there was no need to specify when the lock is released; std::lock_guard is responsible for both acquiring the lock on construction and releasing it on destruction (i.e. when it goes out of scope).

Related

Multi Threading in c++

I have a class called MatrixAlt and i'm trying to multi thread a function to do some work on that matrix.
My general method worked when I just implemented it in a couple of functions. But when I try to bring it into the class methods, I get an error.
The problematic line (or where it highlights anyway) is 4 lines from the end and the error message is in the comments just above it.
#include <vector>
#include <future>
#include <thread>
class MatrixAlt
{
public:
MatrixAlt();
// initilaise the matrix to constant value for each entry
void function01(size_t maxThreads);
void function02(size_t threadIndex);
};
MatrixAlt::MatrixAlt()
{
}
void MatrixAlt::function02(size_t threadIndex)
{
// do some stuff
return;
}
void MatrixAlt::function01(size_t maxThreads)
{
// To control async threads and their results
std::vector<std::future<bool>> threadsIssued;
// now loop through all the threads and orchestrate the work to be done
for (size_t threadIndex = 0; threadIndex < maxThreads; ++threadIndex)
{
// line 42 gives error:
// 'MatrixAlt::function02': non-standard syntax; use '&' to create a pointer to member
// 'std::async': no matching overloaded function found
threadsIssued.push_back(std::async(function02, threadIndex));
}
return;
}

Your first problem is solved like this
threadsIssued.push_back(std::async(&MatrixAlt::function02, this, threadIndex));
You need to specify the exact class::function and take its address and which instance of the class your doing it for, and then the parameters.
The second problem which you haven't see yet is this line
std::vector<std::future<bool>> threadsIssued;
All those futures will be lost in scope exit, like tears in rain. Time to destroy.
Freely after Blade runner.
All those moments will be lost in time, like tears in rain. Time to
die.

Whenever you have a member function in C++, that function takes the object itself as an implicit first argument. So you need to pass the object as well, but even then, it can't be called with the same syntax as a normal function that takes the object.
The simplest way to setup an asynchronous job in C++ is typically just to use lambdas. They've very clear and explicit. So, for example, you could change your call to:
threadsIssued.push_back(std::async([this] (size_t t) { this->function02(t);}, threadIndex));
This lambda is explicitly capturing the this pointer, which tells us that all of the function02 calls will be called on the same object that the calling function01 is called on.
In addition to being correct, and explicit, this also helps highlight an important point: all of the function02 objects will be running with mutable access to the same MatrixAlt object. This is very dangerous, so you need to make sure that function02 is thread safe, one way or another (usually easy if its conceptually const, otherwise perhaps need a mutex, or something else).

Is the main thread allowed to spawn a POSIX thread before it enters main()?

I have this object that contains a thread. I want the fate of the object and the fate of the thread to be one in the same. So the constructor creates a thread (with pthread_create) and the destructor performs actions to cause the thread to return in a reasonable amount of time and then joins the thread. This is working fine as long as I don't instantiate one of these objects with static storage duration. If I instantiate one of these objects at global or namespace or static class scope the program compiles fine (gcc 4.8.1) but immediately segfaults upon running. With print statements I have determined that the main thread doesn't even enter main() before the segfault. Any ideas?
Update: Also added a print statement to the first line of the constructor (so before pthread_create is called), and not even that gets printed before the segfault BUT the constructor does use an initialization list so it is possible something there is causing it?
Here is the constructor:
worker::worker(size_t buffer_size):
m_head(nullptr),m_tail(nullptr),
m_buffer_A(operator new(buffer_size)),
m_buffer_B(operator new(buffer_size)),
m_next(m_buffer_A),
m_buffer_size(buffer_size),
m_pause_gate(true),
m_worker_thread([this]()->void{ thread_func(); }),
m_running(true)
{
print("this wont get printed b4 segfault");
scoped_lock lock(worker_lock);
m_worker_thread.start();
all_workers.push_back(this);
}
And destructor:
worker::~worker()
{
{
scoped_lock lock(worker_lock);
auto w=all_workers.begin();
while(w!=all_workers.end())
{
if(*w==this)
{
break;
}
++w;
}
all_workers.erase(w);
}
{
scoped_lock lock(m_lock);
m_running=false;
}
m_sem.release();
m_pause_gate.open();
m_worker_thread.join();
operator delete(m_buffer_A);
operator delete(m_buffer_B);
}
Update 2:
Okay I figured it out. My print function is atomic and likewise protects cout with an extern namespace scope mutex defined elsewhere. I changed to just plain cout and it printed at the beginning of the ctor. Apparently none of these static storage duration mutexes are getting initialized before things are trying to access them. So yeah it is probably Casey's answer.
I'm just not going to bother with complex objects and static storage duration. It's no big deal anyway.

Initialization of non-local variables is described in C++11 §3.6.2, there's a ton of scary stuff in paragraph 2 that has to do with threads:
If a program starts a thread (30.3), the subsequent initialization of a variable is unsequenced with respect to the initialization of a variable defined in a different translation unit. Otherwise, the initialization of a variable is indeterminately sequenced with respect to the initialization of a variable defined in a different translation unit. If a program starts a thread, the subsequent unordered initialization of a variable is unsequenced with respect to every other dynamic initialization.
I interpret "the subsequent unordered initialization of a variable is unsequenced with respect to every other dynamic initialization" to mean that the spawned thread cannot access any variable with dynamic initialization that was not initialized before the thread was spawned without causing a data race. If that thread doesn't somehow synchronize with main, you're basically dancing through a minefield with your hands over your eyes.
I'd strongly suggest you read through and understand all of 3.6; even without threads it's a huge PITA to do much before main starts.

What happens before entering main will be platform specific, but here is a link on how main() executes on Linux
http://linuxgazette.net/84/hawk.html
The useful snipet is
__libc_start_main initializes necessary stuffs, especially C library(such as malloc) and thread environment and calls our main.
For more information look up __libc_start_main
Not sure how this behaves on Windows, but it seems like any standard C library call before entering main is not a good idea

There may be many ways to do that. See the snippet below where the constructor of class A called before main because we have declared an object of class A at global scope: (I have expanded the example to demonstrate how a thread can be created before main executes)
#include <iostream>
#include <stdlib.h>
#include <pthread.h>
using namespace std;
void *fun(void *x)
{
while (true) {
cout << "Thread\n";
sleep(2);
}
}
pthread_t t_id;
class A
{
public:
A()
{
cout << "Hello before main \n " ;
pthread_create(&t_id, 0, fun, 0);
sleep(6);
}
};
A a;
int main()
{
cout << "I am main\n";
sleep(40);
return 0;
}

I found this question after I posted my own question about threads. Reviewing my question might be helpful to others. I found that when I allocated an object creating a thread in the constructor at global scope I got strange behavior, but if I moved the objection creation just inside main() things worked as I expected. That seems to be consistent with comments on this question.

Extra brace brackets in C++ code

Sometimes you run into code that has extra brace brackets, that have nothing to do with scope, only are for readability and avoiding mistakes.
For example:
GetMutexLock( handle ) ;
{
// brace brackets "scope" the lock,
// must close block / remember
// to release the handle.
// similar to C#'s lock construct
}
ReleaseMutexLock( handle ) ;
Other places I have seen it are:
glBegin( GL_TRIANGLES ) ;
{
glVertex3d( .. ) ;
glVertex3d( .. ) ;
glVertex3d( .. ) ;
} // must remember to glEnd!
glEnd() ;
This introduces a compiler error if the mutex isn't freed (assuming you remember both the } and the Release() call).
Is this a bad practice? Why?
If it isn't one, could it change how the code is compiled or make it slower?

The braces themselves are fine, all they do is limit scope and you won't slow anything down. It can be seen as cleaner. (Always prefer clean code over fast code, if it's cleaner, don't worry about the speed until you profile.)
But with respect to resources it's bad practice because you've put yourself in a position to leak a resource. If anything in the block throws or returns, bang you're dead.
Use Scope-bound Resource Management (SBRM, also known as RAII), which limits a resource to a scope, by using the destructor:
class mutex_lock
{
public:
mutex_lock(HANDLE pHandle) :
mHandle(pHandle)
{
//acquire resource
GetMutexLock(mHandle);
}
~mutex_lock()
{
// release resource, bound to scope
ReleaseMutexLock(mHandle);
}
private:
// resource
HANDLE mHandle;
// noncopyable
mutex_lock(const mutex_lock&);
mutex_lock& operator=(const mutex_lock&);
};
So you get:
{
mutex_lock m(handle);
// brace brackets "scope" the lock,
// AUTOMATICALLY
}
Do this will all resources, it's cleaner and safer. If you are in a position to say "I need to release this resource", you've done it wrong; they should be handled automatically.

Braces affect variable scope. As far as I know that is all they do.
Yes, this can affect how the program is compiled. Destructors will be called at the end of the block instead of waiting until the end of the function.
Often this is what you want to do. For example, your GetMutexLock and ReleaseMutexLock would be much better C++ code written like this:
struct MutexLocker {
Handle handle;
MutexLocker(handle) : handle(handle) { GetMutexLock(handle); }
~MutexLocker() { ReleaseMutexLock(handle); }
};
...
{
MutexLocker lock(handle);
// brace brackets "scope" the lock,
// must close block / remember
// to release the handle.
// similar to C#'s lock construct
}
Using this more C++ style, the lock is released automatically at the end of the block. It will be released in all circumstances, including exceptions, with the exceptions of setjmp/longjmp or a program crash or abort.

It's not bad practice. It does not make anything slower; it's just a way of structuring the code.
Getting the compiler to do error-checking & enforcing for you is always a good thing!

The specific placement of { ... } in your original example serves purely as formatting sugar by making it more obvious where a group of logically related statements begins and where it ends. As shown in your examples, it has not effect on the compiled code.
I don't know what you mean by "this introduces a compiler error if the mutex isn't freed". That's simply not true. Such use of { ... } cannot and will not introduce any compiler errors.
Whether it is a good practice is a matter of personal preference. It looks OK. Alternatively, you can use comments and/or indentation to indicate logical grouping of statements in the code, without any extra { ... }.
There are various scoping-based techniques out there, some of which have been illustrated by the other answers here, but what you have in your OP doesn't even remotely look anything like that. Once again, what you have in your OP (as shown) is purely a source formatting habit with superfluous { ... } that have no effect on the generated code.

It will make no difference to the compiled code, apart from calling any destructors at the end of that block rather than the end of the surrounding block, unless the compiler is completely insane.
Personally, I would call it bad practice; the way to avoid the kind of mistakes you might make here is to use scoped resource management (sometimes called RAII), not to use error-prone typographical reminders. I would write the code as something like
{
mutex::scoped_lock lock(mutex);
// brace brackets *really* scope the lock
} // scoped_lock destructor releases the lock
{
gl_group gl(GL_TRIANGLES); // calls glBegin()
gl.Vertex3d( .. );
gl.Vertex3d( .. );
gl.Vertex3d( .. );
} // gl_group destructor calls glEnd()

If you're putting code into braces, you should probably break it out into its own method. If it's a single discrete unit, why not label it and break it out functionally? That will make it explicit what the block does, and people who later read the code won't have to figure out.

Anything that improves readablity IMHO is good practice. If adding braces help with the readability, then go for it!
Adding additional braces will not change how the code is compiled. It won't make the running of the program any slower.

This is much more useful (IMHO) in C++ with object destructors; your examples are in C.
Imagine if you made a MutexLock class:
class MutexLock {
private:
HANDLE handle;
public:
MutexLock() : handle(0) {
GetMutexLock(handle);
}
~MutexLock() {
ReleaseMutexLock(handle);
}
}
Then you could scope that lock to just the code that needed it by providing an new scope with the braces:
{
MutexLock mtx; // Allocated on the stack in this new scope
// Use shared resource
}
// When this scope exits the destructor on mtx is called and the stack is popped

Thread-Safe implementation of an object that deletes itself

I have an object that is called from two different threads and after it was called by both it destroys itself by "delete this".
How do I implement this thread-safe? Thread-safe means that the object never destroys itself exactly one time (it must destroys itself after the second callback).
I created some example code:
class IThreadCallBack
{
virtual void CallBack(int) = 0;
};
class M: public IThreadCallBack
{
private:
bool t1_finished, t2_finished;
public:
M(): t1_finished(false), t2_finished(false)
{
startMyThread(this, 1);
startMyThread(this, 2);
}
void CallBack(int id)
{
if (id == 1)
{
t1_finished = true;
}
else
{
t2_finished = true;
}
if (t1_finished && t2_finished)
{
delete this;
}
}
};
int main(int argc, char **argv) {
M* MObj = new M();
while(true);
}
Obviously I can't use a Mutex as member of the object and lock the delete, because this would also delete the Mutex. On the other hand, if I set a "toBeDeleted"-flag inside a mutex-protected area, where the finised-flag is set, I feel unsure if there are situations possible where the object isnt deleted at all.
Note that the thread-implementation makes sure that the callback method is called exactly one time per thread in any case.
Edit / Update:
What if I change Callback(..) to:
void CallBack(int id)
{
mMutex.Obtain()
if (id == 1)
{
t1_finished = true;
}
else
{
t2_finished = true;
}
bool both_finished = (t1_finished && t2_finished);
mMutex.Release();
if (both_finished)
{
delete this;
}
}
Can this considered to be safe? (with mMutex being a member of the m class?)
I think it is, if I don't access any member after releasing the mutex?!

Use Boost's Smart Pointer. It handles this automatically; your object won't have to delete itself, and it is thread safe.
Edit:
From the code you've posted above, I can't really say, need more info. But you could do it like this: each thread has a shared_ptr object and when the callback is called, you call shared_ptr::reset(). The last reset will delete M. Each shared_ptr could be stored with thread local storeage in each thread. So in essence, each thread is responsible for its own shared_ptr.

Instead of using two separate flags, you could consider setting a counter to the number of threads that you're waiting on and then using interlocked decrement.
Then you can be 100% sure that when the thread counter reaches 0, you're done and should clean up.
For more info on interlocked decrement on Windows, on Linux, and on Mac.

I once implemented something like this that avoided the ickiness and confusion of delete this entirely, by operating in the following way:
Start a thread that is responsible for deleting these sorts of shared objects, which waits on a condition
When the shared object is no longer being used, instead of deleting itself, have it insert itself into a thread-safe queue and signal the condition that the deleter thread is waiting on
When the deleter thread wakes up, it deletes everything in the queue
If your program has an event loop, you can avoid the creation of a separate thread for this by creating an event type that means "delete unused shared objects" and have some persistent object respond to this event in the same way that the deleter thread would in the above example.

I can't imagine that this is possible, especially within the class itself. The problem is two fold:
1) There's no way to notify the outside world not to call the object so the outside world has to be responsible for setting the pointer to 0 after calling "CallBack" iff the pointer was deleted.
2) Once two threads enter this function you are, and forgive my french, absolutely fucked. Calling a function on a deleted object is UB, just imagine what deleting an object while someone is in it results in.
I've never seen "delete this" as anything but an abomination. Doesn't mean it isn't sometimes, on VERY rare conditions, necessary. Problem is that people do it way too much and don't think about the consequences of such a design.
I don't think "to be deleted" is going to work well. It might work for two threads, but what about three? You can't protect the part of code that calls delete because you're deleting the protection (as you state) and because of the UB you'll inevitably cause. So the first goes through, sets the flag and aborts....which of the rest is going to call delete on the way out?

The more robust implementation would be to implement reference counting. For each thread you start, increase a counter; for each callback call decrease the counter and if the counter has reached zero, delete the object. You can lock the counter access, or you could use the Interlocked class to protect the counter access, though in that case you need to be careful with potential race between the first thread finishing and the second starting.
Update: And of course, I completely ignored the fact that this is C++. :-) You should use InterlockExchange to update the counter instead of the C# Interlocked class.

How can I schedule some code to run after all '_atexit()' functions are completed

I'm writing a memory tracking system and the only problem I've actually run into is that when the application exits, any static/global classes that didn't allocate in their constructor, but are deallocating in their deconstructor are deallocating after my memory tracking stuff has reported the allocated data as a leak.
As far as I can tell, the only way for me to properly solve this would be to either force the placement of the memory tracker's _atexit callback at the head of the stack (so that it is called last) or have it execute after the entire _atexit stack has been unwound. Is it actually possible to implement either of these solutions, or is there another solution that I have overlooked.
Edit:
I'm working on/developing for Windows XP and compiling with VS2005.

I've finally figured out how to do this under Windows/Visual Studio. Looking through the crt startup function again (specifically where it calls the initializers for globals), I noticed that it was simply running "function pointers" that were contained between certain segments. So with just a little bit of knowledge on how the linker works, I came up with this:
#include <iostream>
using std::cout;
using std::endl;
// Typedef for the function pointer
typedef void (*_PVFV)(void);
// Our various functions/classes that are going to log the application startup/exit
struct TestClass
{
int m_instanceID;
TestClass(int instanceID) : m_instanceID(instanceID) { cout << " Creating TestClass: " << m_instanceID << endl; }
~TestClass() {cout << " Destroying TestClass: " << m_instanceID << endl; }
};
static int InitInt(const char *ptr) { cout << " Initializing Variable: " << ptr << endl; return 42; }
static void LastOnExitFunc() { puts("Called " __FUNCTION__ "();"); }
static void CInit() { puts("Called " __FUNCTION__ "();"); atexit(&LastOnExitFunc); }
static void CppInit() { puts("Called " __FUNCTION__ "();"); }
// our variables to be intialized
extern "C" { static int testCVar1 = InitInt("testCVar1"); }
static TestClass testClassInstance1(1);
static int testCppVar1 = InitInt("testCppVar1");
// Define where our segment names
#define SEGMENT_C_INIT ".CRT$XIM"
#define SEGMENT_CPP_INIT ".CRT$XCM"
// Build our various function tables and insert them into the correct segments.
#pragma data_seg(SEGMENT_C_INIT)
#pragma data_seg(SEGMENT_CPP_INIT)
#pragma data_seg() // Switch back to the default segment
// Call create our call function pointer arrays and place them in the segments created above
#define SEG_ALLOCATE(SEGMENT) __declspec(allocate(SEGMENT))
SEG_ALLOCATE(SEGMENT_C_INIT) _PVFV c_init_funcs[] = { &CInit };
SEG_ALLOCATE(SEGMENT_CPP_INIT) _PVFV cpp_init_funcs[] = { &CppInit };
// Some more variables just to show that declaration order isn't affecting anything
extern "C" { static int testCVar2 = InitInt("testCVar2"); }
static TestClass testClassInstance2(2);
static int testCppVar2 = InitInt("testCppVar2");
// Main function which prints itself just so we can see where the app actually enters
void main()
{
cout << " Entered Main()!" << endl;
}
which outputs:
Called CInit();
Called CppInit();
Initializing Variable: testCVar1
Creating TestClass: 1
Initializing Variable: testCppVar1
Initializing Variable: testCVar2
Creating TestClass: 2
Initializing Variable: testCppVar2
Entered Main()!
Destroying TestClass: 2
Destroying TestClass: 1
Called LastOnExitFunc();
This works due to the way MS have written their runtime library. Basically, they've setup the following variables in the data segments:
(although this info is copyright I believe this is fair use as it doesn't devalue the original and IS only here for reference)
extern _CRTALLOC(".CRT$XIA") _PIFV __xi_a[];
extern _CRTALLOC(".CRT$XIZ") _PIFV __xi_z[]; /* C initializers */
extern _CRTALLOC(".CRT$XCA") _PVFV __xc_a[];
extern _CRTALLOC(".CRT$XCZ") _PVFV __xc_z[]; /* C++ initializers */
extern _CRTALLOC(".CRT$XPA") _PVFV __xp_a[];
extern _CRTALLOC(".CRT$XPZ") _PVFV __xp_z[]; /* C pre-terminators */
extern _CRTALLOC(".CRT$XTA") _PVFV __xt_a[];
extern _CRTALLOC(".CRT$XTZ") _PVFV __xt_z[]; /* C terminators */
On initialization, the program simply iterates from '__xN_a' to '__xN_z' (where N is {i,c,p,t}) and calls any non null pointers it finds. If we just insert our own segment in between the segments '.CRT$XnA' and '.CRT$XnZ' (where, once again n is {I,C,P,T}), it will be called along with everything else that normally gets called.
The linker simply joins up the segments in alphabetical order. This makes it extremely simple to select when our functions should be called. If you have a look in defsects.inc (found under $(VS_DIR)\VC\crt\src\) you can see that MS have placed all the "user" initialization functions (that is, the ones that initialize globals in your code) in segments ending with 'U'. This means that we just need to place our initializers in a segment earlier than 'U' and they will be called before any other initializers.
You must be really careful not to use any functionality that isn't initialized until after your selected placement of the function pointers (frankly, I'd recommend you just use .CRT$XCT that way its only your code that hasn't been initialized. I'm not sure what will happen if you've linked with standard 'C' code, you may have to place it in the .CRT$XIT block in that case).
One thing I did discover was that the "pre-terminators" and "terminators" aren't actually stored in the executable if you link against the DLL versions of the runtime library. Due to this, you can't really use them as a general solution. Instead, the way I made it run my specific function as the last "user" function was to simply call atexit() within the 'C initializers', this way, no other function could have been added to the stack (which will be called in the reverse order to which functions are added and is how global/static deconstructors are all called).
Just one final (obvious) note, this is written with Microsoft's runtime library in mind. It may work similar on other platforms/compilers (hopefully you'll be able to get away with just changing the segment names to whatever they use, IF they use the same scheme) but don't count on it.

atexit is processed by the C/C++ runtime (CRT). It runs after main() has already returned. Probably the best way to do this is to replace the standard CRT with your own.
On Windows tlibc is probably a great place to start: http://www.codeproject.com/KB/library/tlibc.aspx
Look at the code sample for mainCRTStartup and just run your code after the call to _doexit();
but before ExitProcess.
Alternatively, you could just get notified when ExitProcess gets called. When ExitProcess gets called the following occurs (according to http://msdn.microsoft.com/en-us/library/ms682658%28VS.85%29.aspx):
All of the threads in the process, except the calling thread, terminate their execution without receiving a DLL_THREAD_DETACH notification.
The states of all of the threads terminated in step 1 become signaled.
The entry-point functions of all loaded dynamic-link libraries (DLLs) are called with DLL_PROCESS_DETACH.
After all attached DLLs have executed any process termination code, the ExitProcess function terminates the current process, including the calling thread.
The state of the calling thread becomes signaled.
All of the object handles opened by the process are closed.
The termination status of the process changes from STILL_ACTIVE to the exit value of the process.
The state of the process object becomes signaled, satisfying any threads that had been waiting for the process to terminate.
So, one method would be to create a DLL and have that DLL attach to the process. It will get notified when the process exits, which should be after atexit has been processed.
Obviously, this is all rather hackish, proceed carefully.

This is dependent on the development platform. For example, Borland C++ has a #pragma which could be used for exactly this. (From Borland C++ 5.0, c. 1995)
#pragma startup function-name [priority]
#pragma exit function-name [priority]
These two pragmas allow the program to specify function(s) that should be called either upon program startup (before the main function is called), or program exit (just before the program terminates through _exit).
The specified function-name must be a previously declared function as:
void function-name(void);
The optional priority should be in the range 64 to 255, with highest priority at 0; default is 100. Functions with higher priorities are called first at startup and last at exit. Priorities from 0 to 63 are used by the C libraries, and should not be used by the user.
Perhaps your C compiler has a similar facility?

I've read multiple times you can't guarantee the construction order of global variables (cite). I'd think it is pretty safe to infer from this that destructor execution order is also not guaranteed.
Therefore if your memory tracking object is global, you will almost certainly be unable any guarantees that your memory tracker object will get destructed last (or constructed first). If it's not destructed last, and other allocations are outstanding, then yes it will notice the leaks you mention.
Also, what platform is this _atexit function defined for?

Having the memory tracker's cleanup executed last is the best solution. The easiest way I've found to do that is to explicitly control all the relevant global variables' initialization order. (Some libraries hide their global state in fancy classes or otherwise, thinking they're following a pattern, but all they do is prevent this kind of flexibility.)
Example main.cpp:
#include "global_init.inc"
int main() {
// do very little work; all initialization, main-specific stuff
// then call your application's mainloop
}
Where the global-initialization file includes object definitions and #includes similar non-header files. Order the objects in this file in the order you want them constructed, and they'll be destructed in the reverse order. 18.3/8 in C++03 guarantees that destruction order mirrors construction: "Non-local objects with static storage duration are destroyed in the reverse order of the completion of their constructor." (That section is talking about exit(), but a return from main is the same, see 3.6.1/5.)
As a bonus, you're guaranteed that all globals (in that file) are initialized before entering main. (Something not guaranteed in the standard, but allowed if implementations choose.)

I've had this exact problem, also writing a memory tracker.
A few things:
Along with destruction, you also need to handle construction. Be prepared for malloc/new to be called BEFORE your memory tracker is constructed (assuming it is written as a class). So you need your class to know whether it has been constructed or destructed yet!
class MemTracker
{
enum State
{
unconstructed = 0, // must be 0 !!!
constructed,
destructed
};
State state;
MemTracker()
{
if (state == unconstructed)
{
// construct...
state = constructed;
}
}
};
static MemTracker memTracker; // all statics are zero-initted by linker
On every allocation that calls into your tracker, construct it!
MemTracker::malloc(...)
{
// force call to constructor, which does nothing after first time
new (this) MemTracker();
...
}
Strange, but true. Anyhow, onto destruction:
~MemTracker()
{
OutputLeaks(file);
state = destructed;
}
So, on destruction, output your results. Yet we know that there will be more calls. What to do? Well,...
MemTracker::free(void * ptr)
{
do_tracking(ptr);
if (state == destructed)
{
// we must getting called late
// so re-output
// Note that this might happen a lot...
OutputLeaks(file); // again!
}
}
And lastly:
be careful with threading
be careful not to call malloc/free/new/delete inside your tracker, or be able to detect the recursion, etc :-)
EDIT:
and I forgot, if you put your tracker in a DLL, you will probably need to LoadLibrary() (or dlopen, etc) yourself to up your reference count, so that you don't get removed from memory prematurely. Because although your class can still be called after destruction, it can't if the code has been unloaded.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js