Is the main thread allowed to spawn a POSIX thread before it enters main()?

Is the main thread allowed to spawn a POSIX thread before it enters main()? - c++

I have this object that contains a thread. I want the fate of the object and the fate of the thread to be one in the same. So the constructor creates a thread (with pthread_create) and the destructor performs actions to cause the thread to return in a reasonable amount of time and then joins the thread. This is working fine as long as I don't instantiate one of these objects with static storage duration. If I instantiate one of these objects at global or namespace or static class scope the program compiles fine (gcc 4.8.1) but immediately segfaults upon running. With print statements I have determined that the main thread doesn't even enter main() before the segfault. Any ideas?
Update: Also added a print statement to the first line of the constructor (so before pthread_create is called), and not even that gets printed before the segfault BUT the constructor does use an initialization list so it is possible something there is causing it?
Here is the constructor:
worker::worker(size_t buffer_size):
m_head(nullptr),m_tail(nullptr),
m_buffer_A(operator new(buffer_size)),
m_buffer_B(operator new(buffer_size)),
m_next(m_buffer_A),
m_buffer_size(buffer_size),
m_pause_gate(true),
m_worker_thread([this]()->void{ thread_func(); }),
m_running(true)
{
print("this wont get printed b4 segfault");
scoped_lock lock(worker_lock);
m_worker_thread.start();
all_workers.push_back(this);
}
And destructor:
worker::~worker()
{
{
scoped_lock lock(worker_lock);
auto w=all_workers.begin();
while(w!=all_workers.end())
{
if(*w==this)
{
break;
}
++w;
}
all_workers.erase(w);
}
{
scoped_lock lock(m_lock);
m_running=false;
}
m_sem.release();
m_pause_gate.open();
m_worker_thread.join();
operator delete(m_buffer_A);
operator delete(m_buffer_B);
}
Update 2:
Okay I figured it out. My print function is atomic and likewise protects cout with an extern namespace scope mutex defined elsewhere. I changed to just plain cout and it printed at the beginning of the ctor. Apparently none of these static storage duration mutexes are getting initialized before things are trying to access them. So yeah it is probably Casey's answer.
I'm just not going to bother with complex objects and static storage duration. It's no big deal anyway.

Initialization of non-local variables is described in C++11 §3.6.2, there's a ton of scary stuff in paragraph 2 that has to do with threads:
If a program starts a thread (30.3), the subsequent initialization of a variable is unsequenced with respect to the initialization of a variable defined in a different translation unit. Otherwise, the initialization of a variable is indeterminately sequenced with respect to the initialization of a variable defined in a different translation unit. If a program starts a thread, the subsequent unordered initialization of a variable is unsequenced with respect to every other dynamic initialization.
I interpret "the subsequent unordered initialization of a variable is unsequenced with respect to every other dynamic initialization" to mean that the spawned thread cannot access any variable with dynamic initialization that was not initialized before the thread was spawned without causing a data race. If that thread doesn't somehow synchronize with main, you're basically dancing through a minefield with your hands over your eyes.
I'd strongly suggest you read through and understand all of 3.6; even without threads it's a huge PITA to do much before main starts.

What happens before entering main will be platform specific, but here is a link on how main() executes on Linux
http://linuxgazette.net/84/hawk.html
The useful snipet is
__libc_start_main initializes necessary stuffs, especially C library(such as malloc) and thread environment and calls our main.
For more information look up __libc_start_main
Not sure how this behaves on Windows, but it seems like any standard C library call before entering main is not a good idea

There may be many ways to do that. See the snippet below where the constructor of class A called before main because we have declared an object of class A at global scope: (I have expanded the example to demonstrate how a thread can be created before main executes)
#include <iostream>
#include <stdlib.h>
#include <pthread.h>
using namespace std;
void *fun(void *x)
{
while (true) {
cout << "Thread\n";
sleep(2);
}
}
pthread_t t_id;
class A
{
public:
A()
{
cout << "Hello before main \n " ;
pthread_create(&t_id, 0, fun, 0);
sleep(6);
}
};
A a;
int main()
{
cout << "I am main\n";
sleep(40);
return 0;
}

I found this question after I posted my own question about threads. Reviewing my question might be helpful to others. I found that when I allocated an object creating a thread in the constructor at global scope I got strange behavior, but if I moved the objection creation just inside main() things worked as I expected. That seems to be consistent with comments on this question.

Related

Why do I get "Segmentation fault (core dumped)" error when trying to implement multithreading in c++?

I have a main file where I plan to initiate the threads for my c++ program, for now, I only want to get one of the threads up and running before moving on to the others, but that is proving to be difficult. The purpose of the threads is for a TCP Server and Client to run at the same time, I have already tested my TCP code and it works, the issue now is running each one in its own thread. The following shows my main.cpp code:
#include <thread>
#include <iostream>
#include <functional>
#include "./hdr/tcpip_server.hpp"
#include "./hdr/tcpip_client.hpp"
using namespace std;
tcpServer *backendServer;
//This is done because the callback function of std::thread tcpip_server_thread complains when I only use 'backendServer->Monitor' as my callback function
void StartThread (void) {backendServer->Monitor();}
int main (void)
{
/*Initiate and start TCP server thread*/
std::thread tcpip_server_thread; // done to define object to be used outside the scope of the if statement below
if (backendServer->Init())
{
std::thread tcpip_server_thread (StartThread);
}
/*Initiate and start data reader thread*/
//std::thread tcpip_client_thread (tcpip_client);
tcpip_server_thread.join();
//tcpip_client_thread.join();
return 0;
}
The backendServer class is as follows:
class tcpServer
{
private:
int listening;
sockaddr_in hint;
sockaddr_in client;
socklen_t clientSize;
int clientSocket;
char host[NI_MAXHOST];
char service[NI_MAXSERV];
char buf[4096];
public:
bool Init ();
void Monitor ();
};
The only error I am getting with this code is the one in the title, and I only get it when the code is executing, no errors are received while compiling the code.
When trying the following:
std::thread tcpip_server_thread (backendServer->Monitor);
I get the following warning:
a pointer to a bound function may only be used to call the function
and
no instance of constructor "std::thread::thread" matches the argument list
Any help would be appreciated as this is my first project implementing threads.

1. Initializing backendServer:
backendServer is a pointer to tcpServer, but it is uninitialized (and does not point to any valid object).
Therefore backendServer->Init(); invokes UB Undefined Behavior, and likely to crash.
If you must use a pointer you must allocate it. Better still use a smart pointer like std::unique_ptr instead.
But in your case I believe the best solution is not to use a pointer at all, and define backendServer as a local variable in main:
int main(void)
{
tcpServer backendServer;
// ...
}
This will require accessing it with backendServer. instead of backendServer->.
2. The thread issue:
At the moment, you have 2 tcpip_server_thread variables.
The 2nd one inside the if is shadowing the 1st one you have before.
When you get out of the if's scope, the 2nd tcpip_server_thread will be destroyed, and a std::thread must be joined before destruction.
Later on you attempt to join the 1st one which has not even started, causing a 2nd problem.
In order to fix it:
Inside the if, do not declare a new variable. Instead use the one you already have:
tcpip_server_thread = std::thread(StartThread);
If you made backendServer a local in main as suggested above, you can use a lambda that captures it by reference:
tcpip_server_thread = std::thread(
[&backendServer]() { backendServer.Monitor();});
//--------------^^^^^^^^^^^^^^---------------------------------
Before you join the thread check that it is joinable. In the current code this will not be the case if you didn't enter the if that started the thread:
if (tcpip_server_thread.joinable())
{
tcpip_server_thread.join();
}
A side note: Why is "using namespace std;" considered bad practice?.

The main issue of your code is an uninitialised (actually: zero-initialised) pointer:
tcpServer *backendServer;
Note that you never assign a value to! This results in (as a global variable) the pointer being initialised to nullptr, which you dereference illegally later on, e.g. at (the first time during the programme run)
if (backendServer->Init())
which most likely caused the crash. A quick and dirty fix might look as:
int main()
{
backendServer = new tcpServer(); // possibly with arguments, depending
// on how your constructor looks like
// the code you have so far
delete backendServer; // avoid memory leak!!!
return 0;
}
You spare all this hassle around manual memory management (-> explicit delete) if you use smart pointers instead, e.g. std::unique_ptr. However unless you possibly want to dynamically exchange the backend server, limit its life-time to anything else than the entire programme run or construct it with arguments that need to be retrieved/calculated within main before (none of appears pretty likely to me in given case) then you most likely are better off with a global object:
tcpServer backendServer; // note the dropped asterisk!
This way the object is created before entering main and correctly destructed after leaving.
As now no pointer any more you now refer to members via . instead of ->, i.e. backendServer.Monitor() for instance.
You actually can construct a std::thread with member function pointers, too. You need, though, to pass the object on which this member function should get called to the thread as well:
std::thread(&tcpServer::Monitor, backendServer);
This works with both functions and objects, the latter are accepted by value, though, thus if you use a global object as recommended above you might rather want to create a pointer:
std::thread(&tcpServer::Monitor, &backendServer);
// ^ (!)
// note: NOT if your variable remains a pointer!!!
This way you can actually spare the global variable entirely and create the object within main and the StartThread (actually you should better have named it RunThread) gets entirely obsolete as well.
Alternatives to are converting Monitor function into an operator() or adding such one as
void tcpServer::operator()()
{
this->Monitor();
}
which makes the object itself callable, thus you could pass it directly to the thread's constructor (std::thread(std::ref(backendServer)); with std::ref preventing the object getting copied) or using a lambda:
std::thread([&]() { backendServer.Monitor(); });
both with the same advantage as providing the member function that you can spare global variable and StartThread function.
Still your code reveals another problem:
if (backendServer->Init())
{
std::thread tcpip_server_thread(StartThread);
}
You create here a second local variable tcpip_server_thread which, as long as it exists, hides the previous one, but which runs out of scope and thus gets destructed again right after the end of the if-body!
Instead you want to assign the newly created thread to the already existing variable, which would look like:
tcpip_server_thread = std::thread(StartThread);
Actually you get nicer code if you move the entire thread-code into the if block:
// no thread code left here any more
if(backendServer->Init())
{
std::thread tcpip_server_thread(StartThread);
// start second thread here, too!
tcpip_server_thread.join();
}
// no thread code left here any more
Finally you should not join a thread that actually has failed to start. You spot this by checking if the thread is joinable
std::thread tcpip_server_thread (StartThread);
if(tcpip_server_thread.joinable())
{
// see above for correct construction!
std::thread tcpip_client_thread(tcpip_client);
if(tcpip_client_thread.joinable())
{
tcpip_server_thread.join();
}
else
{
// you might need some appropriate error handling like
// printing/logging a warning message
// and possibly stop the server thread
}
}
else
{
error handling, see above
}

To fix the code I had to do 2 things, one was to not define the tcpServer variable, backendServer, as a pointer, since I never pointed it toward an actual object of the type tcpServer.
Next, I removed the first tcpip_server_thread variable and made sure that the code that initiates ```tcpip_server_thread`` and the code that joins it is in the same scope. In the future, I will implement the std::move function as explained by #wohlstad.
My working code:
#include <thread>
#include <iostream>
#include <functional>
#include "./hdr/tcpip_server.hpp"
#include "./hdr/tcpip_client.hpp"
using namespace std;
/*All the threads*/
tcpServer backendServer;
void StartThread (void) {backendServer.Monitor();}
int main (void)
{
/*Initiate and start tcp server thread*/
if (backendServer.Init())
{
std::thread tcpip_server_thread (StartThread);
if (tcpip_server_thread.joinable())
{
tcpip_server_thread.join();
}
else
{
cout << "error";
}
}
return 0;
}

c++ capture ctrl+c without using globals

I have simplified my example for an easier explanation. I am writing an application that counts to 100 but at any given time I allow the user to cancel the program by entering ctrl+c through the keyboard.
What seemingly started as a simple program quickly became complicated based on my lack of knowledge on function pointers. This is what I'm attempting to do:
Capture the SIGINT signal when ctrl+c is pressed.
Once captured, call a member function that shuts down a third-party resource.
The catch is that unlike the two examples that Michael Haidl and Grijesh Chauhan give on capturing SIGINT, I am not permitted to store any global variables. The ideal scenario is one in which all variables and function calls related to signal() are encapsulated within a class of mine.
Here's my modified attempt based on Haidl and Grijesh's code:
#include <thread>
#include <chrono>
#include <functional>
#include <iostream>
#include <signal.h>
class MyClass {
public:
volatile sig_atomic_t cancel = 0;
void sig_handler(int signal) {
cancel = true;
this->libCancel();
}
void libCancel() { std::cout << "Cancel and cleanup" << std::endl; }
};
int main(int argc, char *argv[]) {
MyClass mc;
//using std::placeholders::_1;
//std::function<void(int)> handler = std::bind(&MyClass::sig_handler, mc, _1);
//signal(SIGINT, handler);
signal(SIGINT, &mc.sig_handler); // **compiler error**
for (int i = 0; !mc.cancel && i < 100; ++i)
{
std::cout << i << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(1));
}
return 0;
}
As you can see, I'd like the code to simply count to 100 and exit if all goes well. But if the user calls ctrl+c then the class should handle SIGINT, call the external library for cleanup, and the for loop will exit.
The main problem is that I can't seem to setup the signal() declaration to bind to my instance of MyClass::sig_handler. I even tried casting my member function to std::function to be used by signal(), commented out, but the compiler isn't happy about the fact that C++ function<void(int)> isn't equivalent to the C lang void (*)(int).
Any and all criticism is welcome. I'm not at all tied to what I've written and I clearly don't have a great fundamental understanding of how to use function pointers with member functions.

It is not possible to communicate between the signal handler and the rest of the program using local variables. No parameters are passed into the handler other than the raised signal and the handler returns no value.
The words "global variables" are somewhat ambiguous. People sometimes mean different things depending on context. If your restriction applies only to the global scope, then simply use a volatile sig_atomic_t within some namespace. Or use static member variable, if you so prefer.
If your restriction applies to static storage duration, then you can use a thread local variable instead.
If your restriction applies to all global memory, then your problem is unsolvable using a signal handler. You simply need a global variable of some sort.
If you can rely on POSIX rather than C++ standard, A way to handle SIGINT without globals is to make sure that it is not handled, and block the thread with sigwait. If the call returns SIGINT, then stop the program, otherwise do what you want to do with the signal that was caught.
Of course, this means that the blocking thread doesn't do anything other than wait for signals. You'll need to do the actual work in other thread(s).
Technically though, global memory is probably still used. The use is simply hidden inside system library.
Furthermore, it is not safe to use std::cout within a signal handler. I know that is only an example, but "call the external library for cleanup" is very likely also async signal unsafe.
This can be fixed simply by calling the cleanup outside the for loop rather than inside the handler.
The main problem is that I can't seem to setup the signal() declaration to bind to my instance of MyClass::sig_handler.
That's because signal requires a function pointer (of type void(int)). Non-static member functions cannot be pointed by function pointers. They can only be pointed by member function pointers, which signal doesn't accept.

C++ syntax I don't understand

I've found a C++ code that has this syntax:
void MyClass::method()
{
beginResetModel();
{
// Various stuff
}
endResetModel();
}
I've no idea why there are { } after a line ending with ; but it seems there is no problem to make it compile and run. Is it possible this as something to do with the fact that the code may be asynchronous (I'm not sure yet)? Or maybe the { } are only here to delimit a part of the code and don't really make a difference but honestly I doubt this. I don't know, does someone has any clue what this syntax mean ?
More info: There is no other reference to beginResetModel, resetModel or ResetModel in the whole project (searched with grep). Btw the project is a Qt one. Maybe it's another Qt-related macro I haven't heard of.

Using {} will create a new scope. In your case, any variable created in those braces will cease to exist at the } in the end.

beginResetModel();
{
// Various stuff
}
endResetModel()
The open and close braces in your code are a very important feature in C++, as they delimit a new scope. You can appreciate the power of this in combination with another powerful language feature: destructors.
So, suppose that inside those braces you have code that creates various objects, like graphics models, or whatever.
Assuming that these objects are instances of classes that allocate resources (e.g. textures on the video card), and those classes have destructors that release the allocated resources, you are guaranteed that, at the }, these destructors are automatically invoked.
In this way, all the allocated resources are automatically released, before the code outside the closing curly brace, e.g. before the call to endResetModel() in your sample.
This automatic and deterministic resource management is a key powerful feature of C++.
Now, suppose that you remove the curly braces, and your method looks like this:
void MyClass::method()
{
beginResetModel();
// {
// Various stuff
// }
endResetModel();
}
Now, all the objects created in the Various stuff section of code will be destroyed before the } that terminates the MyClass::method(), but after the call to endResetModel().
So, in this case, you end up with the endResetModel() call, followed by other release code that runs after it. This may cause bugs.
On the other hand, the curly braces that define a new scope enclosed in begin/endResetModel() do guarantee that all the objects created inside this scope are destroyed before endResetModel() is invoked.

{} delimits a scope. That means that any variable declared inside there is not accessible outside of it and is erased from memory once the } is reached. Here is an example:
#include <iostream>
using namespace std;
class MyClass{
public:
~MyClass(){
cout << "Destructor called" << endl;
}
};
int main(){
{
int x = 3;
MyClass foo;
cout << x << endl; //Prints 3
} //Here "Destructor called" is printed since foo is cleared from the memory
cout << x << endl; //Compiler error, x isn't defined here
return 0;
}
Usually scopes are used for functions, loops, if-statements, etc, but you're perfectly allowed to use scopes without any statement before them. This can be particularly useful to declare variables inside a switch (this answer explains why).

As others have pointed out, the curly braces create a new scope, but maybe the interesting thing is why would you want to do that - that is, what is the difference between using it and not using it. There are cases where scopes are obviously necessary, such as with if or for blocks; if you don't create a scope after them you can only have one statement. Another possible reason is that maybe you use one variable in one part of the function and do not one it to be used outside of that part, so you put it into its own scope. However, the main use of scopes out of control statements has to do with RAII. When you declare an instance variable (not a pointer or reference), it is always initialized; when it goes out of scope, it is always destroyed. This can be used to define blocks that require some setup at the beginning and some tear down at the end (if you are familiar with Python, similar to with blocks).
Take this example:
#include <mutex>
void fun(std::mutex & mutex) {
// 1. Perform some computations...
{
std::lock_guard<std::mutex> lock(mutex);
// 2. Operations in this scope are performed with the mutex locked
}
// 3. More computations...
}
In this example, part 2 is only run after the mutex has been acquired, and is released before part 3 starts. If you remove the additional scope:
#include <mutex>
void fun(std::mutex & mutex) {
// 1. Perform some computations...
std::lock_guard<std::mutex> lock(mutex);
// 2. Operations in this scope are performed with the mutex locked
// 3. More computations...
}
In this case the mutex is acquired before starting part 2, but it is held until part 3 is complete (possibly producing more interlocking between threads than necessary). Note, however, that in both cases there was no need to specify when the lock is released; std::lock_guard is responsible for both acquiring the lock on construction and releasing it on destruction (i.e. when it goes out of scope).

std::thread::join() hangs if called after main() exits when using VS2012 RC

The following example runs successfully (i.e. doesn't hang) if compiled using Clang 3.2 or GCC 4.7 on Ubuntu 12.04, but hangs if I compile using VS11 Beta or VS2012 RC.
#include <iostream>
#include <string>
#include <thread>
#include "boost/thread/thread.hpp"
void SleepFor(int ms) {
std::this_thread::sleep_for(std::chrono::milliseconds(ms));
}
template<typename T>
class ThreadTest {
public:
ThreadTest() : thread_([] { SleepFor(10); }) {}
~ThreadTest() {
std::cout << "About to join\t" << id() << '\n';
thread_.join();
std::cout << "Joined\t\t" << id() << '\n';
}
private:
std::string id() const { return typeid(decltype(thread_)).name(); }
T thread_;
};
int main() {
static ThreadTest<std::thread> std_test;
static ThreadTest<boost::thread> boost_test;
// SleepFor(100);
}
The issue appears to be that std::thread::join() never returns if it is invoked after main has exited. It is blocked at WaitForSingleObject in _Thrd_join defined in cthread.c.
Uncommenting SleepFor(100); at the end of main allows the program to exit properly, as does making std_test non-static. Using boost::thread also avoids the issue.
So I'd like to know if I'm invoking undefined behaviour here (seems unlikely to me), or if I should be filing a bug against VS2012?

Tracing through Fraser's sample code in his connect bug (https://connect.microsoft.com/VisualStudio/feedback/details/747145)
with VS2012 RTM seems to show a fairly straightforward case of deadlocking. This likely isn't specific to std::thread - likely _beginthreadex suffers the same fate.
What I see in the debugger is the following:
On the main thread, the main() function has completed, the process cleanup code has acquired a critical section called _EXIT_LOCK1, called the destructor of ThreadTest, and is waiting (indefinitely) on the second thread to exit (via the call to join()).
The second thread's anonymous function completed and is in the thread cleanup code waiting to acquire the _EXIT_LOCK1 critical section. Unfortunately, due to the timing of things (whereby the second thread's anonymous function's lifetime exceeds that of the main() function) the main thread already owns that critical section.
DEADLOCK.
Anything that extends the lifetime of main() such that the second thread can acquire _EXIT_LOCK1 before the main thread avoids the deadlock situation. That's why the uncommenting the sleep in main() results in a clean shutdown.
Alternatively if you remove the static keyword from the ThreadTest local variable, the destructor call is moved up to the end of the main() function (instead of in the process cleanup code) which then blocks until the second thread has exited - avoiding the deadlock situation.
Or you could add a function to ThreadTest that calls join() and call that function at the end of main() - again avoiding the deadlock situation.

I realize this is an old question regarding VS2012, but the bug is still present in VS2013. For those who are stuck on VS2013, perhaps due to Microsoft's refusal to provide upgrade pricing for VS2015, I offer the following analysis and workaround.
The problem is that the mutex (at_thread_exit_mutex) used by _Cnd_do_broadcast_at_thread_exit() is either not yet initialized, or has already been destroyed, depending on the exact circumstances. In the former case, _Cnd_do_broadcast_at_thread_exit() tries to initialize the mutex during shutdown, causing a deadlock. In the latter case, where the mutex has already been destroyed via the atexit stack, the program will crash on the way out.
The solution I found is to explicitly call _Cnd_do_broadcast_at_thread_exit() (which thankfully is declared publicly) early during program startup. This has the effect of creating the mutex before anyone else tries to access it, as well as ensuring that the mutex continues to exist until the last possible moment.
So, to fix the problem, insert the following code at the bottom of a source module, for instance somewhere below main().
#pragma warning(disable:4073) // initializers put in library initialization area
#pragma init_seg(lib)
#if _MSC_VER < 1900
struct VS2013_threading_fix
{
VS2013_threading_fix()
{
_Cnd_do_broadcast_at_thread_exit();
}
} threading_fix;
#endif

I believe your threads have already been terminated and their resources freed following the termination of your main function and before static destruction. This is the behavior of the VC runtimes dating back to at least VC6.
Do child threads exit when the parent thread terminates
boost thread and process cleanup on windows

My answer is too far late, but hope will help someone.
I was stucked by this bug, and i find a trick to solve this problem,it worked in my code.
int main()
{
ThreadTest trick_obj; //trick... You can put this line of code anywhere
static ThreadTest std_test;
return 1;
}

I have been battling this bug for a day, and found the following work-around, which turned out the be the least dirty trick:
Instead of returning, one can use the standard Windows API function call ExitThread() to terminate the thread. This method of course may mess up the internal state of the std::thread object and associated library, but since the program is going to terminate anyway, well, so be it.
#include <windows.h>
template<typename T>
class ThreadTest {
public:
ThreadTest() : thread_([] { SleepFor(10); ExitThread(NULL); }) {}
~ThreadTest() {
std::cout << "About to join\t" << id() << '\n';
thread_.join();
std::cout << "Joined\t\t" << id() << '\n';
}
private:
std::string id() const { return typeid(decltype(thread_)).name(); }
T thread_;
};
The join() call apparently works correctly. However, I chose to use a more safe method in our solution. One can get the thread HANDLE via std::thread::native_handle(). With this handle we can call the Windows API directly to join the thread:
WaitForSingleObject(thread_.native_handle(), INFINITE);
CloseHandle(thread_.native_handle());
Thereafter, the std::thread object must not be destroyed, as the destructor would try to join the thread a second time. So we just leave the std::thread object dangling at program exit.

How can I schedule some code to run after all '_atexit()' functions are completed

I'm writing a memory tracking system and the only problem I've actually run into is that when the application exits, any static/global classes that didn't allocate in their constructor, but are deallocating in their deconstructor are deallocating after my memory tracking stuff has reported the allocated data as a leak.
As far as I can tell, the only way for me to properly solve this would be to either force the placement of the memory tracker's _atexit callback at the head of the stack (so that it is called last) or have it execute after the entire _atexit stack has been unwound. Is it actually possible to implement either of these solutions, or is there another solution that I have overlooked.
Edit:
I'm working on/developing for Windows XP and compiling with VS2005.

I've finally figured out how to do this under Windows/Visual Studio. Looking through the crt startup function again (specifically where it calls the initializers for globals), I noticed that it was simply running "function pointers" that were contained between certain segments. So with just a little bit of knowledge on how the linker works, I came up with this:
#include <iostream>
using std::cout;
using std::endl;
// Typedef for the function pointer
typedef void (*_PVFV)(void);
// Our various functions/classes that are going to log the application startup/exit
struct TestClass
{
int m_instanceID;
TestClass(int instanceID) : m_instanceID(instanceID) { cout << " Creating TestClass: " << m_instanceID << endl; }
~TestClass() {cout << " Destroying TestClass: " << m_instanceID << endl; }
};
static int InitInt(const char *ptr) { cout << " Initializing Variable: " << ptr << endl; return 42; }
static void LastOnExitFunc() { puts("Called " __FUNCTION__ "();"); }
static void CInit() { puts("Called " __FUNCTION__ "();"); atexit(&LastOnExitFunc); }
static void CppInit() { puts("Called " __FUNCTION__ "();"); }
// our variables to be intialized
extern "C" { static int testCVar1 = InitInt("testCVar1"); }
static TestClass testClassInstance1(1);
static int testCppVar1 = InitInt("testCppVar1");
// Define where our segment names
#define SEGMENT_C_INIT ".CRT$XIM"
#define SEGMENT_CPP_INIT ".CRT$XCM"
// Build our various function tables and insert them into the correct segments.
#pragma data_seg(SEGMENT_C_INIT)
#pragma data_seg(SEGMENT_CPP_INIT)
#pragma data_seg() // Switch back to the default segment
// Call create our call function pointer arrays and place them in the segments created above
#define SEG_ALLOCATE(SEGMENT) __declspec(allocate(SEGMENT))
SEG_ALLOCATE(SEGMENT_C_INIT) _PVFV c_init_funcs[] = { &CInit };
SEG_ALLOCATE(SEGMENT_CPP_INIT) _PVFV cpp_init_funcs[] = { &CppInit };
// Some more variables just to show that declaration order isn't affecting anything
extern "C" { static int testCVar2 = InitInt("testCVar2"); }
static TestClass testClassInstance2(2);
static int testCppVar2 = InitInt("testCppVar2");
// Main function which prints itself just so we can see where the app actually enters
void main()
{
cout << " Entered Main()!" << endl;
}
which outputs:
Called CInit();
Called CppInit();
Initializing Variable: testCVar1
Creating TestClass: 1
Initializing Variable: testCppVar1
Initializing Variable: testCVar2
Creating TestClass: 2
Initializing Variable: testCppVar2
Entered Main()!
Destroying TestClass: 2
Destroying TestClass: 1
Called LastOnExitFunc();
This works due to the way MS have written their runtime library. Basically, they've setup the following variables in the data segments:
(although this info is copyright I believe this is fair use as it doesn't devalue the original and IS only here for reference)
extern _CRTALLOC(".CRT$XIA") _PIFV __xi_a[];
extern _CRTALLOC(".CRT$XIZ") _PIFV __xi_z[]; /* C initializers */
extern _CRTALLOC(".CRT$XCA") _PVFV __xc_a[];
extern _CRTALLOC(".CRT$XCZ") _PVFV __xc_z[]; /* C++ initializers */
extern _CRTALLOC(".CRT$XPA") _PVFV __xp_a[];
extern _CRTALLOC(".CRT$XPZ") _PVFV __xp_z[]; /* C pre-terminators */
extern _CRTALLOC(".CRT$XTA") _PVFV __xt_a[];
extern _CRTALLOC(".CRT$XTZ") _PVFV __xt_z[]; /* C terminators */
On initialization, the program simply iterates from '__xN_a' to '__xN_z' (where N is {i,c,p,t}) and calls any non null pointers it finds. If we just insert our own segment in between the segments '.CRT$XnA' and '.CRT$XnZ' (where, once again n is {I,C,P,T}), it will be called along with everything else that normally gets called.
The linker simply joins up the segments in alphabetical order. This makes it extremely simple to select when our functions should be called. If you have a look in defsects.inc (found under $(VS_DIR)\VC\crt\src\) you can see that MS have placed all the "user" initialization functions (that is, the ones that initialize globals in your code) in segments ending with 'U'. This means that we just need to place our initializers in a segment earlier than 'U' and they will be called before any other initializers.
You must be really careful not to use any functionality that isn't initialized until after your selected placement of the function pointers (frankly, I'd recommend you just use .CRT$XCT that way its only your code that hasn't been initialized. I'm not sure what will happen if you've linked with standard 'C' code, you may have to place it in the .CRT$XIT block in that case).
One thing I did discover was that the "pre-terminators" and "terminators" aren't actually stored in the executable if you link against the DLL versions of the runtime library. Due to this, you can't really use them as a general solution. Instead, the way I made it run my specific function as the last "user" function was to simply call atexit() within the 'C initializers', this way, no other function could have been added to the stack (which will be called in the reverse order to which functions are added and is how global/static deconstructors are all called).
Just one final (obvious) note, this is written with Microsoft's runtime library in mind. It may work similar on other platforms/compilers (hopefully you'll be able to get away with just changing the segment names to whatever they use, IF they use the same scheme) but don't count on it.

atexit is processed by the C/C++ runtime (CRT). It runs after main() has already returned. Probably the best way to do this is to replace the standard CRT with your own.
On Windows tlibc is probably a great place to start: http://www.codeproject.com/KB/library/tlibc.aspx
Look at the code sample for mainCRTStartup and just run your code after the call to _doexit();
but before ExitProcess.
Alternatively, you could just get notified when ExitProcess gets called. When ExitProcess gets called the following occurs (according to http://msdn.microsoft.com/en-us/library/ms682658%28VS.85%29.aspx):
All of the threads in the process, except the calling thread, terminate their execution without receiving a DLL_THREAD_DETACH notification.
The states of all of the threads terminated in step 1 become signaled.
The entry-point functions of all loaded dynamic-link libraries (DLLs) are called with DLL_PROCESS_DETACH.
After all attached DLLs have executed any process termination code, the ExitProcess function terminates the current process, including the calling thread.
The state of the calling thread becomes signaled.
All of the object handles opened by the process are closed.
The termination status of the process changes from STILL_ACTIVE to the exit value of the process.
The state of the process object becomes signaled, satisfying any threads that had been waiting for the process to terminate.
So, one method would be to create a DLL and have that DLL attach to the process. It will get notified when the process exits, which should be after atexit has been processed.
Obviously, this is all rather hackish, proceed carefully.

This is dependent on the development platform. For example, Borland C++ has a #pragma which could be used for exactly this. (From Borland C++ 5.0, c. 1995)
#pragma startup function-name [priority]
#pragma exit function-name [priority]
These two pragmas allow the program to specify function(s) that should be called either upon program startup (before the main function is called), or program exit (just before the program terminates through _exit).
The specified function-name must be a previously declared function as:
void function-name(void);
The optional priority should be in the range 64 to 255, with highest priority at 0; default is 100. Functions with higher priorities are called first at startup and last at exit. Priorities from 0 to 63 are used by the C libraries, and should not be used by the user.
Perhaps your C compiler has a similar facility?

I've read multiple times you can't guarantee the construction order of global variables (cite). I'd think it is pretty safe to infer from this that destructor execution order is also not guaranteed.
Therefore if your memory tracking object is global, you will almost certainly be unable any guarantees that your memory tracker object will get destructed last (or constructed first). If it's not destructed last, and other allocations are outstanding, then yes it will notice the leaks you mention.
Also, what platform is this _atexit function defined for?

Having the memory tracker's cleanup executed last is the best solution. The easiest way I've found to do that is to explicitly control all the relevant global variables' initialization order. (Some libraries hide their global state in fancy classes or otherwise, thinking they're following a pattern, but all they do is prevent this kind of flexibility.)
Example main.cpp:
#include "global_init.inc"
int main() {
// do very little work; all initialization, main-specific stuff
// then call your application's mainloop
}
Where the global-initialization file includes object definitions and #includes similar non-header files. Order the objects in this file in the order you want them constructed, and they'll be destructed in the reverse order. 18.3/8 in C++03 guarantees that destruction order mirrors construction: "Non-local objects with static storage duration are destroyed in the reverse order of the completion of their constructor." (That section is talking about exit(), but a return from main is the same, see 3.6.1/5.)
As a bonus, you're guaranteed that all globals (in that file) are initialized before entering main. (Something not guaranteed in the standard, but allowed if implementations choose.)

I've had this exact problem, also writing a memory tracker.
A few things:
Along with destruction, you also need to handle construction. Be prepared for malloc/new to be called BEFORE your memory tracker is constructed (assuming it is written as a class). So you need your class to know whether it has been constructed or destructed yet!
class MemTracker
{
enum State
{
unconstructed = 0, // must be 0 !!!
constructed,
destructed
};
State state;
MemTracker()
{
if (state == unconstructed)
{
// construct...
state = constructed;
}
}
};
static MemTracker memTracker; // all statics are zero-initted by linker
On every allocation that calls into your tracker, construct it!
MemTracker::malloc(...)
{
// force call to constructor, which does nothing after first time
new (this) MemTracker();
...
}
Strange, but true. Anyhow, onto destruction:
~MemTracker()
{
OutputLeaks(file);
state = destructed;
}
So, on destruction, output your results. Yet we know that there will be more calls. What to do? Well,...
MemTracker::free(void * ptr)
{
do_tracking(ptr);
if (state == destructed)
{
// we must getting called late
// so re-output
// Note that this might happen a lot...
OutputLeaks(file); // again!
}
}
And lastly:
be careful with threading
be careful not to call malloc/free/new/delete inside your tracker, or be able to detect the recursion, etc :-)
EDIT:
and I forgot, if you put your tracker in a DLL, you will probably need to LoadLibrary() (or dlopen, etc) yourself to up your reference count, so that you don't get removed from memory prematurely. Because although your class can still be called after destruction, it can't if the code has been unloaded.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js