I would like to watch a global variable before start of the main function. The one possible solution is to create a function which initialize a global variable and set a breakpoint on the function:
int Init()
{
return 0;
}
int globalX = Init();
//gdb: break Init
//gdb: run
//gdb: awatch globalX
Is it possible to watch a global variable (before start of the main function) without defining a function breakpoint ? watch globalX doesn't work.
Is it possible to watch a global variable (before start of the main function) without defining a function breakpoint ?
Yes. However, GDB will only stop when the value of the variable changes after the program starts, and for a variable that is initialized with a constant value the time it is set to that value is before the program starts.
More precisely:
int foo;
int bar = 42;
int baz = func();
The variable foo is allocated in the .bss section, and has value 0 before the first instruction in the process executes.
Likewise, variable bar is allocated in the .data section, and has value 42 before the first instruction (the corresponding location in the .data section has this value on disk, and it is simply mmaped into the process before the process starts).
The variable baz is allocated in the .data section, and is dynamically initialized -- this is the only variable that actually changes its value after the process starts. You can watch that variable and observe where the initialization happens without setting a breakpoint on func().
Related
int testFun(int A)
{
return A+1;
}
int main()
{
int x=0;
int y= testFun(x)
cout<<y;
}
As we know, the stack saves the local variables, which means when I was in the main function, the stack had variables (x and y) and when I called the function (testFun) the stack had the variable(A)
and when I return from (testFun) The stack pops the last frame
But the quesion here, when I return from (testFun), how it know the last place it were in the main function before calling the (testFun)
when I return from (testFun), how it know the last place it were in the main function before calling the (testFun)
The compiler parses the code and generates machine instructions that run on the CPU. A function call produces a CALL instruction. When the function exits, a RET instruction is used to return to the caller.
The CALL instruction pushes the address of the instruction that follows the CALL itself onto the call stack, then jumps to the starting address of the specified function.
The RET instruction pops that address from the call stack, then jumps to the specified address.
I began learning about POSIX threads recently, and I've learned that when you have two threads Main and B, thread B can continuously change a variable in thread Main if I reference the variable as the void pointer in thread B's creation.
That lead me to wonder how to make thread Main continuously change a variable in thread B. I wrote a program to test whether changing the sent parameter changes thread B by running thread B and then changing the referenced variable. It didn't do anything. Is this result right?
So basically:
void *someFunc(void *var) {
int *num=(int*) var;
int num2=*num;
while (true) {
if (num2==1) {
*num=3;
} else {
*num=5;
}
}
return NULL;
}
someVar=1;
pthread_t threadB;
if(pthread_create(&threadB, NULL, someFunc , &someVar)) {
return 1;
}
someVar=2;
//then join both threads later and print someVar
//will someVar be 3 or 5?
Basically, when I reference a variable using the void pointer in thread creation, will any future changes to that variable affect the newly created thread? If this is not true, in order to continuously change it, is there some particular call for that? Should I look into locks/mutex or just put someFunc into a class and change its initializer variables?
Thanks!
The line
int num2=*num;
Creates a copy of the number pointed to by the main thread. You have a race, therefore: if it is changed before the copy, one thing will happen; otherwise, the child thread will never see the change.
Because you pass someVar by pointer to someFunc, and then you copy it to the pointer num, any change to someVar will immediately change the value of *num.
But num2 will not be affected by changes to someVar, because num2 is a a different variable allocated don the stack of thread B. Therefore, the outcome of the while loop will be determined by the value that was assigned to num2 when the thread started. This can be either 1 or 2, depending on how fast the main thread and thread B are running. Such a dependency is a non-deterministic behavior called a "race condition", and you need to be very careful to avoid it.
I'd like to have a thread_local variable to change the level of logging applied in each thread of my application. Something like so:
enum class trace_level { none, error, warning, log, debug, verbose };
static thread_local trace_level min_level = trace_level::log;
The default value should be trace_level::log for the main thread when the application starts, but if it is changed before launching other threads, then I would like the child threads to start with the current value of the parent.
Is there any way to do this using a thread_local variable? Since this code is buried in a library it is not an option to simply set the value manually at the start of each thread.
This already happens if the initialization is dynamic. The standard requires that variables with "thread storage duration" and dynamic initialization be initialized sometime between the start of the thread and the 'first odr-use'. However, since you generally can't control exactly when that initialization will occur (other than sometime after the thread object is created and sometime before the thread ends - assuming the thread local variable actually gets used by the thread) the problem is that the thread local variable might get initialized with a value that your main thread sets after the thread is created.
For a concrete example, consider:
#include <stdio.h>
#include <chrono>
#include <functional>
#include <thread>
#include <string>
using std::string;
enum class trace_level { none, error, warning, log, debug, verbose };
trace_level log_level = trace_level::log;
static thread_local trace_level min_level = log_level;
void f(string const& s)
{
printf("%s, min_level == %d\n", s.c_str(), (int) min_level);
}
int main()
{
std::thread t1{std::bind(f,"thread 1")};
//TODO: std::this_thread::sleep_for(std::chrono::milliseconds(50));
log_level = trace_level::verbose;
std::thread t2{std::bind(f,"thread 2")};
t1.join();
t2.join();
}
With the sleep_for() call commented out as above, I get the following output (usually):
C:\so-test>test
thread 1, min_level == 5
thread 2, min_level == 5
However, with the sleep_for() uncommented, I get (again - usually):
C:\so-test>test
thread 1, min_level == 3
thread 2, min_level == 5
So as long as you're willing to live with a bit of uncertainty regarding which logging level a thread will get if the level gets changed in the main thread soon after the thread starts, you can probably just do what you're looking to do pretty naturally.
There's one remaining caveat - data races. The code above has a data race on the log_level variable, so it actually has undefined behavior. The fix for that is to make the variable either an atomic type or wrap it in a class that uses a mutex to protect updates and reads from data races. So change the declaration of the global log_level to:
std::atomic<trace_level> log_level(trace_level::log);
Standards citations:
3.6.2 Initialization of non-local variables [basic.start.init]
... Non-local variables with thread storage duration are initialized
as a consequence of thread execution. ...
and
3.7.2/2 Thread storage duration [basic.stc.thread]
A variable with thread storage duration shall be initialized before
its first odr-use (3.2) and, if constructed, shall be destroyed on
thread exit.
You can create a global pointer to a parent thread local variable.
In global scope
thread_local trace_level min_level = trace_level::log;
trace_level *min_level_ptr = nullptr;
Then, in each thread you can do:
if (!min_level_ptr)
min_level_ptr = &min_level;
else
min_level = *min_level_ptr;
(Possibly, make the min_level_ptr atomic for added safety and use atomic compare exchange instead of assignment).
The idea goes as following: each thread's local storage occupies a different region in memory, so min_level variable in one thread has unique storage address different from all other. min_level_ptr, on the other hand, has the same address, no matter which thread is accessing it. As "parent" thread starts before all other, it will claim the globally shared pointer with its own min_level address. The children will then initialize their values from that location.
I'm currently working on implementing patch files in some code, and apparently one of the patch files uses return 0 in a class outside of the main. I know return 0 would close the application if it was in the main function, however I'm not sure about how it would function in a class outside of the main function. Basically the code could be summed up like this in pseudocode:
boost::uint64_t
namespace::class(etc. etc.)
{
if (method.isValid)
{
//do stuff
}
return 0;
}
Normally when I think of return 0 in C++, I think of exiting the application by calling it in main, however in this case, I'm not sure if this would exit the application, or just the class's functionality/the class it self. Could someone please explain what the return 0 would actually be doing in this situation?
Thanks,
Flyboy
No.
Think of what would happen if this was the case:
int add(int a, int b) { return a + b; }
// somewhere:
int zero = add(2, -2); // would this exit the program?
It isn't the zero that is important in the return from main, it's the return. You can return any value from main and doing so will cause the program to exit (after all global variables are cleaned up, streams are closed, and other cleanup tasks are completed).
No, returning 0 (or anything else) from a function won't exit the application. Returning from main -- regardless of the value returned -- exits from a (single-threaded) application. But other functions come and go all the time.
return 0 is only relative to the scope of the current function so it will not close the application if it is outside of main.
Returning from main exits the application (regardless of the value being returned -- though the standard only defines meanings for 0, EXIT_SUCCESS, and EXIT_FAILURE). Returning from some other function just returns the designated value (if any) to the caller. The control flow doesn't change just because the value being returned happens to be zero.
I'm writing a memory tracking system and the only problem I've actually run into is that when the application exits, any static/global classes that didn't allocate in their constructor, but are deallocating in their deconstructor are deallocating after my memory tracking stuff has reported the allocated data as a leak.
As far as I can tell, the only way for me to properly solve this would be to either force the placement of the memory tracker's _atexit callback at the head of the stack (so that it is called last) or have it execute after the entire _atexit stack has been unwound. Is it actually possible to implement either of these solutions, or is there another solution that I have overlooked.
Edit:
I'm working on/developing for Windows XP and compiling with VS2005.
I've finally figured out how to do this under Windows/Visual Studio. Looking through the crt startup function again (specifically where it calls the initializers for globals), I noticed that it was simply running "function pointers" that were contained between certain segments. So with just a little bit of knowledge on how the linker works, I came up with this:
#include <iostream>
using std::cout;
using std::endl;
// Typedef for the function pointer
typedef void (*_PVFV)(void);
// Our various functions/classes that are going to log the application startup/exit
struct TestClass
{
int m_instanceID;
TestClass(int instanceID) : m_instanceID(instanceID) { cout << " Creating TestClass: " << m_instanceID << endl; }
~TestClass() {cout << " Destroying TestClass: " << m_instanceID << endl; }
};
static int InitInt(const char *ptr) { cout << " Initializing Variable: " << ptr << endl; return 42; }
static void LastOnExitFunc() { puts("Called " __FUNCTION__ "();"); }
static void CInit() { puts("Called " __FUNCTION__ "();"); atexit(&LastOnExitFunc); }
static void CppInit() { puts("Called " __FUNCTION__ "();"); }
// our variables to be intialized
extern "C" { static int testCVar1 = InitInt("testCVar1"); }
static TestClass testClassInstance1(1);
static int testCppVar1 = InitInt("testCppVar1");
// Define where our segment names
#define SEGMENT_C_INIT ".CRT$XIM"
#define SEGMENT_CPP_INIT ".CRT$XCM"
// Build our various function tables and insert them into the correct segments.
#pragma data_seg(SEGMENT_C_INIT)
#pragma data_seg(SEGMENT_CPP_INIT)
#pragma data_seg() // Switch back to the default segment
// Call create our call function pointer arrays and place them in the segments created above
#define SEG_ALLOCATE(SEGMENT) __declspec(allocate(SEGMENT))
SEG_ALLOCATE(SEGMENT_C_INIT) _PVFV c_init_funcs[] = { &CInit };
SEG_ALLOCATE(SEGMENT_CPP_INIT) _PVFV cpp_init_funcs[] = { &CppInit };
// Some more variables just to show that declaration order isn't affecting anything
extern "C" { static int testCVar2 = InitInt("testCVar2"); }
static TestClass testClassInstance2(2);
static int testCppVar2 = InitInt("testCppVar2");
// Main function which prints itself just so we can see where the app actually enters
void main()
{
cout << " Entered Main()!" << endl;
}
which outputs:
Called CInit();
Called CppInit();
Initializing Variable: testCVar1
Creating TestClass: 1
Initializing Variable: testCppVar1
Initializing Variable: testCVar2
Creating TestClass: 2
Initializing Variable: testCppVar2
Entered Main()!
Destroying TestClass: 2
Destroying TestClass: 1
Called LastOnExitFunc();
This works due to the way MS have written their runtime library. Basically, they've setup the following variables in the data segments:
(although this info is copyright I believe this is fair use as it doesn't devalue the original and IS only here for reference)
extern _CRTALLOC(".CRT$XIA") _PIFV __xi_a[];
extern _CRTALLOC(".CRT$XIZ") _PIFV __xi_z[]; /* C initializers */
extern _CRTALLOC(".CRT$XCA") _PVFV __xc_a[];
extern _CRTALLOC(".CRT$XCZ") _PVFV __xc_z[]; /* C++ initializers */
extern _CRTALLOC(".CRT$XPA") _PVFV __xp_a[];
extern _CRTALLOC(".CRT$XPZ") _PVFV __xp_z[]; /* C pre-terminators */
extern _CRTALLOC(".CRT$XTA") _PVFV __xt_a[];
extern _CRTALLOC(".CRT$XTZ") _PVFV __xt_z[]; /* C terminators */
On initialization, the program simply iterates from '__xN_a' to '__xN_z' (where N is {i,c,p,t}) and calls any non null pointers it finds. If we just insert our own segment in between the segments '.CRT$XnA' and '.CRT$XnZ' (where, once again n is {I,C,P,T}), it will be called along with everything else that normally gets called.
The linker simply joins up the segments in alphabetical order. This makes it extremely simple to select when our functions should be called. If you have a look in defsects.inc (found under $(VS_DIR)\VC\crt\src\) you can see that MS have placed all the "user" initialization functions (that is, the ones that initialize globals in your code) in segments ending with 'U'. This means that we just need to place our initializers in a segment earlier than 'U' and they will be called before any other initializers.
You must be really careful not to use any functionality that isn't initialized until after your selected placement of the function pointers (frankly, I'd recommend you just use .CRT$XCT that way its only your code that hasn't been initialized. I'm not sure what will happen if you've linked with standard 'C' code, you may have to place it in the .CRT$XIT block in that case).
One thing I did discover was that the "pre-terminators" and "terminators" aren't actually stored in the executable if you link against the DLL versions of the runtime library. Due to this, you can't really use them as a general solution. Instead, the way I made it run my specific function as the last "user" function was to simply call atexit() within the 'C initializers', this way, no other function could have been added to the stack (which will be called in the reverse order to which functions are added and is how global/static deconstructors are all called).
Just one final (obvious) note, this is written with Microsoft's runtime library in mind. It may work similar on other platforms/compilers (hopefully you'll be able to get away with just changing the segment names to whatever they use, IF they use the same scheme) but don't count on it.
atexit is processed by the C/C++ runtime (CRT). It runs after main() has already returned. Probably the best way to do this is to replace the standard CRT with your own.
On Windows tlibc is probably a great place to start: http://www.codeproject.com/KB/library/tlibc.aspx
Look at the code sample for mainCRTStartup and just run your code after the call to _doexit();
but before ExitProcess.
Alternatively, you could just get notified when ExitProcess gets called. When ExitProcess gets called the following occurs (according to http://msdn.microsoft.com/en-us/library/ms682658%28VS.85%29.aspx):
All of the threads in the process, except the calling thread, terminate their execution without receiving a DLL_THREAD_DETACH notification.
The states of all of the threads terminated in step 1 become signaled.
The entry-point functions of all loaded dynamic-link libraries (DLLs) are called with DLL_PROCESS_DETACH.
After all attached DLLs have executed any process termination code, the ExitProcess function terminates the current process, including the calling thread.
The state of the calling thread becomes signaled.
All of the object handles opened by the process are closed.
The termination status of the process changes from STILL_ACTIVE to the exit value of the process.
The state of the process object becomes signaled, satisfying any threads that had been waiting for the process to terminate.
So, one method would be to create a DLL and have that DLL attach to the process. It will get notified when the process exits, which should be after atexit has been processed.
Obviously, this is all rather hackish, proceed carefully.
This is dependent on the development platform. For example, Borland C++ has a #pragma which could be used for exactly this. (From Borland C++ 5.0, c. 1995)
#pragma startup function-name [priority]
#pragma exit function-name [priority]
These two pragmas allow the program to specify function(s) that should be called either upon program startup (before the main function is called), or program exit (just before the program terminates through _exit).
The specified function-name must be a previously declared function as:
void function-name(void);
The optional priority should be in the range 64 to 255, with highest priority at 0; default is 100. Functions with higher priorities are called first at startup and last at exit. Priorities from 0 to 63 are used by the C libraries, and should not be used by the user.
Perhaps your C compiler has a similar facility?
I've read multiple times you can't guarantee the construction order of global variables (cite). I'd think it is pretty safe to infer from this that destructor execution order is also not guaranteed.
Therefore if your memory tracking object is global, you will almost certainly be unable any guarantees that your memory tracker object will get destructed last (or constructed first). If it's not destructed last, and other allocations are outstanding, then yes it will notice the leaks you mention.
Also, what platform is this _atexit function defined for?
Having the memory tracker's cleanup executed last is the best solution. The easiest way I've found to do that is to explicitly control all the relevant global variables' initialization order. (Some libraries hide their global state in fancy classes or otherwise, thinking they're following a pattern, but all they do is prevent this kind of flexibility.)
Example main.cpp:
#include "global_init.inc"
int main() {
// do very little work; all initialization, main-specific stuff
// then call your application's mainloop
}
Where the global-initialization file includes object definitions and #includes similar non-header files. Order the objects in this file in the order you want them constructed, and they'll be destructed in the reverse order. 18.3/8 in C++03 guarantees that destruction order mirrors construction: "Non-local objects with static storage duration are destroyed in the reverse order of the completion of their constructor." (That section is talking about exit(), but a return from main is the same, see 3.6.1/5.)
As a bonus, you're guaranteed that all globals (in that file) are initialized before entering main. (Something not guaranteed in the standard, but allowed if implementations choose.)
I've had this exact problem, also writing a memory tracker.
A few things:
Along with destruction, you also need to handle construction. Be prepared for malloc/new to be called BEFORE your memory tracker is constructed (assuming it is written as a class). So you need your class to know whether it has been constructed or destructed yet!
class MemTracker
{
enum State
{
unconstructed = 0, // must be 0 !!!
constructed,
destructed
};
State state;
MemTracker()
{
if (state == unconstructed)
{
// construct...
state = constructed;
}
}
};
static MemTracker memTracker; // all statics are zero-initted by linker
On every allocation that calls into your tracker, construct it!
MemTracker::malloc(...)
{
// force call to constructor, which does nothing after first time
new (this) MemTracker();
...
}
Strange, but true. Anyhow, onto destruction:
~MemTracker()
{
OutputLeaks(file);
state = destructed;
}
So, on destruction, output your results. Yet we know that there will be more calls. What to do? Well,...
MemTracker::free(void * ptr)
{
do_tracking(ptr);
if (state == destructed)
{
// we must getting called late
// so re-output
// Note that this might happen a lot...
OutputLeaks(file); // again!
}
}
And lastly:
be careful with threading
be careful not to call malloc/free/new/delete inside your tracker, or be able to detect the recursion, etc :-)
EDIT:
and I forgot, if you put your tracker in a DLL, you will probably need to LoadLibrary() (or dlopen, etc) yourself to up your reference count, so that you don't get removed from memory prematurely. Because although your class can still be called after destruction, it can't if the code has been unloaded.