I'm performing a long task, implemented in some function, in a new thread (i.e. not the main GUI thread) in FLTK (in C++).
I achieve this with a callback function that in turn creates the thread so that I have something of the form
void callback(Fl_Widget* widget,void* passed_data){
data_type* data = new data_type;
data->value = x; //populate data structure to send to function
fl_create_thread(thread1,function,data);
}
where fl_create_thread (at least for my purposes) is just using pthread_create meaning the data variable is passed as a void pointer and so 'function' takes a void pointer too.
I realised that this would actually create a memory leak as I don't delete 'data':
I can't delete it after the line with fl_create_thread as the thread hasn't necessarily (or ever) finished running. I have tried deleting the pointer at the end of 'function' but this raises two issues
1) Deleting a void pointer is undefined and so I am getting warnings to that effect.
2) This almost defeats the point of using a function: is there a better general coding practice?
Can anyone tell me how I should approach this? Thanks.
1) Deleting a void pointer is undefined and so I am getting warnings
to that effect.
You're probably a casting to the proper type at the very beginning of your function (or you should in order to meaningfully use the parameter). Delete that pointer instead of the void * parameter and you will not get a warning.
2) This almost defeats the point of using a function: is there a
better general coding practice?
Not sure I see your point. This thread interface (pthread) is very c-like, so it has to resort to void pointers in order to pass arbitrary data. You can you look at C++ thread interface for a more C++ way of defining what to execute and with what parameters
With memory you should always have clear the ownership of allocated memory. This approach transfers the ownership to the just created thread.
Related
I am using boost 1.55 (io_service doc). I need to call the destructor on my io_service to reset it after power is cycled on my serial device to get new data. The problem is that when the destructor is called twice (re-trying connection), I get a segmentation fault.
In header file
boost::asio::io_service io_service_port_1;
In function that closes connection
io_service_port_1.stop();
io_service_port_1.reset();
io_service_port_1.~io_service(); // how to check for NULL?
// do I need to re-construct it?
The following does not work:
if (io_service_port_1)
if (io_service_port_1 == NULL)
Thank you.
If you need manual control over when the object is created and destroyed, you should be wrapping it in a std::unique_ptr object.
std::unique_ptr<boost::asio::io_service> service_ptr =
std::make_unique<boost::asio::io_service>();
/*Do stuff until connection needs to be reset*/
service_ptr->stop();
//I don't know your specific use case, but the call to io_service's member function reset is probably unnecessary.
//service_ptr->reset();
service_ptr.reset();//"reset" is a member function of unique_ptr, will delete object.
/*For your later check*/
if(service_ptr) //returns true if a valid object exists in the pointer
if(!service_ptr) //returns true if no object is being pointed to.
Generally speaking, you should never directly call ~object_name();. Ever. Ever. Ever. There's several reasons why:
As a normal part of Stack Unwinding, this will get called anyways when the method returns.
deleteing a pointer will call it.
"Smart Pointers" (like std::unique_ptr and std::shared_ptr) will call it when they self-destruct.
Directly calling ~object_name(); should only ever be done in rare cases, usually involving Allocators, and even then, there are usually cleaner solutions.
I'm aware of the threading issues etc that this could cause and of its dangers but I need to know how to do this for a security project I am doing at school. I need to know how to call a function in a remote address space of a given calling convention with arguments - preferably recovering the data the remote function has returned though its really not required that I do.
If I can get specifics from the remote function's function prototype at compile time, I will be able to make this method work. I need to know how big the arguments are and if the arguments are explicitly declared as pointers or not (void*, char*, int*, etc...)
I.e if I define a function prototype like:
typedef void (__cdecl *testFunc_t)(int* pData);
I would need to, at compile time, get the size of arguments at least, and if I could, which ones are pointers or not. Here we are assuming the remote function is either an stdcall or _cdecl call.
The IDE I am using is Microsoft Visual Studio 2007 in case the solution is specific to a particular product.
Here is my plan:
Create a thread in the remote process using CreateRemoteThread at the origin of the function want to call, though I would do so in a suspended state.
I would setup the stack such that the return address was that of a stub of code allocated inside of the process that would call ExitThread(eax) - as this would exit the thread with the function's return value - I would then recover this by by using GetExitCodeThread
I would also copy the arguments for the function call from my local stack to that of the newly created thread - this is where I need to know if function arguments are pointers and the size of the arguments.
Resume the thread and wait for it to exit, at which point I will return to the caller with the threads exit code.
I know that this should be doable at compile time but whether the compiler has some method I can use to do it, I'm not sure. I'm also aware all this data can be easily recovered from a PDB file created after compiling the code and that the size of arguments might change if the compiler performs optimizations. I don't need to be told how dangerous this is, as I am fully aware of it, but this is not a commercial product but a small project I must do for school.
The question:
If I have a function prototype such as
typedef void (__cdecl testFunc_t)(int pData);
Is there anyway I can get the size of this prototype's arguments at compile time(i.e in the above example, the arguments would sum to a total size of sizeof(int*) If, for example, I have a function like:
template<typename T> unsigned long getPrototypeArgLength<T>()
{
//would return size of arguments described in the prototype T
}
//when called as
getPrototypeArgLength<testFunc>()
This seems like quite a school project...
For step 3 you can use ReadProcessMemory / WriteProcessMemory (one of them). For example, the new thread could receive the address (on the calling process), during the thread creation, of the parameters on the start (begin and end). Then it could read the caller process memory from that region and copy it to its own stack.
Did you consider using COM for this whole thing? you could probably get things done much easier if you use a mechanism that was designed especially for that.
Alright, I figured out that I can use the BOOST library to get a lot of type information at compile-time. Specifically, I am using boost::function_traits however, if you look around the boost library, you will find that you can recover quite a bit of information. Here's a bit of code I wrote to demonstrate how to get the number of arguments of a function prototype.
(actually, I haven't tested the below code, its just something I'm throwing together from another function I've made and tested.)
template<typename T>
unsigned long getArgCount()
{
return boost::function_traits<boost::remove_pointer<T>::type>::arity;
}
void (*pFunc)(int, int);
2 = getArgCount<BOOST_TYPEOF(pFunc)>();
I know it looks not necessary but I hope that it would help me find memory leak.
So having a function inside a class that returns int, how can I call it from another function of that class (call it so that function that returns int would run in another thread)?
You are trying to find a memory leak in a function by having it called from another thread? That is like trying to find a needle in a haystack by adding more hay to the stack.
Thread programming 101:
Spawn a new thread ("thread2") that invokes a new function ("foo").
Have the original thread join against thread2 immediately after the spawn.
Read a global variable that foo() has written its final value to.
Notice that foo() cannot return its value to the original thread; it must write the value to some shared memory (ie, global variable). Also note that this will not solve your memory leak problem, or even make it obvious where your memory leak is coming from.
Look for memory leaks with Valgrind. And read a book or tutorial about multithreading.
The operating system will not reclaim memory leaks in worker threads. That's not how it works.
Fix your bugs. The world doesn't need any more crappy software.
I am not very good in multithreading programming so I would like to ask for some help/advice.
In my application I have two threads trying to access a shared object.
One can think about two tasks trying to call functions from within another object. For clarity I will show some parts of the program which may not be very relevant but hopefully can help to get my problem better.
Please take a look at the sample code below:
//DataLinkLayer.h
class DataLinkLayer: public iDataLinkLayer {
public:
DataLinkLayer(void);
~DataLinkLayer(void);
};
Where iDataLinkLayer is an interface (abstract class without any implementation) containing pure virtual functions and a reference (pointer) declaration to the isntance of DataLinkLayer object (dataLinkLayer).
// DataLinkLayer.cpp
#include "DataLinkLayer.h"
DataLinkLayer::DataLinkLayer(void) {
/* In reality task constructors takes bunch of other parameters
but they are not relevant (I believe) at this stage. */
dll_task_1* task1 = new dll_task_1(this);
dll_task_2* task2 = new dll_task_2(this);
/* Start multithreading */
task1->start(); // task1 extends thread class
task2->start(); // task2 also extends thread class
}
/* sample stub functions for testing */
void DataLinkLayer::from_task_1() {
printf("Test data Task 1");
}
void DataLinkLayer::from_task_2() {
printf("Test data Task 2");
}
Implementation of task 1 is below. The dataLinLayer interface (iDataLinkLayer) pointer is passed to the class cosntructor in order to be able to access necessary functions from within the dataLinkLayer isntance.
//data_task_1.cpp
#include "iDataLinkLayer.h" // interface to DataLinkLayer
#include "data_task_1.h"
dll_task_1::dll_task_1(iDataLinkLayer* pDataLinkLayer) {
this->dataLinkLayer = pDataLinkLayer; // dataLinkLayer declared in dll_task_1.h
}
// Run method - executes the thread
void dll_task_1::run() {
// program reaches this point and prints the stuff
this->datalinkLayer->from_task_1();
}
// more stuff following - not relevant to the problem
...
And task 2 looks simialrly:
//data_task_2.cpp
#include "iDataLinkLayer.h" // interface to DataLinkLayer
#include "data_task_2.h"
dll_task_2::dll_task_2(iDataLinkLayer* pDataLinkLayer){
this->dataLinkLayer = pDataLinkLayer; // dataLinkLayer declared in dll_task_2.h
}
// // Run method - executes the thread
void dll_task_2::run() {
// ERROR: 'Access violation reading location 0xcdcdcdd9' is signalled at this point
this->datalinkLayer->from_task_2();
}
// more stuff following - not relevant to the problem
...
So as I understand correctly I access the shared pointer from two different threads (tasks) and it is not allowed.
Frankly I thought that I will be able to access the object nevertheless however the results might be unexpected.
It seems that something goes terribly wrong at the point when dll_task_2 tries to call the function using pointer to the DataLinkLayer. dll_task_2 has lower priority hence it is started afterwards. I don't understand why i still cannot at least access the object...
I can use the mutex to lock the variable but I thought that the primary reason for this is to protect the variable/object.
I am using Microsoft Visual C++ 2010 Express.
I don't know much about multithreading so maybe you can suggest a better solution to this problem as well as explain the reason of the problem.
The address of the access violation is a very small positive offset from 0xcdcdcdcd
Wikipedia says:
CDCDCDCD Used by Microsoft's C++ debugging runtime library to mark uninitialised heap memory
Here is the relevant MSDN page.
The corresponding value after free is 0xdddddddd, so it's likely to be incomplete initialization rather than use-after-free.
EDIT: James asked how optimization could mess up virtual function calls. Basically, it's because the currently standardized C++ memory model makes no guarantees about threading. The C++ standard defines that virtual calls made from within a constructor will use the declaring type of the constructor currently being run, not the final dynamic type of the object. So this means that, from the perspective of the C++ sequential execution memory model, the virtual call mechanism (practically speaking, a v-table pointer) must be set up before the constructor starts running (I believe the specific point is after base subobject construction in the ctor-initializer-list and before member subobject construction).
Now, two things can happen to make the observable behavior different in a threaded scenario:
First, the compiler is free to perform any optimization that would, in the C++ sequential execution model, act as-if the rules were being followed. For example, if the compiler can prove that no virtual calls are made inside the constructor, it could wait and set the v-table pointer at the end of the constructor body instead of the beginning. If the constructor doesn't give out the this pointer, since the caller of the constructor also hasn't received its copy of the pointer yet, then none of the functions called by the constructor can call back (virtually or statically) to the object under construction. But the constructor DOES give away the this pointer.
We have to look closer. If the function to which the this pointer is given is visible to the compiler (i.e. included in the current compilation unit), the the compiler can include its behavior in the analysis. We weren't given that function in this question (the constructor and member functions of class task), but it seems likely that the only thing that happens is that said pointer is stored in a subobject which is also not reachable from outside the constructor.
"Foul!", you cry, "I passed the address of that task subobject to a library CreateThread function, therefore it is reachable and through it, the main object is reachable." Ah, but you do not comprehend the mysteries of the "strict aliasing rules". That library function does not accept a parameter of type task *, now does it? And being a parameter whose type is perhaps intptr_t, but definitely neither task * nor char *, the compiler is permitted to assume, for purposes of as-if optimization, that it does not point to a task object (even if it clearly does). And if it does not point to a task object, and the only place our this pointer got stored is in a task member subobject, then it cannot be used to make virtual calls to this, so the compiler may legitimately delay setting up the virtual call mechanism.
But that's not all. Even if the compiler does set up the virtual call mechanism on schedule, the CPU memory model only guarantees that the change is visible to the current CPU core. Writes may become visible to other CPU cores in a completely different order. Now, the library create thread function ought to introduce a memory barrier that constrains CPU write reordering, but that fact that Koz's answer introducing a critical section (which definitely includes a memory barrier) changes the behavior suggests that perhaps no memory barrier was present in the original code.
And, CPU write reordering can not only delay the v-table pointer, but the storage of the this pointer into the task subobject.
I hope you have enjoyed this guided tour of one small corner of the cave of "multithreaded programming is hard".
printf is not, afaik, thread safe. Try surrounding the printf with a critical section.
To do this you InitializeCriticalSection inside iDataLinkLayer class. Then around the printfs you need an EnterCriticalSection and a LeaveCriticalSection. This will prevent both functions entering the printf simultaneously.
Edit: Try changing this code:
dll_task_1* task1 = new task(this);
dll_task_2* task2 = new task(this);
to
dll_task_1* task1 = new dll_task_1(this);
dll_task_2* task2 = new dll_task_2(this);
Im guessing that task is in fact the base class of dll_task_1 and dll_task_2 ... so, more than anything, im surprised it compiles ....
I think it's not always safe to use 'this' (i.e. to call a member function) before the end of the constructor. It could be that task are calling member function of DataLinkLayer before the end of DataLinkLayer constructor. Especially if this member function is virtual:
http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.7
I wanted to comment on the creation of the DataLinkLayer.
When I call the DataLinkLayer constructor from main:
int main () {
DataLinkLayer* dataLinkLayer = new DataLinkLayer();
while(true); // to keep the main thread running
}
I, of coruse, do not destruct the object, this is first. Now, inside the DataLinkLayer cosntructor I initialize many (not only these two tasks) other objects isntances and pass to most of them dataLinkLayer pointer (using this). This is legal, as far as I am concerned. Put it further - it compiles and runs as expected.
What I became curious about is the overall design idea that I am following (if any :) ).
The DataLinkLayer is a parent class that is accessed by several tasks which try to modify it parameters or perform some other processing. Since I want that everything remain as decoupled as possible I provide only interfaces for the accessors and encapsulate the data so that I don't have any global variables, friend functions etc.
It would have been a pretty easy task to do if only multithreading would not be there. I beleive I will encounter many other pitfalls on my way.
Feel free to discuss it please and merci for your generous comments!
UPD:
Speaking of passing the iDataLinkLayer interface pointer to the tasks - is this a good way to do it? In Java it would be pretty usual thing to realize a containment or so called strategy pattern to make things decoupled and stuff. However I am not 100% sure whether it is a good solution in c++... Any suggestions/commnets on it?
All thread create methods like pthread_create() or CreateThread() in Windows expect the caller to provide a pointer to the arg for the thread. Isn't this inherently unsafe?
This can work 'safely' only if the arg is in the heap, and then again creating a heap variable
adds to the overhead of cleaning the allocated memory up. If a stack variable is provided as the arg then the result is at best unpredictable.
This looks like a half-cooked solution to me, or am I missing some subtle aspect of the APIs?
Context.
Many C APIs provide an extra void * argument so that you can pass context through third party APIs. Typically you might pack some information into a struct and point this variable at the struct, so that when the thread initializes and begins executing it has more information than the particular function that its started with. There's no necessity to keep this information at the location given. For instance you might have several fields that tell the newly created thread what it will be working on, and where it can find the data it will need. Furthermore there's no requirement that the void * actually be used as a pointer, its a typeless argument with the most appropriate width on a given architecture (pointer width), that anything can be made available to the new thread. For instance you might pass an int directly if sizeof(int) <= sizeof(void *): (void *)3.
As a related example of this style: A FUSE filesystem I'm currently working on starts by opening a filesystem instance, say struct MyFS. When running FUSE in multithreaded mode, threads arrive onto a series of FUSE-defined calls for handling open, read, stat, etc. Naturally these can have no advance knowledge of the actual specifics of my filesystem, so this is passed in the fuse_main function void * argument intended for this purpose. struct MyFS *blah = myfs_init(); fuse_main(..., blah);. Now when the threads arrive at the FUSE calls mentioned above, the void * received is converted back into struct MyFS * so that the call can be handled within the context of the intended MyFS instance.
Isn't this inherently unsafe?
No. It is a pointer. Since you (as the developer) have created both the function that will be executed by the thread and the argument that will be passed to the thread you are in full control. Remember this is a C API (not a C++ one) so it is as safe as you can get.
This can work 'safely' only if the arg is in the heap,
No. It is safe as long as its lifespan in the parent thread is as long as the lifetime that it can be used in the child thread. There are many ways to make sure that it lives long enough.
and then again creating a heap variable adds to the overhead of cleaning the allocated memory up.
Seriously. That's an argument? Since this is basically how it is done for all threads unless you are passing something much more simple like an integer (see below).
If a stack variable is provided as the arg then the result is at best unpredictable.
Its as predictable as you (the developer) make it. You created both the thread and the argument. It is your responsibility to make sure that the lifetime of the argument is appropriate. Nobody said it would be easy.
This looks like a half-cooked solution to me, or am i missing some subtle aspects of the APIs?
You are missing that this is the most basic of threading API. It is designed to be as flexible as possible so that safer systems can be developed with as few strings as possible. So we now hove boost::threads which if I guess is build on-top of these basic threading facilities but provide a much safer and easier to use infrastructure (but at some extra cost).
If you want RAW unfettered speed and flexibility use the C API (with some danger).
If you want a slightly safer use a higher level API like boost:thread (but slightly more costly)
Thread specific storage with no dynamic allocation (Example)
#include <pthread.h>
#include <iostream>
struct ThreadData
{
// Stuff for my thread.
};
ThreadData threadData[5];
extern "C" void* threadStart(void* data);
void* threadStart(void* data)
{
intptr_t id = reinterpret_cast<intptr_t>(data);
ThreadData& tData = threadData[id];
// Do Stuff
return NULL;
}
int main()
{
for(intptr_t loop = 0;loop < 5; ++loop)
{
pthread_t threadInfo; // Not good just makes the example quick to write.
pthread_create(&threadInfo, NULL, threadStart, reinterpret_cast<void*>(loop));
}
// You should wait here for threads to finish before exiting.
}
Allocation on the heap does not add a lot of overhead.
Besides the heap and the stack, global variable space is another option. Also, it's possible to use a stack frame that will last as long as the child thread. Consider, for example, local variables of main.
I favor putting the arguments to the thread in the same structure as the pthread_t object itself. So wherever you put the pthread record, put its arguments as well. Problem solved :v) .
This is a common idiom in all C programs that use function pointers, not just for creating threads.
Think about it. Suppose your function void f(void (*fn)()) simply calls into another function. There's very little you can actually do with that. Typically a function pointer has to operate on some data. Passing in that data as a parameter is a clean way to accomplish this, without, say, the use of global variables. Since the function f() doesn't know what the purpose of that data might be, it uses the ever-generic void * parameter, and relies on you the programmer to make sense of it.
If you're more comfortable with thinking in terms of object-oriented programming, you can also think of it like calling a method on a class. In this analogy, the function pointer is the method and the extra void * parameter is the equivalent of what C++ would call the this pointer: it provides you some instance variables to operate on.
The pointer is a pointer to the data that you intend to use in the function. Windows style APIs require that you give them a static or global function.
Often this is a pointer to the class you are intending to use a pointer to this or pThis if you will and the intention is that you will delete the pThis after the ending of the thread.
Its a very procedural approach, however it has a very big advantage which is often overlooked, the CreateThread C style API is binary compatible so that when you wrap this API with a C++ class (or almost any other language) you can do this actually do this. If the parameter was typed, you wouldn't be able to access this from another language as easily.
So yes, this is unsafe but there's a good reason for it.