I've read that different threads share the same memory segments apart from the stack. There is something i've been trying to understand. I have a class which creates different threads. Below is a simple example of what I am doing, creating one thread in the constructor.
When an object of this class is created, in main() for example, all the threads can access the same member variables. If every thread gets its own stack, why doesn't each thread get a copy of the member variables rather than access to the same variable. I have looked around and I'm trying to get a picture in my mind of what is going on in memory here with the different stack frames. Many thanks in advance for any replies.
////////////////////////
MyClass::MyClass()
{
t1 = std::thread([this] { this->threadFunc(); });
t1.detach();
}
/////////////////////////
void MyClass::threadFunc()
{
///do stuff.. update member variables etc.
}
I've read that different threads share the same memory segments apart
from the stack.
This is not entirely accurate. Threads share the same address space as opposed to being sand-boxed like processes are.
The stack is just some memory in your application that has been specifically reserved and is used to hold things such as function parameters, local variables, and other function-related information.
Every thread has it's own stack. This means that when a particular thread is executing, it will use it's own specific stack to avoid trampling over other threads which might be idle or executing simultaneously in a multi-core system.
Remember that these stacks are all still inside the same address space which means that any thread can access the contents of another threads' stack.
A simple example:
#include <iostream>
#include <thread>
void Foo(int& i)
{
// if thread t is executing this function then j will sit inside thread t's stack
// if we call this function from the main thread then j will sit inside the main stack
int j = 456;
i++; // we can see i because threads share the same address space
}
int main()
{
int i = 123; // this will sit inside the main threads' stack
std::thread t(std::bind(&Foo, std::ref(i))); // we pass the address of i to our thread
t.join();
std::cout << i << '\n';
return 0;
}
An illustration:
>
As you can see, each thread has its own stack (which is just some part of the processes' memory) which lives inside the same address space.
I've read that different threads share the same memory segments apart from the stack. [...] If every thread gets its own stack, why doesn't each thread get a copy of the member variables rather than access to the same variable.
Hmm... the short answer is, that's the purpose for which threads were designed: multiple threads of execution, all accessing the same data. Then you have threading primitives - like mutexes - to make sure that they don't step on each others' toes, so to speak - that you don't have concurrent writes, for example.
Thus in the default case, threads share objects, and you have to do extra work to synchronize accesses. If you want threads to each have copies of objects... then you can instead write the code to give each thread a copy of the object, instead of the same object. Or, you may consider using processes (e.g. via fork()) instead of threads.
If you have
MyClass c1;
MyClass c2;
then there will be two instances of MyClass, c1 and c2, each with its own members. The thread started during construction of c1 will have access to c1's members only and the thread started in c2 will have access to c2's members only.
If you are talking about several threads within one class (not obvious from your code), then each thread has a copy of this which is just a pointer to your MyClass object. New (local) variables will be allocated on the thread's stack and will be visible only for the thread which creates them.
Related
I have seen code where mutex or critical section is declared as member variable of the class to make it thread safe something like the following.
class ThreadSafeClass
{
public:
ThreadSafeClass() { x = new int; }
~ThreadSafeClass() {};
void reallocate()
{
std::lock_guard<std::mutex> lock(m);
delete x;
x = new int;
}
int * x;
std::mutex m;
};
But doesn't that make it thread safe only if the same object was being shared by multiple threads? In other words, if each thread was creating its own instance of this class, they will be very much independent and its member variables will never conflict with each other and synchronization will not even be needed in that case!?
It appears to me that defining the mutex as member variable really reduces synchronization to the events when the same object is being shared by multiple threads. It doesn't really make the class any thread safer if each thread has its own copy of the class (for example if the class were to access other global objects). Is this a correct assessment?
If you can guarantee that any given object will only be accessed by one thread then a mutex is an unnecessary expense. It however must be well documented on the class's contract to prevent misuse.
PS: new and delete have their own synchronization mechanisms, so even without a lock they will create contention.
EDIT: The more you keep threads independent from each other the better (because it eliminates the need for locks). However, if your class will work heavily with a shared resource (e.g. database, file, socket, memory, etc ...) then having a per-thread instance is of little advantage so you might as well share an object between threads. Real independence is achieved by having different threads work with separate memory locations or resources.
If you will have potentially long waits on your locks, then it might be a good idea to have a single instance running in its own thread and take "jobs" from a synchronized queue.
I'm wondering what the best practice for locking and unlocking mutexes for variables withing an object that is shared between threads.
This is what I have been doing and seems to work just fine so far, just wondering if this is excessive or not though:
class sharedobject
{
private:
bool m_Var1;
pthread_mutex_t var1mutex;
public:
sharedobject()
{
var1mutex = PTHREAD_MUTEX_INITIALIZER;
}
bool GetVar1()
{
pthread_mutex_lock(&var1mutex);
bool temp = m_Var1;
pthread_mutex_unlock(&var1mutex);
return temp;
}
void SetVar1(bool status)
{
pthread_mutex_lock(&var1mutex);
m_Var1 = status;
pthread_mutex_unlock(&var1mutex);
}
};
this isn't my actual code, but it shows how i am using mutexes for every variable that is shared in an object between threads. The reason i don't have a mutex for the entire object is because one thread might take seconds to complete an operation on part of the object, while another thread checks the status of the object, and again another thread gets data from the object
my question is, is it a good practice to create a mutex for every variable within an object that is accessed between threads, then lock and unlock that variable when it is read or written to?
I use trylock for variables i'm checking the status to (so i don't create extra threads while the variable is being processed, and don't make the program wait to get a lock)
I haven't had a lot of experience working with threading. I would like the make the program thread safe, but it also needs to perform as best as possible.
if the members you're protecting are read-write, and may be accessed at any time by more than one thread, then what you're doing is not excessive - it's necessary.
If you can prove that a member will not change (is immutable) then there is no need to protect it with a mutex.
Many people prefer multi-threaded solutions where each thread has an immutable copy of data rather than those in which many threads access the same copy. This eliminates the need for memory barriers and very often improves execution times and code safety.
Your mileage may vary.
If I have a class
class A{
A(){
getcontext(context);
makecontext(context, fun1, etc)
put context pointer on queue
}
fun1(args){
something
}
}
In I make an instance of class A in Thread1 running on CPU1, and then try to pop the context off the queue and swap it in from thread2 on CPU2, will there be a problem because the object was instantiated in the stack of Thread1 in CPU1 and hence the pointer to fun1 which is attacked to this context is not reachable?
I'd say the answer is yes and no. All threads share the same memory. If more than 1 thread accesses the object, you need to be careful to synchronize access between thread. That's the yes part.
The no part is when you said stack of thread 1. Stack variables are local to a function. If the function returns, the local variables are no longer valid. You don't show quite enough code for me to see where context gets created, and if the function that allocate the object on the stack waits until thread 2 finishes.
No. The whole point of threads is that they share all memory.
Lets say there are 4 consumer threads that run in a loop continuously
function consumerLoop(threadIndex)
{
int myArray[100];
main loop {
..process data..
myArray[myIndex] += newValue
}
}
I have another monitor thread which does other background tasks.
I need to access the myArray for each of these threads from the monitor thread.
Assume that the loops will run for ever(so the local variables would exist) and the only operation required from the monitor thread is to read the array contents of all the threads.
One alternative is to change myArray to a global array of arrays. But i am guessing that would slow down the consumer loops.
What are the ill effects of declaring a global pointer array
int *p[4]; and assigning each element to the address of the local variable by adding a line in consumerLoop like so p[threadIndex] = myArray and accessing p from monitor thread?
Note: I am running It in a linux system and the language is C++. I am not concerned about synchronization/validity of the array contents when i am accessing it from the monitor thread.Lets stay away from a discussion of locking
If you are really interested in the performance difference, you have to measure. I would guess, that there are nearly no differenced.
Both approaches are correct, as long as the monitor thread doesn't access stack local variables that are invalid because the function returned.
You can not access myArray from different thread because it is local variable.
You can do 1) Use glibal variable or 2) Malloca and pass the address to all threads.
Please protect the critical section when all threads rush to use the common memory.
let's say we have a c++ class like:
class MyClass
{
void processArray( <an array of 255 integers> )
{
int i ;
for (i=0;i<255;i++)
{
// do something with values in the array
}
}
}
and one instance of the class like:
MyClass myInstance ;
and 2 threads which call the processArray method of that instance (depending on how system executes threads, probably in a completely irregular order). There is no mutex lock used in that scope so both threads can enter.
My question is what happens to the i ? Does each thread scope has it's own "i" or would each entering thread modify i in the for loop, causing i to be changing weirdly all the time.
i is allocated on the stack. Since each thread has its own separate stack, each thread gets its own copy of i.
Be careful. In the example provided the method processArray seems to be reentrant (it's not clear what happens in // do something with values in the array). If so, no race occurs while two or more threads invoke it simultaneously and therefore it's safe to call it without any locking mechanism.
To enforce this, you could mark both the instance and the method with the volatile qualifier, to let users know that no lock is required.
It has been published an interesting article of Andrei Alexandrescu about volatile qualifier and how it can be used to write correct multithreaded classes. The article is published here:
http://www.ddj.com/cpp/184403766
Since i is a local variable it is stored on the thread's own private stack. Hence, you do not need to protect i with a critical section.
As Adam said, i is a variable stored on the stack and the arguments are passed in so this is safe. When you have to be careful and apply mutexes or other synchronization mechanisms is if you were accessing shared member variables in the same instance of the class or global variables in the program (even scoped statics).