Consider a program with three threads A,B,C.
They have a shared global object G.
I want to use an atomic variable(i) inside G which is written by Thread B and Read by A.
My approach was:
declare i in G as:
std::atomic<int> i;
write it from thread B using a pointer to G as:
G* pG; //this is available inside A and B
pG->i = 23;
And read it from thread A using the same way.
int k = pG->i;
Is my approach correct if these threads try to access this variable simultaneously.?
Like JV says, it depends what your definition of "correct" is. See http://preshing.com/20120612/an-introduction-to-lock-free-programming/. If it doesn't need to synchronize with anything, you should use std::memory_order_relaxed stores instead of the default sequential consistency stores, so it compiles to more efficient asm (no memory barrier instructions).
But yes, accessing an atomic struct member through a pointer is fine, as long as the pointer itself is initialized before the threads start.
If the struct is a global, then don't use a pointer to it, just access the global directly. Having a separate variable that always points to the same global is an extra level of indirection for no benefit.
If you want to change the pointer, it also needs to be std::atomic<struct foo *> pG, and changing it gets complicated as far as deciding when it's safe to free the old data after changing it.
Related
I'm wondering if I need to use std::atomic in the following case:
a (pointer to a) member variable is initialized in an object's constructor
at some point in the future, there is exactly one write by some thread
several other threads are reading it concurrently (reads happen both before and after the write)
if I'm only looking for the following type of consistency:
a thread sees either the initial value of the member variable or the value after the write
each thread eventually sees the value after write (provided it runs long enough)
If yes, which memory order should I use in load/store (out of memory_order_consume, memory_order_acquire, memory_order_release, memory_order_acq_rel, memory_order_seq_cst) to get as little overhead as possible?
As an example, suppose I want to implement a "static" singly-linked list which can only insert at tail and never delete or change any of the next pointers, i.e.:
Entry {
...
const Entry* next; // or std::atomic<const Entry*> next;
Entry() : next(NULL) { ... }
...
};
void Insert(Entry* tail, const Entry* e) {
tail->next = e; // assuming tail != NULL (i.e. we always have a dummy node)
}
Memory order only dictates which writes or reads to other variables than the atomic one are being seen by other threads. If you don't care about the other writes or reads in your thread in relation to your member variable, you can even use std::memory_order_relaxed.
To question how fast other threads see writes on your atomic variable, the standard says the following: (ยง 29.3.13)
Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.
To decide what memory ordering you need, you need to provide more information about what you will use the member variable for. If you are using a atomic pointer field to reference a newly created object and that the object has fields you want to the readers to access, then you need to make sure synchronization is established. That means that store needs to be a release, and the loads probably should be acquires. Though depending on the details consume might work. At this point, there isn't really any advantage of using consume performance wise so I'd probably stick to acquire.
Try playing around with examples using CDSChecker to get an idea of what you need to do.
What are the C++98 and C++11 memory models for local arrays and interactions with threads?
I'm not referring to the C++11 thread_local keyword, which pertains to global and static variables.
Instead, I'd like to find out what is the guaranteed behavior of threads for arrays that are allocated at compile-time. By compile-time I mean "int array[100]", which is different to allocation using the new[] keyword. I do not mean static variables.
For example, let's say I have the following struct/class:
struct xyz { int array[100]; };
and the following function:
void fn(int x) {
xyz dog;
for(int i=0; i<100; ++i) { dog.array[i] = x; }
// do something else with dog.array, eg. call another function with dog as parameter
}
Is it safe to call fn() from multiple threads?
It seems that the C++ memory model is: all local non-static variables and arrays are allocated on the stack, and that each thread has its own stack. Is this true (ie. is this officially part of the standard) ?
Such variables are allocated on the stack, and since each thread has its own stack, it is perfectly safe to use local arrays. They are not different from e.g. local ints.
C++98 didn't say anything about threads. Programs otherwise written to C++98 but which use threads do not have a meaning that is defined by C++98. Of course, it's sensible for threading extensions to provide stable, private local variables to threads, and they usually do. But there can exist threads for which this is not the case: for instance processes created by vfork on some Unix systems, whereby the parent and child will execute in the same stack frame, since the v in vfork means not to clone the address space, and vfork does not redirect the new process to a different function, either.
In C++11, there is threading support. Local variables in separate activation chains in separate C++11 threads do not interfere. But if you go outside the language and whip out vfork or anything resembling it, then all bets are off, like before.
But here is something. C++ has closures now. What if two threads both invoke the same closure? Then you have two threads sharing the same local variables. A closure is like an object and its captured local variables are like members. If two or more threads call the same closure, then you have a de facto multi-threaded object whose members (i.e. captured lexical variables) are shared.
A thread can also simply pass the address of its local variables to another thread, thereby causing them to become shared.
While browsing open source code (from OpenCV), I came across the following type of code inside a method:
// copy class member to local variable for optimization
int foo = _foo; //where _foo is a class member
for (...) //a heavy loop that makes use of foo
From another question on SO I've concluded that the answer to whether or not this actually needs to be done or is done automatically by the compiler may be compiler/setting dependent.
My question is if it would make any difference if _foo were a static class member? Would there still be a point in this manual optimization, or is accessing a static class member no more 'expensive' than accessing a local variable?
P.S. - I'm asking out of curiosity, not to solve a specific problem.
Accessing a property means de-referencing the object, in order to access it.
As the property may change during the execution (read threads), the compiler will read the value from memory each time the value is accessed.
Using a local variable will allow the compiler to use a register for the value, as it can safely assume the value won't change from the outside. This way, the value is read only once from memory.
About your question concerning the static member, it's the same, as it can also be changed by another thread, for instance. The compiler will also need to read the value each time from memory.
I think a local variable is more likely to participate in some optimization, precisely because it is local to the function: this fact can be used by the compiler, for example if it sees that nobody modifies the local variable, then the compiler may load it once, and use it in every iteration.
In case of member data, the compiler may have to work more to conclude that nobody modifies the member. Think about multi-threaded application, and note that the memory model in C++11 is multi-threaded, which means some other thread might modify the member, so the compiler may not conclude that nobody modifies it, in consequence it has to emit code for load member for every expression which uses it, possibly multiple times in a single iteration, in order to work with the updated value of the member.
In this example the the _foo will be copied into new local variable. so both cases the same.
Statis values are like any other variable. its just stored in different memory segment dedicated for static memory.
Reading a static class member is effectively like reading a global variable. They both have a fixed address. Reading a non-static one means first reading the this-pointer, adding an offset to the result and then reading that address. In other words, reading a non-static one requires more steps and memory accesses.
Having spent the last few days debugging a multi-threading where one thread was deleting an object still in use by another I realised that the issue would have been far easier and quicker to diagnose if I could have made 'this' volatile. It would have changed the crash dump on the system (Symbian OS) to something far more informative.
So, is there any reason why it cannot be, or shouldn't be?
Edit:
So there really is no safe way to prevent or check for this scenario. Would it be correct to say that one solution to the accessing of stale class pointers is to have a global variable that holds the pointer, and any functions that are called should be statics that use the global variable as a replacement for 'this'?
static TAny* gGlobalPointer = NULL;
#define Harness static_cast<CSomeClass*>(gGlobalPointer);
class CSomeClass : public CBase
{
public:
static void DoSomething();
private:
int iMember;
};
void CSomeClass::DoSomething()
{
if (!Harness)
{
return;
}
Harness->iMember = 0;
}
So if another thread deleted and NULLed the global pointer it would be caught immediately.
One issue I think with this is that if the compiler cached the value of Harness instead of checking it each time it's used.
this is not a variable, but a constant. You can change the object referenced by this, but you can't change the value of this. Because constants never change, there is no need to mark them as volatile.
It wouldn't help: making a variable volatile means that the compiler will make sure it reads its value from memory each time it's accessed, but the value of this doesn't change, even if, from a different context or the same, you delete the object it's pointing at.
It can be. Just declare the member function as volatile.
struct a
{
void foo() volatile {}
};
volatile wouldn't help you.
It would make the access to a variable volatile, but won't wait for any method to complete.
use smart pointers.
shared_ptr
There's also an version in std in newer c++ versions.
If the object is being read whilst being deleted that is a clear memory error and has nothing to do with volatile.
volatile is intended to stop the compiler from "remembering" the value of a variable in memory that may be changed by a different thread, i.e. prevent the compiler from optimising.
e.g. if your class has a pointer member p and your method is accessing:
p->a;
p->b;
etc. and p is volatile within "this" so p->b could be accessing a different object than it was when it did p->a
If p has been destroyed though as "this" was deleted then volatile will not come to your rescue. Presumably you think it will be "nulled" but it won't be.
Incidentally it is also a pretty sound rule that if your destructor locks a mutex in order to protect another thread using the same object, you have an issue. Your destructor may lock due to it removing its own presence from an external object that needs synchronous activity, but not to protect its own members.
What is the problem with having static variables (especially within functions) in multithreaded programs?
Thanks.
Initialization is not thread-safe. Two threads can enter the function and both may initialize the function-scope static variable. That's not good. There's no telling what the result might be.
In C++0x, initialization of function-scope static variables will be thread-safe; the first thread to call the function will initialize the variable and any other threads calling that function will need to block until that initialization is complete.
I don't think there are currently any compiler + standard library pairs that fully implement the C++0x concurrency memory model and the thread support and atomics libraries.
To pick an illustrative example at random, take an interface like asctime in the C library. The prototype looks like this:
char *
asctime(const struct tm *timeptr);
This implicitly must have some global buffer to store the characters in the char* returned. The most common and simple way to accomplish this would be something like:
char *
asctime(const struct tm *timeptr)
{
static char buf[MAX_SIZE];
/* TODO: convert timeptr into string */
return buf;
}
This is totally broken in a multi-threaded environment, because buf will be at the same address for each call to asctime(). If two threads call asctime() at the same time, they run the risk of overwriting each other's results. Implicit in the contract of asctime() is that the characters of the string will stick around until the next call to asctime(), and concurrent calls breaks this.
There are some language extensions that work around this particular problem in this particular example via thread-local storage (__thread,__declspec(thread)). I believe this idea made it into C++0x as the thread_local keyword.
Even so I would argue it's a bad design decision to use it this way, for similar reasons as for why it's bad to use global variables. Among other things, it may be thought of as a cleaner interface for the caller to maintain and provide this kind of state, rather than the callee. These are subjective arguments, however.
A static variable usually means multiple invocations of your function would share a state and thus interfere with one another.
Normally you want your functions to be self contained; have local copies of everything they work on and share nothing with the outside world bar parameters and return values. (Which, if you think a certain way, aren't a part of the function anyway.)
Consider:
int add(int x, int y);
definitely thread-safe, local copies of x and y.
void print(const char *text, Printer *printer);
dangerous, someone outside might be doing something with the same printer, e.g. calling another print() on it.
void print(const char *text);
definitely non-thread-safe, two parallel invocations are guaranteed to use the same printer.
Of course, there are ways to secure access to shared resources (search keyword: mutex); this is just how your gut feeling should be.
Unsynchronized parallel writes to a variable are also non-thread-safe most of the time, as are a read and write. (search keywords: synchronization, synchronization primitives [of which mutex is but one], also atomicity/atomic operation for when parallel access is safe.)