Is the default copy constructor thread-safe in c++? - c++

class CSample{
int a;
// ..... lots of fields
}
Csample c;
As we know, Csample has a default copy constructor. When I do this:
Csample d = c
the default copy constructor will happen. My question is: is it thread safe? Because maybe someone modifies c in another thread when you do the copy constructor. If so, how does the compiler do it? And if not, I think it's horrible that the complier can't guarantee that the copy constructor be thread safe.

Nothing in C++ is thread-safe¹ unless explicitly noted.
If you need to read object c while it may be modified in another thread, you are responsible for locking it. That is a general rule and there is no reason why reading it for purpose of creating a copy should be an exception.
Note, that the copy being created does not need to be locked, because no other thread knows about it yet. Only the source needs to be.
The compiler does not guarantee anything to be thread-safe on it's own, because 99.9% of things don't need to be thread-safe. Most things only need to be reentrant³. So in the rare case you actually need to make something thread-safe, you have to use locks (std::mutex) or atomic types (std::atomic<int>).
You can also simply make your objects constant and then you can read them without locking, because nothing is writing them after creation. Code using constant objects is both more easily parallelised and more easily understood in general, because there is fewer things with state you have to track.
Note that on the most common architecture the mov instruction with int operands happens to be thread-safe. On other CPU types even that might not be true. And because the compiler is allowed to preload values, integer assignment in C++ is not anyway.
¹A set of operations is considered thread-safe if calling them concurrently on the same object is well defined². In C++, calling any modifying operation and any other operation concurrently on the same object is a data race, which is UndefinedBehaviour™.
²It is important to note, that if an object is "thread-safe", it does not really help you much most of the time anyway. Because if an object guarantees that when it's concurrently written you'll always read the new or the old value (C++ allows that when an int c is being changed from 0 to 1000 by one thread, another thread may read, say, 232), most of the time that won't help you, because you need to read multiple values in a consistent state, for which you have to lock over them yourself anyway.
³Reentrant means that the same operation may be called on different objects at the same time. There are a few functions in standard C library that are not reentrant, because they use global (static) buffers or other state. Most have reentrant variants (with _r suffix, usually) and the standrd C++ library uses these, so the C++ part is generally reentrant.

The general rule in the standard is simple: if an object (and
sub-objects are objects) is accessed by more than one thread,
and is modified by any thread, then all accesses must be
synchronized. There are numerous reasons for this, but the most
basic one is that protecting at the lowest level is usually the
wrong level of granularity; adding synchronization primitives
would only make the code run significantly slower, without any
real advantage for the user, even in a multithreaded
environment. Even if the copy constructor were "thread-safe",
unless the object is somehow totally independent of all other
context, you'll probably need some sort of synchronization
primitives at a higher level.
And with regards to "thread-safety": the usual meaning among
experienced practitionners it that the object/class/whatever
specifies exactly how much protection it guarantees. Precisely
because such low level definitions such as you (and many, many
others) seem to use are useless. Synchronizing each function in
a class is generally useless. (Java made the experiment, and
then backed off, because the gurantees they made in the initial
versions of their containers turned out to be expensive and
worthless.)

Assuming that d or c are accessed concurrently on multiple threads, this is not thread-safe. This would amount to a data-race which is undefined behavior.
Csample d = c;
is just as unsafe as
int d = c;
is.

Related

Is it OK for a class constructor to block forever?

Let's say I have an object that provides some sort of functionality in an infinite loop.
Is is acceptable to just put the infinite loop in the constructor?
Example:
class Server {
public:
Server() {
for(;;) {
//...
}
}
};
Or is there an inherent initialization problem in C++ if the constructor never completes?
(The idea is that to run a server you just say Server server;, possibly in a thread somewhere...)
It's not wrong per standard, it's just a bad design.
Constructors don't usually block. Their purpose is to take a raw chunk of memory, and transform it into a valid C++ object. Destructors do the opposite: they take valid C++ objects and turn them back into raw chunks of memory.
If your constructor blocks forever (emphasis on forever), it does something different than just turn a chunk of memory into an object.
It's ok to block for a short time (a mutex is a perfect example of it), if this serves the construction of the object.
In your case, it looks like your constructor is accepting and serving clients. This is not turning memory into objects.
I suggest you split the constructor into a "real" constructor that builds a server object and another start method that serves clients (by starting an event loop).
ps: In some cases you have to execute the functionality/logic of the object separately from the constructor, for example if your class inherit from std::enable_shared_from_this.
It's allowed. But like any other infinite loop, it must have observable side effects, otherwise you get undefined behavior.
Calling the networking functions counts as "observable side effects", so you're safe. This rule only bans loops that either do literally nothing, or just shuffle data around without interacting with the outside world.
Its legal, but its a good idea to avoid it.
The main issue is that you should avoid surprising users. Its unusual to have a constructor that never returns because it isn't logical. Why would you construct something you can never use? As such, while the pattern may work, it is unlikely to be an expected behavior.
A secondary issue is that it limits how your Server class can be used. The construction and destruction processes of C++ are fundamental to the language, so hijacking them can be tricky. For example, one might want to have a Server that is the member of a class, but now that overarching class' constructor will block... even if that isn't intuitive. It also makes it very difficult to put these objects into containers, as this can involve allocating many objects.
The closest I can think of to what you are doing is that of std::thread. Thread does not block forever, but it does have a constructor that does a surprisingly large amount of work. But if you look at std::thread, you realize that when it comes to multithreading, being surprised is the norm, so people have less trouble with such choices. (I am not personally aware of the reasons for starting the thread upon construction, but there's so many corner cases in multithreading that I would not be surprised if it resolves some of them)
A user might expect to set up your Server object in the main thread. Then call the server.endless_loop() function within a worker thread.
In an actual server, the process of acquiring a port requires escalated privileges which can then be dropped. Or perhaps you have an object that needs to load settings. Those sort of tasks could take place in the main thread before the long term looping takes place elsewhere.
Personally, I'd prefer your object had a "poll" function that was fast and non blocking. You could then have a loop function that called poll and sleep in an endless loop. You might even have an atomic variable that you can set to exit the loop from a different thread. Another feature would be to launch an internal thread within the Server object.
As others have pointed out, there's nothing "wrong" with this as far as C++ semantics are concerned, but it's poor design. The point of a constructor is to construct an object, so if that task never completes then it will be surprising to users.
Others have made suggestions regarding splitting the construction & run steps into constructor and method, which makes sense if you have other things you might want to do with the Server besides run it, or if you actually might want to construct it, do other stuff, and then run it.
But if you expect the caller will always just do Server server; server.run(), then maybe you don't even need a class -- it could just be a stand-alone function run_server(). If you don't have state to encapsulate and pass around in the first place, then you don't necessarily need objects. A stand-alone function can even be marked [[noreturn]] to make it clear both to the user and the compiler that the function never returns.
It's hard to say which makes more sense without knowing more about your use case. But in short: constructors construct objects -- if you're doing something else, don't use them for that.
In most cases, your code has no problem. Because of the following rule:
A class is considered a completely-defined object type ([basic.types]) (or complete type) at the closing } of the class-specifier. Within the class member-specification, the class is regarded as complete within function bodies, default arguments, noexcept-specifiers, and default member initializers (including such things in nested classes). Otherwise it is regarded as incomplete within its own class member-specification.
However, A restriction for your code is that you cannot use a glvalue that doesn't obtain from pointer this to access this object due to the behavior is unspecified. It's governed by this rule:
During the construction of an object, if the value of the object or any of its subobjects is accessed through a glvalue that is not obtained, directly or indirectly, from the constructor's this pointer, the value of the object or subobject thus obtained is unspecified.
Moreover, you cannot use the utility shared_ptr to manage such class objects. In general, place an infinitely loop into a constructor is not a good idea. Many restrictions will apply to the object when you use it.

C++11 atomic<>: only to be read/written with provided methods?

I wrote some multithreaded but lock-free code that compiled and apparently executed fine on an earlier C++11-supporting GCC (7 or older). The atomic fields were ints and so on. To the best of my recollection, I used normal C/C++ operations to operate on them (a=1;, etc.) in places where atomicity or event ordering wasn't a concern.
Later I had to do some double-width CAS operations, and made a little struct with a pointer and counter as is common. I tried doing the same normal C/C++ operations, and errors came that the variable had no such members. (Which is what you'd expect from most normal templates, but I half-expected atomic to work differently, in part because normal assignments to and from were supported, to the best of my recollection, for ints.).
So two part question:
Should we use the atomic methods in all cases, even (say) initialization done by one thread with no race conditions? 1a) so once declared atomic there's no way to access unatomically? 1b) we also have to use the verboser verbosity of the atomic<> methods to do so?
Otherwise, if for integer types at least, we can use normal C/C++ operations. But in this case will those operations be the same as load()/store() or are they merely normal assignments?
And a semi-meta question: is there any insight as to why normal C/C++ operations aren't supported on atomic<> variables? I'm not sure if the C++11 language as spec'd has the power to write code that does that, but the spec can certainly require the compiler to do things the language as spec'd isn't powerful enough to do.
You're maybe looking for C++20 std::atomic_ref<T> to give you the ability to do atomic ops on objects that can also be accessed non-atomically. Make sure your non-atomic T object is declared with sufficient alignment for atomic<T>. e.g.
alignas(std::atomic_ref<long long>::required_alignment)
long long sometimes_shared_var;
But that requires C++20, and nothing equivalent is available in C++17 or earlier. Once an atomic object is constructed, I don't think there's any guaranteed portable safe way to modify it other than its atomic member functions.
Its internal object representation isn't guaranteed by the standard so memcpy to get the struct sixteenbyte object out of an atomic<sixteenbyte> efficiently isn't guaranteed by the standard to be safe even if no other thread has a reference to it. You'd have to know how a specific implementation stores it. Checking sizeof(atomic<T>) == sizeof(T) is a good sign, though, and mainstream implementations do in practice just have a T as the object-representation for atomic<T>.
Related: How can I implement ABA counter with c++11 CAS? for a nasty union hack ("safe" in GNU C++) to give efficient access to a single member, because compilers don't optimize foo.load().ptr to just atomically load that member. Instead GCC and clang will lock cmpxchg16b to load the whole pointer+counter pair, then just the first member. C++20 atomic_ref<> should solve that.
Accessing members of atomic<struct foo>: one reason for not allowing shared.x = tmp; is that it's the wrong mental model. If two different threads are storing to different members of the same struct, how does the language define any ordering for what other threads see? Plus it was probably considered too easy for programmer to design their lockless algorithms incorrectly if stuff like that were allowed.
Also, how would you even implement that? Return an lvalue-reference? It can't be to the underlying non-atomic object. And what if the code captures that reference and keeps using it long after calling some function that's not load or store?
Remember that ISO C++'s ordering model works in terms of synchronizes-with, not in terms of local reordering and a single cache-coherent domain like the way real ISAs define their memory models. The ISO C++ model is always strictly in terms of reading, writing, or RMWing the entire atomic object. So a load of the object can always sync-with any store of the whole object.
In hardware that would actually still work for a store to one member and a load from a different member if the whole object is in one cache line, on real-world ISAs. At least I think so, although possibly not on some SMT systems. (Being in one cache line is necessary for lock-free atomic access to the whole object to be possible on most ISAs.)
we also have to use the verboser verbosity of the atomic<> methods to do so?
The member functions of atomic<T> include overloads of all the operators, including operator= (store) and cast back to T (load). a = 1; is equivalent to a.store(1, std::memory_order_seq_cst) for atomic<int> a; and is the slowest way to set a new value.
Should we use the atomic methods in all cases, even (say) initialization done by one thread with no race conditions?
You don't have any choice, other than passing args to the constructors of std::atomic<T> objects.
You can use mo_relaxed loads / stores while your object is still thread-private, though. Avoid any RMW operators like +=. e.g. a.store(a.load(relaxed) + 1, relaxed); will compile about the same as for non-atomic objects of register-width or smaller.
(Except that it can't optimize away and keep the value in a register, so use local temporaries instead of actually updating the atomic object).
But for atomic objects too large to be lock-free, there's not really anything you can do efficiently except construct them with the right values in the first place.
The atomic fields were ints and so on. ...
and apparently executed fine
If you mean plain int, not atomic<int> then it wasn't portably safe.
Data-race UB doesn't guarantee visible breakage, the nasty thing with undefined behaviour is that happening to work in your test case is one of the things that's allowed to happen.
And in many cases with pure load or pure store, it won't break, especially on strongly ordered x86, unless the load or store can hoist or sink out of a loop. Why is integer assignment on a naturally aligned variable atomic on x86?. It'll eventually bite you when a compiler manages to do cross-file inlining and reorder some operations at compile time, though.
why normal C/C++ operations aren't supported on atomic<> variables?
... but the spec can certainly require the compiler to do things the language as spec'd isn't powerful enough to do.
This in fact was a limitation of C++11 through 17. Most compilers have no problem with it. For example implementation of the <atomic> header for gcc/clang's uses __atomic_ builtins which take a plain T* pointer.
The C++20 proposal for atomic_ref is p0019, which cites as motivation:
An object could be heavily used non-atomically in well-defined phases
of an application. Forcing such objects to be exclusively atomic would
incur an unnecessary performance penalty.
3.2. Atomic Operations on Members of a Very Large Array
High-performance computing (HPC) applications use very large arrays. Computations with these arrays typically have distinct phases that allocate and initialize members of the array, update members of the array, and read members of the array. Parallel algorithms for initialization (e.g., zero fill) have non-conflicting access when assigning member values. Parallel algorithms for updates have conflicting access to members which must be guarded by atomic operations. Parallel algorithms with read-only access require best-performing streaming read access, random read access, vectorization, or other guaranteed non-conflicting HPC pattern.
All of these things are a problem with std::atomic<>, confirming your suspicion that this is a problem for C++11.
Instead of introducing a way to do non-atomic access to std::atomic<T>, they introduced a way to do atomic access to a T object. One problem with this is that atomic<T> might need more alignment than a T would get by default, so be careful.
Unlike with giving atomic access to members of T, you could plausible have a .non_atomic() member function that returned an lvalue reference to the underlying object.

Use constructor in place of atomic.store() when atomicity is not currently needed

I use std::atomic for atomicity. Still, somewhere in the code, atomicity is not needed by program logic. In this case, I'm wondering whether it is OK, both pedantically and practically, to use constructor in place of store() as an optimization. For example,
// p.store(nullptr, std::memory_order_relaxed);
new(p) std::atomic<node*>(nullptr);
In accord with the standard, whether this works depends entirely on the implementation of std::atomic<T>. If it is lock-free for that T, then the implementation probably just stores a T. If it isn't lock-free, things get more complex, since it may store a mutex or some other thing.
The thing is, you don't know what std::atomic<T> stores. This matters because if it stores a const-qualified object or a reference type, then reusing the storage here will cause problems. The pointer returned by placement-new can certainly be used, but if a const or reference type is used, the original object name p cannot.
Why would std::atomic<T> store a const or reference type? Who knows; my point is that, because its implementation is not under your control, then pedantically you cannot know how any particular implementation behaves.
As for "practically", it's unlikely that this will cause a problem. Especially if the atomic<T> is always lock-free.
That being said, "practically" should also include some notion of how other users will interpret this code. While people experienced with doing things like reusing storage will be able to understand what the code is doing, they will likely be puzzled by why you're doing it. That means you'll need to either stick a comment on that line or make a (template) function non_atomic_reset.
Also, it should be noted that std::shared_ptr uses atomic increments/decrements for its reference counter. I bring that up because there is no std::single_threaded_shared_ptr that doesn't use atomics, or a special constructor that doesn't use atomics. So even in cases where you're using shared_ptr in pure single-threaded code, those atomics are still firing. This was considered a reasonable tradeoff by the C++ standards committee.
Atomics aren't cheap, but they're not that expensive (most of the time) that using unusual mechanisms like this to bypass an atomic store is a good idea. As always, profile to see if the code obfuscation is worth it.

c++ constructors and concurrency

I've been thinking about writing a container class to control access to a complex data structure that will have use in a multi-threaded environment.
And then the question occurred to me:
Is there ever a situation where c++ constructors must be thread-safe?
In general, a constructor cannot be called for the same object by two threads simultaneously. However, the same constructor can certainly be called for different objects at the same time.
Certainly you can invoke the same constructor from more than one thread at once. It that sense, they must be thread-safe, just as any other function must be. If the constructor is going to modify shared state, for example, your container, then you must use synchronization to ensure that the state is modified in a deterministic way.
You can't construct the same object on more than one thread at once, because each object is only constructed once, so there's no way to invoke the constructor on the same object more than once, much less on two different threads at the same time.
Not in my experience. It is the code that calls the constructor, implicitly or otherwise, which needs to be made thread-safe should the application require it.
The rationale is that only one thread should be initializing an object at a time, so no synchronization is necessary to protect the object being initialized within the constructor itself (if the object hasn't finished initialization, it shouldn't be shared between threads anyway).
Another way to look at it is this: Objects are to be treated as logically nonexistent until their constructors have returned. So, a thread that is in the process of creating an object is the only thread that "knows" about it.
Of course, proper synchronization rules apply to any shared resource the constructor itself accesses, but that applies to any function (I've encountered people that fail to realize this, believing constructors are special and somehow provide exclusive access to all resources).

What Rules does compiler have to follow when dealing with volatile memory locations?

I know when reading from a location of memory which is written to by several threads or processes the volatile keyword should be used for that location like some cases below but I want to know more about what restrictions does it really make for compiler and basically what rules does compiler have to follow when dealing with such case and is there any exceptional case where despite simultaneous access to a memory location the volatile keyword can be ignored by programmer.
volatile SomeType * ptr = someAddress;
void someFunc(volatile const SomeType & input){
//function body
}
What you know is false. Volatile is not used to synchronize memory access between threads, apply any kind of memory fences, or anything of the sort. Operations on volatile memory are not atomic, and they are not guaranteed to be in any particular order. volatile is one of the most misunderstood facilities in the entire language. "Volatile is almost useless for multi-threadded programming."
What volatile is used for is interfacing with memory-mapped hardware, signal handlers and the setjmp machine code instruction.
It can also be used in a similar way that const is used, and this is how Alexandrescu uses it in this article. But make no mistake. volatile doesn't make your code magically thread safe. Used in this specific way, it is simply a tool that can help the compiler tell you where you might have messed up. It is still up to you to fix your mistakes, and volatile plays no role in fixing those mistakes.
EDIT: I'll try to elaborate a little bit on what I just said.
Suppose you have a class that has a pointer to something that cannot change. You might naturally make the pointer const:
class MyGizmo
{
public:
const Foo* foo_;
};
What does const really do for you here? It doesn't do anything to the memory. It's not like the write-protect tab on an old floppy disc. The memory itself it still writable. You just can't write to it through the foo_ pointer. So const is really just a way to give the compiler another way to let you know when you might be messing up. If you were to write this code:
gizmo.foo_->bar_ = 42;
...the compiler won't allow it, because it's marked const. Obviously you can get around this by using const_cast to cast away the const-ness, but if you need to be convinced this is a bad idea then there is no help for you. :)
Alexandrescu's use of volatile is exactly the same. It doesn't do anything to make the memory somehow "thread safe" in any way whatsoever. What it does is it gives the compiler another way to let you know when you may have screwed up. You mark things that you have made truly "thread safe" (through the use of actual synchronization objects, like Mutexes or Semaphores) as being volatile. Then the compiler won't let you use them in a non-volatile context. It throws a compiler error you then have to think about and fix. You could again get around it by casting away the volatile-ness using const_cast, but this is just as Evil as casting away const-ness.
My advice to you is to completely abandon volatile as a tool in writing multithreadded applications (edit:) until you really know what you're doing and why. It has some benefit but not in the way that most people think, and if you use it incorrectly, you could write dangerously unsafe applications.
It's not as well defined as you probably want it to be. Most of the relevant standardese from C++98 is in section 1.9, "Program Execution":
The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.
Accessing an object designated by a volatile lvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression might produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.
Once the execution of a function begins, no expressions from the calling function are evaluated until execution of the called function has completed.
When the processing of the abstract machine is interrupted by receipt of a signal, the values of objects with type other than volatile sig_atomic_t are unspecified, and the value of any object not of volatile sig_atomic_t that is modified by the handler becomes undefined.
An instance of each object with automatic storage duration (3.7.2) is associated with each entry into its block. Such an object exists and retains its last-stored value during the execution of the block and while the block is suspended (by a call of a function or receipt of a signal).
The least requirements on a conforming implementation are:
At sequence points, volatile objects are stable in the sense that previous evaluations are complete and subsequent evaluations have not yet occurred.
At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.
The input and output dynamics of interactive devices shall take place in such a fashion that prompting messages actually appear prior to a program waiting for input. What constitutes an interactive device is implementation-defined.
So what that boils down to is:
The compiler cannot optimize away reads or writes to volatile objects. For simple cases like the one casablanca mentioned, that works the way you might think. However, in cases like
volatile int a;
int b;
b = a = 42;
people can and do argue about whether the compiler has to generate code as if the last line had read
a = 42; b = a;
or if it can, as it normally would (in the absence of volatile), generate
a = 42; b = 42;
(C++0x may have addressed this point, I haven't read the whole thing.)
The compiler may not reorder operations on two different volatile objects that occur in separate statements (every semicolon is a sequence point) but it is totally allowed to rearrange accesses to non-volatile objects relative to volatile ones. This is one of the many reasons why you should not try to write your own spinlocks, and is the primary reason why John Dibling is warning you not to treat volatile as a panacea for multithreaded programming.
Speaking of threads, you will have noticed the complete absence of any mention of threads in the standards text. That is because C++98 has no concept of threads. (C++0x does, and may well specify their interaction with volatile, but I wouldn't be assuming anyone implements those rules yet if I were you.) Therefore, there is no guarantee that accesses to volatile objects from one thread are visible to another thread. This is the other major reason volatile is not especially useful for multithreaded programming.
There is no guarantee that volatile objects are accessed in one piece, or that modifications to volatile objects avoid touching other things right next to them in memory. This is not explicit in what I quoted but is implied by the stuff about volatile sig_atomic_t -- the sig_atomic_t part would be unnecessary otherwise. This makes volatile substantially less useful for access to I/O devices than it was probably intended to be, and compilers marketed for embedded programming often offer stronger guarantees, but it's not something you can count on.
Lots of people try to make specific accesses to objects have volatile semantics, e.g. doing
T x;
*(volatile T *)&x = foo();
This is legit (because it says "object designated by a volatile lvalue" and not "object with a volatile type") but has to be done with great care, because remember what I said about the compiler being totally allowed to reorder non-volatile accesses relative to volatile ones? That goes even if it's the same object (as far as I know anyway).
If you are worried about reordering of accesses to more than one volatile value, you need to understand the sequence point rules, which are long and complicated and I'm not going to quote them here because this answer is already too long, but here's a good explanation which is only a little simplified. If you find yourself needing to worry about the differences in the sequence point rules between C and C++ you have already screwed up somewhere (for instance, as a rule of thumb, never overload &&).
A particular and very common optimization that is ruled out by volatile is to cache a value from memory into a register, and use the register for repeated access (because this is much faster than going back to memory every time).
Instead the compiler must fetch the value from memory every time (taking a hint from Zach, I should say that "every time" is bounded by sequence points).
Nor can a sequence of writes make use of a register and only write the final value back later on: every write must be pushed out to memory.
Why is this useful? On some architectures certain IO devices map their inputs or outputs to a memory location (i.e. a byte written to that location actually goes out on the serial line). If the compiler redirects some of those writes to a register that is only flushed occasionally then most of the bytes won't go onto the serial line. Not good. Using volatile prevents this situation.
Declaring a variable as volatile means the compiler can't make any assumptions about the value that it could have done otherwise, and hence prevents the compiler from applying various optimizations. Essentially it forces the compiler to re-read the value from memory on each access, even if the normal flow of code doesn't change the value. For example:
int *i = ...;
cout << *i; // line A
// ... (some code that doesn't use i)
cout << *i; // line B
In this case, the compiler would normally assume that since the value at i wasn't modified in between, it's okay to retain the value from line A (say in a register) and print the same value in B. However, if you mark i as volatile, you're telling the compiler that some external source could have possibly modified the value at i between line A and B, so the compiler must re-fetch the current value from memory.
The compiler is not allowed to optimize away reads of a volatile object in a loop, which otherwise it'd normally do (i.e. strlen()).
It's commonly used in embedded programming when reading a hardware registry at a fixed address, and that value may change unexpectedly. (In contrast with "normal" memory, that doesn't change unless written to by the program itself...)
That is it's main purpose.
It could also be used to make sure one thread see the change in a value written by another, but it in no way guarantees atomicity when reading/writing to said object.