I was looking up the keyword volatile and what it's for, and the answer I got was pretty much:
It's used to prevent the compiler from optimizing away code.
There were some examples, such as when polling memory-mapped hardware: without volatile the polling loop would be removed as the compiler might recognize that the condition value is never changed. But since there only were one example or maybe two, it got me thinking: Are there other situations where we need to use volatile in terms of avoiding unwanted optimization? Are condition variables the only place where volatile is needed?
I imagine that optimization is compiler-specific and therefore is not specified in the C++ specification. Does that mean we have to go by gut feeling, saying Hm, I suspect my compiler will do away with this if I don't declare that variable as volatile or are there any clear rules to go by?
Basically, volatile announces that a value might change behind your program's back. That prevents compilers from caching the value (in a CPU register) and from optimizing away accesses to that value when they seem unnecessary from the POV of your program.
What should trigger usage of volatile is when a value changes despite the fact that your program hasn't written to it, and when no other memory barriers (like mutexes as used for multi-threaded programs) are present.
The observable behavior of a C++ program is determined by read and writes to volatile variables, and any calls to input/output functions.
What this entails is that all reads and writes to volatile variables must happen in the order they appear in code, and they must happen. (If a compiler broke one of those rules, it would be breaking the as-if rule.)
That's all. It's used when you need to indicate that reading or writing a variable is to be seen as an observable effect. (Note, the "C++ and the Perils of Double-Checked Locking" article touches on this quite a bit.)
So to answer the title question, it prevents any optimization that might re-order the evaluation of volatile variables relative to other volatile variables.
That means a compiler that changes:
int x = 2;
volatile int y = 5;
x = 5;
y = 7;
To
int x = 5;
volatile int y = 5;
y = 7;
Is fine, since the value of x is not part of the observable behavior (it's not volatile). What wouldn't be fine is changing the assignment from 5 to an assignment to 7, because that write of 5 is an observable effect.
Condition variables are not where volatile is needed; strictly it is only needed in device drivers.
volatile guarantees that reads and writes to the object are not optimized away, or reordered with respect to another volatile. If you are busy-looping on a variable modified by another thread, it should be declared volatile. However, you shouldn't busy-loop. Because the language wasn't really designed for multithreading, this isn't very well supported. For example, the compiler may move a write to a non-volatile variable from after to before the loop, violating the lock. (For indefinite spinloops, this might only happen under C++0x.)
When you call a thread-library function, it acts as a memory fence, and the compiler will assume that any and all values have changed — essentially everything is volatile. This is either specified or tacitly implemented by any threading library to keep the wheels turning smoothly.
C++0x might not have this shortcoming, as it introduces formal multithreading semantics. I'm not really familiar with the changes, but for the sake of backward compatibility, it doesn't require to declare anything volatile that wasn't before.
Remember that the "as if rule" means that the compiler can, and should, do whatever it wants, as long as the behaviour as seen from outside the program as a whole is the same. In particular, while a variable conceptually names an area in memory, there is no reason why it actually should be in memory.
It could be in a register:
Its value could be calculated away, e.g. in:
int x = 2;
int y = x + 7;
return y + 1;
Need not have an x and y at all, but could just be replaced with:
return 10;
And another example, is that any code that doesn't affect state from the outside could be removed entirely. E.g. if you zeroise sensitive data, the compiler can see this as a wasted exercise ("why are you writing to what won't be read?") and remove it. volatile can be used to stop that happening.
volatile can be thought of as meaning "the state of this variable must be considered part of the outwardly visible state, and not messed with". Optimisations that would use it other than literally following the source code are not allowed.
(A note C#. A lot I've seen of late on volatile suggests that people are reading about C++ volatile and applying it to C#, and reading about it in C# and applying it to C++. Really though, volatile behaves so differently between the two as to not be useful to consider them related).
Volatile doesn't try to keep data to a cpu register (100's of times faster than memory). It has to read it from memory every time it is used.
One way to think about a volatile variable is to imagine that it's a virtual property; writes and even reads may do things compiler can't know about. The actual generated code for a writing/reading a volatile variable is simply a memory write or read(*), but the compiler has to regard the code as opaque; it can't make any assumptions under which it might be superfluous. The issue isn't merely with making sure that the compiled code notices that something has caused a variable to change. On some systems, even memory reads can "do" things.
(*) On some compilers, volatile variables may be added to, subtracted from, incremented, decremented, etc. as distinct operations. It's probably useful for a compiler to compile:
volatilevar++;
as
inc [_volatilevar]
since the latter form may be atomic on many microprocessors (though not on modern multi-core PCs). It's important to note, however, that if the statement were:
volatilevar2 = (volatilevar1++);
the correct code would not be:
mov ax,[_volatilevar1] ; Reads it once
inc [_volatilevar] ; Reads it again (oops)
mov [_volatilevar2],ax
nor
mov ax,[_volatilevar1]
mov [_volatilevar2],ax ; Writes in wrong sequence
inc ax
mov [_volatilevar1],ax
but rather
mov ax,[_volatilevar1]
mov bx,ax
inc ax
mov [_volatilevar1],ax
mov [_volatilevar2],bx
Writing the source code differently would allow the generation of more efficient (and possibly safer) code. If 'volatilevar1' didn't mind being read twice and 'volatilevar2' didn't mind being written before volatilevar1, then splitting the statement into
volatilevar2 = volatilevar1;
volatilevar1++;
would allow for faster, and possibly safer, code.
Unless you are on an embedded system, or you are writing hardware drivers where memory mapping is used as the means of communication, you should never ever ever be using volatile
Consider:
int main()
{
volatile int SomeHardwareMemory; //This is a platform specific INT location.
for(int idx=0; idx < 56; ++idx)
{
printf("%d", SomeHardwareMemory);
}
}
Has to produce code like:
loadIntoRegister3 56
loadIntoRegister2 "%d"
loopTop:
loadIntoRegister1 <<SOMEHARDWAREMEMORY>
pushRegister2
pushRegister1
call printf
decrementRegister3
ifRegister3LessThan 56 goto loopTop
whereas without volatile it could be:
loadIntoRegister3 56
loadIntoRegister2 "%d"
loadIntoRegister1 <<SOMEHARDWAREMEMORY>
loopTop:
pushRegister2
pushRegister1
call printf
decrementRegister3
ifRegister3LessThan 56 goto loopTop
The assumption about volatile is that the memory location of the variable may be changed. You are forcing the compiler to load the actual value from memory each time the variable is used; and you tell the compiler that reuse of that value in a register is not allowed.
usually compiler assumes that a program is single threaded, therefore it has complete knowledge of what's happening with variable values. a smart compiler can then prove that the program can be transformed into another program with equivalent semantics but better performance. for example
x = y+y+y+y+y;
can be transformed to
x = y*5;
however, if a variable can be changed outside the thread, compiler doesn't have a complete knowledge of what's going on by simply examining this piece of code. it can no longer make optimizations like above. (edit: it probably can in this case; we need more sophisticated examples)
by default, for performance optimization, single thread access is assumed. this assumption is usually true. unless programmer explicitly instruct otherwise with the volatile keyword.
Related
In ISO/IEC 14882:2003 (C++03) is stated under 7.1.5.1/8, section "The cv-qualifers":
[Note: volatile is a hint to the implementation to avoid aggressive optimization involving the object because the value of the object might be changed by means undetectable by an implementation. See 1.9 for detailed semantics. In general, the semantics of volatile are intended to be the same in C++ as they are in C. ]
These "means", that are undetectable by an implementation has been also already subject of Nawaz´ Q&A Why do we use volatile keyword:
However, sometimes, optimization (of some parts of your program) may be undesirable, because it may be that someone else is changing the value of some_int from outside the program which compiler is not aware of, since it can't see it; but it's how you've designed it. In that case, compiler's optimization would not produce the desired result!
But unfortunately, he missed to explain what these "means", that may change objects from outside the program, can be and how they can change objects.
My Question:
What are examples for these "undetectable means" and how are they be able to change internal objects of a program from outside of it?
A pointer in memory may be visible by other parts of the same or another program. For example, a variable that exists in shared memory and can be changed by another program.
The compiler cannot detect that.
Other examples are hardware-based memory locations.
Generally, apps that need volatile variables usually deal with stuff like asynchronous audio, and at the system level, interrupts, APIC etc. Most apps do not need them.
An imaginary example:
int v = 0;
// Some thread
SetUpdatesOn(&v);
// Another thread
for(;;)
{
int g = v;
std::cout << g;
}
Assume that an imaginary OS-level function SetUpdatesOn periodically changes the variable passed to it. If the variable is not declared volatile, the compiler might optimize the int g = v call or assume that v always has the same value.
If the variable is declared volatile, the compiler will keep reading it in the loop.
Note that very often it is difficult to debug such programming mistakes because the optimization may only exist in release builds.
In C++03 Standard observable behavior (1.9/6) includes reading and writing volatile data. Now I have this code:
int main()
{
const volatile int value = 0;
if( value ) {
}
return 0;
}
which formally initializes a volatile variable and then reads it. Visual C++ 10 emits machine code that makes room on the stack by pushing a dword there, then writes zero into that stack location, then reads that location.
To me it makes no sense - no other code or hardware could possibly know where the local variable is located (since it's in automatic storage) and so it's unreasonable to expect that the variable could have been read/written by any other party and so it can be eliminated in this case.
Is eliminating this variable access allowed? Is accessing a volatile local which address is not known to any other party observable behavior?
The thread's entire stack might be located on a protected memory page, with a handler that logs all reads and writes (and allows them to complete, of course).
However, I don't think MSVC really cares whether or how the memory access might be detected. It understands volatile to mean, among other things, "do not bother applying optimizations to this object". So it doesn't. It doesn't have to make sense, because MSVC is not interested in speeding up this kind of use of volatile.
Since it's implementation-dependent whether and how observable behavior can actually be observed, I think you're right that an implementation can "cheat" if it knows, because of details of the hardware, that the access cannot possibly be detected. Observable behavior that has no physically-detectable effect can be skipped: no matter what the standard says, the means to detect non-conforming behavior are limited to what's physically possible.
If an implementation fails to conform to the standard in a forest, and nobody notices, does it make a sound? Kind of thing.
That's the whole point of declaring a variable volatile: you tell the implementation that that variable may change or be read by means unknown to the implementation itself and that the implementation should refrain from performing optimizations that might impact such access.
When a variable is declared both volatile and const your program may not change it, but it may still be changed from outside. This implies that not only the variable itself but also all read operations on it cannot be optimized away.
no other code or hardware could possibly know
You can look a the assembly (you just did!), figure out the address of the variable, and map it to some hardware for the duration of the call. volatile means the implementation is obliged to account for such things too.
Volatile also applies to your own code.
volatile int x;
spawn_thread(&x);
x = 0;
while (x == 0){};
This will be an endless loop if x is not volatile.
As for the const. I'm unsure whether the compiler can use that to decide.
To me it makes no sense - no other code or hardware could possibly
know where the local variable is located (since it's in automatic
storage)
Really? So if I write an x86 emulator and run your code on it, then that emulator won't know about that local variable?
The implementation can never actually know for sure that the behaviour is unobservable.
My answer is a bit late. Anyway, this statement
To me it makes no sense - no other code or hardware could possibly
know where the local variable is located (since it's in automatic
storage)
is wrong. The difference between volatile or not is actually very observable in VC++2010. For instance, in Release build, you cannot add a break point to a local variable declaration that was eliminated by optimization. Hence, if you need to set a break point to a variable declaration or even just to watch its value in the Debugger, we have to use Debug build. To debug a specific local variable in Release build, we could make use of the volatile keyword:
int _tmain(int argc, _TCHAR* argv[])
{
int a;
//int volatile a;
a=1; //break point here is not possible in Release build, unless volatile used
printf("%d\n",a);
return 0;
}
I know when reading from a location of memory which is written to by several threads or processes the volatile keyword should be used for that location like some cases below but I want to know more about what restrictions does it really make for compiler and basically what rules does compiler have to follow when dealing with such case and is there any exceptional case where despite simultaneous access to a memory location the volatile keyword can be ignored by programmer.
volatile SomeType * ptr = someAddress;
void someFunc(volatile const SomeType & input){
//function body
}
What you know is false. Volatile is not used to synchronize memory access between threads, apply any kind of memory fences, or anything of the sort. Operations on volatile memory are not atomic, and they are not guaranteed to be in any particular order. volatile is one of the most misunderstood facilities in the entire language. "Volatile is almost useless for multi-threadded programming."
What volatile is used for is interfacing with memory-mapped hardware, signal handlers and the setjmp machine code instruction.
It can also be used in a similar way that const is used, and this is how Alexandrescu uses it in this article. But make no mistake. volatile doesn't make your code magically thread safe. Used in this specific way, it is simply a tool that can help the compiler tell you where you might have messed up. It is still up to you to fix your mistakes, and volatile plays no role in fixing those mistakes.
EDIT: I'll try to elaborate a little bit on what I just said.
Suppose you have a class that has a pointer to something that cannot change. You might naturally make the pointer const:
class MyGizmo
{
public:
const Foo* foo_;
};
What does const really do for you here? It doesn't do anything to the memory. It's not like the write-protect tab on an old floppy disc. The memory itself it still writable. You just can't write to it through the foo_ pointer. So const is really just a way to give the compiler another way to let you know when you might be messing up. If you were to write this code:
gizmo.foo_->bar_ = 42;
...the compiler won't allow it, because it's marked const. Obviously you can get around this by using const_cast to cast away the const-ness, but if you need to be convinced this is a bad idea then there is no help for you. :)
Alexandrescu's use of volatile is exactly the same. It doesn't do anything to make the memory somehow "thread safe" in any way whatsoever. What it does is it gives the compiler another way to let you know when you may have screwed up. You mark things that you have made truly "thread safe" (through the use of actual synchronization objects, like Mutexes or Semaphores) as being volatile. Then the compiler won't let you use them in a non-volatile context. It throws a compiler error you then have to think about and fix. You could again get around it by casting away the volatile-ness using const_cast, but this is just as Evil as casting away const-ness.
My advice to you is to completely abandon volatile as a tool in writing multithreadded applications (edit:) until you really know what you're doing and why. It has some benefit but not in the way that most people think, and if you use it incorrectly, you could write dangerously unsafe applications.
It's not as well defined as you probably want it to be. Most of the relevant standardese from C++98 is in section 1.9, "Program Execution":
The observable behavior of the abstract machine is its sequence of reads and writes to volatile data and calls to library I/O functions.
Accessing an object designated by a volatile lvalue (3.10), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression might produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.
Once the execution of a function begins, no expressions from the calling function are evaluated until execution of the called function has completed.
When the processing of the abstract machine is interrupted by receipt of a signal, the values of objects with type other than volatile sig_atomic_t are unspecified, and the value of any object not of volatile sig_atomic_t that is modified by the handler becomes undefined.
An instance of each object with automatic storage duration (3.7.2) is associated with each entry into its block. Such an object exists and retains its last-stored value during the execution of the block and while the block is suspended (by a call of a function or receipt of a signal).
The least requirements on a conforming implementation are:
At sequence points, volatile objects are stable in the sense that previous evaluations are complete and subsequent evaluations have not yet occurred.
At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.
The input and output dynamics of interactive devices shall take place in such a fashion that prompting messages actually appear prior to a program waiting for input. What constitutes an interactive device is implementation-defined.
So what that boils down to is:
The compiler cannot optimize away reads or writes to volatile objects. For simple cases like the one casablanca mentioned, that works the way you might think. However, in cases like
volatile int a;
int b;
b = a = 42;
people can and do argue about whether the compiler has to generate code as if the last line had read
a = 42; b = a;
or if it can, as it normally would (in the absence of volatile), generate
a = 42; b = 42;
(C++0x may have addressed this point, I haven't read the whole thing.)
The compiler may not reorder operations on two different volatile objects that occur in separate statements (every semicolon is a sequence point) but it is totally allowed to rearrange accesses to non-volatile objects relative to volatile ones. This is one of the many reasons why you should not try to write your own spinlocks, and is the primary reason why John Dibling is warning you not to treat volatile as a panacea for multithreaded programming.
Speaking of threads, you will have noticed the complete absence of any mention of threads in the standards text. That is because C++98 has no concept of threads. (C++0x does, and may well specify their interaction with volatile, but I wouldn't be assuming anyone implements those rules yet if I were you.) Therefore, there is no guarantee that accesses to volatile objects from one thread are visible to another thread. This is the other major reason volatile is not especially useful for multithreaded programming.
There is no guarantee that volatile objects are accessed in one piece, or that modifications to volatile objects avoid touching other things right next to them in memory. This is not explicit in what I quoted but is implied by the stuff about volatile sig_atomic_t -- the sig_atomic_t part would be unnecessary otherwise. This makes volatile substantially less useful for access to I/O devices than it was probably intended to be, and compilers marketed for embedded programming often offer stronger guarantees, but it's not something you can count on.
Lots of people try to make specific accesses to objects have volatile semantics, e.g. doing
T x;
*(volatile T *)&x = foo();
This is legit (because it says "object designated by a volatile lvalue" and not "object with a volatile type") but has to be done with great care, because remember what I said about the compiler being totally allowed to reorder non-volatile accesses relative to volatile ones? That goes even if it's the same object (as far as I know anyway).
If you are worried about reordering of accesses to more than one volatile value, you need to understand the sequence point rules, which are long and complicated and I'm not going to quote them here because this answer is already too long, but here's a good explanation which is only a little simplified. If you find yourself needing to worry about the differences in the sequence point rules between C and C++ you have already screwed up somewhere (for instance, as a rule of thumb, never overload &&).
A particular and very common optimization that is ruled out by volatile is to cache a value from memory into a register, and use the register for repeated access (because this is much faster than going back to memory every time).
Instead the compiler must fetch the value from memory every time (taking a hint from Zach, I should say that "every time" is bounded by sequence points).
Nor can a sequence of writes make use of a register and only write the final value back later on: every write must be pushed out to memory.
Why is this useful? On some architectures certain IO devices map their inputs or outputs to a memory location (i.e. a byte written to that location actually goes out on the serial line). If the compiler redirects some of those writes to a register that is only flushed occasionally then most of the bytes won't go onto the serial line. Not good. Using volatile prevents this situation.
Declaring a variable as volatile means the compiler can't make any assumptions about the value that it could have done otherwise, and hence prevents the compiler from applying various optimizations. Essentially it forces the compiler to re-read the value from memory on each access, even if the normal flow of code doesn't change the value. For example:
int *i = ...;
cout << *i; // line A
// ... (some code that doesn't use i)
cout << *i; // line B
In this case, the compiler would normally assume that since the value at i wasn't modified in between, it's okay to retain the value from line A (say in a register) and print the same value in B. However, if you mark i as volatile, you're telling the compiler that some external source could have possibly modified the value at i between line A and B, so the compiler must re-fetch the current value from memory.
The compiler is not allowed to optimize away reads of a volatile object in a loop, which otherwise it'd normally do (i.e. strlen()).
It's commonly used in embedded programming when reading a hardware registry at a fixed address, and that value may change unexpectedly. (In contrast with "normal" memory, that doesn't change unless written to by the program itself...)
That is it's main purpose.
It could also be used to make sure one thread see the change in a value written by another, but it in no way guarantees atomicity when reading/writing to said object.
[edit] For background reading, and to be clear, this is what I am talking about: Introduction to the volatile keyword
When reviewing embedded systems code, one of the most common errors I see is the omission of volatile for thread/interrupt shared data. However my question is whether it is 'safe' not to use volatile when a variable is accessed via an access function or member function?
A simple example; in the following code...
volatile bool flag = false ;
void ThreadA()
{
...
while (!flag)
{
// Wait
}
...
}
interrupt void InterruptB()
{
flag = true ;
}
... the variable flag must be volatile to ensure that the read in ThreadA is not optimised out, however if the flag were read via a function thus...
volatile bool flag = false ;
bool ReadFlag() { return flag }
void ThreadA()
{
...
while ( !ReadFlag() )
{
// Wait
}
...
}
... does flag still need to be volatile? I realise that there is no harm in it being volatile, but my concern is for when it is omitted and the omission is not spotted; will this be safe?
The above example is trivial; in the real case (and the reason for my asking), I have a class library that wraps an RTOS such that there is an abstract class cTask that task objects are derived from. Such "active" objects typically have member functions that access data than may be modified in the object's task context but accessed from other contexts; is it critical then that such data is declared volatile?
I am really interested in what is guaranteed about such data rather than what a practical compiler might do. I may test a number of compilers and find that they never optimise out a read through an accessor, but then one day find a compiler or a compiler setting that makes this assumption untrue. I could imagine for example that if the function were in-lined, such an optimisation would be trivial for a compiler because it would be no different than a direct read.
My reading of C99 is that unless you specify volatile, how and when the variable is actually accessed is implementation defined. If you specify volatile qualifier then code must work according to the rules of an abstract machine.
Relevant parts in the standard are: 6.7.3 Type qualifiers (volatile description) and 5.1.2.3 Program execution (the abstract machine definition).
For some time now I know that many compilers actually have heuristics to detect cases when a variable should be reread again and when it is okay to use a cached copy. Volatile makes it clear to the compiler that every access to the variable should be actually an access to the memory. Without volatile it seems compiler is free to never reread the variable.
And BTW wrapping the access in a function doesn't change that since a function even without inline might be still inlined by the compiler within the current compilation unit.
P.S. For C++ probably it is worth checking the C89 which the former is based on. I do not have the C89 at hand.
Yes it is critical.
Like you said volatile prevents code breaking optimization on shared memory [C++98 7.1.5p8].
Since you never know what kind of optimization a given compiler may do now or in the future, you should explicitly specify that your variable is volatile.
Of course, in the second example, writing/modifying variable 'flag' is omitted. If it is never written to, there is no need for it being volatile.
Concerning the main question
The variable has still to be flagged volatile even if every thread accesses/modifies it through the same function.
A function can be "active" simultaneously in several threads. Imagine that the function code is just a blueprint that gets taken by a thread and executed. If thread B interrupts the execution of ReadFlag in thread A, it simply executes a different copy of ReadFlag (with a different context, e.g. a different stack, different register contents). And by doing so, it could mess up the execution of ReadFlag in thread A.
In C, the volatile keyword is not required here (in the general sense).
From the ANSI C spec (C89), section A8.2 "Type Specifiers":
There are no
implementation-independent semantics
for volatile objects.
Kernighan and Ritchie comment on this section (referring to the const and volatile specifiers):
Except that it should diagnose
explicit attempts to change const
objects, a compiler may ignore these
qualifiers.
Given these details, you can't be guaranteed how a particular compiler interprets the volatile keyword, or if it ignores it altogether. A keyword that is completely implementation dependent shouldn't be considered "required" in any situation.
That being said, K&R also state that:
The purpose of volatile is to force
an implementation to suppress
optimization that could otherwise
occur.
In practice, this is how practically every compiler I have seen interprets volatile. Declare a variable as volatile and the compiler will not attempt to optimize accesses to it in any way.
Most of the time, modern compilers are pretty good about judging whether or not a variable can be safely cached or not. If you find that your particular compiler is optimizing away something that it shouldn't, then adding a volatile keyword might be appropriate. Be aware, though, that this can limit the amount of optimization that the compiler can do on the rest of the code in the function that uses the volatile variable. Some compilers are better about this than others; one embedded C compiler I used would turn off all optimizations for a function that accesses a volatile, but others like gcc seem to be able to still perform some limited optimizations.
Accessing the variable through an accessor function should prevent the function from caching the value. Even if the function is auto-inlined, each call to the function should re-call the function and re-fetch a new value. I have never seen a compiler that would auto-inline the accessor function and then optimize away the data re-fetch. I'm not saying it can't happen (since this is implementation-dependent behavior), but I wouldn't write any code that expects that to happen. Your second example is essentially placing a wrapper API around the variable, and libraries do this without using volatile all the time.
All in all, the treatment of volatile objects in C is implementation-dependent. There is nothing "guaranteed" about them according to the ANSI C89 spec.
Your code is sharing the volatile object between a thread and an interrupt routine. No compiler implementation (that I have ever seen) gives volatile enough power to be sufficient for handling parallel access. You should use some sort of locking mechanism to guarantee that the two threads (in your first example) don't step on each other's toes (even though one is an interrupt handler, you can still have parallel access on a multi-CPU or multi-core system).
Edit: I didn't read the code very closely and so I thought this was a question about thread synchronization, for which volatile should never be used, however this usage looks like it might be OK (Depending on how else the variable in question is used, and if the interrupt is always running such that it's view of memory is (cache-)coherent with the one that the thread sees. In the case of 'can you remove the volatile qualifier if you wrap it in a function call?' the accepted answer is correct, you cannot. I'm going to leave my original answer because it's important for people reading this question to know that volatile is almost useless outside of certain very special cases.
More Edit: Your RTOS use case may require additional protection above and beyond volatile, you may need to use memory barriers in some cases or make them atomic... I can't really tell you for sure, it's just something you need to be careful of (I'd suggest looking at the Linux kernel documentation link I have below though, Linux doesn't use volatile for that kind of thing, very probably with a good reason). Part of when you do and do not need volatile depends very strongly on the memory model of the CPU you're running on, and often volatile is not good enough.
volatile is the WRONG way to do this, it does NOT guarantee that this code will work, it wasn't meant for this kind of use.
volatile was intended for reading/writing to memory mapped device registers, and as such it is sufficient for that purpose, however it DOES NOT help when you're talking about stuff going between threads. (In particular the compiler is still aloud to re-order some reads and writes, as is the CPU while it's executing (this one's REALLY important since volatile doesn't tell the CPU to do anything special (sometimes it means bypass cache, but that's compiler/CPU dependent))
see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html, Intel developer article, CERT, Linux kernel documentation
Short version of those articles, volatile used the way you want to is both BAD and WRONG. Bad because it will make your code slower, wrong because it doesn't actually do what you want.
In practice, on x86 your code will function correctly with or without volatile, however it will be non-portable.
EDIT: Note to self actually read the code... this is the sort of thing volatile is meant to do.
A compiler cannot eliminate or reorder reads/writes to a volatile-qualified variables.
But what about the cases where other variables are present, which may or may not be volatile-qualified?
Scenario 1
volatile int a;
volatile int b;
a = 1;
b = 2;
a = 3;
b = 4;
Can the compiler reorder first and the second, or third and the fourth assignments?
Scenario 2
volatile int a;
int b, c;
b = 1;
a = 1;
c = b;
a = 3;
Same question, can the compiler reorder first and the second, or third and the fourth assignments?
The C++ standard says (1.9/6):
The observable behavior of the
abstract machine is its sequence of
reads and writes to volatile data and
calls to library I/O functions.
In scenario 1, either of the changes you propose changes the sequence of writes to volatile data.
In scenario 2, neither change you propose changes the sequence. So they're allowed under the "as-if" rule (1.9/1):
... conforming implementations are
required to emulate (only) the
observable behavior of the abstract
machine ...
In order to tell that this has happened, you would need to examine the machine code, use a debugger, or provoke undefined or unspecified behavior whose result you happen to know on your implementation. For example, an implementation might make guarantees about the view that concurrently-executing threads have of the same memory, but that's outside the scope of the C++ standard. So while the standard might permit a particular code transformation, a particular implementation could rule it out, on grounds that it doesn't know whether or not your code is going to run in a multi-threaded program.
If you were to use observable behavior to test whether the re-ordering has happened or not (for example, printing the values of variables in the above code), then of course it would not be allowed by the standard.
For scenario 1, the compiler should not perform any of the reorderings you mention. For scenario 2, the answer might depend on:
and whether the b and c variables are visible outside the current function (either by being non-local or having had their address passed
who you talk to (apparently there is some disagreement about how string volatile is in C/C++)
your compiler implementation
So (softening my first answer), I'd say that if you're depending on certain behavior in scenario 2, you'd have to treat it as non-portable code whose behavior on a particular platform would have be determined by whatever the implementation's documentation might indicate (and if the docs said nothing about it, then you're out of luck with a guaranteed behavior.
from C99 5.1.2.3/2 "Program execution":
Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place.
...
(paragraph 5) The least requirements on a conforming implementation are:
At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred.
Here's a little of what Herb Sutter has to say about the required behavior of volatile accesses in C/C++ (from "volatile vs. volatile" http://www.ddj.com/hpc-high-performance-computing/212701484) :
what about nearby ordinary reads and writes -- can those still be reordered around unoptimizable reads and writes? Today, there is no practical portable answer because C/C++ compiler implementations vary widely and aren't likely to converge anytime soon. For example, one interpretation of the C++ Standard holds that ordinary reads can move freely in either direction across a C/C++ volatile read or write, but that an ordinary write cannot move at all across a C/C++ volatile read or write -- which would make C/C++ volatile both less restrictive and more restrictive, respectively, than an ordered atomic. Some compiler vendors support that interpretation; others don't optimize across volatile reads or writes at all; and still others have their own preferred semantics.
And for what it's worth, Microsoft documents the following for the C/C++ volatile keyword (as Microsoft-sepcific):
A write to a volatile object (volatile write) has Release semantics; a reference to a global or static object that occurs before a write to a volatile object in the instruction sequence will occur before that volatile write in the compiled binary.
A read of a volatile object (volatile read) has Acquire semantics; a reference to a global or static object that occurs after a read of volatile memory in the instruction sequence will occur after that volatile read in the compiled binary.
This allows volatile objects to be used for memory locks and releases in multithreaded applications.
Volatile is not a memory fence. Assignments to B and C in snippet #2 can be eliminated or performed whenever. Why would you want the declarations in #2 to cause the behavior of #1?
Some compilers regard accesses to volatile-qualified objects as a memory fence. Others do not. Some programs are written to require that volatile works as a fence. Others aren't.
Code which is written to require fences, running on platforms that provide them, may run better than code which is written to not require fences, running on platforms that don't provide them, but code which requires fences will malfunction if they are not provided. Code which doesn't require fences will often run slower on platforms that provide them than would code which does require the fences, and implementations which provide fences will run such code more slowly than those that don't.
A good approach may be to define a macro semi_volatile as expanding to nothing on systems where volatile implies a memory fence, or to volatile on systems where it doesn't. If variables that need to have accesses ordered with respect to other volatile variables but not to each other are qualified as semi-volatile, and that macro is defined correctly, reliable operation will be achieved on systems with or without memory fences, and the most efficient operation that can be achieved on systems with fences will be achieved. If a compiler actually implements a qualifier that works as required, semivolatile, it could be defined as a macro that uses that qualifier and achieve even better code.
IMHO, that's an area the Standard really should address, since the concepts involved are applicable on many platforms, and any platform where fences aren't meaningful can simply ignore them.