volatile-Correctness on Win32/pthreads Threading Functions - c++

After having read this wonderful article, I starting digging around to volatile-correct some code. One consequence of volatile-correctness (as I understand it) is that methods accessed from different threads should be volatile qualified.
A simple example might be a mutex using Win32 or pthreads. We could make a class Mutex and give it a field:
#if defined BACKEND_WIN32
CRITICAL_SECTION _lock;
#elif defined BACKEND_POSIX
pthread_mutex_t _lock;
#endif
The ".acquire()" method might look like:
void Mutex::acquire(void) volatile {
#if defined BACKEND_WIN32
EnterCriticalSection(&_lock);
#elif defined BACKEND_POSIX
pthread_mutex_lock(&_lock);
#endif
}
Trying this doesn't work. From MSVC:
error C2664: 'void EnterCriticalSection(LPCRITICAL_SECTION)' : cannot convert argument 1 from 'volatile CRITICAL_SECTION *' to 'LPCRITICAL_SECTION'
From g++:
error: invalid conversion from ‘volatile pthread_mutex_t*’ to ‘pthread_mutex_t*’ [-fpermissive]
You'll get similar problems trying to unlock _lock. Two questions:
My impression is that these are just artifacts of the APIs being outdated. Is this in fact the case? Did I misunderstand something?
The whole purpose of a lock or unlock API function is to switch between being in a critical section, therefore, is it the case that I should just use the inadequately named const_cast to cast away the volatile-ness of _lock before passing to these functions, with no ill effect?

I think the best summary of what the standard (now) intends by volatile is to be found in n3797 S7.1.6.1 /6/7
What constitutes an access to an object that has volatile-qualified type is implementation-defined. If an attempt is made to refer to an object defined with a volatile-qualified type through the use of a glvalue with a non-volatile-qualified type, the program behavior is undefined.
[ Note: volatile is a hint to the implementation to avoid aggressive optimization involving the object because the value of the object might be changed by means undetectable by an implementation. Furthermore, for some implementations, volatile might indicate that special hardware instructions are required to access the object. See 1.9 for detailed semantics. In general, the semantics of volatile are intended to be the same in C++ as they are in C. —end note ]
Sadly, this means that a standards-complying implementation is not required to do as much as that article seems to suggest. There is no obligation to store values promptly and a limited obligation to reload values that might have changed. The compiler is not free to elide code that loads a value that it has not stored, but it is free to elide code that stores a value that it never uses.
As noted in the comments, it's better to use implementation-defined APIs or atomic or a mutex. Volatile will let you down. Stroustrup says as much in the linked question.

Related

How to safely zero std::array?

I'm trying to safely zero a std::array in a class destructor. From safely, I mean I want to be sure that compiler never optimize this zeroing. Here is what I came with:
template<size_t SZ>
struct Buf {
~Buf() {
auto ptr = static_cast<volatile uint8_t*>(buf_.data());
std::fill(ptr, ptr + buf_.size(), 0);
}
std::array<uint8_t, SZ> buf_{};
};
is this code working as expected? Will that volatile keyword prevent optimizing by compiler in any case?
The C++ standard itself doesn't make explicit guarantees. It says:
[dcl.type.cv]
The semantics of an access through a volatile glvalue are implementation-defined. ...
[Note 5: volatile is a hint to the implementation to avoid aggressive optimization involving the object because the value of the object might be changed by means undetectable by an implementation.
Furthermore, for some implementations, volatile might indicate that special hardware instructions are required to access the object.
See [intro.execution] for detailed semantics.
In general, the semantics of volatile are intended to be the same in C++ as they are in C.
— end note]
Despite the lack of guarantees by the C++ standard, over-writing the memory through a pointer to volatile is one way that some crypto libraries clear memory - at least as a fallback when system specific function isn't available.
P.S. I recommend using const_cast instead, in order to avoid accidentally casting to a different type rather than differently qualified same type:
auto ptr = const_cast<volatile std::uint8_t*>(buf_.data());
Implicit conversion also works:
volatile std::uint8_t* ptr = buf_.data();
System specific functions for this purpose are SecureZeroMemory in windows and explicit_bzero in some BSDs and glibc.
The C11 standard has an optional function memset_s for this purpose and it may be available for you in C++ too but isn't of course guaranteed to be available.
There is a proposal P1315 to introduce similar function to the C++ standard.
Note that secure erasure is not the only consideration that has to be taken to minimise possibility of leaking sensitive data. For example, operating system may swap the memory onto permanent storage unless instructed to not do so. There's no standard way to make such instruction in C++. There's mlock in POSIX and VirtualLock in windows.

In C++ - is it possible to compare volatile shared_ptr to nullptr?

It seems like the volatile comparison functions in shared_ptr implementation do not exist.
Does it even make sense to exist?
Essentially no, the standard doesn't cater for comparisons or boolean conversion on a volatile shared_ptr.
The following fails to compile...
auto main() -> int {
volatile shared_ptr<int> a;
if (a == nullptr) // fails
; // empty
if (a) // fails
; // empty
}
You could cast the volatile off (via. a const_cast), but I'm not sure that will have the desired outcome. From the cppreference:
Modifying a const object through a non-const access path and referring to a volatile object through a non-volatile glvalue result in undefined behavior.
In more general terms, in not marking member methods as volatile, the class or library implementors are effectively saying "this is not intended to be use as a volatile object" - if it were, then the appropriate methods or overloads would provide for volatile objects. Similarly, this applies to const, in marking members as const, they are saying "this class can be used as a const object.
Why is the volatile needed, what external sources could be modifying the shared_ptr "without the knowledge of the compiler" (this is one of the uses of volatile)? If there are threading issues, then you would be better served with one the thread library utilities, or if the requirement is simply turning off optimisations, various compilers have mechanisms for this already (pragmas etc.).
Volatile is just an indication to the compiler that the memory may change unexpectedly. Turns off some optimizations. Under the covers its just a memory address.
The shared pointer is just holding/managing the resource. That said, since std::shared_ptr::get() isn't marked volatile, you cannot really use a shared_ptr to manage a volatile pointer in a accessible way.
I'd recommend using a naked pointer, and using scoped exit or the destructer to clean up after it.
If you are using volatile because the pointer may be modified by another thread, you may want to consider using std::atomic instead of volatile. In threading workflows, before std::atomic, pointers that were accessed from other threads were usually marked volatile. That is no longer best practice.

C++ - What does volatile represent when applied to a method?

If I have a C++ method declaration as follows:
class A
{
public:
double getPrice() volatile;
};
What does volatile represent here?
What could it be used for?
You might be interested in this Dr Dobbs article by Andrei Alexandrescu. I was :)
Edit:
That article was written a while back and now it seems that the community has moved on. Herb Sutter has this to say this. Thanks Iain (and Herb!)
mlimber points out that Andrei had a follow up article here where he continues to advocate the use of volatile correctness as a valuable tool for detecting race conditions on systems supporting POSIX-like mutexes.
You're probably familiar with const methods and const-correctness (cf. "Item 15 - Use const proactively" in C++ Coding Standards by Sutter and Alexandrescu), and volatile works in similar but slightly different ways to yield what might be called "volatile-correctness."
Like const, volatile is a type modifier. When attached to a member function as in your example, either modifier (or both!) mean that the object on which the method is called must have or be convertible to that type.
Consider:
struct A
{
void f();
void cf() const;
void vf() volatile;
void cvf() const volatile;
// ...
};
void foo( A& a, const A& ca, volatile A& va, const volatile A& cva )
{
a.f(); // Ok
a.cf(); // Ok: Can convert non-const obj to const obj
a.vf(); // Ok: Can convert non-volatile obj to volatile obj
a.cvf(); // Ok: Can convert non-cv obj to cv obj
ca.f(); // Error: can't call non-const method on const obj
ca.cf(); // Ok
ca.vf(); // Error: can't call non-const method on const obj
ca.cvf(); // Ok: Can convert
va.f(); // Error: can't call non-volatile method on volatile obj
va.cf(); // Error: can't call non-volatile method on volatile obj
va.vf(); // Ok
va.cvf(); // Ok: Can convert
cva.f(); // Error: can't call non-cv method on cv obj
cva.cf(); // Error: can't call non-cv method on cv obj
cva.vf(); // Error: can't call non-cv method on cv obj
cva.cvf(); // Ok
}
Note these are compile-time errors, not run-time errors, and that is where it's potential usefulness comes in.
Const-correctness prevents unintentional errors at compile-time as well as making code "easier to understand, track, and reason about" (Sutter and Alexandrescu). Volatile-correctness can function similarly but is much less used (note that const_cast in C++ can cast away const, volatile, or const volatile, but rather than calling it cv_cast or similar, it's named after const alone because it is far more commonly used for casting away just const).
For instance, in "volatile - Multithreaded Programmer's Best Friend", Andrei Alexandrescu gives some examples of how this can be used to have the compiler automatically detect race conditions in multithreaded code. It has plenty of explanation about how type modifiers work, too, but see also his follow-up comments in his subsequent column.
Update:
Note that C++11 changes the meaning of const. Thus sayeth the Sutter: "const now really does mean 'read-only, or safe to read concurrently'—either truly physically/bitwise const, or internally synchronized so that any actual writes are synchronized with any possible concurrent const accesses so the callers can’t tell the difference."
Elsewhere, he notes that while C++11 has added concurrency primitives, volatile is still not one of them: "C++ volatile variables (which have no analog in languages like C# and Java) are always beyond the scope of this and any other article about the memory model and synchronization. That’s because C++ volatile variables aren’t about threads or communication at all and don’t interact with those things. Rather, a C++ volatile variable should be viewed as portal into a different universe beyond the language — a memory location that by definition does not obey the language’s memory model because that memory location is accessed by hardware (e.g., written to by a daughter card), have more than one address, or is otherwise 'strange' and beyond the language. So C++ volatile variables are universally an exception to every guideline about synchronization because are always inherently “racy” and unsynchronizable using the normal tools (mutexes, atomics, etc.) and more generally exist outside all normal of the language and compiler including that they generally cannot be optimized by the compiler.... For more discussion, see my article 'volatile vs. volatile.'"
It is a volatile member which, just like a const member can only be called on const objects, can only be called on volatile objects.
What's the use? Well, globally volatile is of little use (it is often misunderstood to be applicable for multi-threaded -- MT -- programming, it isn't the case in C++, see for instance http://www.drdobbs.com/high-performance-computing/212701484), and volatile class objects are even less useful.
IIRC A. Alexandrescu has proposed to use the type checking done on volatile objects to statically ensure some properties usefull for MT programming (say that a lock has been taken before calling a member function). Sadly, I don't find the article back. (Here it is: http://www.drdobbs.com/184403766)
Edit: added links from the comments (they where added also in the question).
In member functions (the only functions that can have cv-qualifiers), the const or volatile effectively modifies the this pointer. Therefore, like a const member function can access the object only as if through a const pointer, a volatile member function can access the object only as if through a volatile pointer.
The informal meaning of volatile is that an object can change due to circumstances outside the program (such as memory-mapped I/O or shared memory). The precise meaning is that any access to volatile data must be done in reality as it is written in the code, and may not be optimized out or changed in order relative to other volatile accesses or I/O operations.
What this means is that any operations relating to the object in volatile member functions must be done in order as written.
Further, a volatile member function can only call other volatile (or const volatile) member functions.
As for what use it is...frankly, I can't think of a good use right now. volatile is vital for some data objects, such as pointers pointed to I/O registers, but I can't think of why a volatile member function would be useful.

C++ volatile and operator overloading for CUDA application

I have a class A that I overload its operator=. However it is required that I need to do something like this:
volatile A x;
A y;
x = y;
which raised an error while compiling
error: no operator "=" matches these operands
operand types are: volatile A = A
If I removed volatile, it's compilable. Is there anyway to have this compiled without removing the "volatile" (and still keep the behavior of volatile) ?
Basically this is a CUDA program in which 'x' is a shared memory ( all threads can access and modify its value ). I want it to be "volatile" in order to avoid the compiler optimization and re-use the value instead of accessing the memory address.
More on the problem: at the beginning A is just a primitive type e.g integer, volatile worked as expected and doesn't cause any problem, now I want it to be a custom class ( integer 128-bit for example ). I'm not sure why C++ complain in this case but not with primitive data type.
Thanks in advance.
Assuming the volatile qualification is necessary, you'll have to add a volatile assignment operator to A (A& A::operator=(const A&) volatile).
const_cast<A&>(x) = y will make it compile, but will technically cause undefined behaviour, and will certainly remove the guarantees that volatile gives.
The "volatile isn't a lot of use in C++ threading" comment is irrelevant to the question, which is CUDA-specific. volatile is needed for warp synchronous coding in CUDA.
volatile isn't a lot of use in C++ threading (see Dave Butenhof's explanation at http://www.lambdacs.com/cpt/FAQ.html#Q56). It's not sufficient to ensure your program flushes the data written out of core-local cache to a point where other programs can see the updates in shared memory, and given almost everyone's multi-core these days, that's a serious problem. I suggest you use proper threading synchronisation methods, such as boost's if your portability needs match it, or perhaps POSIX mutexes and condition variables, failing that more architecture dependent techniques like memory barriers or atomic operations that implicitly sync memory between cores.
I'm sure you want it to be fast, but fast and unstable isn't generally as useful as slow and reliable, especially if you ship a product that's only unstable on your customer's hardware.
Declaring a copy constructor
volatile A& operator=(volatile A&) volatile;
worked for me with nvcc. Note that you may have to pass around the non-primitive type by reference only. Else you'll need more copy-constructors that convert volatile instances to non-volatile whenever the non-primitive type is passed by value to a non-volatile parameter.
It really boils down to establishing volatile-correctness (much like const-correctness).

Is `volatile` required for shared memory accessed via access function?

[edit] For background reading, and to be clear, this is what I am talking about: Introduction to the volatile keyword
When reviewing embedded systems code, one of the most common errors I see is the omission of volatile for thread/interrupt shared data. However my question is whether it is 'safe' not to use volatile when a variable is accessed via an access function or member function?
A simple example; in the following code...
volatile bool flag = false ;
void ThreadA()
{
...
while (!flag)
{
// Wait
}
...
}
interrupt void InterruptB()
{
flag = true ;
}
... the variable flag must be volatile to ensure that the read in ThreadA is not optimised out, however if the flag were read via a function thus...
volatile bool flag = false ;
bool ReadFlag() { return flag }
void ThreadA()
{
...
while ( !ReadFlag() )
{
// Wait
}
...
}
... does flag still need to be volatile? I realise that there is no harm in it being volatile, but my concern is for when it is omitted and the omission is not spotted; will this be safe?
The above example is trivial; in the real case (and the reason for my asking), I have a class library that wraps an RTOS such that there is an abstract class cTask that task objects are derived from. Such "active" objects typically have member functions that access data than may be modified in the object's task context but accessed from other contexts; is it critical then that such data is declared volatile?
I am really interested in what is guaranteed about such data rather than what a practical compiler might do. I may test a number of compilers and find that they never optimise out a read through an accessor, but then one day find a compiler or a compiler setting that makes this assumption untrue. I could imagine for example that if the function were in-lined, such an optimisation would be trivial for a compiler because it would be no different than a direct read.
My reading of C99 is that unless you specify volatile, how and when the variable is actually accessed is implementation defined. If you specify volatile qualifier then code must work according to the rules of an abstract machine.
Relevant parts in the standard are: 6.7.3 Type qualifiers (volatile description) and 5.1.2.3 Program execution (the abstract machine definition).
For some time now I know that many compilers actually have heuristics to detect cases when a variable should be reread again and when it is okay to use a cached copy. Volatile makes it clear to the compiler that every access to the variable should be actually an access to the memory. Without volatile it seems compiler is free to never reread the variable.
And BTW wrapping the access in a function doesn't change that since a function even without inline might be still inlined by the compiler within the current compilation unit.
P.S. For C++ probably it is worth checking the C89 which the former is based on. I do not have the C89 at hand.
Yes it is critical.
Like you said volatile prevents code breaking optimization on shared memory [C++98 7.1.5p8].
Since you never know what kind of optimization a given compiler may do now or in the future, you should explicitly specify that your variable is volatile.
Of course, in the second example, writing/modifying variable 'flag' is omitted. If it is never written to, there is no need for it being volatile.
Concerning the main question
The variable has still to be flagged volatile even if every thread accesses/modifies it through the same function.
A function can be "active" simultaneously in several threads. Imagine that the function code is just a blueprint that gets taken by a thread and executed. If thread B interrupts the execution of ReadFlag in thread A, it simply executes a different copy of ReadFlag (with a different context, e.g. a different stack, different register contents). And by doing so, it could mess up the execution of ReadFlag in thread A.
In C, the volatile keyword is not required here (in the general sense).
From the ANSI C spec (C89), section A8.2 "Type Specifiers":
There are no
implementation-independent semantics
for volatile objects.
Kernighan and Ritchie comment on this section (referring to the const and volatile specifiers):
Except that it should diagnose
explicit attempts to change const
objects, a compiler may ignore these
qualifiers.
Given these details, you can't be guaranteed how a particular compiler interprets the volatile keyword, or if it ignores it altogether. A keyword that is completely implementation dependent shouldn't be considered "required" in any situation.
That being said, K&R also state that:
The purpose of volatile is to force
an implementation to suppress
optimization that could otherwise
occur.
In practice, this is how practically every compiler I have seen interprets volatile. Declare a variable as volatile and the compiler will not attempt to optimize accesses to it in any way.
Most of the time, modern compilers are pretty good about judging whether or not a variable can be safely cached or not. If you find that your particular compiler is optimizing away something that it shouldn't, then adding a volatile keyword might be appropriate. Be aware, though, that this can limit the amount of optimization that the compiler can do on the rest of the code in the function that uses the volatile variable. Some compilers are better about this than others; one embedded C compiler I used would turn off all optimizations for a function that accesses a volatile, but others like gcc seem to be able to still perform some limited optimizations.
Accessing the variable through an accessor function should prevent the function from caching the value. Even if the function is auto-inlined, each call to the function should re-call the function and re-fetch a new value. I have never seen a compiler that would auto-inline the accessor function and then optimize away the data re-fetch. I'm not saying it can't happen (since this is implementation-dependent behavior), but I wouldn't write any code that expects that to happen. Your second example is essentially placing a wrapper API around the variable, and libraries do this without using volatile all the time.
All in all, the treatment of volatile objects in C is implementation-dependent. There is nothing "guaranteed" about them according to the ANSI C89 spec.
Your code is sharing the volatile object between a thread and an interrupt routine. No compiler implementation (that I have ever seen) gives volatile enough power to be sufficient for handling parallel access. You should use some sort of locking mechanism to guarantee that the two threads (in your first example) don't step on each other's toes (even though one is an interrupt handler, you can still have parallel access on a multi-CPU or multi-core system).
Edit: I didn't read the code very closely and so I thought this was a question about thread synchronization, for which volatile should never be used, however this usage looks like it might be OK (Depending on how else the variable in question is used, and if the interrupt is always running such that it's view of memory is (cache-)coherent with the one that the thread sees. In the case of 'can you remove the volatile qualifier if you wrap it in a function call?' the accepted answer is correct, you cannot. I'm going to leave my original answer because it's important for people reading this question to know that volatile is almost useless outside of certain very special cases.
More Edit: Your RTOS use case may require additional protection above and beyond volatile, you may need to use memory barriers in some cases or make them atomic... I can't really tell you for sure, it's just something you need to be careful of (I'd suggest looking at the Linux kernel documentation link I have below though, Linux doesn't use volatile for that kind of thing, very probably with a good reason). Part of when you do and do not need volatile depends very strongly on the memory model of the CPU you're running on, and often volatile is not good enough.
volatile is the WRONG way to do this, it does NOT guarantee that this code will work, it wasn't meant for this kind of use.
volatile was intended for reading/writing to memory mapped device registers, and as such it is sufficient for that purpose, however it DOES NOT help when you're talking about stuff going between threads. (In particular the compiler is still aloud to re-order some reads and writes, as is the CPU while it's executing (this one's REALLY important since volatile doesn't tell the CPU to do anything special (sometimes it means bypass cache, but that's compiler/CPU dependent))
see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2016.html, Intel developer article, CERT, Linux kernel documentation
Short version of those articles, volatile used the way you want to is both BAD and WRONG. Bad because it will make your code slower, wrong because it doesn't actually do what you want.
In practice, on x86 your code will function correctly with or without volatile, however it will be non-portable.
EDIT: Note to self actually read the code... this is the sort of thing volatile is meant to do.