In a C++ Linux app, what is the simplest way to get the functionality that the Interlocked functions on Win32 provide? Specifically, a lightweight way to atomically increment or add 32 or 64 bit integers?
Just few notes to clarify the issue which has nothing to do with Linux.
RWM (read-modify-write) operations and those that do not execute in a single-step need the hardware-support to execute atomically; among them increments and decrements, fetch_and_add, etc.
For some architecture (including I386, AMD_64 and IA64) gcc has a built-in support for atomic memory access, therefore no external libray is required. Here you can read some information about the API.
Intel's open-source ThreadBuildingBlocks has a template, Atomic, that offers the same functionality as .NET's Interlocked class.
Unlike gcc's Atomic built-ins, it's cross platform and doesn't depend on a particular compiler. As Nemanja Trifunovic correctly points out above, it does depend on the compare-and-swap CPU instruction provided by x86 and Itanium chips. I guess you wouldn't expect anything else from an Intel library : )
Strictly speaking, Linux cannot offer atomic "interlocked" functions like ones in Win32, simply because these functions require hardware support, and Linux runs on some platforms that don't offer that support. Having said that, if you can constrain yourself to Intel x86/x64, take a look at the implementation of reference counting in Boost shared pointers library.
The atomic functions from the Apache Portable Runtime are really close to the Win32 InterlockedXXX functions.
You can insert some assembly code in your source, to use x68 interlocked instructions directly.
You should use a lock xadd operation.
See for instance this.
The fairly common glib library that's used in GTK and QT programming as well as standalone offers a variety of atomic operations. See http://library.gnome.org/devel/glib/2.16/glib-Atomic-Operations.html for a list. There are g_atomic functions for most of the operations that Interlocked supports on Win32, and on platforms where the hardware directly supports these, they are inlined as the needed assembly code.
Upon further review, this looks promising. Yay stack overflow.
Related
When implementing condition variables into a Win32 C++ program, would it be better to use Win32 functions, classes, and data types (e.g. CreateThread, SleepConditionVariableCS, WaitForSingleObjectEx, ReleaseMutex, CONDITION_VARIABLE) or those from the C++11 standard libraries (e.g. thread, wait, join, unlock, condition_variable)?
Since the answer to this question is probably not binary, what considerations should one take into account when making such a decision?
The C++ synchronization mechanisms are designed to C++ principles. They free their resources in the destructor, and they also use RAII to ensure safe locking. They use exceptions to signal errors.
Essentially, they are much harder to use incorrectly than the function-based native Windows API. This means that if you can use them (your implementation supports them), you always should use them.
Oh, and they are cross-platform.
One consideration should be what your compiler can handle. For example, when you install MinGW on Windows, you can choose whether to install the API for POSIX threads or Win32 threads. On the other hand, if you use TDM-GCC, you should be aware that versions 4.7.1 and lower use Win32 threads, while versions 4.8.1 and higher use POSIX threads. And as woolstar mentioned above, if you're using Microsoft's compiler, you should check to see whether the bugs in its support for these classes have been worked out.
If your compiler supports POSIX threads, you can use the C++ thread classes of the Standard Library (e.g. thread, mutex, condition_variable). If your compiler supports Win32 threads, you can use the Win32 thread functions.
In my case, I originally had TDM-GCC 4.7.1 and tried to use the C++ Standard Library classes, but that didn't work (for reasons explained above). So I installed MinGW by itself and chose "posix" in the "threads" option of the installer. Then I was able to use those classes.
What are the advantages and disadvantages of using Interlocked winapi functions instead of any library provides atomic operations on Win32 platform?
Portability is not an issue.
If portability is not a concern then you're basically down to deciding whom you trust more to get this right. A library is generally designed to provide portability. It otherwise has a tough time competing with an OS provided implementation that's been battle-hardened for over 15 years.
Check this thread to see an example of how the obvious implementation is not in fact the best.
The Interlocked winapi functions work on old processors even when there is no CPU support for locked operations. 386 and maybe 486, not really a issue today unless you still support Win9x and older NT.
It would likely depend up on the specific atomic library in question.
A good library with a specific back-end would likely end up with the same implementation of a couple of ASM instructions to issue an x86 lock instruction and do their work. And assuming the library itself is portable, subsequently make your code portable.
A naive atomic implementation might do something heavier like use a mutex to protect a normal variable. I don't know of any that do - just making the point for argument.
As such, given your stated non-portability requirements, using the Win32 functions should be fine. Alternately, go ahead with an Atomic version, but perhaps look at the actual implementation.
Is there free a portable (Windows, GNU/Linux & MacOSX) library providing a lock-free atomic swap function?
If not, how would it be implemented for each of these platforms? (x86 with VC++ or g++)
Thanks
There's a lock-free library pending review in boost. Also if you dig into source of boost smart pointers library you will find atomic ops inlined for multiple platforms. Another one - Intel Threading Building Blocks has implementation of atomic<> template.
Depends what you want to swap. In assembler for x86 you might be able to get a "nearly" atomic xor swap, otherwise I'd go with some solution that uses locking, which will differ on Win32/{Linux,Darwin}.
If you are looking for a library, have a look at APR (Apache Portable Runtime) - http://apr.apache.org/
Boost has a set of macros for facilitating lock-free operations in a portable way.
It is easy to set memory barriers on the kernel side: the macros mb, wmb, rmb, etc. are always in place thanks to the Linux kernel headers.
How to accomplish this on the user side?
You are looking for the full memory barrier atomic builtins of gcc.
Please note the detail on the reference i gave here says,
The [following] builtins are intended to be compatible with those described in the Intel Itanium Processor-specific Application Binary Interface, section 7.4. As such, they depart from the normal GCC practice of using the “__builtin_” prefix, and further that they are overloaded such that they work on multiple types.
Posix defines a number of functions as acting as memory barriers. Memory locations must not be concurrently accessed; to prevent this, use synchronization - and that synchronization will also work as a barrier.
Use libatomic_ops. http://www.hpl.hp.com/research/linux/atomic_ops/
It's not compiler-specific, and less buggy than the GCC stuff. It's not a giganto-library that provides tons of functionality you don't care about. It just provides atomic operations. Also, it's portable to different CPU architectures.
Linux x64 means you can use the Intel memory barrier instructions.
You might wrap them in macros similar to those in the Linux headers, if
those macros aren't appropriate or accessible to your code
__sync_synchronize() in GCC 4.4+
The Intel Memory Ordering White Paper, a section from Volume 3A of Intel 64 and IA-32 manual http://developer.intel.com/Assets/PDF/manual/253668.pdf
The Qprof profiling library (nothing to do with Qt) also includes in its source code a library of atomic operations, including memory barriers. They work on many compilers and architectures. I'm using it on a project of mine.
http://www.hpl.hp.com/research/linux/qprof/download.php4
The include/arch/qatomic_*.h headers of a recent Qt distribution include (LGPL) code for a lot of architectures and all kinds of memory barriers (acquire, release, both).
Simply borrowing barriers defined for Linux kernel, just add those macros to your header file: http://lxr.linux.no/#linux+v3.6.5/arch/x86/include/asm/barrier.h#L21 . And of course, give Linux developers credit in your source code.
Been doing mostly Java and smattering of .NET for last five years and haven't written any significant C or C++ during that time. So have been away from that scene for a while.
If I want to write a C or C++ program today that does some multi-threading and is source code portable across Windows, Mac OS X, and Linux/Unix - is PThread a good choice?
The C or C++ code won't be doing any GUI, so won't need to worry with any of that.
For the Windows platform, I don't want to bring a lot of Unix baggage, though, in terms of unix emulation runtime libraries. Would prefer a PThread API for Windows that is a thin-as-possible wrapper over existing Windows threading APIs.
ADDENDUM EDIT:
Am leaning toward going with
boost:thread - I also want to be able
to use C++ try/catch exception
handling too. And even though my
program will be rather minimal and not
particularly OOPish, I like to
encapsulate using class and namespace
- as opposed to C disembodied functions.
Well, pthreads is the old posix standard for writing threaded programs. Its the lowest level threading routines, so its a good choice for cross-platform threading.
However, there are alternatives:
boost::thread - an STL style
threading library
Intel's Thread
Building Blocks
OpenMP -
both these are a higher-level way of
writing threaded apps without needing
to do any threading calls.
As the latter are all fully supported on all platforms, (pthreads requires a bit of compiler settings as its only part of Windows posix subsystem, unless you want to use Pthreads-w32), then perhaps the latter ones are a better choice. boost::threads are more like a threading library, the other 2 are high-level ways of achieving parallelism without needing to code 'threads', they allow you to write loops that run concurrently automatically (subject to common-sense conditions)
Boost::thread is not a C compatible library though.
edit: cross-platform abilities of the above:
Intel TBB is cross-platform (Windows*,
Linux*, and Mac OS* X), supports
32-bit and 64-bit applications and
works with Intel, Microsoft and GNU
compilers.
OpenMP depends on the compiler you want to use, but GCC and/or Intel compilers have supported OpenMP Windows, Linux and MacOS.
If you need your code to be truly portable then it may be best to stay away from the various libraries that scatter the internet. At some point you'll find a platform they don't support and will then have to create your own branch.
This is also not a hard problem to solve and can be a good exercise for creating cross-platform code.
I'd suggest you create a class, e.g. CThread, that has separate .cpp implementations for each platform and a pure-virtual execute() function that is called after your thread is constructed/run.
That allows all of your thread-creation and sleep/shutdown/priority code to be implemented using the most appropriate API for the platform. You may also need a header (e.g. ThreadTypes.h) that contains defines/typedefs for each platform.
E.g.
// ThreadTypes.h
#if defined(PLATFORM_WIN) || defined(PLATFORM_XBOX)
typedef DWORD ThreadID
#elif defined(PLATFORM_PS3)
// etc etc
#endif
This is how I have written all my cross-platform threading code for platforms such as PC/PS2/PS3/360/Wii. It is also a good pattern to follow for things like mutex's and semaphores, which if you have threads you're certain to need at some point :)
Nope, pthreads aren't normally available on Windows. (There are a few attempts at implementing it, but it's not supported by the OS directly, at least.)
If you're writing C++, Boost is, as usual, the answer. Boost.Thread has a portable (and safer) threading library.
In C, the simplest solution is probably to wrap write a common wrapper for both pthreads and the Windows threading API.
I will bet on ZThread
Simple API, easier to use than PThreads and FREE
Have a look at ting also:
http://code.google.com/p/ting/
It is cross platform between Windows and Linux. No Mac OS support yet.