Does Boost.Interprocess sacrifice performance to achieve portability

Does Boost.Interprocess sacrifice performance to achieve portability - c++

I just read this page of the Boost.Interprocess documentation. It seems to suggest that in order to accommodate the differences among different operating systems and come to some agreement, certain interprocess mechanisms are not implemented with the directly corresponding native mechanism provided by the operating system, but instead emulated using other mechanisms. I'm wondering whether this may impose a considerable performance hit.
The last section on that page is particularly concerning to me, which is cited below
Since each mechanism can be emulated through diferent mechanisms (a
semaphore might be implement using mapped files or native semaphores)
permissions types could vary when the implementation of a named
resource changes (eg.: in Windows mutexes require synchronize
permissions, but that's not the case of files). To avoid this,
Boost.Interprocess relies on file-like permissions, requiring file
read-write-delete permissions to open named synchronization mechanisms
(mutex, semaphores, etc.) and appropiate read or read-write-delete
permissions for shared memory. This approach has two advantages: it's
similar to the UNIX philosophy and the programmer does not need to
know how the named resource is implemented.
Based on this text, I'm guessing that most of the kernel objects provided natively by Windows for interprocess synchronization (e.g, Event, Mutex, Semaphore) are just not used by Boost.Interprocess.

I've seen before that native kernel objects are used.
As I read it, the message speaks of permissions only.
It mentions that this is emulated in case the underlying objects have different access control. It doesn't actually mention how it's emulated.

Related

How to create a single process mutex within C++?

So I'm reading about monitors vs mutexes and finding mentions that suggest that monitors are faster mutexes because they don't lock system wide but rather only across the threads of a given process.
Is there some way in C++ to accomplish or simulate this?
Edit: I'm curious now what the difference is between system wide mutex and one restricted to a specific process.

C++ Standard does not define system-wide vs per-process primitives. So C++ does not specify whether std::mutex is system-wide.
Reasonable implementations have efficient per-process std::mutex; to have system-wide mutex you'll need to use libraries or operating system objects for your platform
The difference is that per-process mutex may use any memory operations to avoid system calls, as the process memory is shared among process's threads. Atomic operation on that memory are more efficient, and system call is often avoided via them. System-wide mutex will either start with system calls (not efficient), or will have to use shared memory (might be unsafe, also still may have some overhead).

The answer by #Alex Guteniev is as accurate as one can get (and should be considered the accepted answer). It states that the c++ standard doesn't define a system wide concept, and that mutexes for all practical purposes are per process i.e for synchronization between threads (execution agents) in a single process (and therefore according to your needs). The C++ makes it clear what a thread (std::thread) is (33.3 - ... intended to map one-to-one with OS threads (in my draft, at least...N4687)).
Microsoft post VC2015 has improved their implementation to use windows primitives as stated here. This is also indicated here in the most upvoted answer. I've also looked at the boost library implementations (which often precedes/influences the c++ standard) for microsoft and (AFAICT) it doesn't use any inter-process calls.
So to answer your question. In C++ threads and monitors are practically the same thing if this definition is to be considered accurate.

Update, stumbled across the answer to this while researching something related.
On Windows, Critical Sections can be used for single processes instead of system wide mutexes and are often faster:
Edit:
While the above statement is correct, c++ doesn't have the concept system wide mutex. This concept only exists when using OS specific primitives such as win32 CreateMutex and is not relevant to std c++.
Source:
std::mutex performance compared to win32 CRITICAL_SECTION
On Linux, pthreads are for processes.

POSIX Shared Memory Sync Across Processes C++/C++11

Problem (in short):
I'm using POSIX Shared Memory and currently just used POSIX semaphores and i need to control multiple readers, multiple writers. I need help with what variables/methods i can use to control access within the limitations described below.
I've found an approach that I want to implement but i'm unsure of what methodology i can use to implement it when using POSIX Shared memory.
What I've Found
https://stackoverflow.com/a/28140784
This link has the algorithm i'd like to use but i'm unsure how to implement it with shared memory. Do i store the class in shared memory somehow? This is where I need help please.
The reason I'm unsure is a lot of my research, points towards keeping shared memory to primitives only to avoid addressing problems and STL objects can't be used.
NOTE:
For all my multi-threading i'm using C++11 features. This shared memory will be completely seperate program executables using C++11 std::threads from which any thread of any process/executable will want access. I have avoided the Linux pthread for any of my multi-threading and will continue to do so (except if its just control variable not actual pThreads).
Solution Parameters aimed for
Must be shareable between 2+ processes which will be running multiple C++11 std::thread that may wish access. I.e. Multiple Writers (exclusive one at a time) while allowing multiple simultaneous readers when no writer wants access.
Not using BOOST libraries. Ideally native C++11 or built in linux libraries, something that will work without the need to install abstract libraries.
Not using pThread actual threads but could use some object from there that will work with C++11 std::thread.
Ideally can handle a process crash while in operation. E.g. Using POSIX semaphore if a process crashes while it has the semaphore, everyone is screwed. I have seen people using file locks?
Thanks in advance

keeping shared memory to primitives only to avoid addressing problems
You can use pointers in and to shared memory objects across programs, so long as the memory is mmaped to the same address. This is actually a straightforward proposition, especially on 64 bit. See this open source C library I wrote for implementation details: rszshm - resizable pointer-safe shared memory.
Using POSIX semaphore if a process crashes while it has the semaphore, everyone is screwed.
If you want to use OS mediated semaphores, the SysV semaphores have SEM_UNDO, which recovers in this case. OTOH pthread offers robust mutexes that can be embedded and shared in shared memory. This can be used to build more sophisticated mechanisms.
The SysV scheme of providing multiple semaphores in a semaphore set, where a group of actions must all succeed, or the call blocks, permits building sophisticated mechanism too. A read/write lock can be made with a set of three semaphores.

Exchange and store values between different processes

My application needs to store and exchange some (one) values between different processes. Also behind a start of the application this value is needed (but not behind a system reboot).
I can write and read this value in a file and synchronize the access. The file could lay in a ramfs. The solution would work but I have the feeling I use the wrong method.
Is there a better lightweight solution for this? Do I miss an straightforward approach?
I was thinking about named pipes (mkfifo) but there needs always and active writer and reader?

You are asking about inter-process communication. There are a number of methods for communicating between processes:
Low-level shared memory
Low-level named pipes
Low-level sockets
Remote procedure call mechanisms (DCOM, Corba, ONC RPC, etc.)
REST api
Distributed system frameworks
Which of these will work best for you depends on the complexity of the messages exchanged between processes, the complexity of the overall system, the need for portability, etc.
Shared memory is a very low-level approach and can feel the easiest solution "because it's just bytes in memory addressed by a pointer". However, it's inherently low-level nature makes it also tedious to use. There is no universally agreed upon C++ interface to these facilities, so you are left with low-level C style APIs for accessing and configuring shared memory between processes. There are differences between platforms (POSIX does it one way; Windows does it another).
Boost.Interprocess gives you a portable way to access shared memory mechanisms and aims to make using them simpler.

C++ how to check if file is in use - multi-threaded multi-process system

C++:
Is there a way to check if a file has been opened for writing by another process/ class/ device ?
I am trying to read files from a folder that may be accessed by other processes for writing. If I read a file that is simultaneously being written on, both the read and the write process give me errors (the writing is incomplete, I might only get a header).
So I must check for some type of condition before I decide whether to open that specific file.
I have been using boost::filesystem to get my file list. I want compatibility with both Unix and Windows.

You must use a file advisory lock. In Unix, this is flock, in Windows it is LockFile.
However, the fact that your reading process is erroring probably indicates that you have not opened the file in read-only mode in that process. You must specify the correct flags for read-only access or from the OS' perspective you have two writers.
Both operating systems support reader-writer locks, where unlimited readers are allowed, but only in the absence of writers, and only at most one writer at a time will have access.
Since you say your system is multi-process (ie, not multi thread), you can't use a condition variable (unless it's in interprocess shared memory). You also can't use a single writer as a coordinator unless you're willing to shuttle your data there via sockets or shared memory.

From what I understand about boost::filesystem, you're not going to get the granularity you need from that feature-set in order to perform the tasks you're requesting. In general, there are two different approaches you can take:
Use a synchronization mechanism such as a named semaphore visible at the file-system level
Use file-locks (i.e., fcntl or flock on POSIX systems)
Unfortunately both approaches are going to be platform-specific, or at least specific to POSIX vs. Win32.

A very nice solution can be found here using Sutter's active object https://sites.google.com/site/kjellhedstrom2/active-object-with-cpp0x
This is quite advanced but really scaled well on many cores.

Why do libraries implement their own basic locks on windows?

Windows provides a number of objects useful for synchronising threads, such as event (with SetEvent and WaitForSingleObject), mutexes and critical sections.
Personally I have always used them, especially critical sections since I'm pretty certain they incur very little overhead unless already locked. However, looking at a number of libraries, such as boost, people then to go to a lot of trouble to implement their own locks using the interlocked methods on Windows.
I can understand why people would write lock-less queues and such, since thats a specialised case, but is there any reason why people choose to implement their own versions of the basic synchronisation objects?

Libraries aren't implementing their own locks. That is pretty much impossible to do without OS support.
What they are doing is simply wrapping the OS-provided locking mechanisms.
Boost does it for a couple of reasons:
They're able to provide a much better designed locking API, taking advantage of C++ features. The Windows API is C only, and not very well-designed C, at that.
They are able to offer a degree of portability. the same Boost API can be used if you run your application on a Linux machine or on Mac. Windows' own API is obviously Windows-specific.
The Windows-provided mechanisms have a glaring disadvantage: They require you to include windows.h, which you may want to avoid for a large number of reasons, not least its extreme macro abuse polluting the global namespace.

One particular reason I can think of is portability. Windows locks are just fine on their own but they are not portable to other platforms. A library which wishes to be portable must implement their own lock to guarantee the same semantics across platforms.

In many libraries (aka Boost) you need to write corss platform code. So, using WaitForSingleObject and SetEvent are no-go. Also, there common idioms, like Monitors, Conditions that Win32 API misses, (but it can be implemented using these basic primitives)
Some lock-free data structures like atomic counter are very useful; for example: boost::shared_ptr uses them in order to make it thread safe without overhead of critical section, most compilers (not msvc) use atomic counters in order to implement thread safe copy-on-write std::string.
Some things like queues, can be implemented very efficiently in thread safe way without locks at all that may give significant perfomance boost in certain applications.

There may occasionally be good reasons for implementing your own locks that don't use the Windows OS synchronization objects. But doing so is a "sharp stick." It's easy to poke yourself in the foot.
Here's an example: If you know that you are running the same number of threads as there are hardware contexts, and if the latency of waking up one of those threads which is waiting for a lock is very important to you, you might choose a spin lock implemented completely in user space. If the waiting thread is the only thread spinning on the lock, the latency of transferring the lock from the thread that owns it to the waiting thread is just the latency of moving the cache line to the owner thread and back to the waiting thread -- orders of magnitude faster than the latency of signaling a thread with an OS lock under the same circumstances.
But the scenarios where you want to do this is pretty narrow. As soon as you start having more software threads than hardware threads, you'll likely regret it. In that scenario, you could spend entire OS scheduling quanta doing nothing but spinning on your spin lock. And, if you care about power, spinlocks are bad because they prevent the processor from going into a low-power state.
I'm not sure I buy the portability argument. Portable libraries often have an OS portability layer that abstracts the different OS APIs for synchronization. If you're dealing with locks, a pthread_mutex can be made semantically the same as a Windows Mutex or Critical Section under an abstraction layer. There's some exceptions here, but for most people this is true. If you're dealing with Windows Events or POSIX condition variables, well, those are tougher to abstract. (Vista did introduce POSIX-style condition variables, but not many Windows software developers are in a position to require Vista...)

Writing locking code for a library is useful if that library is meant to be cross platform. Users of the library can use the library's locking functionality and not have to care about the underlying platform implementation. Assuming the library has versions for all the platforms being targetted it's one less bit of code that has to be ported.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js