Multiplatform multiprocessing?

Multiplatform multiprocessing? - c++

I was wondering why in the new C++11 they added threads and not processes.
Couldn't have they done a wrapper around platform specific functions?
Any suggestion about the most portable way to do multiprocessing? fork()? OpenMP?

If you could use Qt, QProcess class could be an elegant platform independent solution.

If you want to do this portably I'd suggest you avoid calling fork() directly and instead write your own library function that can be mapped on to a combination of fork() and exec() on systems where that's available. If you're careful you can make your function have the same or similar semantics as CreateProcess() on Win32.
UNIX systems tend to have a quite different approach to processes and process management compared to Windows based systems so it's non-trivial to make all but the simplest wrappers portable.
Of course if you have C++11 or Boost available I'd just stick with that. If you don't have any globals (which is a good thing generally anyway) and don't set up and shared data any other way then the practical differences between threads and processes on modern systems is slim. All the threads you create can make progress independently of each other in the same way the processes can.
Failing that you could look at An MPI implementation if message passing suits your task, or a batch scheduler system.

I am using Boost Interprocess.
It does not provide the possibility to create new processes, but once they are there, it allows them to communicate.
In this particular case I can create the processes I need from a shell script.

Related

Why is it prohibited to use fork without exec in mac?

My question is quite simple.
On Linux it is quite popular to use fork without exec
However, I have found that on MacOS this is not possible (see fork manual)
https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/fork.2.html
There are limits to what you can do in the child process. To be totally safe you should restrict your-self yourself
self to only executing async-signal safe operations until such time as one of the exec functions is
called. All APIs, including global data symbols, in any framework or library should be assumed to be
unsafe after a fork() unless explicitly documented to be safe or async-signal safe. If you need to use
these frameworks in the child process, you must exec. In this situation it is reasonable to exec yourself.
This seems strange to me? What is the reason? Is it possible to workaround it?

It's okay to use fork in OS X, under the same restrictions you would use fork with Linux. Linux has similar caveats or via the Wayback Machine.
If you are building an application that is single-threaded and relies on core UNIX APIs and design philosophy, you should be fine. If you are linking to additional libraries, you should be intimately familiar with their behavior. Imagine linking to a library that started a background thread – after forking you'd be in a potentially undefined state, given only the thread that called fork is cloned.
OS X offers some awesome features for taking advantage of multiple cores, such as Grand Central Dispatch that may be worth considering.
I'd recommend you read this article by Mike Ash on fork safety under OS X.

GNU pth vs. pthread

I want to build a portable and efficient server in C++; it will have lots of clients trying to connect at the same time, so it must be able of handling each request parallel.
I have been trying to find documentation, guides... etc. for multithreading. I have found a lot about POSIX Pthread, but almost nothing for GNU Pth (apart from the official manual in gnu.org).
So, can anyone explain me the difference between POSIX Pthread and GNU Pth? Please, I want the response not to be a copy of Wikipedia's contents (keep in mind that I'm an absolute newbie to multithreading). I want my server to be portable and efficient between all *nix-based systems, keeping away of using heavy fork()s.
Thanks for your help.
PS: I think it's better to ask this here: what about Windows? Are Pthreads or Pth an option there? If not, what is the API for that operating system?

Use Pthreads, it's much more widely used, so there is far more information and support available for it. I've never met anyone who actually uses GNU Pth. Or better yet if you are using C++11 use std::thread and if not then use boost::thread.
So, can anyone explain me the difference between POSIX Pthread and GNU Pth?
Pthreads is a cross-platform standard for pre-emptible multithreading, meaning (usually) the OS kernel manages the threads and the OS scheduler decides when each thread gets to run (if you have a single core only one thread can run at a time, if you have multiple cores multiple threads can run at a time). The OS scheduler could pause any thread at (almost) any time and let another thread run, so each thread gets a limited "time slice" and then other threads get to run.
GNU Pth is a non-preemptible user-space threading library, meaning the threads and which ones run at which time are decided in user-space not by the kernel. Some people say programs using non-preemptible threading libraries are easier to understand, because your thread won't get paused at arbitrary times for another thread to run.
I want my server to be portable and efficient between all *nix-based systems, keeping away of using heavy fork()s.
fork is not heavy on UNIX.
what about W*ndows? Are Pthreads or Pth an option there? If not, what is the API for that operating system?
There are pthreads APIs for Windows, but they're not native to the Windows OS. I don't know if GNU Pth works on Windows - I doubt it, unless you use Cygwin. Windows has its own Win32 thread model.
Using std::thread or boost::thread is portable to POSIX platforms and Windows, and makes certain parts of the API easier to use (specifically, locking and unlocking mutexes can be easily done in an exception safe way and condition variables are easier to use.)

Gnu PTH is for a very limited use case: you want to use a multi-threaded implementation paradigm but you don't want to use multiple CPUs or cores and you don't want to rely on any OS or kernel-level support. Since almost all general-purpose CPUs now have multiple cores, this use case is increasingly irrelevant.
Windows has a separate threading model from POSIX; if you want your application to be cross-platform it is best to use a cross-platform threading library such as boost::thread.

I think GNUs PTH is meaned for C in the first place. You can use it on C++ too but C++ have its own anyway.
There are quite some applications using pth like low-level burning tools (and so GUI-Tools like K3B and Brasero depend on pth), also GnuPG uses PTH, the package management of Archlinux and some multimedia stuff.
On Windows its always a bit complicated. Microsoft did never get over the fact that C is the Programming Language from/for UNIX-Systems and so is suffering the NIH Symptome (Not Invented Here)
So they do a lot of stuff without any advantage just to be different.
If you use an Application which should run everywhere and its not low-level, use Qt with its QThreads and QThreadPool
Its 100% the same on all operating systems
You need much less code
If you write an "low-level" application i recommend to split your applications into backends and frontends and write a own backend for each OS and use the library which will do the least problems.

Simple but fast IPC method for a Python and C++ application?

I have a GNU Radio application which utilizes both Python and C++ code. I want to be able to signal the C++ code of an event. If they were in the same scope I would normally use a simple boolean, but the code is separate to the point where some form of shared memory is required. The code in question is performance-critical so an efficient method is required.
I was initially thinking about a shared memory segment that is accessible by both Python and C++. Therefore I could set a flag in the python code and check it from C++. Since I just need a simple flag to pause the C++ code, would a semaphore suffice?
To be clear, I need to set a flag from Python and the C++ code will simply check this flag, and if it is set enter a busy loop.
So would trying to implement a shared memory segment between Python/C++ be a reasonable approach? How about a semaphore? On Linux, which is easier to implement?
Thanks!

Assuming this is two separate applications on one machine and you need decent real time performance you don't want to go with sockets. I would use a flag in shared memory, and probably use a semaphore to make sure both programs can't be accessing the flag at once. This library provides access to the semaphores and shared memory with Python and supports Python versions 2.4-3.1 (not 3.0): http://semanchuk.com/philip/posix_ipc
EDIT: Changed recommendation to using a semaphore protecting the flag in shared memory

Why not open a unix socket? Or use DBus

If Boost is an option, you could use Boost.Python and Boost.Interprocess. Boost.Python gives you a way for Python & C++ objects to interact and Boost.Interprocess gives you plenty of options for shared memory or synchronization primitives across process boundaries.

DBus looks promising. It supports signals, so you should be able to stop an application on demand. However, I'm not sure if it's performance will be enough for you.

You can try using custom signals. I don't know about Python code being able to send custom signals, but your C/C++ can certainly define custom signals with SIGIO.
If you have stringent response-time requirements, you might need to look beyond your application code and into some time of OS with support for real-time signals (rt-linux, muOs, etc.)

What's the difference between Boost.MPI and Boost.Interprocess?

I suppose Boost.MPI and Boost.Interprocess are different, right?
From a performance perspective, which is faster? Has anyone ever done benchmarking?
Can I use them to pass data within the same process (i.e. among different threads)?
Thanks!

They are totally different. Boost MPI is for parallel/distributed computing (like massively-parallel super-computers). It requires an existing installation of MPI (Message Passing Interface), such as OpenMPI. MPI is usually used with high-performance clusters of networked computers, or with super computers. The Boost MPI library is basically just a nice wrapper around the normal MPI function calls.
Boost.Interprocess, on the other hand, is an API for IPC (Interprocess Communications), i.e. communicating between two processes on a single computer.
If you want to share data between processes on the same computer, Boost.Interprocess is useful. But if, as you suggest, you just want to share data between threads, you don't need any of this. You just need a threading API.

C++ master/worker

I am looking for a cross-platform C++ master/worker library or work queue library. The general idea is that my application would create some sort of Task or Work objects, pass them to the work master or work queue, which would in turn execute the work in separate threads or processes. To provide a bit of context, the application is a CD ripper, and the the tasks that I want to parallelize are things like "rip track", "encode WAV to Mp3", etc.
My basic requirements are:
Must support a configurable number of concurrent tasks.
Must support dependencies between tasks, such that tasks are not executed until all tasks that they depend on have completed.
Must allow for cancellation of tasks (or at least not prevent me from coding cancellation into my own tasks).
Must allow for reporting of status and progress information back to the main application thread.
Must work on Windows, Mac OS X, and Linux
Must be open source.
It would be especially nice if this library also:
Integrated with Qt's signal/slot mechanism.
Supported the use of threads or processes for executing tasks.
By way of analogy, I'm looking for something similar to Java's ExecutorService or some other similar thread pooling library, but in cross-platform C++. Does anyone know of such a beast?
Thanks!

I haven't used it in long enough that I'm not positive whether it exactly meets your needs, but check out the Adaptive Communications Environment (ACE). This library allows you to construct "active objects" which have work queues and execute their main body in their own threads, as well as thread pools that can be shared amoung objects. Then you can pass queue work objects on to active objects for them to process. Objects can be chained in various ways. The library is fairly heavy and has a lot to it to learn, but there have been a couple of books written about it and theres a fair amount of tutorial information available online as well. It should be able to do everything you want plus more, my only concern is whether it possesses the interfaces you are looking for 'out of the box' or if you'd need to build on top of it to get exactly what you are looking for.

I think this calls for intel's Threading Building Blocks, which pretty much does what you want.

Check out Intels' Thread Building Blocks library.

Sounds like you require some kind of "Time Sharing System".
There are some good open source ones out there, but I don't know
if they have built-in QT slot support.

This is probably a huge overkill for what you need but still worth mentioning -
BOINC is a distributed framework for such tasks. There's a main server that gives out tasks to perform and a cloud of workers that do its bidding. It is the framework behind projects like SETI#Home and many others.

See this post for creating threads using the boost library in C++:
Simple example of threading in C++
(it is a c++ thread even though the title says c)
basically, create your own "master" object that takes a "runnable" object and starts it running in a new thread.
Then you can create new classes that implement "runnable" and throw them over to your master runner any old time you want.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js