Cross-Platform Threading/Forking-with-static-variables in C/C++

Cross-Platform Threading/Forking-with-static-variables in C/C++ - c++

I'm trying to write a server program which can keep a track of the number of instances of some object.
At the moment I'm using a static int which is incremented during the object's constructor:
class myObj{
public:
static int numOfInstances;
myObj();
};
int myObj::numOfInstances = 0;
myObj::myObj(){
this->numOfInstances = ++myObj::numOfInstaces
}
But I also want to fork for each connection, with a child process handling each one and the parent constantly listening for new connections.
If I use fork(), each child process is unaware of new connections, and new objects created due to them.
I think threading might be a solution, but I'm not sure if threading is cut out for this kind of thing (most of the program would run in the thread). Even if it is, it's not in the ANSI standard, so I'd rather find a solution which uses fork.
If there's no sane solution with fork, which threading solution do people recommend? I'm writing for Linux, but I'd much prefer a cross-platform solution.

Multiprocessing is not part of the C++ standard. However, if you are on a POSIX system (where you have fork()), you can obtain shared memory from the operating system; look at the shmget() familiy of functions. You will need some synchronisation mechanism for access to the shared memory (like a mutex or a semaphore); those are also provided.
I suggest man shm_overview and man sem_overview as starting points.

I don't really know how resource sharing works on POSIX systems (so I can't tell you whether to simply fork() or use threads) but there's the portable Boost.Thread library, in addition to pthreads, if you decide to go that way.
Note that there's also a race condition in your code; two threads (whether in the same process or not, so in either case) cannot write to the same location without some kind of synchronization.

Related

Is it possible to use fork in modern C++?

Traditional C++ was very straightforward and only a library intended to create threads (like pthread) gave rise to other threads.
Modern C++ is much closer to Java with many functions being thread based, with thread pools ready to run asynchronous jobs, etc. It's much more likely that some library, including the standard library, uses threads to compute asynchronously some function, or sets up the infrastructure to do so even if it isn't used.
In that context, is it ever safe to use functions with global impact like fork?

The answer to this question, like almost everything else in C++, is "it depends".
If we assume there are other threads in the program, and those threads are synchronizing with each other, calling fork is dangerous. This is because, fork does not wait for all threads to be a synchronization point (i.e. mutex release) to fork the process. In the forked process, only the thread that called fork will be present, and the others will have been terminated, possibly in the middle of a critical section. This means any memory shared with other threads, that wasn't a std::atomic<int> or similar, is an undefined state.
If your forked process reads from this memory, or indeed expects the other threads to be running, it is likely not going to work reliably. However, most uses of fork actually have effectively no preconditions on program state. That is because the most common thing to do is to immediately call execv or similar to spawn a subprocess. In this case your entire process is kinda "replaced" by some new process, and all memory from your old process is discarded.
tl;dr - Calling fork may not be safe in multithreaded programs. Sometimes it is safe; like if no threads have spawned yet, or evecv is called immediately. If you are using fork for something else, consider using a thread instead.
See the fork man page and this helpful blog post for the nitty-gritty.

To add to peteigel's answer, my advice is - if you want to fork, do it very early, before any other threads than the main thread are started.
In general, anything you can do in C, you can do in C++, since C++, especially on Linux with clang or gcc extensions, is pretty darn close to a perfect superset of C. Of course, when there are good portable APIs in std C++, use them. The canonical example is preferring std::thread over pthreads C API.
One caveat is pthread_cancel, which must be avoided on C++ due to exceptions. See e.g. pthread cancel harmful on C++.
Here is another link that explains the problem:
pthread_cancel while in destructor
In general, C++ cleanup handling is in general easier and more elegant than C, since RAII is part and parcel of C++ culture, and C does not have destructors.

POSIX Shared Memory Sync Across Processes C++/C++11

Problem (in short):
I'm using POSIX Shared Memory and currently just used POSIX semaphores and i need to control multiple readers, multiple writers. I need help with what variables/methods i can use to control access within the limitations described below.
I've found an approach that I want to implement but i'm unsure of what methodology i can use to implement it when using POSIX Shared memory.
What I've Found
https://stackoverflow.com/a/28140784
This link has the algorithm i'd like to use but i'm unsure how to implement it with shared memory. Do i store the class in shared memory somehow? This is where I need help please.
The reason I'm unsure is a lot of my research, points towards keeping shared memory to primitives only to avoid addressing problems and STL objects can't be used.
NOTE:
For all my multi-threading i'm using C++11 features. This shared memory will be completely seperate program executables using C++11 std::threads from which any thread of any process/executable will want access. I have avoided the Linux pthread for any of my multi-threading and will continue to do so (except if its just control variable not actual pThreads).
Solution Parameters aimed for
Must be shareable between 2+ processes which will be running multiple C++11 std::thread that may wish access. I.e. Multiple Writers (exclusive one at a time) while allowing multiple simultaneous readers when no writer wants access.
Not using BOOST libraries. Ideally native C++11 or built in linux libraries, something that will work without the need to install abstract libraries.
Not using pThread actual threads but could use some object from there that will work with C++11 std::thread.
Ideally can handle a process crash while in operation. E.g. Using POSIX semaphore if a process crashes while it has the semaphore, everyone is screwed. I have seen people using file locks?
Thanks in advance

keeping shared memory to primitives only to avoid addressing problems
You can use pointers in and to shared memory objects across programs, so long as the memory is mmaped to the same address. This is actually a straightforward proposition, especially on 64 bit. See this open source C library I wrote for implementation details: rszshm - resizable pointer-safe shared memory.
Using POSIX semaphore if a process crashes while it has the semaphore, everyone is screwed.
If you want to use OS mediated semaphores, the SysV semaphores have SEM_UNDO, which recovers in this case. OTOH pthread offers robust mutexes that can be embedded and shared in shared memory. This can be used to build more sophisticated mechanisms.
The SysV scheme of providing multiple semaphores in a semaphore set, where a group of actions must all succeed, or the call blocks, permits building sophisticated mechanism too. A read/write lock can be made with a set of three semaphores.

Multiplatform multiprocessing?

I was wondering why in the new C++11 they added threads and not processes.
Couldn't have they done a wrapper around platform specific functions?
Any suggestion about the most portable way to do multiprocessing? fork()? OpenMP?

If you could use Qt, QProcess class could be an elegant platform independent solution.

If you want to do this portably I'd suggest you avoid calling fork() directly and instead write your own library function that can be mapped on to a combination of fork() and exec() on systems where that's available. If you're careful you can make your function have the same or similar semantics as CreateProcess() on Win32.
UNIX systems tend to have a quite different approach to processes and process management compared to Windows based systems so it's non-trivial to make all but the simplest wrappers portable.
Of course if you have C++11 or Boost available I'd just stick with that. If you don't have any globals (which is a good thing generally anyway) and don't set up and shared data any other way then the practical differences between threads and processes on modern systems is slim. All the threads you create can make progress independently of each other in the same way the processes can.
Failing that you could look at An MPI implementation if message passing suits your task, or a batch scheduler system.

I am using Boost Interprocess.
It does not provide the possibility to create new processes, but once they are there, it allows them to communicate.
In this particular case I can create the processes I need from a shell script.

Is there a disadvantage to using boost::interprocess::interprocess_semaphore within a single multithreaded c++ process?

The disadvantage would be in comparison to a technique that was specialized to work on threads that are running within the same process. For example, does wait/post cause the whole process to yield, rather than just the executing thread, even though anyone waiting for a post would be within the same process?
The semaphore would be used, for example, to solve a producer/consumer problem in a shared buffer between two threads in the same process.
Are there any reasonable alternatives?

Use Boost.Thread condition variables as shown here. The accompanying article has a good summary of Boost.Thread features.
Using interprocess semaphores will work but it's likely to place a tax on your execution due to use of unnecessarily heavyweight underlying OS locking primitives (named kernel objects in Windows, for example).

How can I pass data from a thread to the parent process?

I have a main process that uses a single thread library and I can only the library functions from the main process. I have a thread spawned by the parent process that puts info it receives from the network into a queue.
I need to able to tell the main process that something is on the queue. Then it can access the queue and process the objects. The thread cannot process those objects because the library can only be called by one process.
I guess I need to use pipes and signals. I also read from various newsgroups that I need to use a 'self-trick' pipe.
How should this scenario be implemented?
A more specific case of the following post:
How can unix pipes be used between main process and thread?

Why not use a simple FIFO (named pipe)? The main process will automatically block until it can read something.
If it shouldn't block, it must be possible to poll instead, but maybe it will suck CPU. There probably exists an efficient library for this purpose.
I wouldn't recommend using signals because they are easy to get wrong. If you want to use them anyway, the easiest way I've found is:
Mask all signals in every thread,
A special thread handles signals with sigwait(). It may have to wake up another thread which will handle the signal, e.g. using condition variables.
The advantage is that you don't have to worry anymore about which function is safe to call from the handler.

The "optimal" solution depends quite a bit on your concrete setup. Do you have one process with a main thread and a child thread or do you have one parent process and a child process? Which OS and which thread library do you use?
The reason for the last question is that the current C++03 standard has no notion of a 'thread'. This means in particular that whatever solution your OS and your thread library offer are platform specific. The most portable solutions will only hide these specifics from you in their implementation.
In particular, C++ has no notion of threads in its memory model, nor does it have a notion of atomic operations, synchronization, ordered memory accesses, race conditions etc.
Chances are, however, that whatever library you are using already provides a solution for your problem on your platform.

I highly suggest you used a thread-safe queue such as this one (article and source code). I have personally used it and it's very simple to use. The API consist in simple methods such as push(), try_pop(), wait_and_pop() and empty().
Note that it is based on Boost.Thread.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js