share queue between parent and child process in c++ - c++

I know there are many way to handle inter-communication between two processes, but I'm still a bit confused how to deal with it. Is it possible to share queue (from standard library) between two processes in efficient way?
Thanks

I believe your confusion comes from not understanding the relationship between the memory address spaces of the parent and child process. The two address spaces are effectively unrelated. Yes, immediately after the fork() the two processes contain almost identical copies of memory, but you should think of them as copies. Any change one proces makes to memory in its address space has no impact on the other process's memory.
Any "plain old data structures" (such as provided by the C++ standard library) are purely abstractions of memory, so there is no way to use them to communicate between the two processes. To send data from one process to the other, you must use one of several system calls that provide interprocess communication.
But, note that shared memory is an exception to this. You can use system calls to set up a section of share memory, and then create data structures in the share memory. You'll still need to protect these data structures with a mutex, but the mutex will have to be shared-memory aware. With Posix threads, you'd use pthread_mutexattr_init with the PTHREAD_PROCESS_SHARED attribute.

Simple answer: Sharing an std::queue by two processes can be done but it is not trivial to do.
You can use shared memory to hold the queue together with some synchronization mechanism (usually a mutex). Note that not only the std::queue object must be constructed in the shared memory region, but also the contents of the queue, so you will have to provide your own allocator that manages the creation of memory in the shared region.
If you can, try to look at higher level libraries that might provide already packed solutions to your process communication needs. Consider Boost.Interprocess or search in your favorite search engine for interprocess communication.

I don't think there are any simple ways to share structures/objects like that between two projects. If you want to implement a queue/list/array/etc between two processes, you will need to implement some kind of communication between the processes to manage the queues and to retrieve and store entries.
For example, you could implement the queue management in one process and implement some kind of IPC (shared memory, sockets, pipes, etc.) to hand off entries from one process to the other.
There may be other methods outside of the standard C++ libraries that will do this for you. For example, there are likely Boost libraries that already implement this.

Related

Confusion regarding multiprocessing.pool memory usage in Python

I've been reading up on Python's "multiprocessing", specifically the "Pool" stuff. I'm familiar with threading but not the approach used here. If I were to pass a very large collection (say a dictionary of some sort) to the process pool ("pool.map(myMethod, humungousDictionary)") are copies made of the dictionary in memory and than handed off to each process, or does there exist only the one dictionary? I'm concerned about memory usage. Thank you in advance.
The short answer is: No. Processes work in their own independent memory space, effectively duplicating your data.
If your dictionary is read only, and modifications will not be made, here are some options you could consider:
Save your data into a database. Each worker will read the data and work independently
Have a single process with a parent that spawns multiple workers using os.fork. Thus, all threads share the same context.
Use shared memory. Unix systems offer shared memory for interprocess communication. If there is a chance of racing, you will need semaphores as well.
You may also consider referring here for deeper insight on a possible solution.

Ways to share a variable among threads

I have a general question about parallel programming in C and C++ and would appreciate it if you could answer it. As far as I know, we can declare a variable in at least one level higher (parent thread) to share it among children threads. So, I was wondering if there is any other way to share a variable among threads with the same parent thread? Is this API dependant or not?
For Posix threads, read some pthread tutorial.
For C++11, read the documentation of its thread library
All threads of the same process share the same address space in virtual memory. As commented by Marco A. consider also thread_local variables.
Notice that you share data or memory (not variables, which exist only in the source code)
In practice, you'll better protect with a mutex the shared data (for synchronization) to avoid data races.
In the simple case, the mutex and the shared data are in some global variables.
You could also use atomic operations.
BTW, you could also develop a parallel application using some message passing paradigm, e.g. using MPI (or simply using some RPC or other messages, e.g. JSON on sockets). You might consider for regular numerical applications to use the GPGPU e.g. using OpenCL. And of course you might mix all the approaches (using OpenCL, with several threads, and having your parallel software running in several such processes communicating with MPI).
Debugging a heavily parallel software can become a nightmare. Performance may depend upon the hardware system and may require tricky tuning. scalability and synchronization may becoming a growing concern.map-reduce is often a useful model.
In C++ and C any memory location (identified by a variable) can be shared among threads. The memory space is the same across all threads. There is no parent/child thread relationship with memory.
The challenge is to control or synchronize access to the memory location among the threads.
That is implementation dependent.
Any global variable is sharable among threads, since threads are light weight processes sharing the same address space. For synchronization, you need to ensure mutual exclusion while updating/accessing those global variables through semaphores or wait notify blocks.

Simulating Thread with fork()

What's your idea about simulating thread with "fork() function" and a "shared memory" block ...
Is it possible ?
How much is it reasonable to do this for a program ? ( I mean , Will it work well..?)
For starters, don't mix a thread and fork().
A fork gives you a brand new process, which is a copy of the current process, with the same code segments. As the memory image changes (typically this is due to different behavior of the two processes) you get a separation of the memory images, however the executable code remains the same. Tasks do not share memory unless they use some Inter Process Communication (IPC) primitive.
In contrast a thread is another execution thread of the same task. One task can have multiple threads, and the task memory object are shared among threads, therefore shared data must be accessed through some primitive and synchronization objects that allow you to avoid data corruption.
Yes, it is possible, but I cannot imagine it being a good idea, and it would be a real pain to test.
If you have a shared heap, and you make sure all semaphores etc. are allocated in the heap, and not the stack, then there's no inherent reason you couldn't do something like it. There would be some tricky differences though.
For example, anything you do in an interrupt handler in a multi-threaded program can change data used by all the threads, while in a forked program, you would have to send multiple interrupts, which would be caught at different times, and might lead to unintended effects.
If you want threading behavior, just use a thread.
AFAIK, fork will create a separate process with its own context, stack and so on. Depends what you mean by "simulating"...
You might want to check this out : http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
A few of the answers here focus on "don't mix fork and threads". But the way I read your question is: "can you use two different processes, and still communicate quickly and conveniently with shared memory between them, just like how threads have access to each others' memory?"
And the answer is, yes you can, but you have to remember to explicitly mark which memory areas you want shared. You can not just share your variables between the processes. Also, you can communicate this way between processes not related to each other at all. It is not limited to processes forked from each other.
Have a look at shared memory or "shm".

C++: Is it possible to share a pointer through forked processes?

I have a count variable that should get counted up by a few processes I forked and used/read by the mother process.
I tried to create a pointer in my main() function of the mother process and count that pointer up in the forked children. That does not work! Every child seems to have it's own copy even though the address is the same in every process.
What is the best way to do that?
Each child gets its own copy of the parent processes memory (at least as soon as it trys to modify anything). If you need to share betweeen processes you need to look at shared memory or some similar IPC mechanism.
BTW, why are you making this a community wiki - you may be limiting responses by doing so.
2 processes cannot share the same memory. It is true that a forked child process will share the same underlying memory after forking, but an attempt to write to this would cause the operating system to allocate a new writeable space for it somewhere else.
Look into another form of IPC to use.
My experience is, that if you want to share information between at least two processes, you almost never want to share just some void* pointer into memory. You might want to have a look at
Boost Interprocess
which can give you an idea, how to share structured data (read "classes" and "structs") between processes.
No, use IPC or threads. Only file descriptors are shared (but not the seek pointer).
You might want to check out shared memory.
the pointers are always lies in the same process. It's private to the process, relative to the process's base address. There different kind of IPC mechanisms available in any operating systems. You can opt for Windows Messaging, Shared memory, socket, pipes etc. Choose one according to your requirement and size of data. Another mechanism is to write data in target process using Virtual memory APIs available and notify the process with corresponding pointer.
One simple option but limited form of IPC that would work well for a shared count is a 'shared data segment'. On Windows this is implemented using the #pragma data_seg directive.
See this article for an example.

Methods of sharing class instances between processes

I have written a C++ class that I need to share an instance of between at least two windows processes. What are the various ways to do this?
Initially I looked into #pragma data_seg only to be disappointed when I realised that it will not work on classes or with anything that allocates on the heap.
The instance of the class must be accessible via a dll because existing, complete applications already use this dll.
You can potentially use memory-mapped files to share data between processes. If you need to call functions on your object, you'd have to use COM or something similar, or you'd have to implement your own RPC protocol.
Look into Boost::interprocess. It takes a bit of getting used to, but it works very well. I've made relatively complex data structures in shared memory that worked fine between processes.
edit: it works with memory-mapped files too. The point is you can use data in a structured way; you don't have to treat the memory blocks (in files or shared memory) as just raw data that you have to carefully read/write to leave in a valid state. Boost::interprocess takes care of that part and you can use STL containers like trees, lists, etc.
You can use placement new to create the object in a shared memory zone. As long as the object doesn't use any pointers, that sould be fine.
Is it a POD or do you need to be able to share a single instance across processes? Have you considered using the Singleton pattern (static initialization version, for thread safety reasons)? You will need to use Mutexes as well to protect concurrent writes and stuff.
On Windows, you can use COM as well.