I have a count variable that should get counted up by a few processes I forked and used/read by the mother process.
I tried to create a pointer in my main() function of the mother process and count that pointer up in the forked children. That does not work! Every child seems to have it's own copy even though the address is the same in every process.
What is the best way to do that?
Each child gets its own copy of the parent processes memory (at least as soon as it trys to modify anything). If you need to share betweeen processes you need to look at shared memory or some similar IPC mechanism.
BTW, why are you making this a community wiki - you may be limiting responses by doing so.
2 processes cannot share the same memory. It is true that a forked child process will share the same underlying memory after forking, but an attempt to write to this would cause the operating system to allocate a new writeable space for it somewhere else.
Look into another form of IPC to use.
My experience is, that if you want to share information between at least two processes, you almost never want to share just some void* pointer into memory. You might want to have a look at
Boost Interprocess
which can give you an idea, how to share structured data (read "classes" and "structs") between processes.
No, use IPC or threads. Only file descriptors are shared (but not the seek pointer).
You might want to check out shared memory.
the pointers are always lies in the same process. It's private to the process, relative to the process's base address. There different kind of IPC mechanisms available in any operating systems. You can opt for Windows Messaging, Shared memory, socket, pipes etc. Choose one according to your requirement and size of data. Another mechanism is to write data in target process using Virtual memory APIs available and notify the process with corresponding pointer.
One simple option but limited form of IPC that would work well for a shared count is a 'shared data segment'. On Windows this is implemented using the #pragma data_seg directive.
See this article for an example.
Related
I've been reading up on Python's "multiprocessing", specifically the "Pool" stuff. I'm familiar with threading but not the approach used here. If I were to pass a very large collection (say a dictionary of some sort) to the process pool ("pool.map(myMethod, humungousDictionary)") are copies made of the dictionary in memory and than handed off to each process, or does there exist only the one dictionary? I'm concerned about memory usage. Thank you in advance.
The short answer is: No. Processes work in their own independent memory space, effectively duplicating your data.
If your dictionary is read only, and modifications will not be made, here are some options you could consider:
Save your data into a database. Each worker will read the data and work independently
Have a single process with a parent that spawns multiple workers using os.fork. Thus, all threads share the same context.
Use shared memory. Unix systems offer shared memory for interprocess communication. If there is a chance of racing, you will need semaphores as well.
You may also consider referring here for deeper insight on a possible solution.
What's your idea about simulating thread with "fork() function" and a "shared memory" block ...
Is it possible ?
How much is it reasonable to do this for a program ? ( I mean , Will it work well..?)
For starters, don't mix a thread and fork().
A fork gives you a brand new process, which is a copy of the current process, with the same code segments. As the memory image changes (typically this is due to different behavior of the two processes) you get a separation of the memory images, however the executable code remains the same. Tasks do not share memory unless they use some Inter Process Communication (IPC) primitive.
In contrast a thread is another execution thread of the same task. One task can have multiple threads, and the task memory object are shared among threads, therefore shared data must be accessed through some primitive and synchronization objects that allow you to avoid data corruption.
Yes, it is possible, but I cannot imagine it being a good idea, and it would be a real pain to test.
If you have a shared heap, and you make sure all semaphores etc. are allocated in the heap, and not the stack, then there's no inherent reason you couldn't do something like it. There would be some tricky differences though.
For example, anything you do in an interrupt handler in a multi-threaded program can change data used by all the threads, while in a forked program, you would have to send multiple interrupts, which would be caught at different times, and might lead to unintended effects.
If you want threading behavior, just use a thread.
AFAIK, fork will create a separate process with its own context, stack and so on. Depends what you mean by "simulating"...
You might want to check this out : http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
A few of the answers here focus on "don't mix fork and threads". But the way I read your question is: "can you use two different processes, and still communicate quickly and conveniently with shared memory between them, just like how threads have access to each others' memory?"
And the answer is, yes you can, but you have to remember to explicitly mark which memory areas you want shared. You can not just share your variables between the processes. Also, you can communicate this way between processes not related to each other at all. It is not limited to processes forked from each other.
Have a look at shared memory or "shm".
I need to call a function (an LLVM JIT to be specific) from a C++ application. This call might fail or even signal abort() or exit(). How can I avoid or at least reduce effects on my host application? Someone suggested using fork(), however I need a solution for both windows and posix. Even if I would use fork() ... would it be possible for the two processes to communicate (pass some pointers around)?
You basically have to isolate the call that might fail spectacularly, so yes, you probably have to create a separate process for it. I'd actually be tempted to create a small executable just containing this particular call and the necessary supporting functionality and call that from your main executable. This gets you around the lack of fork() on Windows and allows you to use the same mechanisms to communicate.
You can't pass pointers around between processes as they're not sharing the same address space. What I would do is have the spawned process reading data from stdin and write to stdout with the controlling process piping data into the child's stdin and reading from the child's stdout. Basically the way a Unix (command line) filter works. Another alternative if you're passing around a lot of data would be to write/read to/from a file on disk (better, a RAM disk) and communicate that way, but unless you're talking a lot of data, that's overkill.
As Eugen pointed out in the comments, you can also use shared memory if you want to pass pointers around or another inter-process communication mechanism depending on how much data you need to pass around. That said, choose the simplest possible method as nested executables like these aren't that easy to debug in the first place.
I know there are many way to handle inter-communication between two processes, but I'm still a bit confused how to deal with it. Is it possible to share queue (from standard library) between two processes in efficient way?
Thanks
I believe your confusion comes from not understanding the relationship between the memory address spaces of the parent and child process. The two address spaces are effectively unrelated. Yes, immediately after the fork() the two processes contain almost identical copies of memory, but you should think of them as copies. Any change one proces makes to memory in its address space has no impact on the other process's memory.
Any "plain old data structures" (such as provided by the C++ standard library) are purely abstractions of memory, so there is no way to use them to communicate between the two processes. To send data from one process to the other, you must use one of several system calls that provide interprocess communication.
But, note that shared memory is an exception to this. You can use system calls to set up a section of share memory, and then create data structures in the share memory. You'll still need to protect these data structures with a mutex, but the mutex will have to be shared-memory aware. With Posix threads, you'd use pthread_mutexattr_init with the PTHREAD_PROCESS_SHARED attribute.
Simple answer: Sharing an std::queue by two processes can be done but it is not trivial to do.
You can use shared memory to hold the queue together with some synchronization mechanism (usually a mutex). Note that not only the std::queue object must be constructed in the shared memory region, but also the contents of the queue, so you will have to provide your own allocator that manages the creation of memory in the shared region.
If you can, try to look at higher level libraries that might provide already packed solutions to your process communication needs. Consider Boost.Interprocess or search in your favorite search engine for interprocess communication.
I don't think there are any simple ways to share structures/objects like that between two projects. If you want to implement a queue/list/array/etc between two processes, you will need to implement some kind of communication between the processes to manage the queues and to retrieve and store entries.
For example, you could implement the queue management in one process and implement some kind of IPC (shared memory, sockets, pipes, etc.) to hand off entries from one process to the other.
There may be other methods outside of the standard C++ libraries that will do this for you. For example, there are likely Boost libraries that already implement this.
What is the data that Process and Thread will not share ?
An advance thanks goes to everybody who provide their time
Separate processes do not share any data with each other.
Threads can share any heap-allocated or static data if they are running within the same process.
It depends on the context. Completely separate processes do not share any of the same memory in most cases, but in some cases child processes will share the same memory space as the parent, such as when you use fork in Unix. In older version of Windows (95,98,ME) there's a shared memory area that is shared among all processes, but mainly it's just a space for system DLLs not data.
Generally threads share heap data, but you will want to be careful deallocating memory in one thread that was allocated in another thread since some memory managers depend on the stack.
BY default no sharing of Data between processes, But using Inter-process communication techniques such as Socket , Pipes, RPC etc..you can share the data.
In operating system theory (and AFAIK this applies to operating systems such as Windows, Linux, *BSD, ...) a process is defined as a thread with its own page table, i.e. its own virtual memory space.
Anything else is OS dependant (file descriptors, sockets, etc.). In my experience, such thread properties are usually copied with standard system calls that replicate processes. Think about it, it's easier to implement and more resourceful too (less house keeping and keep non-virtual memory without touching it).
On UNIX, processes can share file descriptors with their child processes if the file descriptors are not set to close on exec (FD_CLOEXEC). Likewise, Windows supports sharing handles with child processes by setting lpSecurityAttributes->bInheritHandle to TRUE when calling CreateFile() and then setting bInheritHandles to TRUE when calling CreateProcess. Not to mention that the Microsoft C runtime _open() function accepts a _O_NOINHERIT flag.
On Linux, the clone() syscall gives you a lot of control over what the child process shares with its parent: everything from the address space (CLONE_VM) to the file descriptor table (CLONE_FILES) to the parent process ID (CLONE_PARENT) can be either shared or not shared. Of course, this functionality was added to support kernel threads.
Thread-local storage (TLS) is indexed differently for each thread in a process, but the actual memory is shared between threads.