I have inherited a large body of c++ code in a linux shared object that I suspect is not re-entrant.
Is there any way to run this code in multiple threads spawned from the same process, by ensuring each thread loads its own copy of the dll and maintains its own memory space?
Of course not. Threads use the same memory space. Processes have separate memory spaces. So you'd need to run multiple separate processes if your code is not re-entrant.
Related
I am just started coding of device driver and new to threading, went through many documents for getting an idea about threads. I still have some doubts.
what is a kernel thread?
how it differs from user thread?
what is the relationship between the two threads?
how can i implement kernel threads?
where can i see the output of the implementation?
Can anyone help me?
Thanks.
A kernel thread is a task_struct with no userspace components.
Besides the lack of userspace, it has different ancestors (kthreadd kernel thread instead of the init process) and is created by a kernel-only API instead of sequences of clone from fork/exec system calls.
Two kernel threads have kthreadd as a parent. Apart from that, kernel threads enjoy the same "independence" one from another as userspace processes.
Use the kthread_run function/macro from the kthread.h header You will most probably have to write a kernel module in order to call this function, so you should take a look a the Linux Device Drivers
If you are referring to the text output of your implementation (via printk calls), you can see this output in the kernel log using the dmesg command.
A kernel thread is a kernel task running only in kernel mode; it usually has not been created by fork() or clone() system calls. An example is kworker or kswapd.
You probably should not implement kernel threads if you don't know what they are.
Google gives many pages about kernel threads, e.g. Frey's page.
user threads & stack:
Each thread has its own stack so that it can use its own local variables, thread’s share global variables which are part of .data or .bss sections of linux executable.
Since threads share global variables i.e we use synchronization mechanisms like mutex when we want to access/modify global variables in multi threaded application. Local variables are part of thread individual stack, so no need of any synchronization.
Kernel threads
Kernel threads have emerged from the need to run kernel code in process context. Kernel threads are the basis of the workqueue mechanism. Essentially, a thread kernel is a thread that only runs in kernel mode and has no user address space or other user attributes.
To create a thread kernel, use kthread_create():
#include <linux/kthread.h>
structure task_struct *kthread_create(int (*threadfn)(void *data),
void *data, const char namefmt[], ...);
kernel threads & stack:
Kernel threads are used to do post processing tasks for kernel like pdf flush threads, workq threads etc.
Kernel threads are basically new process only without address space(can be created using clone() call with required flags), means they can’t switch to user-space. kernel threads are schedulable and preempt-able as normal processes.
kernel threads have their own stacks, which they use to manage local info.
More about kernel stacks:-
https://www.kernel.org/doc/Documentation/x86/kernel-stacks
Since you're comparing kernel threads with user[land] threads, I assume you mean something like the following.
The normal way of implementing threads nowadays is to do it in the kernel, so those can be considered "normal" threads. It's however also possible to do it in userland, using signals such as SIGALRM, whose handler will save the current process state (registers, mostly) and change them to another one previously saved. Several OSes used this as a way to implement threads before they got proper kernel thread support. They can be faster, since you don't have to go into kernel mode, but in practice they've faded away.
There's also cooperative userland threads, where one thread runs until it calls a special function (usually called yield), which then switches to another thread in a similar way as with SIGALRM above. The advantage here is that the program is in total control, which can be useful when you have timing concerns (a game for example). You also don't have to care much about thread safety. The big disadvantage is that only one thread can run at a time, and therefore this method is also uncommon now that processors have multiple cores.
Kernel threads are implemented in the kernel. Perhaps you meant how to use them? The most common way is to call pthread_create.
How can I mark data structures that wont be copied on fork?
I have a program that uses fork() and I cannot modify it.
The program loads a shared library that uses threads, and was written in C++.
Since threads are not duplicated on fork, I am afraid that I loose references to some objects that I allocated on of the threads.
How can I avoid that?
Is there an option to mark objects that wont be copied on fork()?
I am writing a library that I load with the application in order to override some of it's function and force it to communicate with a remote server.
The library creates a thread that runs in the background.
The program calls fork().
And I must use boost objects for all threads/shared memory.
Thanks.
I have a piece of code that handles the multi-threading (with shared resources) issue, like that:
CRITICAL_SECTION gCS;
InitializeCriticalSection(&gCS);
EnterCriticalSection(&gCS);
// Do some shared resources stuff
LeaveCriticalSection(&gCS);
In this MSDN page is written: "The threads of a single process [my bold] can use a critical section object for mutual-exclusion synchronization."
So, my question is: what about the case that the operating system decides to divide the threads to different processes, or even different processors.
Does EnterCriticalSection indeed not do the job? And if the answer is "critical sections are no help with multi-processing", what is the alternative?
I prefer not to use the Boost classes.
An operating system will not divide a thread into different processes.
EnterCriticalSection is appropriate for programs with multiple threads, as well as systems with multiple processors.
So, my question is what about the case that the operation system
decide to divide the theards to different process, or even different
processors.
Different processors - critical sections cover this.
Different processes - you need different synchronization API, which can share [kernel] objects between processes, such as mutexes and semaphores.
See sample usage in Using Mutex Objects section.
If all your threads are started in the same program, they are part of a single process and there is nothing anyone, including the OS, can do to "separate them". They exist only as part of that process and will die with the process. You are perfectly safe using a critical section.
A process is been allocated a newly address space(stack&heap), whereas when a thread is created it is implicitly assigned the initiator process's memory space ,but for a newly allocated own stack space (a new stack space is assigned to each and every different thread)
for the OS a thread executes the same as it was a process,naturally when using threads this might result in more cache and memory\page hits .
the OS executer will give time to the process who then may use his own scheduler to divide time between his threads,but this is not a must since all threads are processes they are in the same process table and can run on any core concurrently\at any time, the same as regular process.
since threads (for the same process) have the same memory they can synchronize on variables\lock objects on User level
a process should not have access to a different process's allocated memory(unless he is a thread of joint space) so synchronizing between processes should be done on some joined\global space or at kernel level
What's your idea about simulating thread with "fork() function" and a "shared memory" block ...
Is it possible ?
How much is it reasonable to do this for a program ? ( I mean , Will it work well..?)
For starters, don't mix a thread and fork().
A fork gives you a brand new process, which is a copy of the current process, with the same code segments. As the memory image changes (typically this is due to different behavior of the two processes) you get a separation of the memory images, however the executable code remains the same. Tasks do not share memory unless they use some Inter Process Communication (IPC) primitive.
In contrast a thread is another execution thread of the same task. One task can have multiple threads, and the task memory object are shared among threads, therefore shared data must be accessed through some primitive and synchronization objects that allow you to avoid data corruption.
Yes, it is possible, but I cannot imagine it being a good idea, and it would be a real pain to test.
If you have a shared heap, and you make sure all semaphores etc. are allocated in the heap, and not the stack, then there's no inherent reason you couldn't do something like it. There would be some tricky differences though.
For example, anything you do in an interrupt handler in a multi-threaded program can change data used by all the threads, while in a forked program, you would have to send multiple interrupts, which would be caught at different times, and might lead to unintended effects.
If you want threading behavior, just use a thread.
AFAIK, fork will create a separate process with its own context, stack and so on. Depends what you mean by "simulating"...
You might want to check this out : http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
A few of the answers here focus on "don't mix fork and threads". But the way I read your question is: "can you use two different processes, and still communicate quickly and conveniently with shared memory between them, just like how threads have access to each others' memory?"
And the answer is, yes you can, but you have to remember to explicitly mark which memory areas you want shared. You can not just share your variables between the processes. Also, you can communicate this way between processes not related to each other at all. It is not limited to processes forked from each other.
Have a look at shared memory or "shm".
In Windows C++, createThread() causes some of the threads to slow down if one thread is doing a very CPU intensive operation. Will createProcess() alleviate this? If so, does createProcess() imply the code must reside in a second executable, or can this all take place inside the same executable?
The major difference between a process and a thread is that each process has its own memory space, while threads share the memory space of the process that they are running within.
If a thread is truly CPU bound, it will only slow another thread if they are both executing on the same processor core. createProcess will not alleviate this since a process would still have the same issue.
Also, what kind of machine are you running this on? Does it have more than one core?
Not likely - a process is much "heavier" than a thread, so it is likely to be slower still. I'm not sure what you're asking about the 2nd executable, but you can use createProcess on the same .exe.
http://msdn.microsoft.com/en-us/library/ms682425(v=vs.85).aspx
It sounds like you're chasing down some performance issues, so perhaps trying out a threading-oriented profiler would be helpful: http://software.intel.com/en-us/articles/using-intel-thread-profiler-for-win32-threads-philosophy-and-theory/
Each process provides the resources needed to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a priority class, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.
A thread is the entity within a process that can be scheduled for execution. All threads of a process share its virtual address space and system resources. In addition, each thread maintains exception handlers, a scheduling priority, thread local storage, a unique thread identifier, and a set of structures the system will use to save the thread context until it is scheduled. The thread context includes the thread's set of machine registers, the kernel stack, a thread environment block, and a user stack in the address space of the thread's process. Threads can also have their own security context, which can be used for impersonating clients.
Create process and create thread both cause additional execution on what is a resource limited environment. Meaning no matter how you do parallel processing at some point in time your other lines of execution will imped the current. It is for this reason for very large problems that are suited to parallization distributed system are used. There are pluses and minuses tho threads and processes.
Threads
Threads allow separate execution inside of one address space meaning you can share data variables instances of objects very easily, however it also means you run into many more synchronization issues. These are painfull and as you can see from the shear number of api function involved not a light subject. Threads are a lighter weight on windows then process and as such spin up and down faster and use less resources to maintain. Threads also suffer in that one thread can cause the entire process to fail.
Processes
Process each have there own address space and as such protect themselves from being brought down by another process, but lack the ability to easily communicate. Any communication will necessarily involve some type of IPC ( Pipes, TCP , ...).
The code does not have to be in a second executable just two instances need to run.
That would make things worse. When switching threads, the CPU needs to swap out only a few registers. Since all threads of a process share the same memory, there's no need to flush the cache. But when switching betweeen processes, you also switch mapped memory. Therefore, the CPU has to flush the L1 cache. That's painful.
(L2 cache is physically mapped, i.e. uses hardware addresses. Those don't change, of course.)