In one C++ source file that I'm wrapping in a python function, someone has included the following:
namespace some_namespace
{
static double some_double;
}
float function_that_uses_some_double(float input) {
// implementation
return result;
}
The static global some_double is only ever used inside the function, so if I wrap this in a CPython function and call it in single-threaded code, the variable will only ever be used by one function at a time. It's ugly, but no problem there. My question is what happens if I use:
the threading module, or
the multiprocessing module.
When I have multiple processes and / or threads using this module, will they interfere with each other?
If you use the threading module, then all functions will simply share that global variable. Threads in python are switched between bytecode boundaries, so locking is a non-issue.
If you use the multiprocessing module, things are different and it depends a bit on your usage of multiprocessing. Python starts with a single process so, there is only one copy of the global variable. The value of that variable when you start using multiprocessing (i.e. forking new python processes from the main process) will be copied into the subprocesses (tasks), but those processes will each have their own copy of the global variable.
If you get tricky and set up a shared memory segment (mmap with MAP_SHARED) and the variable is a pointer, then the location pointed to will be shared and you'll need to use locking.
If you are going to change some_double (which I assume you are because it's not a const variable), then you will have to do some kind of locking (e.g. mutex) if you use multiple threads.
Related
We are porting an embedded application from Windows CE to a different system. The current processor is an STM32F4. Our current codebase heavily uses TLS. The new prototype is running KEIL CMSIS RTOS which has very reduced functionality.
On http://www.keil.com/support/man/docs/armcc/armcc_chr1359124216560.htm it says that thread local storage is supported since 5.04. Right now we are using 5.04. The problem is that when linking our program with a variable definition of __thread int a; the linker cannot find __aeabi_read_tp which makes sense to me.
My question is: Is it possible to implement __aeabi_read_tp and it will work or is there more to it?
If it simply is not possible for us: Is there a way to implement TLS only in software? Let's not talk about performance there for now.
EDIT
I tried implementing __aeabi_read_tp by looking at old source of freeBSD and other sources. While the function is mostly implemented in assembly I found a version in C which boils down to this:
extern "C"
{
extern osThreadId svcThreadGetId(void);
void *__aeabi_read_tp()
{
return (void*)svcThreadGetId();
}
}
What this basically does is give me the ID (void*) of my currently executing thread. If I understand correctly that is what we want. Can this possibly work?
Not considering the performance and not going into CMIS RTOS specifics (which are unknown to me), you can allocate space needed for your variables - either on heap or as static or global variable - I would suggest to have an array of structures. Then, when you create thread, pass the pointer to the next not used structure to your thread function.
In case of static or global variable, it would be good if you know how many threads are working in parallel for limiting the size of preallocated memory.
EDIT: Added sample of TLS implementation based on pthreads:
#include <pthread.h>
#define MAX_PARALLEL_THREADS 10
static pthread_t threads[MAX_PARALLEL_THREADS];
static struct tls_data tls_data[MAX_PARALLEL_THREADS];
static int tls_data_free_index = 0;
static void *worker_thread(void *arg) {
static struct tls_data *data = (struct tls_data *) arg;
/* Code omitted. */
}
static int spawn_thread() {
if (tls_data_free_index >= MAX_PARALLEL_THREADS) {
// Consider increasing MAX_PARALLEL_THREADS
return -1;
}
/* Prepare thread data - code omitted. */
pthread_create(& threads[tls_data_free_index], NULL, worker_thread, & tls_data[tls_data_free_index]);
}
The not-so-impressive solution is a std::map<threadID, T>. Needs to be wrapped with a mutex to allow new threads.
For something more convoluted, see this idea
I believe this is possible, but probably tricky.
Here's a paper describing how __thread or thread_local behaves in ELF images (though it doesn't talk about ARM architecture for AEABI):
https://www.akkadia.org/drepper/tls.pdf
The executive summary is:
The linker creates .tbss and/or .tdata sections in the resulting executable to provide a prototype image of the thread local data needed for each thread.
At runtime, each thread control block (TCB) has a pointer to a dynamic thread-local vector table (dtv in the paper) that contains the thread-local storage for that thread. It is lazily allocated and initialized the first time a thread attempts to access a thread-local variable. (presumably by __aeabi_read_tp())
Initialization copies the prototype .tdata image and memsets the .tbss image into the allocated storage.
When source code access thread-local variables, the compiler generates code to read the thread pointer from __aeabi_read_tp(), and do all the appropriate indirection to get at the storage for that thread-local variable.
The compiler and linker is doing all the work you'd expect it to, but you need to initialize and return a "thread pointer" that is properly structured and filled out the way the compiler expects it to be, because it's generating instructions directly to follow the hops.
There are a few ways that TLS variables are accessed, as mentioned in this paper, which, again, may or may not totally apply to your compiler and architecture:
http://www.fsfla.org/~lxoliva/writeups/TLS/RFC-TLSDESC-x86.txt
But, the problems are roughly the same. When you have runtime-loaded libraries that may bring their own .tbss and .tdata sections, it gets more complicated. You have to expand the thread-local storage for any thread that suddenly tries to access a variable introduced by a library loaded after the storage for that thread was initialized. The compiler has to generate different access code depending on where the TLS variable is declared. You'd need to handle and test all the cases you would want to support.
It's years later, so you probably already solved or didn't solve your problem. In this case, it is (was) probably easiest to use your OS's TLS API directly.
I'm currently using QT (4) to parallelize a non-threadsafe library that's written in C by non-programmers, and thus has a lot of global variables. The threads don't need to interact or share data, they each just call a bunch of methods of the library and then at the end the library gives an output that is used further.
The problem is, though, that global variables are per default shared between threads, causing the library to crash in different places. There are two ways to fix this:
Refactor the entire library to not use global variables (ouch), or find a way to make global variables non-shared, or find a third magic way.
Is the latter an option with QT or standard (C++01) C++?
Using thread local storage is a way to make global variables non-shared. Starting point for that, with links to details for different implementations:
http://en.wikipedia.org/wiki/Thread-local_storage
You can't "unshare" global variables. The only available option for parallelization (bar refactoring) is to have multiple processes instead of multiple threads. Preferably pooled.
Create a singleton that is responsible for synchronizing the access to the global variables. The global variables become members of the new singleton class and can be accessed by methods that have the same names as your current global variables so you don't need to change code all over the place.
How do I use a boost::thread to execute a function with each thread executing in its own memory space. So that when I allocate a new variable with in the function it only lives as an instance in the executing thread.
Just to clarify I want to spawn threads that execute the same method using boost::thread but I do not want to use locks or semaphores I just want it to execute in a separate space.
Anything you allocate inside the thread function is already local to that function, as long as they're not declared as static. Just write your code as normal (avoiding static local variables) and you'll be fine.
If you need to create a thread that is running completely within its own address space, then what you are looking to do is to create a process, not a thread. Threads by definition are points of execution running within the same address space of the parent process.
If you really need to create threads (i.e. there's still memory and other resources shared between threads), but you also need to have a portion of memory dedicated to a specific thread, then you have few options:
1) as ildjarn suggested, have thread procedure allocate local (or dynamic memory) and write your code so that each thread uses this memory that it allocates for itself
2) Take a look at TLS (Thread Local Storage). It is an API that allows you to create "global" variables which are dedicated to a specific thread. Also some variations of C++ have built-in keywords for declaring variables which use TLS under the hood.
Note that in above options you will not get automatic isolation where a thread would not be able to corrupt another threads memory. The only way to get this isolation is to spawn multiple processes (or switch to one of .NET languages and instantiate multiple AppDomains running within the same process).
I have a program that I link with many libraries. I run my application on profiler and found out that most of the time is spent in "waiting" state after some network requests.
Those requests are effect of my code calling sleeping_function() from external library.
I call this function in a loop which executes many, many times so all waiting times sum up to huge amounts.
As I cannot modify the sleeping_function() I want to start a few threads to run a few iterations of my loop in parallel. The problem is that this function internally uses some global variables.
Is there a way to tell linker on SunOS that I want to link specific libraries in a way that will place all variables from them in Thread Local Storage?
I don’t think you’ll be able to achieve this with just the linker, but you might be able to get something working with some code in C.
The problem is that a call to load a library that is already loaded will return a reference to the already loaded instance instead of loading a new copy. A quick look at the documentation for dlopen and LoadLibrary seems to confirm that there’s no way to load the same library more than once, at least not if you want the image to be prepared for execution. One way to circumvent this would be to prevent the OS from knowing that it is the same library. To do this you could make a copy of the file.
Some pseudo code, just replace calls to sleeping_function with calls to call_sleeping_function_thread_safe:
char *shared_lib_name
void sleeping_function_thread_init(char *lib_name);
void call_sleeping_function_thread_safe()
{
void *lib_handle;
pthread_t pthread;
new_file_name = make_copy_of_file(shared_lib_name);
pthread_create(&pthread, NULL, sleeping_function_thread_init, new_file_name);
}
void sleeping_function_thread_init(char *lib_name)
{
void *lib_handle;
void (*)() sleeping_function;
lib_handle = dlopen(lib_name, RTLD_LOCAL);
sleeping_function = dlsym(lib_handle, "sleeping_function")
while (...)
sleeping_function;
dlclose(lib_handle);
delete_file(lib_name);
}
For windows dlopen becomes LoadLibrary and dlsym becomes GetProcAddress etc... but the basic idea would still work.
In general, this is a bad idea. Global data isn't the only issue that may prevent a non thread-safe library from running in a multithreaded environment.
As one example, what if the library had a global variable that points to a memory-mapped file that it always maps into a single, hardcoded address. In this case, with your technique, you would have one global variable per thread, but they would all point to the same memory location, which would be trashed by multi-threaded access.
In C, declaring a variable static in the global scope makes it a global variable. Is this global variable shared among threads or is it allocated per thread?
Update:
If they are shared among threads, what is an easy way to make globals in a preexisting library unique to a thread/non-shared?
Update2:
Basically, I need to use a preexisting C library with globals in a thread-safe manner.
It's visible to the entire process, i.e., all threads. Of course, this is in practice. In theory, you couldn't say because threads have nothing to do with the C standard (at least up to c99, which is the standard that was in force when this question was asked).
But all thread libraries I've ever used would have globals accessible to all threads.
Update 1:
Many thread libraries (pthreads, for one) will allow you to create thread-specific data, a means for functions to create and use data specific to the thread without having it passed down through the function.
So, for example, a function to return pseudo random numbers may want each thread to have an independent seed. So each time it's called it either creates or attaches to a thread-specific block holding that seed (using some sort of key).
This allows the functions to maintain the same signature as the non-threaded ones (important if they're ISO C functions for example) since the other solution involves adding a thread-specific pointer to the function call itself.
Another possibility is to have an array of globals of which each thread gets one, such as:
int fDone[10];
int idx;
: : :
for (i = 0; i < 10; i++) {
idx = i;
startThread (function, i);
while (idx >= 0)
yield();
}
void function () {
int myIdx = idx;
idx = -1;
while (1) {
: : :
}
}
This would allow the thread function to be told which global variable in the array belongs to it.
There are other methods, no doubt, but short of knowing your target environment, there's not much point in discussing them.
Update 2:
The easiest way to use a non-thread-safe library in a threaded environment is to provide wrapper calls with mutex protection.
For example, say your library has a non-thread-safe doThis() function. What you do is provide a wrapper for it:
void myDoThis (a, b) {
static mutex_t serialize;
mutex_claim (&serialize);
doThis (a, b);
mutex_release (&serialize);
}
What will happen there is that only one thread at a time will be able to claim the mutex (and hence call the non-thread-safe function). Others will be blocked until the current one returns.
C/C++ standard doesn't support threads. So all variables shared among threads.
Thread support implemented in C/C++ runtime library which is not part of the standard. Runtime is specific for each implementation of C/C++. If you want to write portable code in C++ you could use boost interprocess library.
To declare thread local variable in Microsoft Visual Studio you could use Microsoft specific keyword __declspec( thread ).
As #Pax mentioned, static variables are visible to all threads. There's no C++ data construct associated with a particular thread.
However, on Windows you can use the TlsAlloc API to allocate index for a thread-specific data and put that index in a static variable. Each thread has its own slot which you can access using this index and the TlsGetValue and TlsSetValue. For more information, read about Using Thread Local Storage on MSDN.
Update: There's no way to make globals in a preexisting library be thread-specific. Any solutions would require you to modify the code as well to be aware that the data has thread affinity.