c++ multithreading const parameter question - c++

I'm in the design phase of a multi threading problem I might implement in c++. Will be the first time implementing something multi threaded in c++. The question is quite simple: If I have a function with a const parameter as input, is it just the function under consideration that is not allowed to alter it? Or does c++ guarantee that the parameter will not change (even if another thread tries to access it mid-function)?
void someFunction(const SomeObject& mustNotChange){
bool check;
if(mustNotChange.getNumber()==0) check == true; //sleep for 10s
if(check && mustNotChange.getNumber()!=0) //CRASH!!!
}

In your example const doesn't make any difference for other threads as mustNotChange is in your current function stack space (or even a register) which should not be accessible by other threads.
I assume you are more interested in the case where other threads can access the memory, something like:
void someFunction(const int& mustNotChange)
{
//...
}
void someOtherFunction(int& mayCHange)
{
//...
}
int main()
{
int i = 0;
std::thread t0([&i](){someFunction(i);});
std::thread t1([&i](){someOtherFunction(i);});
t0.join();
t1.join();
return 0;
}
In this case the const ensure that someFunction can't change the value of mustNotChange and if it does it's undefined behavior, but this doesn't offer any guarantees about other functions that can access the same memory.
As a conclusion:
if you don't share data (as in the same memory location) between threads as you do in your example you don't have to worry about data races
if you share data, but no function can change the shared data (all functions receive data as const) you don't have to worry about data races. Please note that in current example if you change i before both threads join it's still a data race!
if you share data and at least one function can change the data you
must synchronization mechanisms.

It is not possible to say by just looking at the code if the referred-to object can change or not. This is why it's important to design the application with concurrency in mind, to, for example, minimize explicit sharing of writable data.
It doesn't matter that an object is accessed inside a function or through a const reference, the rules are the same. In the C++ memory model, accessing the same (non-atomic) value concurrently is possible only for reading. As soon as at least one writer is involved (in any thread), no other reads or writes may happen concurrently.
Furthermore, reads and writes in different threads must be synchronized; this is known as the "happens-before" semantics; locking and releasing a mutex or waiting on an atomic are examples of synchronization events which "release" writes to other threads, which subsequently "acquire" those writes.
For more details on C++ concurrency there is a very good book "C++ Concurrency in Action". Herb Sutter also has a nice atomic<> weapons talk.

Basically the answer is yes; You are passing the variable by const value and hence the function to which this variable is scoped is not allowed to alter it.

It is best to think of a const parameter in C++ as a promise by the function not to change the value. It doesn't mean someone else doesn't.

Related

C++ thread safety - map reading

I am working on a program that needs std::map and specifically one like this map<string,map<string,int>> - it is meant to be something like bank change rates - the first string is the original currency and the one in the second map is the desired one and the int is their rate. This whole map will be read only. Do I still need mutexes ? I am a bit confused about the whole thread safety, since this is my first bigger multi-threaded program.
If you are talking about the standard std::map† and no thread writes to it, no synchronization is required. Concurrent reads without writes are fine.
If however at least one thread performs writes on the map, you will indeed need some sort of protection like a mutex.
Be aware that std::map::operator[] counts as write, so use std::map::at (or std::map::find if the key may not exist in the map) instead. You can make the compiler protect you from accidental writes by only referring to the shared map via const map&.
†Was clarified to be the case in the OP. For completeness' sake: Note that other classes may have mutable members. For those, even access through const& may introduce a race. If in doubt, check the documentation or use something else for parallel programming.
The rule of thumb is if you have shared data and at least one thread will be a writer then you need synchronization. If one of the threads is a writer you must have synchronization as you do not want a reader to read an element that is being written to. This can cause issues as the reader might read part of the old value and part of the new value.
In your case since all the threads will only ever being reading data there is nothing they can do that will affect the map so you can have concurrent(unsynchronized) reads.
Wrap a std::map<std::string, std::map<std::string,int>> const in a custom class which has only const member functions [*].
This will make sure that all threads which use an object of the class after its creation will only read from it, which is guaranteed to be safe since C++11.
As documentation says:
All const member functions can be called concurrently by different
threads on the same container.
Wrapping containers in your own custom types is good practice anyway. Increased thread safety is just one positive side effect of that good practice. Other positive effects include increased readability of client code, reduction/adaption of container interface to required functionality, ease of adding additional constraints and checks.
Here is a brief example:
class BankChangeRates
{
public:
BankChangeRates(std::map<std::string, std::map<std::string,int>> const& data) : data(data) {}
int get(std::string const& key, std::string const& inner_key) const
{
auto const find_iter = data.find(key);
if (find_iter != data.end())
{
auto const inner_find_iter = find_iter->second.find(inner_key);
if (inner_find_iter != find_iter->second.end())
{
return inner_find_iter->second;
}
}
// error handling
}
int size() const
{
return data.size();
}
private:
std::map<std::string, std::map<std::string,int>> const data;
};
In any case, the thread-safety problem is then reduced to how to make sure that the constructor does not read from an object to which another thread writes. This is often achieved trivially; for example, the object may be constructed before multi-threading even begins, or it may be initialised with hard-coded initialisation lists. In many other cases, the code which creates the object will generally access only other thread-safe functions and local objects.
The point is that concurrent accesses to your object will always be safe once it has been created.
[*] Of course, the const member functions should keep their promise and not attempt "workarounds" with mutable or const_cast.
If your are completely sure that both the maps are ALWAYS READONLY, Then you never need mutexes.
But you have to be extra careful that no one can update the map by any means during the program execution. Make sure that you are initializing the map at the init stage of program and then never update it for any reason.
If you are confused that, In future you may need to update it in between the program execution, then its better to have macros around the map, which are empty right now. And in future, if you need mutexes around them, just change the macro definition.
PS:: I have used map in answer which can be easily replaced by shared resources. It was for the ease of understanding

Do I have to use atomic<bool> for "exit" bool variable?

I need to set a flag for another thread to exit. That other thread checks the exit flag from time to time. Do I have to use atomic for the flag or just a plain bool is enough and why (with an example of what exactly may go wrong if I use plain bool)?
#include <future>
bool exit = false;
void thread_fn()
{
while(!exit)
{
//do stuff
if(exit) break;
//do stuff
}
}
int main()
{
auto f = std::async(std::launch::async, thread_fn);
//do stuff
exit = true;
f.get();
}
Do I have to use atomic for “exit” bool variable?
Yes.
Either use atomic<bool>, or use manual synchronization through (for instance) an std::mutex. Your program currently contains a data race, with one thread potentially reading a variable while another thread is writing it. This is Undefined Behavior.
Per Paragraph 1.10/21 of the C++11 Standard:
The execution of a program contains a data race if it contains two conflicting actions in different threads,
at least one of which is not atomic, and neither happens before the other. Any such data race results in
undefined behavior.
The definition of "conflicting" is given in Paragraph 1.10/4:
Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one
accesses or modifies the same memory location.
Yes, you must have some synchronization. The easiest way is, as you say, with atomic<bool>.
Formally, as #AndyProwl says, the language definition says that not using an atomic here gives undefined behavior. There are good reasons for that.
First, a read or write of a variable can be interrupted halfway through by a thread switch; the other thread may see a partly-written value, or if it modifies the value, the original thread will see a mixed value. Second, when two threads run on different cores, they have separate caches; writing a value stores it in the cache, but doesn't update other caches, so a thread might not see a value written by another thread. Third, the compiler can reorganize code based on what it sees; in the example code, if nothing inside the loop changes the value of exit, the compiler doesn't have any reason to suspect that the value will change; it can turn the loop into while(1).
Atomics address all three of these problems.
actually, nothing goes wrong with plain bool in this particular example. the only notice is to declare bool exit variable as volatile to keep it in memory.
both CISC and RISC architectures implement bool read/write as strictly atomic processor instruction. also modern multocore processors have advanced smart cache implementstion. so, any memory barriers are not necessary. the Standard citation is not appropriate for this particular case because it deals with the only one writing and the reading from the only one thread.

Are non-mutating operations inherently thread safe (C++)?

This is probably a dumb question, but consider the following psuedo-code:
struct Person {
std::string name;
};
class Registry {
public:
const std::string& name(int id) const {return _people[id].name;}
void name(int id, const std::string& name) { [[scoped mutex]]; _people[id].name = name;}
private:
std::map<int, Person> _people;
};
In this simple example, assume Registry is a singleton that will be accessed by multiple threads. I'm locking during an operation that mutates the data, but not during non-mutating access.
Is this thread safe, or should I also lock during the read operation? I'm preventing multiple threads from trying to modify the data at the same time, but I don't know what would happen if a thread was trying to read at the same time another was writing.
If any thread can modify the data, then you need to lock for all access.
Otherwise, one of your "reading" threads could access the data when it is in an indeterminate state. Modifying a map, for example, requires manipulating several pointers. Your reading thread could acces the map while some - but not all - of the map has been adjusted.
If you can guarantee that the data is not being modified, multiple reads from multiple threads do not need to be locked, however that introduces a fragile scenario that you would have to keep a close eye on.
It's not thread safe to be reading the data while it is being modified. It is perfectly safe to have multiple threads reading the data at once.
This difference is what reader-writer locks are for; they will allow any number of readers but when a writer tries to lock the resource new readers will no longer be allowed and the writer will block until all the current readers are done. Then the writer will proceed and once it's done all the readers will be allowed access again.
The reason it's not safe to read data during modification is that the data can be or can appear to be in an inconsistent state (e.g., the object may temporarily not fulfill invariant). If the reader reads it at that point then it's just like there's a bug in the program failing to keep the data consistent.
// example
int array[10];
int size = 0;
int &top() {
return array[size-1];
}
void insert(int value) {
size++;
top() = value;
}
Any number of threads can call top() at the same time, but if one thread is running insert() then a problem occurs when the lines get interleaved like this:
// thread calling insert thread calling top
size++;
return array[size-1];
array[size-1] = value
The reading thread gets garbage.
Of course this is just one possible way things can go wrong. In general you can't even assume the program will behave as though lines of code on different threads will just interleave. In order to make that assumption valid the language simply tells you that you can't have data races (i.e., what we've been talking about; multiple threads accessing a (non-atomic) object with at least one thread modifying the object)*.
* And for completeness; that all atomic accesses use a sequentially consistent memory ordering. This doesn't matter for you since you're not doing low level work directly with atomic objects.
Is this thread safe, or should I also lock during the read operation?
It is not thread-safe.
Per Paragraph 1.10/4 of the C++11 Standard:
Two expression evaluations conflict if one of them modifies a memory location (1.7) and the other one accesses or modifies the same memory location.
Moreover, per Paragraph 1.10/21:
The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior. [...]
It's not thread safe.
The reading operation could be going through the map (it's a tree) to find the requested object, while a writing operation suddenly adds or removes the something from the map (or worse, the actual point the iterator is at).
If you're lucky you'll get an exception, otherwise it will be just undefined behavior while your map is in an inconsistant state.
I don't know what would happen if a thread was trying to read at the
same time another was writing.
Nobody knows. Chaos would ensue.
Multiple threads can share a read-only resource, but as soon as anyone wants to write it, it becomes unsafe for everyone to access it in any way until the writing is done.
Why?
Writes are not atomic. They happen over multiple clock cycles. A process attempting to read an object as it's written may find a half-modified version, temporary garbage.
So
Lock your reads if they are expected to be concurrent with your writes.
Absolutely not safe!
If you are changing Person::Name from "John Smith" to "James Watt", you may well read back a value of "Jame Smith" or "James mith". Or possibly even something completely different because the way that "change this value for that" may not just copy the new data into the existing place, but entirely replace it with a newly allocated piece of memory that contains some completely undefined content [including something that isn't a valid string].

C++ objects in multithreading

I would like to ask about thread safety in C++ (using POSIX threads with a C++ wrapper for ex.) when a single instance/object of a class is shared between different threads. For example the member methods of this single object of class A would be called within different threads. What should/can I do about thread safety?
class A {
private:
int n;
public:
void increment()
{
++n;
}
void decrement()
{
--n;
}
};
Should I protect class member n within increment/decrement methods with a lock or something else? Also static (class variables) members have such a need for lock?
If a member is immutable, I do not have to worry about it, right?
Anything that I cannot foreseen now?
In addition to the scenario with a single object within multithreads, what about multiple object with multiple threads? Each thread owns an instance of a class. Anything special other than static (class variables) members?
These are the things in my mind, but I believe this is a large topic and I would be glad if you have good resources and refer previous discussions about that.
Regards
Suggestion: don't try do it by hand. Use a good multithread library like the one from Boost: http://www.boost.org/doc/libs/1_47_0/doc/html/thread.html
This article from Intel will give you a good overview: http://software.intel.com/en-us/articles/multiple-approaches-to-multithreaded-applications/
It's a really large topic and probably it's impossible to complete the topic in this thread.
The golden rule is "You can't read while somebody else is writing."
So if you have an object that share a variable you have to put a lock in the function that access the shared variable.
There are very few cases when this is not true.
The first case is for integer number you can use the atomic function as showed by c-smile, in this case the CPU will use an hardware lock on the cache, so other cores can't modify the variables.
The second cases are lock free queue, that are special queue that use the compare and excange function to assure the atomicity of the instruction.
All the other cases are MUST be locked...
the first aproach is to lock everything, this can lead to a lot of problem when more object are involved (ObjA try to read from ObjB but, ObjB is using the variable and also is waiting for ObjC that wait ObjA) Where circular lock can lead to indefinite waiting (deadlock).
A better aproach is to minimize the point where thread share variable.
For example if you have and array of data, and you want to parallelize the computation on the data you can launch two thread and thread one will work only on even index while thread two will work on the odd. The thread are working on the same set of data, but as long the data don't overlap you don't have to use lock. (This is called data parallelization)
The other aproch is to organize the application as a set of "work" (function that run on a thread a produce a result) and make the work communicate only with messages. You only have to implement a thread safe message system and a work sheduler you are done. Or you can use libray like intel TBB.
Both approach don't solve deadlock problem but let you isolate the problem and find bugs more easily. Bugs in multithread are really hard to debug and sometime are also difficoult to find.
So, if you are studing I suggest to start with the thery and start with pThread, then whe you are learned the base move to a more user frendly library like boost or if you are using Gcc 4.6 as compiler the C++0x std::thread
yes, you should protect the functions with a lock if they are used in a multithreading environment. You can use boost libraries
and yes, immutable members should not be a concern, since a such a member can not be changed once it has been initialized.
Concerning "multiple object with multiple threads".. that depends very much of what you want to do, in some cases you could use a thread pool which is a mechanism that has a defined number of threads standing by for jobs to come in. But there's no thread concurrency there since each thread does one job.
You have to protect counters. No other options.
On Windows you can do this using these functions:
#if defined(PLATFORM_WIN32_GNU)
typedef long counter_t;
inline long _inc(counter_t& v) { return InterlockedIncrement(&v); }
inline long _dec(counter_t& v) { return InterlockedDecrement(&v); }
inline long _set(counter_t &v, long nv) { return InterlockedExchange(&v, nv); }
#elif defined(WINDOWS) && !defined(_WIN32_WCE) // lets try to keep things for wince simple as much as we can
typedef volatile long counter_t;
inline long _inc(counter_t& v) { return InterlockedIncrement((LPLONG)&v); }
inline long _dec(counter_t& v) { return InterlockedDecrement((LPLONG)&v); }
inline long _set(counter_t& v, long nv) { return InterlockedExchange((LPLONG)&v, nv); }

What exactly is a reentrant function?

Most of the times, the definition of reentrance is quoted from Wikipedia:
A computer program or routine is
described as reentrant if it can be
safely called again before its
previous invocation has been completed
(i.e it can be safely executed
concurrently). To be reentrant, a
computer program or routine:
Must hold no static (or global)
non-constant data.
Must not return the address to
static (or global) non-constant
data.
Must work only on the data provided
to it by the caller.
Must not rely on locks to singleton
resources.
Must not modify its own code (unless
executing in its own unique thread
storage)
Must not call non-reentrant computer
programs or routines.
How is safely defined?
If a program can be safely executed concurrently, does it always mean that it is reentrant?
What exactly is the common thread between the six points mentioned that I should keep in mind while checking my code for reentrant capabilities?
Also,
Are all recursive functions reentrant?
Are all thread-safe functions reentrant?
Are all recursive and thread-safe functions reentrant?
While writing this question, one thing comes to mind:
Are the terms like reentrance and thread safety absolute at all i.e. do they have fixed concrete definitions? For, if they are not, this question is not very meaningful.
1. How is safely defined?
Semantically. In this case, this is not a hard-defined term. It just mean "You can do that, without risk".
2. If a program can be safely executed concurrently, does it always mean that it is reentrant?
No.
For example, let's have a C++ function that takes both a lock, and a callback as a parameter:
#include <mutex>
typedef void (*callback)();
std::mutex m;
void foo(callback f)
{
m.lock();
// use the resource protected by the mutex
if (f) {
f();
}
// use the resource protected by the mutex
m.unlock();
}
Another function could well need to lock the same mutex:
void bar()
{
foo(nullptr);
}
At first sight, everything seems ok… But wait:
int main()
{
foo(bar);
return 0;
}
If the lock on mutex is not recursive, then here's what will happen, in the main thread:
main will call foo.
foo will acquire the lock.
foo will call bar, which will call foo.
the 2nd foo will try to acquire the lock, fail and wait for it to be released.
Deadlock.
Oops…
Ok, I cheated, using the callback thing. But it's easy to imagine more complex pieces of code having a similar effect.
3. What exactly is the common thread between the six points mentioned that I should keep in mind while checking my code for reentrant capabilities?
You can smell a problem if your function has/gives access to a modifiable persistent resource, or has/gives access to a function that smells.
(Ok, 99% of our code should smell, then… See last section to handle that…)
So, studying your code, one of those points should alert you:
The function has a state (i.e. access a global variable, or even a class member variable)
This function can be called by multiple threads, or could appear twice in the stack while the process is executing (i.e. the function could call itself, directly or indirectly). Function taking callbacks as parameters smell a lot.
Note that non-reentrancy is viral : A function that could call a possible non-reentrant function cannot be considered reentrant.
Note, too, that C++ methods smell because they have access to this, so you should study the code to be sure they have no funny interaction.
4.1. Are all recursive functions reentrant?
No.
In multithreaded cases, a recursive function accessing a shared resource could be called by multiple threads at the same moment, resulting in bad/corrupted data.
In singlethreaded cases, a recursive function could use a non-reentrant function (like the infamous strtok), or use global data without handling the fact the data is already in use. So your function is recursive because it calls itself directly or indirectly, but it can still be recursive-unsafe.
4.2. Are all thread-safe functions reentrant?
In the example above, I showed how an apparently threadsafe function was not reentrant. OK, I cheated because of the callback parameter. But then, there are multiple ways to deadlock a thread by having it acquire twice a non-recursive lock.
4.3. Are all recursive and thread-safe functions reentrant?
I would say "yes" if by "recursive" you mean "recursive-safe".
If you can guarantee that a function can be called simultaneously by multiple threads, and can call itself, directly or indirectly, without problems, then it is reentrant.
The problem is evaluating this guarantee… ^_^
5. Are the terms like reentrance and thread safety absolute at all, i.e. do they have fixed concrete definitions?
I believe they do, but then, evaluating a function is thread-safe or reentrant can be difficult. This is why I used the term smell above: You can find a function is not reentrant, but it could be difficult to be sure a complex piece of code is reentrant
6. An example
Let's say you have an object, with one method that needs to use a resource:
struct MyStruct
{
P * p;
void foo()
{
if (this->p == nullptr)
{
this->p = new P();
}
// lots of code, some using this->p
if (this->p != nullptr)
{
delete this->p;
this->p = nullptr;
}
}
};
The first problem is that if somehow this function is called recursively (i.e. this function calls itself, directly or indirectly), the code will probably crash, because this->p will be deleted at the end of the last call, and still probably be used before the end of the first call.
Thus, this code is not recursive-safe.
We could use a reference counter to correct this:
struct MyStruct
{
size_t c;
P * p;
void foo()
{
if (c == 0)
{
this->p = new P();
}
++c;
// lots of code, some using this->p
--c;
if (c == 0)
{
delete this->p;
this->p = nullptr;
}
}
};
This way, the code becomes recursive-safe… But it is still not reentrant because of multithreading issues: We must be sure the modifications of c and of p will be done atomically, using a recursive mutex (not all mutexes are recursive):
#include <mutex>
struct MyStruct
{
std::recursive_mutex m;
size_t c;
P * p;
void foo()
{
m.lock();
if (c == 0)
{
this->p = new P();
}
++c;
m.unlock();
// lots of code, some using this->p
m.lock();
--c;
if (c == 0)
{
delete this->p;
this->p = nullptr;
}
m.unlock();
}
};
And of course, this all assumes the lots of code is itself reentrant, including the use of p.
And the code above is not even remotely exception-safe, but this is another story… ^_^
7. Hey 99% of our code is not reentrant!
It is quite true for spaghetti code. But if you partition correctly your code, you will avoid reentrancy problems.
7.1. Make sure all functions have NO state
They must only use the parameters, their own local variables, other functions without state, and return copies of the data if they return at all.
7.2. Make sure your object is "recursive-safe"
An object method has access to this, so it shares a state with all the methods of the same instance of the object.
So, make sure the object can be used at one point in the stack (i.e. calling method A), and then, at another point (i.e. calling method B), without corrupting the whole object. Design your object to make sure that upon exiting a method, the object is stable and correct (no dangling pointers, no contradicting member variables, etc.).
7.3. Make sure all your objects are correctly encapsulated
No one else should have access to their internal data:
// bad
int & MyObject::getCounter()
{
return this->counter;
}
// good
int MyObject::getCounter()
{
return this->counter;
}
// good, too
void MyObject::getCounter(int & p_counter)
{
p_counter = this->counter;
}
Even returning a const reference could be dangerous if the user retrieves the address of the data, as some other portion of the code could modify it without the code holding the const reference being told.
7.4. Make sure the user knows your object is not thread-safe
Thus, the user is responsible to use mutexes to use an object shared between threads.
The objects from the STL are designed to be not thread-safe (because of performance issues), and thus, if a user want to share a std::string between two threads, the user must protect its access with concurrency primitives;
7.5. Make sure your thread-safe code is recursive-safe
This means using recursive mutexes if you believe the same resource can be used twice by the same thread.
"Safely" is defined exactly as the common sense dictates - it means "doing its thing correctly without interfering with other things". The six points you cite quite clearly express the requirements to achieve that.
The answers to your 3 questions is 3× "no".
Are all recursive functions reentrant?
NO!
Two simultaneous invocations of a recursive function can easily screw up each other, if
they access the same global/static data, for example.
Are all thread-safe functions reentrant?
NO!
A function is thread-safe if it doesn't malfunction if called concurrently. But this can be achieved e.g. by using a mutex to block the execution of the second invocation until the first finishes, so only one invocation works at a time. Reentrancy means executing concurrently without interfering with other invocations.
Are all recursive and thread-safe functions reentrant?
NO!
See above.
The common thread:
Is the behavior well defined if the routine is called while it is interrupted?
If you have a function like this:
int add( int a , int b ) {
return a + b;
}
Then it is not dependent upon any external state. The behavior is well defined.
If you have a function like this:
int add_to_global( int a ) {
return gValue += a;
}
The result is not well defined on multiple threads. Information could be lost if the timing was just wrong.
The simplest form of a reentrant function is something that operates exclusively on the arguments passed and constant values. Anything else takes special handling or, often, is not reentrant. And of course the arguments must not reference mutable globals.
Now I have to elaborate on my previous comment. #paercebal answer is incorrect. In the example code didn't anyone notice that the mutex which as supposed to be parameter wasn't actually passed in?
I dispute the conclusion, I assert: for a function to be safe in the presence of concurrency it must be re-entrant. Therefore concurrent-safe (usually written thread-safe) implies re-entrant.
Neither thread safe nor re-entrant have anything to say about arguments: we're talking about concurrent execution of the function, which can still be unsafe if inappropriate parameters are used.
For example, memcpy() is thread-safe and re-entrant (usually). Obviously it will not work as expected if called with pointers to the same targets from two different threads. That's the point of the SGI definition, placing the onus on the client to ensure accesses to the same data structure are synchronised by the client.
It is important to understand that in general it is nonsense to have thread-safe operation include the parameters. If you've done any database programming you will understand. The concept of what is "atomic" and might be protected by a mutex or some other technique is necessarily a user concept: processing a transaction on a database can require multiple un-interrupted modifications. Who can say which ones need to be kept in sync but the client programmer?
The point is that "corruption" doesn't have to be messing up the memory on your computer with unserialised writes: corruption can still occur even if all individual operations are serialised. It follows that when you're asking if a function is thread-safe, or re-entrant, the question means for all appropriately separated arguments: using coupled arguments does not constitute a counter-example.
There are many programming systems out there: Ocaml is one, and I think Python as well, which have lots of non-reentrant code in them, but which uses a global lock to interleave thread acesss. These systems are not re-entrant and they're not thread-safe or concurrent-safe, they operate safely simply because they prevent concurrency globally.
A good example is malloc. It is not re-entrant and not thread-safe. This is because it has to access a global resource (the heap). Using locks doesn't make it safe: it's definitely not re-entrant. If the interface to malloc had be design properly it would be possible to make it re-entrant and thread-safe:
malloc(heap*, size_t);
Now it can be safe because it transfers the responsibility for serialising shared access to a single heap to the client. In particular no work is required if there are separate heap objects. If a common heap is used, the client has to serialise access. Using a lock inside the function is not enough: just consider a malloc locking a heap* and then a signal comes along and calls malloc on the same pointer: deadlock: the signal can't proceed, and the client can't either because it is interrupted.
Generally speaking, locks do not make things thread-safe .. they actually destroy safety by inappropriately trying to manage a resource that is owned by the client. Locking has to be done by the object manufacturer, thats the only code that knows how many objects are created and how they will be used.
The "common thread" (pun intended!?) amongst the points listed is that the function must not do anything that would affect the behaviour of any recursive or concurrent calls to the same function.
So for example static data is an issue because it is owned by all threads; if one call modifies a static variable the all threads use the modified data thus affecting their behaviour. Self modifying code (although rarely encountered, and in some cases prevented) would be a problem, because although there are multiple thread, there is only one copy of the code; the code is essential static data too.
Essentially to be re-entrant, each thread must be able to use the function as if it were the only user, and that is not the case if one thread can affect the behaviour of another in a non-deterministic manner. Primarily this involves each thread having either separate or constant data that the function works on.
All that said, point (1) is not necessarily true; for example, you might legitimately and by design use a static variable to retain a recursion count to guard against excessive recursion or to profile an algorithm.
A thread-safe function need not be reentrant; it may achieve thread safety by specifically preventing reentrancy with a lock, and point (6) says that such a function is not reentrant. Regarding point (6), a function that calls a thread-safe function that locks is not safe for use in recursion (it will dead-lock), and is therefore not said to be reentrant, though it may nonetheless safe for concurrency, and would still be re-entrant in the sense that multiple threads can have their program-counters in such a function simultaneously (just not with the locked region). May be this helps to distinguish thread-safety from reentarncy (or maybe adds to your confusion!).
The answers your "Also" questions are "No", "No" and "No". Just because a function is recursive and/or thread safe it doesn't make it re-entrant.
Each of these type of function can fail on all the points you quote. (Though I'm not 100% certain of point 5).
non reentrant function means that there will be a static context, maintained by function. when first time entering, there will be create new context for you. and next entering, you don't send more parameter for that, for convenient to token analyze, . e.g. strtok in c. if you have not clear the context, there might be some errors.
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
on the contrary of non-reentrant, reentrant function means calling function in anytime will get the same result without side effect. because there is none of context.
in the view of thread safe, it just means there is only one modification for public variable in current time, in current process. so you should add lock guard to ensure just one change for public field in one time.
so thread safety and reentrant are two different things in different views.reentrant function safety says you should clear context before next time for context analyze. thread safety says you should keep visit public field order.
The terms "Thread-safe" and "re-entrant" mean only and exactly what their definitions say. "Safe" in this context means only what the definition you quote below it says.
"Safe" here certainly doesn't mean safe in the broader sense that calling a given function in a given context won't totally hose your application. Altogether, a function might reliably produce a desired effect in your multi-threaded application but not qualify as either re-entrant or thread-safe according to the definitions. Oppositely, you can call re-entrant functions in ways that will produce a variety of undesired, unexpected and/or unpredictable effects in your multi-threaded application.
Recursive function can be anything and Re-entrant has a stronger definition than thread-safe so the answers to your numbered questions are all no.
Reading the definition of re-entrant, one might summarize it as meaning a function which will not modify any anything beyond what you call it to modify. But you shouldn't rely on only the summary.
Multi-threaded programming is just extremely difficult in the general case. Knowing which part of one's code re-entrant is only a part of this challenge. Thread safety is not additive. Rather than trying to piece together re-entrant functions, it's better to use an overall thread-safe design pattern and use this pattern to guide your use of every thread and shared resources in the your program.